feat: Add test introspection APIs and harness test management

- Introduced new gRPC service methods: GetTestStatus, ListTests, and GetTestResults for enhanced test introspection.
- Defined corresponding request and response message types in the proto file.
- Implemented test harness execution tracking in TestManager, including methods to register, mark, and retrieve test execution details.
- Enhanced test logging and summary capabilities to support introspection features.
- Updated existing structures to accommodate new test management functionalities.
This commit is contained in:
scawful
2025-10-02 15:42:07 -04:00
parent 3a573c0764
commit b3bcd801a0
8 changed files with 1217 additions and 621 deletions

View File

@@ -96,21 +96,27 @@ The z3ed CLI and AI agent workflow system has completed major infrastructure mil
- **Test Management**: Can't query test status, results, or execution queue
#### IT-05: Test Introspection API (6-8 hours)
**Implementation Tasks**:
1. **Add GetTestStatus RPC**:
- Query status of queued/running tests by ID
- Return test state: queued, running, passed, failed, timeout
- Include execution time, error messages, assertion failures
2. **Add ListTests RPC**:
- Enumerate all registered tests in ImGuiTestEngine
- Filter by category (grpc, unit, integration, e2e)
- Return test metadata: name, category, last run time, pass/fail count
3. **Add GetTestResults RPC**:
- Retrieve detailed results for completed tests
- Include assertion logs, performance metrics, resource usage
- Support pagination for large result sets
**Status (Oct 2, 2025)**: 🟡 *Server-side RPCs implemented; CLI + E2E pending*
**Progress**:
- `imgui_test_harness.proto` expanded with GetTestStatus/ListTests/GetTestResults messages.
- `TestManager` maintains execution history (queued→running→completed) with logs, metrics, and aggregates.
-`ImGuiTestHarnessServiceImpl` exposes the three introspection RPCs with pagination, status conversion, and log/metric marshalling.
- ⚠️ `agent` CLI commands (`test status`, `test list`, `test results`) still stubbed.
- ⚠️ End-to-end introspection script (`scripts/test_introspection_e2e.sh`) not implemented; regression script `test_harness_e2e.sh` currently failing because it references the unfinished CLI.
**Immediate Next Steps**:
1. **Wire CLI Client Methods**
- Implement gRPC client wrappers for the new RPCs in the automation client.
- Add user-facing commands under `z3ed agent test ...` with JSON/YAML output options.
2. **Author E2E Validation Script**
- Spin up harness, run Click/Assert workflow, poll via `agent test status`, fetch results.
- Update CI notes with the new script and expected output.
3. **Documentation & Examples**
- Extend `E6-z3ed-reference.md` with full usage examples and sample outputs.
- Add troubleshooting section covering common errors (unknown test_id, timeout, etc.).
4. **Stretch (Optional Before IT-06)**
- Capture assertion metadata (expected/actual) for richer `AssertionResult` payloads.
**Example Usage**:
```bash

View File

@@ -1,4 +1,14 @@
# IT-05: T## Motivation
# IT-05: Test Introspection API Implementation Guide
**Status (Oct 2, 2025)**: 🟡 *Server-side RPCs complete; CLI + E2E pending*
## Progress Snapshot
- ✅ Proto definitions and service stubs added for `GetTestStatus`, `ListTests`, `GetTestResults`.
-`TestManager` now records execution lifecycle, aggregates, logs, and metrics with thread-safe history trimming.
-`ImGuiTestHarnessServiceImpl` implements the three RPC handlers, including pagination and status conversion helpers.
- ⚠️ CLI wiring, automation client calls, and user-facing output still TODO.
- ⚠️ End-to-end validation script (`scripts/test_introspection_e2e.sh`) not yet authored.
**Current Limitations**:
- ❌ Tests execute asynchronously with no way to query status
@@ -7,7 +17,7 @@
- ❌ Results lost after test completion
- ❌ Can't track test history or identify flaky tests
**Why This Blocks AI Agent Autonomy**:
**Why This Blocks AI Agent Autonomy**
Without test introspection, **AI agents cannot implement closed-loop feedback**:
@@ -62,7 +72,8 @@ Add test introspection capabilities to enable clients to query test execution st
- ❌ Results lost after test completion
- ❌ Can't track test history or identify flaky tests
**Benefits After IT-05**:
**Benefits After IT-05**
- ✅ AI agents can reliably poll for test completion
- ✅ CLI can show real-time progress bars
- ✅ Test history enables trend analysis
@@ -208,166 +219,20 @@ message AssertionResult {
## Implementation Steps
### Step 1: Extend TestManager (2-3 hours)
### Step 1: Extend TestManager (✔️ Completed)
#### 1.1 Add Test Execution Tracking
**What changed**:
- Introduced `HarnessTestExecution`, `HarnessTestSummary`, and related enums in `test_manager.h`.
- Added registration, running, completion, log, and metric helpers with `absl::Mutex` guarding (`RegisterHarnessTest`, `MarkHarnessTestRunning`, `MarkHarnessTestCompleted`, etc.).
- Stored executions in `harness_history_` + `harness_aggregates_` with deque-based trimming to avoid unbounded growth.
**File**: `src/app/core/test_manager.h`
**Where to look**:
- `src/app/test/test_manager.h` (see *Harness test introspection (IT-05)* section around `HarnessTestExecution`).
- `src/app/test/test_manager.cc` (functions `RegisterHarnessTest`, `MarkHarnessTestCompleted`, `AppendHarnessTestLog`, `GetHarnessTestExecution`, `ListHarnessTestSummaries`).
```cpp
#include <map>
#include <vector>
#include "absl/synchronization/mutex.h"
#include "absl/time/time.h"
class TestManager {
public:
enum class TestStatus {
UNKNOWN = 0,
QUEUED = 1,
RUNNING = 2,
PASSED = 3,
FAILED = 4,
TIMEOUT = 5
};
struct TestExecution {
std::string test_id;
std::string name;
std::string category;
TestStatus status;
absl::Time queued_at;
absl::Time started_at;
absl::Time completed_at;
absl::Duration execution_time;
std::string error_message;
std::vector<std::string> assertion_failures;
std::vector<std::string> logs;
std::map<std::string, int32_t> metrics;
};
// NEW: Introspection API
absl::StatusOr<TestExecution> GetTestStatus(const std::string& test_id);
std::vector<TestExecution> ListTests(const std::string& category_filter = "");
absl::StatusOr<TestExecution> GetTestResults(const std::string& test_id);
// NEW: Recording test execution
void RecordTestStart(const std::string& test_id, const std::string& name,
const std::string& category);
void RecordTestComplete(const std::string& test_id, TestStatus status,
const std::string& error_message = "");
void AddTestLog(const std::string& test_id, const std::string& log_entry);
void AddTestMetric(const std::string& test_id, const std::string& key,
int32_t value);
private:
std::map<std::string, TestExecution> test_history_ ABSL_GUARDED_BY(history_mutex_);
absl::Mutex history_mutex_;
// Helper: Generate unique test ID
std::string GenerateTestId(const std::string& prefix);
};
```
**File**: `src/app/core/test_manager.cc`
```cpp
#include "src/app/core/test_manager.h"
#include "absl/strings/str_format.h"
#include "absl/time/clock.h"
#include <random>
std::string TestManager::GenerateTestId(const std::string& prefix) {
static std::random_device rd;
static std::mt19937 gen(rd());
static std::uniform_int_distribution<> dis(10000000, 99999999);
return absl::StrFormat("%s_%d", prefix, dis(gen));
}
void TestManager::RecordTestStart(const std::string& test_id,
const std::string& name,
const std::string& category) {
absl::MutexLock lock(&history_mutex_);
TestExecution& exec = test_history_[test_id];
exec.test_id = test_id;
exec.name = name;
exec.category = category;
exec.status = TestStatus::RUNNING;
exec.started_at = absl::Now();
exec.queued_at = exec.started_at; // For now, no separate queue
}
void TestManager::RecordTestComplete(const std::string& test_id,
TestStatus status,
const std::string& error_message) {
absl::MutexLock lock(&history_mutex_);
auto it = test_history_.find(test_id);
if (it == test_history_.end()) return;
TestExecution& exec = it->second;
exec.status = status;
exec.completed_at = absl::Now();
exec.execution_time = exec.completed_at - exec.started_at;
exec.error_message = error_message;
}
void TestManager::AddTestLog(const std::string& test_id,
const std::string& log_entry) {
absl::MutexLock lock(&history_mutex_);
auto it = test_history_.find(test_id);
if (it != test_history_.end()) {
it->second.logs.push_back(log_entry);
}
}
void TestManager::AddTestMetric(const std::string& test_id,
const std::string& key,
int32_t value) {
absl::MutexLock lock(&history_mutex_);
auto it = test_history_.find(test_id);
if (it != test_history_.end()) {
it->second.metrics[key] = value;
}
}
absl::StatusOr<TestManager::TestExecution> TestManager::GetTestStatus(
const std::string& test_id) {
absl::MutexLock lock(&history_mutex_);
auto it = test_history_.find(test_id);
if (it == test_history_.end()) {
return absl::NotFoundError(
absl::StrFormat("Test ID '%s' not found", test_id));
}
return it->second;
}
std::vector<TestManager::TestExecution> TestManager::ListTests(
const std::string& category_filter) {
absl::MutexLock lock(&history_mutex_);
std::vector<TestExecution> results;
for (const auto& [id, exec] : test_history_) {
if (category_filter.empty() || exec.category == category_filter) {
results.push_back(exec);
}
}
return results;
}
absl::StatusOr<TestManager::TestExecution> TestManager::GetTestResults(
const std::string& test_id) {
// Same as GetTestStatus for now
return GetTestStatus(test_id);
}
```
**Next touch-ups**:
- Consider persisting assertion metadata (expected/actual) so `GetTestResults` can populate richer `AssertionResult` entries.
- Decide on retention limit (`harness_history_limit_`) tuning once CLI consumption patterns are known.
#### 1.2 Update Existing RPC Handlers
@@ -418,125 +283,25 @@ message ClickResponse {
// Repeat for TypeResponse, WaitResponse, AssertResponse
```
### Step 2: Implement Introspection RPCs (2-3 hours)
### Step 2: Implement Introspection RPCs (✔️ Completed)
**File**: `src/app/core/imgui_test_harness_service.cc`
**What changed**:
- Added helper utilities (`ConvertHarnessStatus`, `ToUnixMillisSafe`, `ClampDurationToInt32`) in `imgui_test_harness_service.cc`.
- Implemented `GetTestStatus`, `ListTests`, and `GetTestResults` with pagination, optional log inclusion, and structured metrics.mapping.
- Updated gRPC wrapper to surface new RPCs and translate Abseil status codes into gRPC codes.
- Ensured deque-backed `DynamicTestData` keep-alive remains bounded while reusing new tracking helpers.
```cpp
absl::Status ImGuiTestHarnessServiceImpl::GetTestStatus(
const GetTestStatusRequest* request,
GetTestStatusResponse* response) {
auto status_or = test_manager_->GetTestStatus(request->test_id());
if (!status_or.ok()) {
response->set_status(GetTestStatusResponse::UNKNOWN);
return absl::OkStatus(); // Not an RPC error, just test not found
}
const auto& exec = status_or.value();
// Map internal status to proto status
switch (exec.status) {
case TestManager::TestStatus::QUEUED:
response->set_status(GetTestStatusResponse::QUEUED);
break;
case TestManager::TestStatus::RUNNING:
response->set_status(GetTestStatusResponse::RUNNING);
break;
case TestManager::TestStatus::PASSED:
response->set_status(GetTestStatusResponse::PASSED);
break;
case TestManager::TestStatus::FAILED:
response->set_status(GetTestStatusResponse::FAILED);
break;
case TestManager::TestStatus::TIMEOUT:
response->set_status(GetTestStatusResponse::TIMEOUT);
break;
default:
response->set_status(GetTestStatusResponse::UNKNOWN);
}
// Convert absl::Time to milliseconds since epoch
response->set_queued_at_ms(absl::ToUnixMillis(exec.queued_at));
response->set_started_at_ms(absl::ToUnixMillis(exec.started_at));
response->set_completed_at_ms(absl::ToUnixMillis(exec.completed_at));
response->set_execution_time_ms(absl::ToInt64Milliseconds(exec.execution_time));
response->set_error_message(exec.error_message);
for (const auto& failure : exec.assertion_failures) {
response->add_assertion_failures(failure);
}
return absl::OkStatus();
}
**Where to look**:
- `src/app/core/imgui_test_harness_service.cc` (search for `GetTestStatus(`, `ListTests(`, `GetTestResults(`).
- `src/app/core/imgui_test_harness_service.h` (new method declarations).
absl::Status ImGuiTestHarnessServiceImpl::ListTests(
const ListTestsRequest* request,
ListTestsResponse* response) {
auto tests = test_manager_->ListTests(request->category_filter());
// TODO: Implement pagination if needed
response->set_total_count(tests.size());
for (const auto& exec : tests) {
auto* test_info = response->add_tests();
test_info->set_test_id(exec.test_id);
test_info->set_name(exec.name);
test_info->set_category(exec.category);
test_info->set_last_run_timestamp_ms(absl::ToUnixMillis(exec.completed_at));
test_info->set_total_runs(1); // TODO: Track across multiple runs
if (exec.status == TestManager::TestStatus::PASSED) {
test_info->set_pass_count(1);
test_info->set_fail_count(0);
} else {
test_info->set_pass_count(0);
test_info->set_fail_count(1);
}
test_info->set_average_duration_ms(
absl::ToInt64Milliseconds(exec.execution_time));
}
return absl::OkStatus();
}
**Follow-ups**:
- Expand `AssertionResult` population once `TestManager` captures structured expected/actual data.
- Evaluate pagination defaults (`page_size`, `page_token`) once CLI usage patterns are seen.
absl::Status ImGuiTestHarnessServiceImpl::GetTestResults(
const GetTestResultsRequest* request,
GetTestResultsResponse* response) {
auto status_or = test_manager_->GetTestResults(request->test_id());
if (!status_or.ok()) {
return absl::NotFoundError(
absl::StrFormat("Test '%s' not found", request->test_id()));
}
const auto& exec = status_or.value();
response->set_success(exec.status == TestManager::TestStatus::PASSED);
response->set_test_name(exec.name);
response->set_category(exec.category);
response->set_executed_at_ms(absl::ToUnixMillis(exec.completed_at));
response->set_duration_ms(absl::ToInt64Milliseconds(exec.execution_time));
// Include logs if requested
if (request->include_logs()) {
for (const auto& log : exec.logs) {
response->add_logs(log);
}
}
// Add metrics
for (const auto& [key, value] : exec.metrics) {
(*response->mutable_metrics())[key] = value;
}
return absl::OkStatus();
}
```
### Step 3: CLI Integration (🚧 TODO)
### Step 3: CLI Integration (1-2 hours)
Goal: expose the new RPCs through `GuiAutomationClient` and user-facing `z3ed agent test` subcommands. The pseudo-code below illustrates the desired flow; implementation still pending.
**File**: `src/cli/handlers/agent.cc`
@@ -631,7 +396,7 @@ absl::Status HandleAgentTestList(const CommandOptions& options) {
}
```
### Step 4: Testing & Validation (1 hour)
### Step 4: Testing & Validation (🚧 TODO)
#### Test Script: `scripts/test_introspection_e2e.sh`
@@ -673,14 +438,14 @@ kill $YAZE_PID
## Success Criteria
- [ ] All 3 new RPCs respond correctly
- [ ] Test IDs returned in Click/Type/Wait/Assert responses
- [ ] Status polling works with `--follow` flag
- [ ] Test history persists across multiple test runs
- [x] All 3 new RPCs respond correctly
- [x] Test IDs returned in Click/Type/Wait/Assert responses
- [ ] Status polling works with `--follow` flag (CLI pending)
- [x] Test history persists across multiple test runs
- [ ] CLI commands output clean YAML/JSON
- [ ] No memory leaks in test history tracking
- [ ] Thread-safe access to test history
- [ ] Documentation updated in E6-z3ed-reference.md
- [x] No memory leaks in test history tracking (bounded deque + pruning)
- [x] Thread-safe access to test history (mutex-protected)
- [ ] Documentation updated in `E6-z3ed-reference.md`
## Migration Guide
@@ -719,4 +484,4 @@ After IT-05 completion:
**Author**: @scawful, GitHub Copilot
**Created**: October 2, 2025
**Status**: Ready for implementation
**Status**: In progress (server-side complete; CLI + E2E pending)

View File

@@ -79,7 +79,12 @@ See the **[Technical Reference](E6-z3ed-reference.md)** for a full command list.
## Recent Enhancements
**Test Harness Evolution** (Planned: IT-05 to IT-09):
**Latest Progress (Oct 2, 2025)**
- ✅ Implemented server-side wiring for `GetTestStatus`, `ListTests`, and `GetTestResults` RPCs, including execution history tracking inside `TestManager`.
- ✅ Added gRPC status mapping helper to surface accurate error codes back to clients.
- ⚠️ Pending CLI integration, end-to-end introspection tests, and documentation updates for new commands.
**Test Harness Evolution** (In Progress: IT-05 to IT-09):
- **Test Introspection**: Query test status, results, and execution history
- **Widget Discovery**: AI agents can enumerate available GUI interactions dynamically
- **Test Recording**: Capture manual workflows as JSON scripts for regression testing

File diff suppressed because it is too large Load Diff

View File

@@ -36,6 +36,12 @@ class AssertRequest;
class AssertResponse;
class ScreenshotRequest;
class ScreenshotResponse;
class GetTestStatusRequest;
class GetTestStatusResponse;
class ListTestsRequest;
class ListTestsResponse;
class GetTestResultsRequest;
class GetTestResultsResponse;
// Implementation of ImGuiTestHarness gRPC service
// This class provides the actual RPC handlers for automated GUI testing
@@ -72,6 +78,14 @@ class ImGuiTestHarnessServiceImpl {
absl::Status Screenshot(const ScreenshotRequest* request,
ScreenshotResponse* response);
// Test introspection APIs
absl::Status GetTestStatus(const GetTestStatusRequest* request,
GetTestStatusResponse* response);
absl::Status ListTests(const ListTestsRequest* request,
ListTestsResponse* response);
absl::Status GetTestResults(const GetTestResultsRequest* request,
GetTestResultsResponse* response);
private:
TestManager* test_manager_; // Non-owning pointer to access ImGuiTestEngine
};

View File

@@ -22,6 +22,11 @@ service ImGuiTestHarness {
// Capture a screenshot
rpc Screenshot(ScreenshotRequest) returns (ScreenshotResponse);
// Test introspection APIs (IT-05)
rpc GetTestStatus(GetTestStatusRequest) returns (GetTestStatusResponse);
rpc ListTests(ListTestsRequest) returns (ListTestsResponse);
rpc GetTestResults(GetTestResultsRequest) returns (GetTestResultsResponse);
}
// ============================================================================
@@ -43,14 +48,15 @@ message PingResponse {
// ============================================================================
message ClickRequest {
string target = 1; // Target element (e.g., "button:Open ROM", "menu:File/Open")
string target = 1; // Target element (e.g., "button:Open ROM")
ClickType type = 2; // Type of click
enum ClickType {
LEFT = 0; // Single left click
RIGHT = 1; // Single right click
DOUBLE = 2; // Double click
MIDDLE = 3; // Middle mouse button
CLICK_TYPE_UNSPECIFIED = 0; // Default/unspecified click type
CLICK_TYPE_LEFT = 1; // Single left click
CLICK_TYPE_RIGHT = 2; // Single right click
CLICK_TYPE_DOUBLE = 3; // Double click
CLICK_TYPE_MIDDLE = 4; // Middle mouse button
}
}
@@ -58,6 +64,7 @@ message ClickResponse {
bool success = 1; // Whether the click succeeded
string message = 2; // Human-readable result message
int32 execution_time_ms = 3; // Time taken to execute (for debugging)
string test_id = 4; // Unique test identifier for introspection
}
// ============================================================================
@@ -74,6 +81,7 @@ message TypeResponse {
bool success = 1;
string message = 2;
int32 execution_time_ms = 3;
string test_id = 4;
}
// ============================================================================
@@ -81,7 +89,7 @@ message TypeResponse {
// ============================================================================
message WaitRequest {
string condition = 1; // Condition to wait for (e.g., "window:Overworld Editor", "enabled:button:Save")
string condition = 1; // Condition to wait for (e.g., "window:Overworld")
int32 timeout_ms = 2; // Maximum time to wait (default 5000ms)
int32 poll_interval_ms = 3; // How often to check (default 100ms)
}
@@ -90,6 +98,7 @@ message WaitResponse {
bool success = 1; // Whether condition was met before timeout
string message = 2;
int32 elapsed_ms = 3; // Time taken before condition met (or timeout)
string test_id = 4; // Unique test identifier for introspection
}
// ============================================================================
@@ -97,7 +106,7 @@ message WaitResponse {
// ============================================================================
message AssertRequest {
string condition = 1; // Condition to assert (e.g., "visible:button:Save", "text:label:Version:0.3.2")
string condition = 1; // Condition to assert (e.g., "visible:button:Save")
string failure_message = 2; // Custom message if assertion fails
}
@@ -106,6 +115,7 @@ message AssertResponse {
string message = 2; // Diagnostic message
string actual_value = 3; // Actual value found (for debugging)
string expected_value = 4; // Expected value (for debugging)
string test_id = 5; // Unique test identifier for introspection
}
// ============================================================================
@@ -118,8 +128,9 @@ message ScreenshotRequest {
ImageFormat format = 3; // Image format
enum ImageFormat {
PNG = 0;
JPEG = 1;
IMAGE_FORMAT_UNSPECIFIED = 0;
IMAGE_FORMAT_PNG = 1;
IMAGE_FORMAT_JPEG = 2;
}
}
@@ -129,3 +140,85 @@ message ScreenshotResponse {
string file_path = 3; // Absolute path to saved screenshot
int64 file_size_bytes = 4;
}
// ============================================================================
// GetTestStatus - Query test execution state
// ============================================================================
message GetTestStatusRequest {
string test_id = 1; // Test ID from Click/Type/Wait/Assert response
}
message GetTestStatusResponse {
enum Status {
STATUS_UNSPECIFIED = 0; // Test ID not found or unspecified
STATUS_QUEUED = 1; // Waiting to execute
STATUS_RUNNING = 2; // Currently executing
STATUS_PASSED = 3; // Completed successfully
STATUS_FAILED = 4; // Assertion failed or error
STATUS_TIMEOUT = 5; // Exceeded timeout
}
Status status = 1;
int64 queued_at_ms = 2; // When test was queued
int64 started_at_ms = 3; // When test started (0 if not started)
int64 completed_at_ms = 4; // When test completed (0 if not complete)
int32 execution_time_ms = 5; // Total execution time
string error_message = 6; // Error details if FAILED/TIMEOUT
repeated string assertion_failures = 7; // Failed assertion details
}
// ============================================================================
// ListTests - Enumerate available tests
// ============================================================================
message ListTestsRequest {
string category_filter = 1; // Optional: "grpc", "unit", "integration", "e2e"
int32 page_size = 2; // Number of results per page (default 100)
string page_token = 3; // Pagination token from previous response
}
message ListTestsResponse {
repeated TestInfo tests = 1;
string next_page_token = 2; // Token for next page (empty if no more)
int32 total_count = 3; // Total number of matching tests
}
message TestInfo {
string test_id = 1; // Unique test identifier
string name = 2; // Human-readable test name
string category = 3; // Category: grpc, unit, integration, e2e
int64 last_run_timestamp_ms = 4; // When test last executed
int32 total_runs = 5; // Total number of executions
int32 pass_count = 6; // Number of successful runs
int32 fail_count = 7; // Number of failed runs
int32 average_duration_ms = 8; // Average execution time
}
// ============================================================================
// GetTestResults - Retrieve detailed results
// ============================================================================
message GetTestResultsRequest {
string test_id = 1;
bool include_logs = 2; // Include full execution logs
}
message GetTestResultsResponse {
bool success = 1; // Overall test result
string test_name = 2;
string category = 3;
int64 executed_at_ms = 4;
int32 duration_ms = 5;
repeated AssertionResult assertions = 6;
repeated string logs = 7; // If include_logs=true
map<string, int32> metrics = 8; // e.g., "frame_count": 123
}
message AssertionResult {
string description = 1;
bool passed = 2;
string expected_value = 3;
string actual_value = 4;
string error_message = 5;
}

View File

@@ -1,7 +1,13 @@
#include "app/test/test_manager.h"
#include <algorithm>
#include <random>
#include "absl/strings/str_format.h"
#include "absl/strings/str_cat.h"
#include "absl/strings/str_replace.h"
#include "absl/time/clock.h"
#include "absl/time/time.h"
#include "app/core/features.h"
#include "app/core/platform/file_dialog.h"
#include "app/gfx/arena.h"
@@ -1281,5 +1287,199 @@ absl::Status TestManager::TestRomDataIntegrity(Rom* rom) {
});
}
std::string TestManager::RegisterHarnessTest(const std::string& name,
const std::string& category) {
absl::MutexLock lock(&harness_history_mutex_);
const std::string sanitized_category = category.empty() ? "grpc" : category;
std::string test_id = GenerateHarnessTestIdLocked(sanitized_category);
HarnessTestExecution execution;
execution.test_id = test_id;
execution.name = name;
execution.category = sanitized_category;
execution.status = HarnessTestStatus::kQueued;
execution.queued_at = absl::Now();
execution.started_at = absl::InfinitePast();
execution.completed_at = absl::InfinitePast();
harness_history_[test_id] = execution;
harness_history_order_.push_back(test_id);
TrimHarnessHistoryLocked();
HarnessAggregate& aggregate = harness_aggregates_[name];
if (aggregate.category.empty()) {
aggregate.category = sanitized_category;
}
aggregate.last_run = execution.queued_at;
aggregate.latest_execution = execution;
return test_id;
}
void TestManager::MarkHarnessTestRunning(const std::string& test_id) {
absl::MutexLock lock(&harness_history_mutex_);
auto it = harness_history_.find(test_id);
if (it == harness_history_.end()) {
return;
}
HarnessTestExecution& execution = it->second;
execution.status = HarnessTestStatus::kRunning;
execution.started_at = absl::Now();
HarnessAggregate& aggregate = harness_aggregates_[execution.name];
if (aggregate.category.empty()) {
aggregate.category = execution.category;
}
aggregate.latest_execution = execution;
}
void TestManager::MarkHarnessTestCompleted(
const std::string& test_id, HarnessTestStatus status,
const std::string& error_message,
const std::vector<std::string>& assertion_failures,
const std::vector<std::string>& logs,
const std::map<std::string, int32_t>& metrics) {
absl::MutexLock lock(&harness_history_mutex_);
auto it = harness_history_.find(test_id);
if (it == harness_history_.end()) {
return;
}
HarnessTestExecution& execution = it->second;
execution.status = status;
if (execution.started_at == absl::InfinitePast()) {
execution.started_at = execution.queued_at;
}
execution.completed_at = absl::Now();
execution.duration = execution.completed_at - execution.started_at;
execution.error_message = error_message;
if (!assertion_failures.empty()) {
execution.assertion_failures = assertion_failures;
}
if (!logs.empty()) {
execution.logs.insert(execution.logs.end(), logs.begin(), logs.end());
}
if (!metrics.empty()) {
execution.metrics.insert(metrics.begin(), metrics.end());
}
HarnessAggregate& aggregate = harness_aggregates_[execution.name];
if (aggregate.category.empty()) {
aggregate.category = execution.category;
}
aggregate.total_runs += 1;
if (status == HarnessTestStatus::kPassed) {
aggregate.pass_count += 1;
} else if (status == HarnessTestStatus::kFailed ||
status == HarnessTestStatus::kTimeout) {
aggregate.fail_count += 1;
}
aggregate.total_duration += execution.duration;
aggregate.last_run = execution.completed_at;
aggregate.latest_execution = execution;
}
void TestManager::AppendHarnessTestLog(const std::string& test_id,
const std::string& log_entry) {
absl::MutexLock lock(&harness_history_mutex_);
auto it = harness_history_.find(test_id);
if (it == harness_history_.end()) {
return;
}
HarnessTestExecution& execution = it->second;
execution.logs.push_back(log_entry);
HarnessAggregate& aggregate = harness_aggregates_[execution.name];
aggregate.latest_execution.logs = execution.logs;
}
absl::StatusOr<HarnessTestExecution> TestManager::GetHarnessTestExecution(
const std::string& test_id) const {
absl::MutexLock lock(&harness_history_mutex_);
auto it = harness_history_.find(test_id);
if (it == harness_history_.end()) {
return absl::NotFoundError(
absl::StrFormat("Test ID '%s' not found", test_id));
}
return it->second;
}
std::vector<HarnessTestSummary> TestManager::ListHarnessTestSummaries(
const std::string& category_filter) const {
absl::MutexLock lock(&harness_history_mutex_);
std::vector<HarnessTestSummary> summaries;
summaries.reserve(harness_aggregates_.size());
for (const auto& [name, aggregate] : harness_aggregates_) {
if (!category_filter.empty() && aggregate.category != category_filter) {
continue;
}
HarnessTestSummary summary;
summary.latest_execution = aggregate.latest_execution;
summary.total_runs = aggregate.total_runs;
summary.pass_count = aggregate.pass_count;
summary.fail_count = aggregate.fail_count;
summary.total_duration = aggregate.total_duration;
summaries.push_back(summary);
}
std::sort(summaries.begin(), summaries.end(),
[](const HarnessTestSummary& a, const HarnessTestSummary& b) {
absl::Time time_a = a.latest_execution.completed_at;
if (time_a == absl::InfinitePast()) {
time_a = a.latest_execution.queued_at;
}
absl::Time time_b = b.latest_execution.completed_at;
if (time_b == absl::InfinitePast()) {
time_b = b.latest_execution.queued_at;
}
return time_a > time_b;
});
return summaries;
}
std::string TestManager::GenerateHarnessTestIdLocked(absl::string_view prefix) {
static std::mt19937 rng(std::random_device{}());
static std::uniform_int_distribution<uint32_t> dist(0, 0xFFFFFF);
std::string sanitized = absl::StrReplaceAll(std::string(prefix),
{{" ", "_"}, {":", "_"}});
if (sanitized.empty()) {
sanitized = "test";
}
for (int attempt = 0; attempt < 8; ++attempt) {
std::string candidate =
absl::StrFormat("%s_%08x", sanitized, dist(rng));
if (harness_history_.find(candidate) == harness_history_.end()) {
return candidate;
}
}
return absl::StrFormat("%s_%lld", sanitized,
static_cast<long long>(absl::ToUnixMillis(absl::Now())));
}
void TestManager::TrimHarnessHistoryLocked() {
while (harness_history_order_.size() > harness_history_limit_) {
const std::string& oldest_id = harness_history_order_.front();
auto it = harness_history_.find(oldest_id);
if (it != harness_history_.end()) {
harness_history_.erase(it);
}
harness_history_order_.pop_front();
}
}
} // namespace test
} // namespace yaze

View File

@@ -2,13 +2,19 @@
#define YAZE_APP_TEST_TEST_MANAGER_H
#include <chrono>
#include <deque>
#include <functional>
#include <map>
#include <memory>
#include <string>
#include <unordered_map>
#include <vector>
#include "absl/status/status.h"
#include "absl/status/statusor.h"
#include "absl/synchronization/mutex.h"
#include "absl/strings/string_view.h"
#include "absl/time/time.h"
#include "app/rom.h"
#include "imgui.h"
#include "util/log.h"
@@ -111,6 +117,39 @@ struct ResourceStats {
std::chrono::time_point<std::chrono::steady_clock> timestamp;
};
// Test harness execution tracking for gRPC automation (IT-05)
enum class HarnessTestStatus {
kUnspecified,
kQueued,
kRunning,
kPassed,
kFailed,
kTimeout,
};
struct HarnessTestExecution {
std::string test_id;
std::string name;
std::string category;
HarnessTestStatus status = HarnessTestStatus::kUnspecified;
absl::Time queued_at;
absl::Time started_at;
absl::Time completed_at;
absl::Duration duration = absl::ZeroDuration();
std::string error_message;
std::vector<std::string> assertion_failures;
std::vector<std::string> logs;
std::map<std::string, int32_t> metrics;
};
struct HarnessTestSummary {
HarnessTestExecution latest_execution;
int total_runs = 0;
int pass_count = 0;
int fail_count = 0;
absl::Duration total_duration = absl::ZeroDuration();
};
// Main test manager - singleton
class TestManager {
public:
@@ -209,6 +248,29 @@ class TestManager {
}
// File dialog mode now uses global feature flags
// Harness test introspection (IT-05)
std::string RegisterHarnessTest(const std::string& name,
const std::string& category)
ABSL_LOCKS_EXCLUDED(harness_history_mutex_);
void MarkHarnessTestRunning(const std::string& test_id)
ABSL_LOCKS_EXCLUDED(harness_history_mutex_);
void MarkHarnessTestCompleted(
const std::string& test_id, HarnessTestStatus status,
const std::string& error_message = "",
const std::vector<std::string>& assertion_failures = {},
const std::vector<std::string>& logs = {},
const std::map<std::string, int32_t>& metrics = {})
ABSL_LOCKS_EXCLUDED(harness_history_mutex_);
void AppendHarnessTestLog(const std::string& test_id,
const std::string& log_entry)
ABSL_LOCKS_EXCLUDED(harness_history_mutex_);
absl::StatusOr<HarnessTestExecution> GetHarnessTestExecution(
const std::string& test_id) const
ABSL_LOCKS_EXCLUDED(harness_history_mutex_);
std::vector<HarnessTestSummary> ListHarnessTestSummaries(
const std::string& category_filter = "") const
ABSL_LOCKS_EXCLUDED(harness_history_mutex_);
private:
TestManager();
~TestManager();
@@ -263,6 +325,31 @@ class TestManager {
// Test selection and configuration
std::unordered_map<std::string, bool> disabled_tests_;
// Harness test tracking
struct HarnessAggregate {
int total_runs = 0;
int pass_count = 0;
int fail_count = 0;
absl::Duration total_duration = absl::ZeroDuration();
std::string category;
absl::Time last_run;
HarnessTestExecution latest_execution;
};
std::unordered_map<std::string, HarnessTestExecution> harness_history_
ABSL_GUARDED_BY(harness_history_mutex_);
std::unordered_map<std::string, HarnessAggregate> harness_aggregates_
ABSL_GUARDED_BY(harness_history_mutex_);
std::deque<std::string> harness_history_order_
ABSL_GUARDED_BY(harness_history_mutex_);
size_t harness_history_limit_ = 200;
mutable absl::Mutex harness_history_mutex_;
std::string GenerateHarnessTestIdLocked(absl::string_view prefix)
ABSL_EXCLUSIVE_LOCKS_REQUIRED(harness_history_mutex_);
void TrimHarnessHistoryLocked()
ABSL_EXCLUSIVE_LOCKS_REQUIRED(harness_history_mutex_);
};
// Utility functions for test result formatting