feat: Add test introspection APIs and harness test management
- Introduced new gRPC service methods: GetTestStatus, ListTests, and GetTestResults for enhanced test introspection. - Defined corresponding request and response message types in the proto file. - Implemented test harness execution tracking in TestManager, including methods to register, mark, and retrieve test execution details. - Enhanced test logging and summary capabilities to support introspection features. - Updated existing structures to accommodate new test management functionalities.
This commit is contained in:
@@ -96,21 +96,27 @@ The z3ed CLI and AI agent workflow system has completed major infrastructure mil
|
||||
- **Test Management**: Can't query test status, results, or execution queue
|
||||
|
||||
#### IT-05: Test Introspection API (6-8 hours)
|
||||
**Implementation Tasks**:
|
||||
1. **Add GetTestStatus RPC**:
|
||||
- Query status of queued/running tests by ID
|
||||
- Return test state: queued, running, passed, failed, timeout
|
||||
- Include execution time, error messages, assertion failures
|
||||
|
||||
2. **Add ListTests RPC**:
|
||||
- Enumerate all registered tests in ImGuiTestEngine
|
||||
- Filter by category (grpc, unit, integration, e2e)
|
||||
- Return test metadata: name, category, last run time, pass/fail count
|
||||
|
||||
3. **Add GetTestResults RPC**:
|
||||
- Retrieve detailed results for completed tests
|
||||
- Include assertion logs, performance metrics, resource usage
|
||||
- Support pagination for large result sets
|
||||
**Status (Oct 2, 2025)**: 🟡 *Server-side RPCs implemented; CLI + E2E pending*
|
||||
|
||||
**Progress**:
|
||||
- ✅ `imgui_test_harness.proto` expanded with GetTestStatus/ListTests/GetTestResults messages.
|
||||
- ✅ `TestManager` maintains execution history (queued→running→completed) with logs, metrics, and aggregates.
|
||||
- ✅ `ImGuiTestHarnessServiceImpl` exposes the three introspection RPCs with pagination, status conversion, and log/metric marshalling.
|
||||
- ⚠️ `agent` CLI commands (`test status`, `test list`, `test results`) still stubbed.
|
||||
- ⚠️ End-to-end introspection script (`scripts/test_introspection_e2e.sh`) not implemented; regression script `test_harness_e2e.sh` currently failing because it references the unfinished CLI.
|
||||
|
||||
**Immediate Next Steps**:
|
||||
1. **Wire CLI Client Methods**
|
||||
- Implement gRPC client wrappers for the new RPCs in the automation client.
|
||||
- Add user-facing commands under `z3ed agent test ...` with JSON/YAML output options.
|
||||
2. **Author E2E Validation Script**
|
||||
- Spin up harness, run Click/Assert workflow, poll via `agent test status`, fetch results.
|
||||
- Update CI notes with the new script and expected output.
|
||||
3. **Documentation & Examples**
|
||||
- Extend `E6-z3ed-reference.md` with full usage examples and sample outputs.
|
||||
- Add troubleshooting section covering common errors (unknown test_id, timeout, etc.).
|
||||
4. **Stretch (Optional Before IT-06)**
|
||||
- Capture assertion metadata (expected/actual) for richer `AssertionResult` payloads.
|
||||
|
||||
**Example Usage**:
|
||||
```bash
|
||||
|
||||
@@ -1,4 +1,14 @@
|
||||
# IT-05: T## Motivation
|
||||
# IT-05: Test Introspection API – Implementation Guide
|
||||
|
||||
**Status (Oct 2, 2025)**: 🟡 *Server-side RPCs complete; CLI + E2E pending*
|
||||
|
||||
## Progress Snapshot
|
||||
|
||||
- ✅ Proto definitions and service stubs added for `GetTestStatus`, `ListTests`, `GetTestResults`.
|
||||
- ✅ `TestManager` now records execution lifecycle, aggregates, logs, and metrics with thread-safe history trimming.
|
||||
- ✅ `ImGuiTestHarnessServiceImpl` implements the three RPC handlers, including pagination and status conversion helpers.
|
||||
- ⚠️ CLI wiring, automation client calls, and user-facing output still TODO.
|
||||
- ⚠️ End-to-end validation script (`scripts/test_introspection_e2e.sh`) not yet authored.
|
||||
|
||||
**Current Limitations**:
|
||||
- ❌ Tests execute asynchronously with no way to query status
|
||||
@@ -7,7 +17,7 @@
|
||||
- ❌ Results lost after test completion
|
||||
- ❌ Can't track test history or identify flaky tests
|
||||
|
||||
**Why This Blocks AI Agent Autonomy**:
|
||||
**Why This Blocks AI Agent Autonomy**
|
||||
|
||||
Without test introspection, **AI agents cannot implement closed-loop feedback**:
|
||||
|
||||
@@ -62,7 +72,8 @@ Add test introspection capabilities to enable clients to query test execution st
|
||||
- ❌ Results lost after test completion
|
||||
- ❌ Can't track test history or identify flaky tests
|
||||
|
||||
**Benefits After IT-05**:
|
||||
**Benefits After IT-05**
|
||||
|
||||
- ✅ AI agents can reliably poll for test completion
|
||||
- ✅ CLI can show real-time progress bars
|
||||
- ✅ Test history enables trend analysis
|
||||
@@ -208,166 +219,20 @@ message AssertionResult {
|
||||
|
||||
## Implementation Steps
|
||||
|
||||
### Step 1: Extend TestManager (2-3 hours)
|
||||
### Step 1: Extend TestManager (✔️ Completed)
|
||||
|
||||
#### 1.1 Add Test Execution Tracking
|
||||
**What changed**:
|
||||
- Introduced `HarnessTestExecution`, `HarnessTestSummary`, and related enums in `test_manager.h`.
|
||||
- Added registration, running, completion, log, and metric helpers with `absl::Mutex` guarding (`RegisterHarnessTest`, `MarkHarnessTestRunning`, `MarkHarnessTestCompleted`, etc.).
|
||||
- Stored executions in `harness_history_` + `harness_aggregates_` with deque-based trimming to avoid unbounded growth.
|
||||
|
||||
**File**: `src/app/core/test_manager.h`
|
||||
**Where to look**:
|
||||
- `src/app/test/test_manager.h` (see *Harness test introspection (IT-05)* section around `HarnessTestExecution`).
|
||||
- `src/app/test/test_manager.cc` (functions `RegisterHarnessTest`, `MarkHarnessTestCompleted`, `AppendHarnessTestLog`, `GetHarnessTestExecution`, `ListHarnessTestSummaries`).
|
||||
|
||||
```cpp
|
||||
#include <map>
|
||||
#include <vector>
|
||||
#include "absl/synchronization/mutex.h"
|
||||
#include "absl/time/time.h"
|
||||
|
||||
class TestManager {
|
||||
public:
|
||||
enum class TestStatus {
|
||||
UNKNOWN = 0,
|
||||
QUEUED = 1,
|
||||
RUNNING = 2,
|
||||
PASSED = 3,
|
||||
FAILED = 4,
|
||||
TIMEOUT = 5
|
||||
};
|
||||
|
||||
struct TestExecution {
|
||||
std::string test_id;
|
||||
std::string name;
|
||||
std::string category;
|
||||
TestStatus status;
|
||||
absl::Time queued_at;
|
||||
absl::Time started_at;
|
||||
absl::Time completed_at;
|
||||
absl::Duration execution_time;
|
||||
std::string error_message;
|
||||
std::vector<std::string> assertion_failures;
|
||||
std::vector<std::string> logs;
|
||||
std::map<std::string, int32_t> metrics;
|
||||
};
|
||||
|
||||
// NEW: Introspection API
|
||||
absl::StatusOr<TestExecution> GetTestStatus(const std::string& test_id);
|
||||
std::vector<TestExecution> ListTests(const std::string& category_filter = "");
|
||||
absl::StatusOr<TestExecution> GetTestResults(const std::string& test_id);
|
||||
|
||||
// NEW: Recording test execution
|
||||
void RecordTestStart(const std::string& test_id, const std::string& name,
|
||||
const std::string& category);
|
||||
void RecordTestComplete(const std::string& test_id, TestStatus status,
|
||||
const std::string& error_message = "");
|
||||
void AddTestLog(const std::string& test_id, const std::string& log_entry);
|
||||
void AddTestMetric(const std::string& test_id, const std::string& key,
|
||||
int32_t value);
|
||||
|
||||
private:
|
||||
std::map<std::string, TestExecution> test_history_ ABSL_GUARDED_BY(history_mutex_);
|
||||
absl::Mutex history_mutex_;
|
||||
|
||||
// Helper: Generate unique test ID
|
||||
std::string GenerateTestId(const std::string& prefix);
|
||||
};
|
||||
```
|
||||
|
||||
**File**: `src/app/core/test_manager.cc`
|
||||
|
||||
```cpp
|
||||
#include "src/app/core/test_manager.h"
|
||||
#include "absl/strings/str_format.h"
|
||||
#include "absl/time/clock.h"
|
||||
#include <random>
|
||||
|
||||
std::string TestManager::GenerateTestId(const std::string& prefix) {
|
||||
static std::random_device rd;
|
||||
static std::mt19937 gen(rd());
|
||||
static std::uniform_int_distribution<> dis(10000000, 99999999);
|
||||
|
||||
return absl::StrFormat("%s_%d", prefix, dis(gen));
|
||||
}
|
||||
|
||||
void TestManager::RecordTestStart(const std::string& test_id,
|
||||
const std::string& name,
|
||||
const std::string& category) {
|
||||
absl::MutexLock lock(&history_mutex_);
|
||||
|
||||
TestExecution& exec = test_history_[test_id];
|
||||
exec.test_id = test_id;
|
||||
exec.name = name;
|
||||
exec.category = category;
|
||||
exec.status = TestStatus::RUNNING;
|
||||
exec.started_at = absl::Now();
|
||||
exec.queued_at = exec.started_at; // For now, no separate queue
|
||||
}
|
||||
|
||||
void TestManager::RecordTestComplete(const std::string& test_id,
|
||||
TestStatus status,
|
||||
const std::string& error_message) {
|
||||
absl::MutexLock lock(&history_mutex_);
|
||||
|
||||
auto it = test_history_.find(test_id);
|
||||
if (it == test_history_.end()) return;
|
||||
|
||||
TestExecution& exec = it->second;
|
||||
exec.status = status;
|
||||
exec.completed_at = absl::Now();
|
||||
exec.execution_time = exec.completed_at - exec.started_at;
|
||||
exec.error_message = error_message;
|
||||
}
|
||||
|
||||
void TestManager::AddTestLog(const std::string& test_id,
|
||||
const std::string& log_entry) {
|
||||
absl::MutexLock lock(&history_mutex_);
|
||||
|
||||
auto it = test_history_.find(test_id);
|
||||
if (it != test_history_.end()) {
|
||||
it->second.logs.push_back(log_entry);
|
||||
}
|
||||
}
|
||||
|
||||
void TestManager::AddTestMetric(const std::string& test_id,
|
||||
const std::string& key,
|
||||
int32_t value) {
|
||||
absl::MutexLock lock(&history_mutex_);
|
||||
|
||||
auto it = test_history_.find(test_id);
|
||||
if (it != test_history_.end()) {
|
||||
it->second.metrics[key] = value;
|
||||
}
|
||||
}
|
||||
|
||||
absl::StatusOr<TestManager::TestExecution> TestManager::GetTestStatus(
|
||||
const std::string& test_id) {
|
||||
absl::MutexLock lock(&history_mutex_);
|
||||
|
||||
auto it = test_history_.find(test_id);
|
||||
if (it == test_history_.end()) {
|
||||
return absl::NotFoundError(
|
||||
absl::StrFormat("Test ID '%s' not found", test_id));
|
||||
}
|
||||
|
||||
return it->second;
|
||||
}
|
||||
|
||||
std::vector<TestManager::TestExecution> TestManager::ListTests(
|
||||
const std::string& category_filter) {
|
||||
absl::MutexLock lock(&history_mutex_);
|
||||
|
||||
std::vector<TestExecution> results;
|
||||
for (const auto& [id, exec] : test_history_) {
|
||||
if (category_filter.empty() || exec.category == category_filter) {
|
||||
results.push_back(exec);
|
||||
}
|
||||
}
|
||||
|
||||
return results;
|
||||
}
|
||||
|
||||
absl::StatusOr<TestManager::TestExecution> TestManager::GetTestResults(
|
||||
const std::string& test_id) {
|
||||
// Same as GetTestStatus for now
|
||||
return GetTestStatus(test_id);
|
||||
}
|
||||
```
|
||||
**Next touch-ups**:
|
||||
- Consider persisting assertion metadata (expected/actual) so `GetTestResults` can populate richer `AssertionResult` entries.
|
||||
- Decide on retention limit (`harness_history_limit_`) tuning once CLI consumption patterns are known.
|
||||
|
||||
#### 1.2 Update Existing RPC Handlers
|
||||
|
||||
@@ -418,125 +283,25 @@ message ClickResponse {
|
||||
// Repeat for TypeResponse, WaitResponse, AssertResponse
|
||||
```
|
||||
|
||||
### Step 2: Implement Introspection RPCs (2-3 hours)
|
||||
### Step 2: Implement Introspection RPCs (✔️ Completed)
|
||||
|
||||
**File**: `src/app/core/imgui_test_harness_service.cc`
|
||||
**What changed**:
|
||||
- Added helper utilities (`ConvertHarnessStatus`, `ToUnixMillisSafe`, `ClampDurationToInt32`) in `imgui_test_harness_service.cc`.
|
||||
- Implemented `GetTestStatus`, `ListTests`, and `GetTestResults` with pagination, optional log inclusion, and structured metrics.mapping.
|
||||
- Updated gRPC wrapper to surface new RPCs and translate Abseil status codes into gRPC codes.
|
||||
- Ensured deque-backed `DynamicTestData` keep-alive remains bounded while reusing new tracking helpers.
|
||||
|
||||
```cpp
|
||||
absl::Status ImGuiTestHarnessServiceImpl::GetTestStatus(
|
||||
const GetTestStatusRequest* request,
|
||||
GetTestStatusResponse* response) {
|
||||
|
||||
auto status_or = test_manager_->GetTestStatus(request->test_id());
|
||||
if (!status_or.ok()) {
|
||||
response->set_status(GetTestStatusResponse::UNKNOWN);
|
||||
return absl::OkStatus(); // Not an RPC error, just test not found
|
||||
}
|
||||
|
||||
const auto& exec = status_or.value();
|
||||
|
||||
// Map internal status to proto status
|
||||
switch (exec.status) {
|
||||
case TestManager::TestStatus::QUEUED:
|
||||
response->set_status(GetTestStatusResponse::QUEUED);
|
||||
break;
|
||||
case TestManager::TestStatus::RUNNING:
|
||||
response->set_status(GetTestStatusResponse::RUNNING);
|
||||
break;
|
||||
case TestManager::TestStatus::PASSED:
|
||||
response->set_status(GetTestStatusResponse::PASSED);
|
||||
break;
|
||||
case TestManager::TestStatus::FAILED:
|
||||
response->set_status(GetTestStatusResponse::FAILED);
|
||||
break;
|
||||
case TestManager::TestStatus::TIMEOUT:
|
||||
response->set_status(GetTestStatusResponse::TIMEOUT);
|
||||
break;
|
||||
default:
|
||||
response->set_status(GetTestStatusResponse::UNKNOWN);
|
||||
}
|
||||
|
||||
// Convert absl::Time to milliseconds since epoch
|
||||
response->set_queued_at_ms(absl::ToUnixMillis(exec.queued_at));
|
||||
response->set_started_at_ms(absl::ToUnixMillis(exec.started_at));
|
||||
response->set_completed_at_ms(absl::ToUnixMillis(exec.completed_at));
|
||||
response->set_execution_time_ms(absl::ToInt64Milliseconds(exec.execution_time));
|
||||
response->set_error_message(exec.error_message);
|
||||
|
||||
for (const auto& failure : exec.assertion_failures) {
|
||||
response->add_assertion_failures(failure);
|
||||
}
|
||||
|
||||
return absl::OkStatus();
|
||||
}
|
||||
**Where to look**:
|
||||
- `src/app/core/imgui_test_harness_service.cc` (search for `GetTestStatus(`, `ListTests(`, `GetTestResults(`).
|
||||
- `src/app/core/imgui_test_harness_service.h` (new method declarations).
|
||||
|
||||
absl::Status ImGuiTestHarnessServiceImpl::ListTests(
|
||||
const ListTestsRequest* request,
|
||||
ListTestsResponse* response) {
|
||||
|
||||
auto tests = test_manager_->ListTests(request->category_filter());
|
||||
|
||||
// TODO: Implement pagination if needed
|
||||
response->set_total_count(tests.size());
|
||||
|
||||
for (const auto& exec : tests) {
|
||||
auto* test_info = response->add_tests();
|
||||
test_info->set_test_id(exec.test_id);
|
||||
test_info->set_name(exec.name);
|
||||
test_info->set_category(exec.category);
|
||||
test_info->set_last_run_timestamp_ms(absl::ToUnixMillis(exec.completed_at));
|
||||
test_info->set_total_runs(1); // TODO: Track across multiple runs
|
||||
|
||||
if (exec.status == TestManager::TestStatus::PASSED) {
|
||||
test_info->set_pass_count(1);
|
||||
test_info->set_fail_count(0);
|
||||
} else {
|
||||
test_info->set_pass_count(0);
|
||||
test_info->set_fail_count(1);
|
||||
}
|
||||
|
||||
test_info->set_average_duration_ms(
|
||||
absl::ToInt64Milliseconds(exec.execution_time));
|
||||
}
|
||||
|
||||
return absl::OkStatus();
|
||||
}
|
||||
**Follow-ups**:
|
||||
- Expand `AssertionResult` population once `TestManager` captures structured expected/actual data.
|
||||
- Evaluate pagination defaults (`page_size`, `page_token`) once CLI usage patterns are seen.
|
||||
|
||||
absl::Status ImGuiTestHarnessServiceImpl::GetTestResults(
|
||||
const GetTestResultsRequest* request,
|
||||
GetTestResultsResponse* response) {
|
||||
|
||||
auto status_or = test_manager_->GetTestResults(request->test_id());
|
||||
if (!status_or.ok()) {
|
||||
return absl::NotFoundError(
|
||||
absl::StrFormat("Test '%s' not found", request->test_id()));
|
||||
}
|
||||
|
||||
const auto& exec = status_or.value();
|
||||
|
||||
response->set_success(exec.status == TestManager::TestStatus::PASSED);
|
||||
response->set_test_name(exec.name);
|
||||
response->set_category(exec.category);
|
||||
response->set_executed_at_ms(absl::ToUnixMillis(exec.completed_at));
|
||||
response->set_duration_ms(absl::ToInt64Milliseconds(exec.execution_time));
|
||||
|
||||
// Include logs if requested
|
||||
if (request->include_logs()) {
|
||||
for (const auto& log : exec.logs) {
|
||||
response->add_logs(log);
|
||||
}
|
||||
}
|
||||
|
||||
// Add metrics
|
||||
for (const auto& [key, value] : exec.metrics) {
|
||||
(*response->mutable_metrics())[key] = value;
|
||||
}
|
||||
|
||||
return absl::OkStatus();
|
||||
}
|
||||
```
|
||||
### Step 3: CLI Integration (🚧 TODO)
|
||||
|
||||
### Step 3: CLI Integration (1-2 hours)
|
||||
Goal: expose the new RPCs through `GuiAutomationClient` and user-facing `z3ed agent test` subcommands. The pseudo-code below illustrates the desired flow; implementation still pending.
|
||||
|
||||
**File**: `src/cli/handlers/agent.cc`
|
||||
|
||||
@@ -631,7 +396,7 @@ absl::Status HandleAgentTestList(const CommandOptions& options) {
|
||||
}
|
||||
```
|
||||
|
||||
### Step 4: Testing & Validation (1 hour)
|
||||
### Step 4: Testing & Validation (🚧 TODO)
|
||||
|
||||
#### Test Script: `scripts/test_introspection_e2e.sh`
|
||||
|
||||
@@ -673,14 +438,14 @@ kill $YAZE_PID
|
||||
|
||||
## Success Criteria
|
||||
|
||||
- [ ] All 3 new RPCs respond correctly
|
||||
- [ ] Test IDs returned in Click/Type/Wait/Assert responses
|
||||
- [ ] Status polling works with `--follow` flag
|
||||
- [ ] Test history persists across multiple test runs
|
||||
- [x] All 3 new RPCs respond correctly
|
||||
- [x] Test IDs returned in Click/Type/Wait/Assert responses
|
||||
- [ ] Status polling works with `--follow` flag (CLI pending)
|
||||
- [x] Test history persists across multiple test runs
|
||||
- [ ] CLI commands output clean YAML/JSON
|
||||
- [ ] No memory leaks in test history tracking
|
||||
- [ ] Thread-safe access to test history
|
||||
- [ ] Documentation updated in E6-z3ed-reference.md
|
||||
- [x] No memory leaks in test history tracking (bounded deque + pruning)
|
||||
- [x] Thread-safe access to test history (mutex-protected)
|
||||
- [ ] Documentation updated in `E6-z3ed-reference.md`
|
||||
|
||||
## Migration Guide
|
||||
|
||||
@@ -719,4 +484,4 @@ After IT-05 completion:
|
||||
|
||||
**Author**: @scawful, GitHub Copilot
|
||||
**Created**: October 2, 2025
|
||||
**Status**: Ready for implementation
|
||||
**Status**: In progress (server-side complete; CLI + E2E pending)
|
||||
|
||||
@@ -79,7 +79,12 @@ See the **[Technical Reference](E6-z3ed-reference.md)** for a full command list.
|
||||
|
||||
## Recent Enhancements
|
||||
|
||||
**Test Harness Evolution** (Planned: IT-05 to IT-09):
|
||||
**Latest Progress (Oct 2, 2025)**
|
||||
- ✅ Implemented server-side wiring for `GetTestStatus`, `ListTests`, and `GetTestResults` RPCs, including execution history tracking inside `TestManager`.
|
||||
- ✅ Added gRPC status mapping helper to surface accurate error codes back to clients.
|
||||
- ⚠️ Pending CLI integration, end-to-end introspection tests, and documentation updates for new commands.
|
||||
|
||||
**Test Harness Evolution** (In Progress: IT-05 to IT-09):
|
||||
- **Test Introspection**: Query test status, results, and execution history
|
||||
- **Widget Discovery**: AI agents can enumerate available GUI interactions dynamically
|
||||
- **Test Recording**: Capture manual workflows as JSON scripts for regression testing
|
||||
|
||||
File diff suppressed because it is too large
Load Diff
@@ -36,6 +36,12 @@ class AssertRequest;
|
||||
class AssertResponse;
|
||||
class ScreenshotRequest;
|
||||
class ScreenshotResponse;
|
||||
class GetTestStatusRequest;
|
||||
class GetTestStatusResponse;
|
||||
class ListTestsRequest;
|
||||
class ListTestsResponse;
|
||||
class GetTestResultsRequest;
|
||||
class GetTestResultsResponse;
|
||||
|
||||
// Implementation of ImGuiTestHarness gRPC service
|
||||
// This class provides the actual RPC handlers for automated GUI testing
|
||||
@@ -72,6 +78,14 @@ class ImGuiTestHarnessServiceImpl {
|
||||
absl::Status Screenshot(const ScreenshotRequest* request,
|
||||
ScreenshotResponse* response);
|
||||
|
||||
// Test introspection APIs
|
||||
absl::Status GetTestStatus(const GetTestStatusRequest* request,
|
||||
GetTestStatusResponse* response);
|
||||
absl::Status ListTests(const ListTestsRequest* request,
|
||||
ListTestsResponse* response);
|
||||
absl::Status GetTestResults(const GetTestResultsRequest* request,
|
||||
GetTestResultsResponse* response);
|
||||
|
||||
private:
|
||||
TestManager* test_manager_; // Non-owning pointer to access ImGuiTestEngine
|
||||
};
|
||||
|
||||
@@ -22,6 +22,11 @@ service ImGuiTestHarness {
|
||||
|
||||
// Capture a screenshot
|
||||
rpc Screenshot(ScreenshotRequest) returns (ScreenshotResponse);
|
||||
|
||||
// Test introspection APIs (IT-05)
|
||||
rpc GetTestStatus(GetTestStatusRequest) returns (GetTestStatusResponse);
|
||||
rpc ListTests(ListTestsRequest) returns (ListTestsResponse);
|
||||
rpc GetTestResults(GetTestResultsRequest) returns (GetTestResultsResponse);
|
||||
}
|
||||
|
||||
// ============================================================================
|
||||
@@ -43,14 +48,15 @@ message PingResponse {
|
||||
// ============================================================================
|
||||
|
||||
message ClickRequest {
|
||||
string target = 1; // Target element (e.g., "button:Open ROM", "menu:File/Open")
|
||||
string target = 1; // Target element (e.g., "button:Open ROM")
|
||||
ClickType type = 2; // Type of click
|
||||
|
||||
enum ClickType {
|
||||
LEFT = 0; // Single left click
|
||||
RIGHT = 1; // Single right click
|
||||
DOUBLE = 2; // Double click
|
||||
MIDDLE = 3; // Middle mouse button
|
||||
CLICK_TYPE_UNSPECIFIED = 0; // Default/unspecified click type
|
||||
CLICK_TYPE_LEFT = 1; // Single left click
|
||||
CLICK_TYPE_RIGHT = 2; // Single right click
|
||||
CLICK_TYPE_DOUBLE = 3; // Double click
|
||||
CLICK_TYPE_MIDDLE = 4; // Middle mouse button
|
||||
}
|
||||
}
|
||||
|
||||
@@ -58,6 +64,7 @@ message ClickResponse {
|
||||
bool success = 1; // Whether the click succeeded
|
||||
string message = 2; // Human-readable result message
|
||||
int32 execution_time_ms = 3; // Time taken to execute (for debugging)
|
||||
string test_id = 4; // Unique test identifier for introspection
|
||||
}
|
||||
|
||||
// ============================================================================
|
||||
@@ -74,6 +81,7 @@ message TypeResponse {
|
||||
bool success = 1;
|
||||
string message = 2;
|
||||
int32 execution_time_ms = 3;
|
||||
string test_id = 4;
|
||||
}
|
||||
|
||||
// ============================================================================
|
||||
@@ -81,7 +89,7 @@ message TypeResponse {
|
||||
// ============================================================================
|
||||
|
||||
message WaitRequest {
|
||||
string condition = 1; // Condition to wait for (e.g., "window:Overworld Editor", "enabled:button:Save")
|
||||
string condition = 1; // Condition to wait for (e.g., "window:Overworld")
|
||||
int32 timeout_ms = 2; // Maximum time to wait (default 5000ms)
|
||||
int32 poll_interval_ms = 3; // How often to check (default 100ms)
|
||||
}
|
||||
@@ -90,6 +98,7 @@ message WaitResponse {
|
||||
bool success = 1; // Whether condition was met before timeout
|
||||
string message = 2;
|
||||
int32 elapsed_ms = 3; // Time taken before condition met (or timeout)
|
||||
string test_id = 4; // Unique test identifier for introspection
|
||||
}
|
||||
|
||||
// ============================================================================
|
||||
@@ -97,7 +106,7 @@ message WaitResponse {
|
||||
// ============================================================================
|
||||
|
||||
message AssertRequest {
|
||||
string condition = 1; // Condition to assert (e.g., "visible:button:Save", "text:label:Version:0.3.2")
|
||||
string condition = 1; // Condition to assert (e.g., "visible:button:Save")
|
||||
string failure_message = 2; // Custom message if assertion fails
|
||||
}
|
||||
|
||||
@@ -106,6 +115,7 @@ message AssertResponse {
|
||||
string message = 2; // Diagnostic message
|
||||
string actual_value = 3; // Actual value found (for debugging)
|
||||
string expected_value = 4; // Expected value (for debugging)
|
||||
string test_id = 5; // Unique test identifier for introspection
|
||||
}
|
||||
|
||||
// ============================================================================
|
||||
@@ -118,8 +128,9 @@ message ScreenshotRequest {
|
||||
ImageFormat format = 3; // Image format
|
||||
|
||||
enum ImageFormat {
|
||||
PNG = 0;
|
||||
JPEG = 1;
|
||||
IMAGE_FORMAT_UNSPECIFIED = 0;
|
||||
IMAGE_FORMAT_PNG = 1;
|
||||
IMAGE_FORMAT_JPEG = 2;
|
||||
}
|
||||
}
|
||||
|
||||
@@ -129,3 +140,85 @@ message ScreenshotResponse {
|
||||
string file_path = 3; // Absolute path to saved screenshot
|
||||
int64 file_size_bytes = 4;
|
||||
}
|
||||
|
||||
// ============================================================================
|
||||
// GetTestStatus - Query test execution state
|
||||
// ============================================================================
|
||||
|
||||
message GetTestStatusRequest {
|
||||
string test_id = 1; // Test ID from Click/Type/Wait/Assert response
|
||||
}
|
||||
|
||||
message GetTestStatusResponse {
|
||||
enum Status {
|
||||
STATUS_UNSPECIFIED = 0; // Test ID not found or unspecified
|
||||
STATUS_QUEUED = 1; // Waiting to execute
|
||||
STATUS_RUNNING = 2; // Currently executing
|
||||
STATUS_PASSED = 3; // Completed successfully
|
||||
STATUS_FAILED = 4; // Assertion failed or error
|
||||
STATUS_TIMEOUT = 5; // Exceeded timeout
|
||||
}
|
||||
|
||||
Status status = 1;
|
||||
int64 queued_at_ms = 2; // When test was queued
|
||||
int64 started_at_ms = 3; // When test started (0 if not started)
|
||||
int64 completed_at_ms = 4; // When test completed (0 if not complete)
|
||||
int32 execution_time_ms = 5; // Total execution time
|
||||
string error_message = 6; // Error details if FAILED/TIMEOUT
|
||||
repeated string assertion_failures = 7; // Failed assertion details
|
||||
}
|
||||
|
||||
// ============================================================================
|
||||
// ListTests - Enumerate available tests
|
||||
// ============================================================================
|
||||
|
||||
message ListTestsRequest {
|
||||
string category_filter = 1; // Optional: "grpc", "unit", "integration", "e2e"
|
||||
int32 page_size = 2; // Number of results per page (default 100)
|
||||
string page_token = 3; // Pagination token from previous response
|
||||
}
|
||||
|
||||
message ListTestsResponse {
|
||||
repeated TestInfo tests = 1;
|
||||
string next_page_token = 2; // Token for next page (empty if no more)
|
||||
int32 total_count = 3; // Total number of matching tests
|
||||
}
|
||||
|
||||
message TestInfo {
|
||||
string test_id = 1; // Unique test identifier
|
||||
string name = 2; // Human-readable test name
|
||||
string category = 3; // Category: grpc, unit, integration, e2e
|
||||
int64 last_run_timestamp_ms = 4; // When test last executed
|
||||
int32 total_runs = 5; // Total number of executions
|
||||
int32 pass_count = 6; // Number of successful runs
|
||||
int32 fail_count = 7; // Number of failed runs
|
||||
int32 average_duration_ms = 8; // Average execution time
|
||||
}
|
||||
|
||||
// ============================================================================
|
||||
// GetTestResults - Retrieve detailed results
|
||||
// ============================================================================
|
||||
|
||||
message GetTestResultsRequest {
|
||||
string test_id = 1;
|
||||
bool include_logs = 2; // Include full execution logs
|
||||
}
|
||||
|
||||
message GetTestResultsResponse {
|
||||
bool success = 1; // Overall test result
|
||||
string test_name = 2;
|
||||
string category = 3;
|
||||
int64 executed_at_ms = 4;
|
||||
int32 duration_ms = 5;
|
||||
repeated AssertionResult assertions = 6;
|
||||
repeated string logs = 7; // If include_logs=true
|
||||
map<string, int32> metrics = 8; // e.g., "frame_count": 123
|
||||
}
|
||||
|
||||
message AssertionResult {
|
||||
string description = 1;
|
||||
bool passed = 2;
|
||||
string expected_value = 3;
|
||||
string actual_value = 4;
|
||||
string error_message = 5;
|
||||
}
|
||||
|
||||
@@ -1,7 +1,13 @@
|
||||
#include "app/test/test_manager.h"
|
||||
|
||||
#include <algorithm>
|
||||
#include <random>
|
||||
|
||||
#include "absl/strings/str_format.h"
|
||||
#include "absl/strings/str_cat.h"
|
||||
#include "absl/strings/str_replace.h"
|
||||
#include "absl/time/clock.h"
|
||||
#include "absl/time/time.h"
|
||||
#include "app/core/features.h"
|
||||
#include "app/core/platform/file_dialog.h"
|
||||
#include "app/gfx/arena.h"
|
||||
@@ -1281,5 +1287,199 @@ absl::Status TestManager::TestRomDataIntegrity(Rom* rom) {
|
||||
});
|
||||
}
|
||||
|
||||
std::string TestManager::RegisterHarnessTest(const std::string& name,
|
||||
const std::string& category) {
|
||||
absl::MutexLock lock(&harness_history_mutex_);
|
||||
|
||||
const std::string sanitized_category = category.empty() ? "grpc" : category;
|
||||
std::string test_id = GenerateHarnessTestIdLocked(sanitized_category);
|
||||
|
||||
HarnessTestExecution execution;
|
||||
execution.test_id = test_id;
|
||||
execution.name = name;
|
||||
execution.category = sanitized_category;
|
||||
execution.status = HarnessTestStatus::kQueued;
|
||||
execution.queued_at = absl::Now();
|
||||
execution.started_at = absl::InfinitePast();
|
||||
execution.completed_at = absl::InfinitePast();
|
||||
|
||||
harness_history_[test_id] = execution;
|
||||
harness_history_order_.push_back(test_id);
|
||||
TrimHarnessHistoryLocked();
|
||||
|
||||
HarnessAggregate& aggregate = harness_aggregates_[name];
|
||||
if (aggregate.category.empty()) {
|
||||
aggregate.category = sanitized_category;
|
||||
}
|
||||
aggregate.last_run = execution.queued_at;
|
||||
aggregate.latest_execution = execution;
|
||||
|
||||
return test_id;
|
||||
}
|
||||
|
||||
void TestManager::MarkHarnessTestRunning(const std::string& test_id) {
|
||||
absl::MutexLock lock(&harness_history_mutex_);
|
||||
|
||||
auto it = harness_history_.find(test_id);
|
||||
if (it == harness_history_.end()) {
|
||||
return;
|
||||
}
|
||||
|
||||
HarnessTestExecution& execution = it->second;
|
||||
execution.status = HarnessTestStatus::kRunning;
|
||||
execution.started_at = absl::Now();
|
||||
|
||||
HarnessAggregate& aggregate = harness_aggregates_[execution.name];
|
||||
if (aggregate.category.empty()) {
|
||||
aggregate.category = execution.category;
|
||||
}
|
||||
aggregate.latest_execution = execution;
|
||||
}
|
||||
|
||||
void TestManager::MarkHarnessTestCompleted(
|
||||
const std::string& test_id, HarnessTestStatus status,
|
||||
const std::string& error_message,
|
||||
const std::vector<std::string>& assertion_failures,
|
||||
const std::vector<std::string>& logs,
|
||||
const std::map<std::string, int32_t>& metrics) {
|
||||
absl::MutexLock lock(&harness_history_mutex_);
|
||||
|
||||
auto it = harness_history_.find(test_id);
|
||||
if (it == harness_history_.end()) {
|
||||
return;
|
||||
}
|
||||
|
||||
HarnessTestExecution& execution = it->second;
|
||||
execution.status = status;
|
||||
if (execution.started_at == absl::InfinitePast()) {
|
||||
execution.started_at = execution.queued_at;
|
||||
}
|
||||
execution.completed_at = absl::Now();
|
||||
execution.duration = execution.completed_at - execution.started_at;
|
||||
execution.error_message = error_message;
|
||||
if (!assertion_failures.empty()) {
|
||||
execution.assertion_failures = assertion_failures;
|
||||
}
|
||||
if (!logs.empty()) {
|
||||
execution.logs.insert(execution.logs.end(), logs.begin(), logs.end());
|
||||
}
|
||||
if (!metrics.empty()) {
|
||||
execution.metrics.insert(metrics.begin(), metrics.end());
|
||||
}
|
||||
|
||||
HarnessAggregate& aggregate = harness_aggregates_[execution.name];
|
||||
if (aggregate.category.empty()) {
|
||||
aggregate.category = execution.category;
|
||||
}
|
||||
aggregate.total_runs += 1;
|
||||
if (status == HarnessTestStatus::kPassed) {
|
||||
aggregate.pass_count += 1;
|
||||
} else if (status == HarnessTestStatus::kFailed ||
|
||||
status == HarnessTestStatus::kTimeout) {
|
||||
aggregate.fail_count += 1;
|
||||
}
|
||||
aggregate.total_duration += execution.duration;
|
||||
aggregate.last_run = execution.completed_at;
|
||||
aggregate.latest_execution = execution;
|
||||
}
|
||||
|
||||
void TestManager::AppendHarnessTestLog(const std::string& test_id,
|
||||
const std::string& log_entry) {
|
||||
absl::MutexLock lock(&harness_history_mutex_);
|
||||
|
||||
auto it = harness_history_.find(test_id);
|
||||
if (it == harness_history_.end()) {
|
||||
return;
|
||||
}
|
||||
|
||||
HarnessTestExecution& execution = it->second;
|
||||
execution.logs.push_back(log_entry);
|
||||
|
||||
HarnessAggregate& aggregate = harness_aggregates_[execution.name];
|
||||
aggregate.latest_execution.logs = execution.logs;
|
||||
}
|
||||
|
||||
absl::StatusOr<HarnessTestExecution> TestManager::GetHarnessTestExecution(
|
||||
const std::string& test_id) const {
|
||||
absl::MutexLock lock(&harness_history_mutex_);
|
||||
|
||||
auto it = harness_history_.find(test_id);
|
||||
if (it == harness_history_.end()) {
|
||||
return absl::NotFoundError(
|
||||
absl::StrFormat("Test ID '%s' not found", test_id));
|
||||
}
|
||||
|
||||
return it->second;
|
||||
}
|
||||
|
||||
std::vector<HarnessTestSummary> TestManager::ListHarnessTestSummaries(
|
||||
const std::string& category_filter) const {
|
||||
absl::MutexLock lock(&harness_history_mutex_);
|
||||
std::vector<HarnessTestSummary> summaries;
|
||||
summaries.reserve(harness_aggregates_.size());
|
||||
|
||||
for (const auto& [name, aggregate] : harness_aggregates_) {
|
||||
if (!category_filter.empty() && aggregate.category != category_filter) {
|
||||
continue;
|
||||
}
|
||||
|
||||
HarnessTestSummary summary;
|
||||
summary.latest_execution = aggregate.latest_execution;
|
||||
summary.total_runs = aggregate.total_runs;
|
||||
summary.pass_count = aggregate.pass_count;
|
||||
summary.fail_count = aggregate.fail_count;
|
||||
summary.total_duration = aggregate.total_duration;
|
||||
summaries.push_back(summary);
|
||||
}
|
||||
|
||||
std::sort(summaries.begin(), summaries.end(),
|
||||
[](const HarnessTestSummary& a, const HarnessTestSummary& b) {
|
||||
absl::Time time_a = a.latest_execution.completed_at;
|
||||
if (time_a == absl::InfinitePast()) {
|
||||
time_a = a.latest_execution.queued_at;
|
||||
}
|
||||
absl::Time time_b = b.latest_execution.completed_at;
|
||||
if (time_b == absl::InfinitePast()) {
|
||||
time_b = b.latest_execution.queued_at;
|
||||
}
|
||||
return time_a > time_b;
|
||||
});
|
||||
|
||||
return summaries;
|
||||
}
|
||||
|
||||
std::string TestManager::GenerateHarnessTestIdLocked(absl::string_view prefix) {
|
||||
static std::mt19937 rng(std::random_device{}());
|
||||
static std::uniform_int_distribution<uint32_t> dist(0, 0xFFFFFF);
|
||||
|
||||
std::string sanitized = absl::StrReplaceAll(std::string(prefix),
|
||||
{{" ", "_"}, {":", "_"}});
|
||||
if (sanitized.empty()) {
|
||||
sanitized = "test";
|
||||
}
|
||||
|
||||
for (int attempt = 0; attempt < 8; ++attempt) {
|
||||
std::string candidate =
|
||||
absl::StrFormat("%s_%08x", sanitized, dist(rng));
|
||||
if (harness_history_.find(candidate) == harness_history_.end()) {
|
||||
return candidate;
|
||||
}
|
||||
}
|
||||
|
||||
return absl::StrFormat("%s_%lld", sanitized,
|
||||
static_cast<long long>(absl::ToUnixMillis(absl::Now())));
|
||||
}
|
||||
|
||||
void TestManager::TrimHarnessHistoryLocked() {
|
||||
while (harness_history_order_.size() > harness_history_limit_) {
|
||||
const std::string& oldest_id = harness_history_order_.front();
|
||||
auto it = harness_history_.find(oldest_id);
|
||||
if (it != harness_history_.end()) {
|
||||
harness_history_.erase(it);
|
||||
}
|
||||
harness_history_order_.pop_front();
|
||||
}
|
||||
}
|
||||
|
||||
} // namespace test
|
||||
} // namespace yaze
|
||||
|
||||
@@ -2,13 +2,19 @@
|
||||
#define YAZE_APP_TEST_TEST_MANAGER_H
|
||||
|
||||
#include <chrono>
|
||||
#include <deque>
|
||||
#include <functional>
|
||||
#include <map>
|
||||
#include <memory>
|
||||
#include <string>
|
||||
#include <unordered_map>
|
||||
#include <vector>
|
||||
|
||||
#include "absl/status/status.h"
|
||||
#include "absl/status/statusor.h"
|
||||
#include "absl/synchronization/mutex.h"
|
||||
#include "absl/strings/string_view.h"
|
||||
#include "absl/time/time.h"
|
||||
#include "app/rom.h"
|
||||
#include "imgui.h"
|
||||
#include "util/log.h"
|
||||
@@ -111,6 +117,39 @@ struct ResourceStats {
|
||||
std::chrono::time_point<std::chrono::steady_clock> timestamp;
|
||||
};
|
||||
|
||||
// Test harness execution tracking for gRPC automation (IT-05)
|
||||
enum class HarnessTestStatus {
|
||||
kUnspecified,
|
||||
kQueued,
|
||||
kRunning,
|
||||
kPassed,
|
||||
kFailed,
|
||||
kTimeout,
|
||||
};
|
||||
|
||||
struct HarnessTestExecution {
|
||||
std::string test_id;
|
||||
std::string name;
|
||||
std::string category;
|
||||
HarnessTestStatus status = HarnessTestStatus::kUnspecified;
|
||||
absl::Time queued_at;
|
||||
absl::Time started_at;
|
||||
absl::Time completed_at;
|
||||
absl::Duration duration = absl::ZeroDuration();
|
||||
std::string error_message;
|
||||
std::vector<std::string> assertion_failures;
|
||||
std::vector<std::string> logs;
|
||||
std::map<std::string, int32_t> metrics;
|
||||
};
|
||||
|
||||
struct HarnessTestSummary {
|
||||
HarnessTestExecution latest_execution;
|
||||
int total_runs = 0;
|
||||
int pass_count = 0;
|
||||
int fail_count = 0;
|
||||
absl::Duration total_duration = absl::ZeroDuration();
|
||||
};
|
||||
|
||||
// Main test manager - singleton
|
||||
class TestManager {
|
||||
public:
|
||||
@@ -209,6 +248,29 @@ class TestManager {
|
||||
}
|
||||
// File dialog mode now uses global feature flags
|
||||
|
||||
// Harness test introspection (IT-05)
|
||||
std::string RegisterHarnessTest(const std::string& name,
|
||||
const std::string& category)
|
||||
ABSL_LOCKS_EXCLUDED(harness_history_mutex_);
|
||||
void MarkHarnessTestRunning(const std::string& test_id)
|
||||
ABSL_LOCKS_EXCLUDED(harness_history_mutex_);
|
||||
void MarkHarnessTestCompleted(
|
||||
const std::string& test_id, HarnessTestStatus status,
|
||||
const std::string& error_message = "",
|
||||
const std::vector<std::string>& assertion_failures = {},
|
||||
const std::vector<std::string>& logs = {},
|
||||
const std::map<std::string, int32_t>& metrics = {})
|
||||
ABSL_LOCKS_EXCLUDED(harness_history_mutex_);
|
||||
void AppendHarnessTestLog(const std::string& test_id,
|
||||
const std::string& log_entry)
|
||||
ABSL_LOCKS_EXCLUDED(harness_history_mutex_);
|
||||
absl::StatusOr<HarnessTestExecution> GetHarnessTestExecution(
|
||||
const std::string& test_id) const
|
||||
ABSL_LOCKS_EXCLUDED(harness_history_mutex_);
|
||||
std::vector<HarnessTestSummary> ListHarnessTestSummaries(
|
||||
const std::string& category_filter = "") const
|
||||
ABSL_LOCKS_EXCLUDED(harness_history_mutex_);
|
||||
|
||||
private:
|
||||
TestManager();
|
||||
~TestManager();
|
||||
@@ -263,6 +325,31 @@ class TestManager {
|
||||
|
||||
// Test selection and configuration
|
||||
std::unordered_map<std::string, bool> disabled_tests_;
|
||||
|
||||
// Harness test tracking
|
||||
struct HarnessAggregate {
|
||||
int total_runs = 0;
|
||||
int pass_count = 0;
|
||||
int fail_count = 0;
|
||||
absl::Duration total_duration = absl::ZeroDuration();
|
||||
std::string category;
|
||||
absl::Time last_run;
|
||||
HarnessTestExecution latest_execution;
|
||||
};
|
||||
|
||||
std::unordered_map<std::string, HarnessTestExecution> harness_history_
|
||||
ABSL_GUARDED_BY(harness_history_mutex_);
|
||||
std::unordered_map<std::string, HarnessAggregate> harness_aggregates_
|
||||
ABSL_GUARDED_BY(harness_history_mutex_);
|
||||
std::deque<std::string> harness_history_order_
|
||||
ABSL_GUARDED_BY(harness_history_mutex_);
|
||||
size_t harness_history_limit_ = 200;
|
||||
mutable absl::Mutex harness_history_mutex_;
|
||||
|
||||
std::string GenerateHarnessTestIdLocked(absl::string_view prefix)
|
||||
ABSL_EXCLUSIVE_LOCKS_REQUIRED(harness_history_mutex_);
|
||||
void TrimHarnessHistoryLocked()
|
||||
ABSL_EXCLUSIVE_LOCKS_REQUIRED(harness_history_mutex_);
|
||||
};
|
||||
|
||||
// Utility functions for test result formatting
|
||||
|
||||
Reference in New Issue
Block a user