feat: Implement auto-capture of screenshots and diagnostics on test failures

- Added a new helper function `CaptureHarnessScreenshot` to encapsulate SDL screenshot logic. - Updated `ImGuiTestHarnessServiceImpl::Screenshot` to utilize the new screenshot helper. - Enhanced `TestManager::CaptureFailureContext` to automatically capture screenshots and widget state on test failures. - Introduced new fields in the `GetTestResultsResponse` proto for screenshot path, size, failure context, and widget state. - Updated CLI and gRPC client to expose new diagnostic fields in test results. - Ensured that screenshots are saved in a structured directory under the system's temp directory. - Improved logging for auto-capture events, including success and failure messages.
2025-10-02 23:36:09 -04:00
parent c348f7f91f
commit 0447d6f8a1
11 changed files with 509 additions and 419 deletions
--- a/docs/z3ed/E6-z3ed-implementation-plan.md
+++ b/docs/z3ed/E6-z3ed-implementation-plan.md
@@ -107,27 +107,23 @@ The z3ed CLI and AI agent workflow system has completed major infrastructure mil
 - **Application Consistency**: z3ed, EditorManager, and core services emit heterogeneous error formats
 #### IT-05: Test Introspection API (6-8 hours)
-**Status (Oct 2, 2025)**: 🟡 *Server-side RPCs implemented; CLI + E2E pending*
+**Status (Oct 2, 2025)**: ✅ Completed
-**Progress**:
+**Highlights**:
- ✅ `imgui_test_harness.proto` expanded with GetTestStatus/ListTests/GetTestResults messages.
+- `imgui_test_harness.proto` now exposes `GetTestStatus`, `ListTests`, and
- ✅ `TestManager` maintains execution history (queued→running→completed) with logs, metrics, and aggregates.
+  `GetTestResults` RPCs backed by `TestManager`'s execution history.
- ✅ `ImGuiTestHarnessServiceImpl` exposes the three introspection RPCs with pagination, status conversion, and log/metric marshalling.
+- CLI commands (`z3ed agent test status|list|results`) are fully wired with
- ⚠️ `agent` CLI commands (`test status`, `test list`, `test results`) still stubbed.
+  JSON/YAML formatting, follow-mode polling, and filtering options.
- ⚠️ End-to-end introspection script (`scripts/test_introspection_e2e.sh`) not implemented; regression script `test_harness_e2e.sh` currently failing because it references the unfinished CLI.
+- `GuiAutomationClient` provides typed wrappers for introspection APIs so agent
  workflows can poll status programmatically.
 - Regression coverage lives in `scripts/test_harness_e2e.sh`; a slimmer
  introspection smoke (`scripts/test_introspection_e2e.sh`) is queued for CI
  automation but manual verification paths are documented.
-**Immediate Next Steps**:
+**Future Enhancements**:
-1. **Wire CLI Client Methods**
+- Capture richer assertion metadata (expected/actual pairs) for improved
-  - Implement gRPC client wrappers for the new RPCs in the automation client.
+  failure messaging when the underlying harness exposes it.
-  - Add user-facing commands under `z3ed agent test ...` with JSON/YAML output options.
+- Add pagination helpers to CLI once history volume grows (low priority).
 2. **Author E2E Validation Script**
  - Spin up harness, run Click/Assert workflow, poll via `agent test status`, fetch results.
  - Update CI notes with the new script and expected output.
 3. **Documentation & Examples**
  - Extend `E6-z3ed-reference.md` with full usage examples and sample outputs.
  - Add troubleshooting section covering common errors (unknown test_id, timeout, etc.).
 4. **Stretch (Optional Before IT-06)**
  - Capture assertion metadata (expected/actual) for richer `AssertionResult` payloads.
 **Example Usage**:
 ```bash
@@ -270,16 +266,16 @@ message DiscoverWidgetsResponse {
 **Implementation Tracks**:
 1. **Harness-Level Diagnostics**
  - ✅ IT-08a: Screenshot RPC implemented (SDL-based, BMP format, 1536x864)
-  - ✅ IT-08b: Auto-capture screenshots and context on test failure
+  - ✅ IT-08b: Auto-capture screenshots and context on test failure using shared
-  - <20> IT-08c: Widget tree dumps and recent ImGui events on failure (NEXT)
+    helper that writes to `${TMPDIR}/yaze/test-results/<test_id>/`
-  - Serialize results to both structured JSON (for automation) and human-friendly HTML bundles
+  - ✅ IT-08c: Widget tree JSON dumps emitted alongside failure context
-  - Persist artifacts under `test-results/<test_id>/` with timestamped directories
+  - ⏳ HTML bundle exporter (screenshots + widget tree) remains a stretch goal
 2. **CLI Experience Improvements**
  - Surface artifact paths, failure context, and widget state in CLI output (DONE)
  - Standardize error envelopes in z3ed (`absl::Status` + structured payload)
-  - Surface artifact paths, summarized failure reason, and next-step hints in CLI output
+  - Add `--format html` flag to emit rich bundles (planned)
-  - Add `--format html` / `--format json` flags to `z3ed agent test results` to emit richer context
+  - Integrate with recording workflow: replay failures using captured state (planned)
  - Integrate with recording workflow: replay failures using captured state for fast reproduction
 3. **EditorManager & Application Integration**
  - Introduce shared `ErrorAnnotatedResult` utility exposing `status`, `context`, `actionable_hint`
@@ -299,7 +295,7 @@ message DiscoverWidgetsResponse {
  "assertion": "visible:Overworld",
  "expected": "visible",
  "actual": "hidden",
-  "screenshot": "/tmp/yaze_test_12345678.png",
+  "screenshot": "/tmp/yaze/test-results/grpc_assert_12345678/failure_1696357220000.bmp",
  "widget_state": {
    "active_window": "Main Window",
    "focused_widget": null,
--- a/docs/z3ed/IT-08-IMPLEMENTATION-GUIDE.md
+++ b/docs/z3ed/IT-08-IMPLEMENTATION-GUIDE.md
@@ -58,182 +58,72 @@ absl::Status ImGuiTestHarnessServiceImpl::Screenshot(
  if (!backend_data || !backend_data->Renderer) {
    response->set_success(false);
    response->set_message("SDL renderer not available");
    return absl::FailedPreconditionError("No SDL renderer available");
  }
  SDL_Renderer* renderer = backend_data->Renderer;
  // 2. Get renderer output size
  int width, height;
  SDL_GetRendererOutputSize(renderer, &width, &height);
  // 3. Create surface to hold screenshot
  SDL_Surface* surface = SDL_CreateRGBSurface(0, width, height, 32,
                                              0x00FF0000, 0x0000FF00,
                                              0x000000FF, 0xFF000000);
  // 4. Read pixels from renderer (ARGB8888 format)
  SDL_RenderReadPixels(renderer, nullptr, SDL_PIXELFORMAT_ARGB8888,
                      surface->pixels, surface->pitch);
  // 5. Determine output path (custom or auto-generated)
  std::string output_path = request->output_path();
  if (output_path.empty()) {
    output_path = absl::StrFormat("/tmp/yaze_screenshot_%lld.bmp",
                                  absl::ToUnixMillis(absl::Now()));
  }
  // 6. Save to BMP file
  SDL_SaveBMP(surface, output_path.c_str());
  // 7. Get file size and clean up
  std::ifstream file(output_path, std::ios::binary | std::ios::ate);
  int64_t file_size = file.tellg();
  SDL_FreeSurface(surface);
  // 8. Return success response
  response->set_success(true);
  response->set_message(absl::StrFormat("Screenshot saved to %s (%dx%d)",
                                        output_path, width, height));
  response->set_file_path(output_path);
  response->set_file_size_bytes(file_size);
  return absl::OkStatus();
 }
 ```
 ### Testing Results
 **Test Command**:
 ```bash
 grpcurl -plaintext \
  -import-path /Users/scawful/Code/yaze/src/app/core/proto \
  -proto imgui_test_harness.proto \
  -d '{"output_path": "/tmp/test_screenshot.bmp"}' \
  localhost:50052 yaze.test.ImGuiTestHarness/Screenshot
 ```
 **Response**:
 ```json
 {
  "success": true,
  "message": "Screenshot saved to /tmp/test_screenshot.bmp (1536x864)",
  "filePath": "/tmp/test_screenshot.bmp",
  "fileSizeBytes": "5308538"
 }
 ```
 **File Verification**:
 ```bash
 $ ls -lh /tmp/test_screenshot.bmp
 -rw-r--r--  1 scawful  wheel   5.1M Oct  2 20:16 /tmp/test_screenshot.bmp
 $ file /tmp/test_screenshot.bmp
 /tmp/test_screenshot.bmp: PC bitmap, Windows 95/NT4 and newer format, 1536 x 864 x 32, cbSize 5308538, bits offset 122
 ```
 ✅ **Result**: Screenshot successfully captured, saved, and validated!
 ---
 ## Design Decisions
 ### Why BMP Format?
 **Chosen**: SDL's built-in `SDL_SaveBMP` function  
 **Rationale**:
 - ✅ Zero external dependencies (no need for libpng, stb_image_write, etc.)
 - ✅ Guaranteed to work on all platforms where SDL works
 - ✅ Simple, reliable, and fast
 - ✅ Adequate for debugging/error reporting (file size not critical)
 - ⚠️ Larger file sizes (5.3MB vs ~500KB for PNG), but acceptable for temporary debug files
 **Future Consideration**: If disk space becomes an issue, can add PNG encoding using stb_image_write (single-header library, easy to integrate)
 ### SDL Backend Integration
 **Challenge**: How to access the SDL_Renderer from ImGui?  
 **Solution**: 
 - ImGui's `BackendRendererUserData` points to an `ImGui_ImplSDLRenderer2_Data` struct
 - This struct contains the `Renderer` pointer as its first member
 - Cast `BackendRendererUserData` to access the renderer safely
 **Why Not Store Renderer Globally?**
 - Multiple ImGui contexts could use different renderers
 - Backend data pattern follows ImGui's architecture conventions
 - More maintainable and future-proof
 ---
 ## Integration with Test System
 ### Current Usage (Manual RPC)
 AI agents or CLI tools can manually capture screenshots:
 ```bash
 # Capture screenshot after opening editor
 z3ed agent test --prompt "Open Overworld Editor"
 grpcurl ... yaze.test.ImGuiTestHarness/Screenshot
 ```
 ### Next Step: Auto-Capture on Failure
 The screenshot RPC is now ready to be integrated with TestManager to automatically capture context when tests fail:
 **Planned Implementation** (IT-08 Phase 2):
 ```cpp
 // In TestManager::MarkHarnessTestCompleted()
 if (test_result == IMGUI_TEST_STATUS_FAILED || 
    test_result == IMGUI_TEST_STATUS_TIMEOUT) {
  // Auto-capture screenshot
  ScreenshotRequest req;
  req.set_output_path(absl::StrFormat("/tmp/test_%s_failure.bmp", test_id));
  ScreenshotResponse resp;
  harness_service_->Screenshot(&req, &resp);
  test_history_[test_id].screenshot_path = resp.file_path();
  // Also capture widget state (IT-08 Phase 3)
  test_history_[test_id].widget_state = CaptureWidgetState();
 }
 ```
 ---
 ---
 ## IT-08b: Auto-Capture on Test Failure ✅ COMPLETE
-**Date Completed**: October 2, 2025  
+    ## IT-08b: Auto-Capture on Test Failure ✅ COMPLETE
 **Time**: 1.5 hours
-### Implementation Summary
+    **Date Completed**: October 2, 2025  
    **Artifacts**: `CaptureFailureContext`, `screenshot_utils.{h,cc}`, CLI introspection updates
-Successfully implemented automatic screenshot and context capture when tests fail or timeout.
+    ### Highlights
-### What Was Built
+    - **Shared SDL helper**: New `CaptureHarnessScreenshot()` centralizes renderer
      capture and writes BMP files into `${TMPDIR}/yaze/test-results/<test_id>/`.
    - **TestManager integration**: Failure context now records ImGui window/nav
      state, widget hierarchy (`CaptureWidgetState`), and screenshot metadata while
      keeping `HarnessTestExecution` aggregates in sync.
    - **Graceful fallbacks**: When `YAZE_WITH_GRPC` is disabled we emit a harness
      log noting that screenshot capture is unavailable.
    - **End-user surfacing**: `GuiAutomationClient::GetTestResults` and
      `z3ed agent test results` expose `screenshot_path`, `screenshot_size_bytes`,
      `failure_context`, and `widget_state` in both YAML and JSON modes.
-1. **TestManager Integration**:
+    ### Key Touch Points
   - Added failure diagnostic fields to `HarnessTestExecution` struct
   - Modified `MarkHarnessTestCompleted()` to auto-trigger capture on failure/timeout
   - Implemented `CaptureFailureContext()` method with execution context capture
-2. **Failure Context Capture**:
+    | File | Purpose |
-   - Frame count at failure time
+    |------|---------|
-   - Active window name
+    | `src/app/core/service/screenshot_utils.{h,cc}` | SDL renderer capture reused by RPC + auto-capture |
-   - Focused widget ID
+    | `src/app/test/test_manager.cc` | Auto-capture pipeline with per-test artifact directories |
-   - Screenshot path placeholder for future RPC integration
+    | `src/app/core/service/imgui_test_harness_service.cc` | Screenshot RPC delegates to shared helper |
    | `src/cli/service/gui_automation_client.*` | Propagates new proto fields to CLI |
    | `src/cli/handlers/agent/test_commands.cc` | Presents diagnostics to users/agents |
-3. **Proto Schema Updates**:
+    ### Validation Checklist
   - Added `screenshot_path`, `screenshot_size_bytes`, `failure_context`, `widget_state` to `GetTestResultsResponse`
-4. **gRPC Service Integration**:
+    ```bash
-   - Updated `GetTestResults` RPC to include failure diagnostics in response
+    # Build (needs YAZE_WITH_GRPC=ON)
    cmake --build build-grpc-test --target yaze -j$(sysctl -n hw.ncpu)
    # Start harness
    ./build-grpc-test/bin/yaze.app/Contents/MacOS/yaze \
      --enable_test_harness --test_harness_port=50052 \
      --rom_file=assets/zelda3.sfc &
    # Queue a failing automation step
    grpcurl -plaintext \
      -import-path src/app/core/proto \
      -proto imgui_test_harness.proto \
      -d '{"target":"button:DoesNotExist","type":"LEFT"}' \
      localhost:50052 yaze.test.ImGuiTestHarness/Click
    # Fetch diagnostics
    z3ed agent test results --test-id <captured_id> --include-logs --format yaml
    # Inspect artifact directory
    ls ${TMPDIR}/yaze/test-results/<captured_id>/
    ```
    You should see a `.bmp` failure screenshot, widget JSON in the CLI output, and
    logs noting the auto-capture event. When the helper fails (e.g., renderer not
    ready) the harness log and CLI output record the failure reason.
    ### Next Steps
    - Wire the same helper into HTML bundle generation (IT-08c follow-up).
    - Add configurable artifact root (`--error-artifact-dir`) for CI separation.
    - Consider PNG encoding via `stb_image_write` if file size becomes an issue.
    ---
 ### Technical Implementation
 **Location**: `/Users/scawful/Code/yaze/src/app/test/test_manager.{h,cc}`
@@ -323,7 +213,7 @@ grpcurl -plaintext \
  "category": "grpc",
  "executedAtMs": "1696357200000",
  "durationMs": 150,
-  "screenshotPath": "/tmp/yaze_test_grpc_click_12345678_failure.bmp",
+  "screenshotPath": "/tmp/yaze/test-results/grpc_click_12345678/failure_1696357200000.bmp",
  "failureContext": "Frame: 1234, Active Window: Main Window, Focused Widget: 0x00000000"
 }
 ```
@@ -336,12 +226,12 @@ grpcurl -plaintext \
 - ✅ No deadlocks (mutex released before calling CaptureFailureContext)
 - ✅ Proto schema updated with new fields
-### Next Steps
+### Retro Notes
-The screenshot path is currently a placeholder. Future integration will:
+- Placeholder screenshot paths have been replaced by the shared helper that
-1. Call the Screenshot RPC from within CaptureFailureContext
+  writes into `${TMPDIR}/yaze/test-results/<test_id>/` and records byte sizes.
-2. Wait for screenshot completion and store the actual file size
+- Widget state capture (IT-08c) is now invoked directly from
-3. Integrate with IT-08c for widget state dumps
+  `CaptureFailureContext`, removing the TODOs from the original plan.
 ---
--- a/docs/z3ed/IT-08b-AUTO-CAPTURE.md
+++ b/docs/z3ed/IT-08b-AUTO-CAPTURE.md
@@ -122,128 +122,117 @@ void TestManager::MarkHarnessTestCompleted(const std::string& test_id,
  history.execution_time_ms = absl::ToInt64Milliseconds(
      history.end_time - history.start_time);
-  // Auto-capture diagnostics on failure
+  # IT-08b: Auto-Capture on Test Failure
  if (status == ImGuiTestStatus_Error || status == ImGuiTestStatus_Warning) {
    CaptureFailureContext(test_id);
  }
  // Notify waiting threads
  cv_.notify_all();
 }
 ```
-### Step 4: Update GetTestResults RPC (30 minutes)
+  **Status**: ✅ Complete  
  **Completed**: October 2, 2025  
  **Owner**: Harness Platform Team  
  **Depends On**: IT-08a (Screenshot RPC), IT-05 (execution history store)
-**File**: `src/app/core/proto/imgui_test_harness.proto`
+  ---
-Add fields to response:
+  ## Summary
-```proto
+  Harness failures now emit rich diagnostics automatically. Whenever a GUI test
-message GetTestResultsResponse {
+  transitions into `FAILED` or `TIMEOUT` we capture:
  string test_id = 1;
  TestStatus status = 2;
  int64 execution_time_ms = 3;
  repeated string logs = 4;
  map<string, string> metrics = 5;
  // IT-08b: Failure diagnostics
  string screenshot_path = 6;
  int64 screenshot_size_bytes = 7;
  string failure_context = 8;
  // IT-08c: Widget state (future)
  string widget_state = 9;
 }
 ```
-**File**: `src/app/core/service/imgui_test_harness_service.cc`
+  - A full-frame SDL screenshot written to a stable per-test artifact folder
  - ImGui execution context (frame number, active/nav/hovered windows & IDs)
  - Serialized widget hierarchy snapshot (`CaptureWidgetState`) for IT-08c
  - Append-only log entries surfaced through `GetTestResults`
-Update implementation:
+  All artifacts are exposed through both the gRPC API and the `z3ed agent test
  results` command (JSON/YAML), enabling AI agents and humans to retrieve the same
  diagnostics without extra RPC calls.
-```cpp
+  ---
 absl::Status ImGuiTestHarnessServiceImpl::GetTestResults(
    const GetTestResultsRequest* request,
    GetTestResultsResponse* response) {
  const std::string& test_id = request->test_id();
  auto history = test_manager_->GetTestHistory(test_id);
  if (!history.has_value()) {
    return absl::NotFoundError(
        absl::StrFormat("Test not found: %s", test_id));
  }
  const auto& h = history.value();
  // Basic info
  response->set_test_id(h.test_id);
  response->set_status(ConvertImGuiTestStatusToProto(h.status));
  response->set_execution_time_ms(h.execution_time_ms);
  // Logs and metrics
  for (const auto& log : h.logs) {
    response->add_logs(log);
  }
  for (const auto& [key, value] : h.metrics) {
    (*response->mutable_metrics())[key] = value;
  }
  // IT-08b: Failure diagnostics
  if (!h.screenshot_path.empty()) {
    response->set_screenshot_path(h.screenshot_path);
    response->set_screenshot_size_bytes(h.screenshot_size_bytes);
  }
  if (!h.failure_context.empty()) {
    response->set_failure_context(h.failure_context);
  }
  // IT-08c: Widget state (future)
  if (!h.widget_state.empty()) {
    response->set_widget_state(h.widget_state);
  }
  return absl::OkStatus();
 }
 ```
---
+  ## What Shipped
-## Testing
+  ### Shared Screenshot Helper
  - New helper (`screenshot_utils.{h,cc}`) centralizes SDL capture logic.
  - Generates deterministic default paths under
    `${TMPDIR}/yaze/test-results/<test_id>/failure_<timestamp>.bmp`.
  - Reused by the manual `Screenshot` RPC to avoid duplicate code.
-### Build and Start Test Harness
+  ### TestManager Auto-Capture Pipeline
  - `CaptureFailureContext` now:
    - Computes ImGui context metadata even when the test finishes on a worker
      thread.
    - Allocates artifact folders per test ID and requests a screenshot via the
      shared helper (guarded when gRPC is disabled).
    - Persists screenshot path, byte size, failure context, and widget state back
      into `HarnessTestExecution` while keeping aggregate caches in sync.
    - Emits structured harness logs for success/failure of the auto-capture.
-```bash
+  ### CLI & Client Updates
-# 1. Rebuild with changes
+  - `GuiAutomationClient::GetTestResults` propagates new proto fields:
-cmake --build build-grpc-test --target yaze -j$(sysctl -n hw.ncpu)
+    `screenshot_path`, `screenshot_size_bytes`, `failure_context`, `widget_state`.
  - `z3ed agent test results` shows diagnostics in both human (YAML) and machine
    (JSON) modes, including `null` markers when artifacts are unavailable.
  - JSON output is now agent-ready: screenshot path + size enable downstream
    fetchers, failure context aids chain-of-thought prompts, widget state allows
    LLMs to reason about UI layout when debugging.
-# 2. Start test harness
+  ### Build Integration
-./build-grpc-test/bin/yaze.app/Contents/MacOS/yaze \
+  - gRPC build stanza now compiles the new helper files so both harness server and
-  --enable_test_harness \
+    in-process capture use the same implementation.
  --test_harness_port=50052 \
  --rom_file=assets/zelda3.sfc &
 ```
-### Trigger Test Failure
+  ---
-```bash
+  ## Developer Notes
 # 3. Trigger a failing test (nonexistent widget)
 grpcurl -plaintext \
  -import-path src/app/core/proto \
  -proto imgui_test_harness.proto \
  -d '{"target":"nonexistent_widget","type":"LEFT"}' \
  127.0.0.1:50052 yaze.test.ImGuiTestHarness/Click
-# Response should indicate failure
+  | Concern | Resolution |
-```
+  |---------|------------|
  | Deadlocks while capturing | Screenshot helper runs outside `harness_history_mutex_`; mutex is reacquired only for bookkeeping. |
  | Non-gRPC builds | Auto-capture logs a descriptive "unavailable" message and skips the SDL call, keeping deterministic behaviour when harness is stubbed. |
  | Artifact collisions | Paths are timestamped and namespaced per test ID; directories are created idempotently with error-code handling. |
  | Large widget dumps | Stored as JSON strings; CLI wraps them with quoting so they can be piped to `jq`/`yq` safely. |
-### Verify Screenshot Captured
+  ---
-```bash
+  ## Usage
 # 4. Check for auto-captured screenshot
 ls -lh /tmp/yaze_test_*_failure.bmp
-# Expected: BMP file created (5.3MB)
+  1. Trigger a harness failure (e.g. click a nonexistent widget):
-```
+     ```bash
     z3ed agent test --prompt "Click widget:nonexistent"
     ```
  2. Fetch diagnostics:
     ```bash
     z3ed agent test results --test-id grpc_click_deadbeef --include-logs --format json
     ```
  3. Inspect artifacts:
     ```bash
     open "$(jq -r '.screenshot_path' results.json)"
     ```
  Example YAML excerpt:
  ```yaml
  screenshot_path: "/var/folders/.../yaze/test-results/grpc_click_deadbeef/failure_1727890045123.bmp"
  screenshot_size_bytes: 5308538
  failure_context: "frame=1287 current_window=MainWindow nav_window=Agent hovered_window=Agent active_id=0x00000000 hovered_id=0x00000000"
  widget_state: '{"active_window":"MainWindow","visible_windows":["MainWindow","Agent"],"focused_widget":null}'
  ```
  ---
  ## Validation
  - Manual harness failure emits screenshot + widget dump under `/tmp`.
  - `GetTestResults` returns the new fields (verified via `grpcurl`).
  - CLI JSON/YAML output includes diagnostics with correct escaping.
  - Non-gRPC build path compiles (guarded sections).
  ---
  ## Follow-Up
  - IT-08c leverages the persisted widget JSON to produce HTML bundles.
  - IT-08d will standardize error envelopes across CLI/services using these
    diagnostics.
  - Investigate persisting artifacts under configurable directories
    (`--artifact-dir`) for CI separation.
 ### Query Test Results
--- a/src/app/app.cmake
+++ b/src/app/app.cmake
@@ -264,6 +264,8 @@ if(YAZE_WITH_GRPC)
  target_sources(yaze PRIVATE
    ${CMAKE_SOURCE_DIR}/src/app/core/service/imgui_test_harness_service.cc
    ${CMAKE_SOURCE_DIR}/src/app/core/service/imgui_test_harness_service.h
    ${CMAKE_SOURCE_DIR}/src/app/core/service/screenshot_utils.cc
    ${CMAKE_SOURCE_DIR}/src/app/core/service/screenshot_utils.h
    ${CMAKE_SOURCE_DIR}/src/app/core/service/widget_discovery_service.cc
    ${CMAKE_SOURCE_DIR}/src/app/core/service/widget_discovery_service.h
    ${CMAKE_SOURCE_DIR}/src/app/core/testing/test_recorder.cc
--- a/src/app/core/service/imgui_test_harness_service.cc
+++ b/src/app/core/service/imgui_test_harness_service.cc
@@ -24,6 +24,7 @@
 #include "absl/time/time.h"
 #include "app/core/proto/imgui_test_harness.grpc.pb.h"
 #include "app/core/proto/imgui_test_harness.pb.h"
 #include "app/core/service/screenshot_utils.h"
 #include "app/core/testing/test_script_parser.h"
 #include "app/test/test_manager.h"
 #include "yaze.h"  // For YAZE_VERSION_STRING
@@ -1187,82 +1188,30 @@ absl::Status ImGuiTestHarnessServiceImpl::Assert(const AssertRequest* request,
  return finalize(absl::OkStatus());
 }
 // Helper struct matching imgui_impl_sdlrenderer2.cpp backend data
 struct ImGui_ImplSDLRenderer2_Data {
  SDL_Renderer* Renderer;
 };
 absl::Status ImGuiTestHarnessServiceImpl::Screenshot(
    const ScreenshotRequest* request, ScreenshotResponse* response) {
-  // Get the SDL renderer from ImGui backend
+  if (!response) {
-  ImGuiIO& io = ImGui::GetIO();
+    return absl::InvalidArgumentError("response cannot be null");
-  auto* backend_data = static_cast<ImGui_ImplSDLRenderer2_Data*>(io.BackendRendererUserData);
+  }
-  
+
-  if (!backend_data || !backend_data->Renderer) {
+  const std::string requested_path =
      request ? request->output_path() : std::string();
  absl::StatusOr<ScreenshotArtifact> artifact_or =
      CaptureHarnessScreenshot(requested_path);
  if (!artifact_or.ok()) {
    response->set_success(false);
-    response->set_message("SDL renderer not available");
+    response->set_message(std::string(artifact_or.status().message()));
-    return absl::FailedPreconditionError("No SDL renderer available");
+    return artifact_or.status();
  }
-  
+
-  SDL_Renderer* renderer = backend_data->Renderer;
+  const ScreenshotArtifact& artifact = *artifact_or;
  // Get renderer output size
  int width, height;
  if (SDL_GetRendererOutputSize(renderer, &width, &height) != 0) {
    response->set_success(false);
    response->set_message(absl::StrFormat("Failed to get renderer size: %s", SDL_GetError()));
    return absl::InternalError("Failed to get renderer output size");
  }
  // Create surface to hold screenshot
  SDL_Surface* surface = SDL_CreateRGBSurface(0, width, height, 32,
                                              0x00FF0000, 0x0000FF00,
                                              0x000000FF, 0xFF000000);
  if (!surface) {
    response->set_success(false);
    response->set_message(absl::StrFormat("Failed to create surface: %s", SDL_GetError()));
    return absl::InternalError("Failed to create SDL surface");
  }
  // Read pixels from renderer
  if (SDL_RenderReadPixels(renderer, nullptr, SDL_PIXELFORMAT_ARGB8888,
                          surface->pixels, surface->pitch) != 0) {
    SDL_FreeSurface(surface);
    response->set_success(false);
    response->set_message(absl::StrFormat("Failed to read pixels: %s", SDL_GetError()));
    return absl::InternalError("Failed to read renderer pixels");
  }
  // Determine output path
  std::string output_path = request->output_path();
  if (output_path.empty()) {
    // Default: /tmp/yaze_screenshot_<timestamp>.bmp
    output_path = absl::StrFormat("/tmp/yaze_screenshot_%lld.bmp",
                                  absl::ToUnixMillis(absl::Now()));
  }
  // Save to BMP file (SDL built-in, no external deps needed)
  if (SDL_SaveBMP(surface, output_path.c_str()) != 0) {
    SDL_FreeSurface(surface);
    response->set_success(false);
    response->set_message(absl::StrFormat("Failed to save BMP: %s", SDL_GetError()));
    return absl::InternalError("Failed to save screenshot");
  }
  // Get file size
  std::ifstream file(output_path, std::ios::binary | std::ios::ate);
  int64_t file_size = file.tellg();
  file.close();
  // Clean up and return success
  SDL_FreeSurface(surface);
  response->set_success(true);
  response->set_message(absl::StrFormat("Screenshot saved to %s (%dx%d)",
-                                        output_path, width, height));
+                                        artifact.file_path, artifact.width,
-  response->set_file_path(output_path);
+                                        artifact.height));
-  response->set_file_size_bytes(file_size);
+  response->set_file_path(artifact.file_path);
-  
+  response->set_file_size_bytes(artifact.file_size_bytes);
  return absl::OkStatus();
 }
--- a/src/app/core/service/screenshot_utils.cc
+++ b/src/app/core/service/screenshot_utils.cc
@@ -0,0 +1,108 @@
 #include "app/core/service/screenshot_utils.h"
 #ifdef YAZE_WITH_GRPC
 #include <SDL.h>
 #include <filesystem>
 #include <string>
 #include "absl/status/status.h"
 #include "absl/status/statusor.h"
 #include "absl/strings/str_format.h"
 #include "absl/time/clock.h"
 #include "imgui.h"
 namespace yaze {
 namespace test {
 namespace {
 struct ImGui_ImplSDLRenderer2_Data {
  SDL_Renderer* Renderer;
 };
 std::filesystem::path DefaultScreenshotPath() {
  std::filesystem::path base_dir =
      std::filesystem::temp_directory_path() / "yaze" / "test-results";
  std::error_code ec;
  std::filesystem::create_directories(base_dir, ec);
  const int64_t timestamp_ms = absl::ToUnixMillis(absl::Now());
  return base_dir /
         std::filesystem::path(
             absl::StrFormat("harness_%lld.bmp", static_cast<long long>(timestamp_ms)));
 }
 }  // namespace
 absl::StatusOr<ScreenshotArtifact> CaptureHarnessScreenshot(
    const std::string& preferred_path) {
  ImGuiIO& io = ImGui::GetIO();
  auto* backend_data =
      static_cast<ImGui_ImplSDLRenderer2_Data*>(io.BackendRendererUserData);
  if (!backend_data || !backend_data->Renderer) {
    return absl::FailedPreconditionError("SDL renderer not available");
  }
  SDL_Renderer* renderer = backend_data->Renderer;
  int width = 0;
  int height = 0;
  if (SDL_GetRendererOutputSize(renderer, &width, &height) != 0) {
    return absl::InternalError(
        absl::StrFormat("Failed to get renderer size: %s", SDL_GetError()));
  }
  std::filesystem::path output_path = preferred_path.empty()
                                         ? DefaultScreenshotPath()
                                         : std::filesystem::path(preferred_path);
  if (output_path.has_parent_path()) {
    std::error_code ec;
    std::filesystem::create_directories(output_path.parent_path(), ec);
  }
  SDL_Surface* surface = SDL_CreateRGBSurface(0, width, height, 32, 0x00FF0000,
                                              0x0000FF00, 0x000000FF,
                                              0xFF000000);
  if (!surface) {
    return absl::InternalError(
        absl::StrFormat("Failed to create SDL surface: %s", SDL_GetError()));
  }
  if (SDL_RenderReadPixels(renderer, nullptr, SDL_PIXELFORMAT_ARGB8888,
                           surface->pixels, surface->pitch) != 0) {
    SDL_FreeSurface(surface);
    return absl::InternalError(
        absl::StrFormat("Failed to read renderer pixels: %s", SDL_GetError()));
  }
  if (SDL_SaveBMP(surface, output_path.string().c_str()) != 0) {
    SDL_FreeSurface(surface);
    return absl::InternalError(
        absl::StrFormat("Failed to save BMP: %s", SDL_GetError()));
  }
  SDL_FreeSurface(surface);
  std::error_code ec;
  const int64_t file_size =
      std::filesystem::file_size(output_path, ec);
  if (ec) {
    return absl::InternalError(
        absl::StrFormat("Failed to stat screenshot %s: %s",
                        output_path.string(), ec.message()));
  }
  ScreenshotArtifact artifact;
  artifact.file_path = output_path.string();
  artifact.width = width;
  artifact.height = height;
  artifact.file_size_bytes = file_size;
  return artifact;
 }
 }  // namespace test
 }  // namespace yaze
 #endif  // YAZE_WITH_GRPC
--- a/src/app/core/service/screenshot_utils.h
+++ b/src/app/core/service/screenshot_utils.h
@@ -0,0 +1,31 @@
 #ifndef YAZE_APP_CORE_SERVICE_SCREENSHOT_UTILS_H_
 #define YAZE_APP_CORE_SERVICE_SCREENSHOT_UTILS_H_
 #ifdef YAZE_WITH_GRPC
 #include <string>
 #include "absl/status/statusor.h"
 namespace yaze {
 namespace test {
 struct ScreenshotArtifact {
  std::string file_path;
  int width = 0;
  int height = 0;
  int64_t file_size_bytes = 0;
 };
 // Captures the current renderer output into a BMP file.
 // If preferred_path is empty, an appropriate path under the system temp
 // directory is generated automatically. Returns the resolved artifact metadata
 // on success.
 absl::StatusOr<ScreenshotArtifact> CaptureHarnessScreenshot(
    const std::string& preferred_path = "");
 }  // namespace test
 }  // namespace yaze
 #endif  // YAZE_WITH_GRPC
 #endif  // YAZE_APP_CORE_SERVICE_SCREENSHOT_UTILS_H_
--- a/src/app/test/test_manager.cc
+++ b/src/app/test/test_manager.cc
@@ -1,14 +1,17 @@
 #include "app/test/test_manager.h"
 #include <algorithm>
 #include <filesystem>
 #include <random>
 #include "absl/status/statusor.h"
 #include "absl/strings/str_cat.h"
 #include "absl/strings/str_format.h"
 #include "absl/strings/str_replace.h"
 #include "absl/synchronization/mutex.h"
 #include "absl/time/clock.h"
 #include "absl/time/time.h"
 #include "app/core/service/screenshot_utils.h"
 #include "app/core/widget_state_capture.h"
 #include "app/core/features.h"
 #include "app/core/platform/file_dialog.h"
@@ -36,6 +39,25 @@ class EditorManager;
 namespace yaze {
 namespace test {
 namespace {
 std::string GenerateFailureScreenshotPath(const std::string& test_id) {
  std::filesystem::path base_dir =
      std::filesystem::temp_directory_path() / "yaze" / "test-results" /
      test_id;
  std::error_code ec;
  std::filesystem::create_directories(base_dir, ec);
  const int64_t timestamp_ms = absl::ToUnixMillis(absl::Now());
  std::filesystem::path file_path =
      base_dir /
      std::filesystem::path(absl::StrFormat(
          "failure_%lld.bmp", static_cast<long long>(timestamp_ms)));
  return file_path.string();
 }
 }  // namespace
 // Utility function implementations
 const char* TestStatusToString(TestStatus status) {
  switch (status) {
@@ -1636,68 +1658,117 @@ void TestManager::TrimHarnessHistoryLocked() {
 void TestManager::CaptureFailureContext(const std::string& test_id) {
  // IT-08b: Capture failure diagnostics
  // Note: This method is called with the harness_history_mutex_ unlocked
-  // to avoid deadlock when Screenshot RPC calls back into TestManager
+  // to avoid deadlock when Screenshot helper touches SDL state.
-  absl::MutexLock lock(&harness_history_mutex_);
+  // 1. Capture execution context metadata from ImGui.
-  auto it = harness_history_.find(test_id);
+  std::string failure_context;
  if (it == harness_history_.end()) {
    return;
  }
  HarnessTestExecution& execution = it->second;
  // 1. Capture execution context (frame count, active window, etc.)
  ImGuiContext* ctx = ImGui::GetCurrentContext();
  if (ctx != nullptr) {
 #if defined(YAZE_ENABLE_IMGUI_TEST_ENGINE) && YAZE_ENABLE_IMGUI_TEST_ENGINE
-  ImGuiWindow* current_window = ctx->CurrentWindow;
+    ImGuiWindow* current_window = ctx->CurrentWindow;
-  ImGuiWindow* nav_window = ctx->NavWindow;
+    ImGuiWindow* nav_window = ctx->NavWindow;
-  ImGuiWindow* hovered_window = ctx->HoveredWindow;
+    ImGuiWindow* hovered_window = ctx->HoveredWindow;
-  const char* current_name =
+    const char* current_name =
-    (current_window && current_window->Name) ? current_window->Name : "none";
+        (current_window && current_window->Name) ? current_window->Name : "none";
-  const char* nav_name =
+    const char* nav_name =
-    (nav_window && nav_window->Name) ? nav_window->Name : "none";
+        (nav_window && nav_window->Name) ? nav_window->Name : "none";
-  const char* hovered_name =
+    const char* hovered_name = (hovered_window && hovered_window->Name)
-    (hovered_window && hovered_window->Name) ? hovered_window->Name : "none";
+                                    ? hovered_window->Name
                                    : "none";
-  ImGuiID active_id = ImGui::GetActiveID();
+    ImGuiID active_id = ImGui::GetActiveID();
-  ImGuiID hovered_id = ImGui::GetHoveredID();
+    ImGuiID hovered_id = ImGui::GetHoveredID();
-  execution.failure_context =
+    failure_context = absl::StrFormat(
-    absl::StrFormat(
+        "frame=%d current_window=%s nav_window=%s hovered_window=%s "
-      "frame=%d current_window=%s nav_window=%s hovered_window=%s active_id=0x%08X hovered_id=0x%08X",
+        "active_id=0x%08X hovered_id=0x%08X",
-      ImGui::GetFrameCount(), current_name, nav_name, hovered_name,
+        ImGui::GetFrameCount(), current_name, nav_name, hovered_name,
-      active_id, hovered_id);
+        active_id, hovered_id);
 #else
-  execution.failure_context =
+    failure_context =
-    absl::StrFormat("frame=%d", ImGui::GetFrameCount());
+        absl::StrFormat("frame=%d", ImGui::GetFrameCount());
 #endif
  } else {
-  execution.failure_context = "ImGui context not available";
+    failure_context = "ImGui context not available";
  }
-  // 2. Screenshot capture would happen here via gRPC call
+  std::string artifact_path;
-  // Note: Screenshot RPC implementation is in ImGuiTestHarnessServiceImpl
+  {
-  // The screenshot_path will be set by the RPC handler when it completes
+    absl::MutexLock lock(&harness_history_mutex_);
-  // For now, we just set a placeholder path to indicate where it should be saved
+    auto it = harness_history_.find(test_id);
-  if (execution.screenshot_path.empty()) {
+    if (it == harness_history_.end()) {
-    execution.screenshot_path =
+      return;
-        absl::StrFormat("/tmp/yaze_test_%s_failure.bmp", test_id);
+    }
-    execution.screenshot_size_bytes = 0;
+
    HarnessTestExecution& execution = it->second;
    execution.failure_context = failure_context;
    if (execution.screenshot_path.empty()) {
      execution.screenshot_path = GenerateFailureScreenshotPath(test_id);
    }
    artifact_path = execution.screenshot_path;
  }
-  // 3. Widget state capture (IT-08c)
+  // 2. Capture widget state snapshot (IT-08c) and failure screenshot.
-  execution.widget_state = core::CaptureWidgetState();
+  std::string widget_state = core::CaptureWidgetState();
 #if defined(YAZE_WITH_GRPC)
  absl::StatusOr<ScreenshotArtifact> screenshot_artifact =
      CaptureHarnessScreenshot(artifact_path);
 #endif
-  // Keep aggregate cache in sync with the latest execution snapshot.
+  {
-  auto aggregate_it = harness_aggregates_.find(execution.name);
+    absl::MutexLock lock(&harness_history_mutex_);
-  if (aggregate_it != harness_aggregates_.end()) {
+    auto it = harness_history_.find(test_id);
-    aggregate_it->second.latest_execution = execution;
+    if (it == harness_history_.end()) {
      return;
    }
    HarnessTestExecution& execution = it->second;
    execution.failure_context = failure_context;
    execution.widget_state = widget_state;
 #if defined(YAZE_WITH_GRPC)
    if (screenshot_artifact.ok()) {
      execution.screenshot_path = screenshot_artifact->file_path;
      execution.screenshot_size_bytes = screenshot_artifact->file_size_bytes;
      execution.logs.push_back(absl::StrFormat(
          "[auto-capture] Failure screenshot saved to %s (%lld bytes)",
          execution.screenshot_path,
          static_cast<long long>(execution.screenshot_size_bytes)));
    } else {
      execution.logs.push_back(absl::StrFormat(
          "[auto-capture] Screenshot capture failed: %s",
          screenshot_artifact.status().message()));
    }
 #else
    execution.logs.push_back(
        "[auto-capture] Screenshot capture unavailable (YAZE_WITH_GRPC=OFF)");
 #endif
    // Keep aggregate cache in sync with the latest execution snapshot.
    auto aggregate_it = harness_aggregates_.find(execution.name);
    if (aggregate_it != harness_aggregates_.end()) {
      aggregate_it->second.latest_execution = execution;
    }
  }
-  util::logf("[TestManager] Captured failure context for test %s: %s",
+#if defined(YAZE_WITH_GRPC)
-             test_id.c_str(), execution.failure_context.c_str());
+  if (screenshot_artifact.ok()) {
-  util::logf("[TestManager] Widget state: %s", execution.widget_state.c_str());
+    util::logf("[TestManager] Captured failure context for test %s: %s",
               test_id.c_str(), failure_context.c_str());
    util::logf("[TestManager] Failure screenshot stored at %s (%lld bytes)",
               screenshot_artifact->file_path.c_str(),
               static_cast<long long>(screenshot_artifact->file_size_bytes));
  } else {
    util::logf("[TestManager] Failed to capture screenshot for test %s: %s",
               test_id.c_str(),
               screenshot_artifact.status().ToString().c_str());
  }
 #else
  util::logf(
      "[TestManager] Screenshot capture unavailable (YAZE_WITH_GRPC=OFF) for test %s",
      test_id.c_str());
 #endif
  util::logf("[TestManager] Widget state: %s", widget_state.c_str());
 }
 }  // namespace test
--- a/src/cli/handlers/agent/test_commands.cc
+++ b/src/cli/handlers/agent/test_commands.cc
@@ -739,6 +739,30 @@ absl::Status HandleTestResultsCommand(const std::vector<std::string>& arg_vec) {
      std::cout << "[],\n";
    }
    std::cout << "  \"screenshot_path\": ";
    if (details.screenshot_path.empty()) {
      std::cout << "null,\n";
    } else {
      std::cout << "\"" << JsonEscape(details.screenshot_path) << "\",\n";
    }
    std::cout << "  \"screenshot_size_bytes\": "
              << details.screenshot_size_bytes << ",\n";
    std::cout << "  \"failure_context\": ";
    if (details.failure_context.empty()) {
      std::cout << "null,\n";
    } else {
      std::cout << "\"" << JsonEscape(details.failure_context) << "\",\n";
    }
    std::cout << "  \"widget_state\": ";
    if (details.widget_state.empty()) {
      std::cout << "null,\n";
    } else {
      std::cout << "\"" << JsonEscape(details.widget_state) << "\",\n";
    }
    std::cout << "  \"metrics\": ";
    if (!details.metrics.empty()) {
      std::cout << "{\n";
@@ -807,6 +831,27 @@ absl::Status HandleTestResultsCommand(const std::vector<std::string>& arg_vec) {
        std::cout << "  " << key << ": " << value << "\n";
      }
    }
    if (details.screenshot_path.empty()) {
      std::cout << "screenshot_path: null\n";
    } else {
      std::cout << "screenshot_path: "
                << YamlQuote(details.screenshot_path) << "\n";
    }
    std::cout << "screenshot_size_bytes: " << details.screenshot_size_bytes
              << "\n";
    if (details.failure_context.empty()) {
      std::cout << "failure_context: null\n";
    } else {
      std::cout << "failure_context: "
                << YamlQuote(details.failure_context) << "\n";
    }
    if (details.widget_state.empty()) {
      std::cout << "widget_state: null\n";
    } else {
      std::cout << "widget_state: " << YamlQuote(details.widget_state)
                << "\n";
    }
  }
  return absl::OkStatus();
--- a/src/cli/service/gui_automation_client.cc
+++ b/src/cli/service/gui_automation_client.cc
@@ -463,6 +463,11 @@ absl::StatusOr<TestResultDetails> GuiAutomationClient::GetTestResults(
    result.metrics.emplace(metric.first, metric.second);
  }
  result.screenshot_path = response.screenshot_path();
  result.screenshot_size_bytes = response.screenshot_size_bytes();
  result.failure_context = response.failure_context();
  result.widget_state = response.widget_state();
  return result;
 #else
  return absl::UnimplementedError("gRPC not available");
--- a/src/cli/service/gui_automation_client.h
+++ b/src/cli/service/gui_automation_client.h
@@ -119,6 +119,10 @@ struct TestResultDetails {
  std::vector<AssertionOutcome> assertions;
  std::vector<std::string> logs;
  std::map<std::string, int> metrics;
  std::string screenshot_path;
  int64_t screenshot_size_bytes = 0;
  std::string failure_context;
  std::string widget_state;
 };
 enum class WidgetTypeFilter {