Update documentation

2025-10-02 20:55:28 -04:00
parent e3621d7a1f
commit 0fb8ba4202
9 changed files with 1059 additions and 1997 deletions
--- a/docs/z3ed/AW-04-POLICY-FRAMEWORK.md
+++ b/docs/z3ed/AW-04-POLICY-FRAMEWORK.md
@@ -1,627 +0,0 @@
-# Policy Evaluation Framework (AW-04)
-
-**Status**: Implementation In Progress  
-**Priority**: High (Next Phase)  
-**Time Estimate**: 6-8 hours  
-**Last Updated**: October 2, 2025
-
-## Overview
-
-The Policy Evaluation Framework provides a YAML-based constraint system for gating proposal acceptance in the z3ed agent workflow. It ensures that AI-generated ROM modifications meet quality, safety, and testing requirements before being merged into the main ROM.
-
-## Goals
-
-1. **Quality Gates**: Enforce minimum test pass rates and code quality standards
-2. **Safety Constraints**: Prevent modifications to critical ROM regions (headers, checksums)
-3. **Scope Limits**: Restrict changes to reasonable byte counts and specific banks
-4. **Human Review**: Require manual review for large or complex changes
-5. **Flexibility**: Allow policy overrides with confirmation and logging
-
-## Architecture
-
-```
-┌─────────────────────────────────────────────────────────┐
-│ ProposalDrawer (GUI)                                    │
-│  └─ Accept button gated by PolicyEvaluator             │
-└────────────────────┬────────────────────────────────────┘
-                     │
-                     ▼
-┌─────────────────────────────────────────────────────────┐
-│ PolicyEvaluator (Singleton Service)                     │
-│  ├─ LoadPolicies() from .yaze/policies/                │
-│  ├─ EvaluateProposal(proposal_id) → PolicyResult       │
-│  └─ Cache of parsed YAML policies                      │
-└────────────────────┬────────────────────────────────────┘
-                     │
-                     ▼
-┌─────────────────────────────────────────────────────────┐
-│ .yaze/policies/agent.yaml (YAML Configuration)          │
-│  ├─ test_requirements (min pass rates)                 │
-│  ├─ change_constraints (byte limits, allowed banks)    │
-│  ├─ review_requirements (human review triggers)        │
-│  └─ forbidden_ranges (protected ROM regions)           │
-└─────────────────────────────────────────────────────────┘
-```
-
-## YAML Policy Schema
-
-### Example Policy File
-
-```yaml
-# .yaze/policies/agent.yaml
-version: 1.0
-enabled: true
-
-policies:
-  # Policy 1: Test Requirements
-  - name: require_tests
-    type: test_requirement
-    enabled: true
-    severity: critical  # critical | warning | info
-    rules:
-      - test_suite: "overworld_rendering"
-        min_pass_rate: 0.95
-      - test_suite: "palette_integrity"
-        min_pass_rate: 1.0
-      - test_suite: "dungeon_logic"
-        min_pass_rate: 0.90
-    message: "All required test suites must pass before accepting proposal"
-
-  # Policy 2: Change Scope Limits
-  - name: limit_change_scope
-    type: change_constraint
-    enabled: true
-    severity: critical
-    rules:
-      - max_bytes_changed: 10240  # 10KB limit
-      - allowed_banks: [0x00, 0x01, 0x0E, 0x0F]  # Graphics banks only
-      - max_commands_executed: 20
-    message: "Proposal exceeds allowed change scope"
-
-  # Policy 3: Protected ROM Regions
-  - name: protect_critical_regions
-    type: forbidden_range
-    enabled: true
-    severity: critical
-    ranges:
-      - start: 0xFFB0  # ROM header
-        end: 0xFFFF
-        reason: "ROM header is protected"
-      - start: 0x00FFC0  # Internal header
-        end: 0x00FFDF
-        reason: "Internal ROM header"
-    message: "Proposal modifies protected ROM region"
-
-  # Policy 4: Human Review Requirements
-  - name: human_review_required
-    type: review_requirement
-    enabled: true
-    severity: warning
-    conditions:
-      - if: bytes_changed > 1024
-        then: require_diff_review
-        message: "Large change requires diff review"
-      - if: commands_executed > 10
-        then: require_log_review
-        message: "Complex operation requires log review"
-      - if: test_failures > 0
-        then: require_explanation
-        message: "Test failures require explanation"
-
-  # Policy 5: Palette Modifications
-  - name: palette_safety
-    type: change_constraint
-    enabled: true
-    severity: warning
-    rules:
-      - max_palettes_changed: 5
-      - preserve_transparency: true  # Don't modify color index 0
-    message: "Palette changes exceed safety threshold"
-```
-
-### Schema Definition
-
-```yaml
-# Policy file structure
-version: string  # Semantic version (e.g., "1.0")
-enabled: boolean  # Master enable/disable
-
-policies:
-  - name: string  # Unique policy identifier
-    type: enum  # test_requirement | change_constraint | forbidden_range | review_requirement
-    enabled: boolean  # Policy-specific enable/disable
-    severity: enum  # critical | warning | info
-    
-    # Type-specific fields:
-    rules: array  # For test_requirement, change_constraint
-    ranges: array  # For forbidden_range
-    conditions: array  # For review_requirement
-    
-    message: string  # User-facing error message
-```
-
-## Implementation Plan
-
-### Phase 1: Core Infrastructure (2 hours)
-
-#### 1.1 Create PolicyEvaluator Service
-
-**File**: `src/cli/service/policy_evaluator.h`
-
-```cpp
-#ifndef YAZE_CLI_SERVICE_POLICY_EVALUATOR_H
-#define YAZE_CLI_SERVICE_POLICY_EVALUATOR_H
-
-#include <string>
-#include <vector>
-#include <memory>
-#include "absl/status/status.h"
-#include "absl/status/statusor.h"
-#include "absl/strings/string_view.h"
-
-namespace yaze {
-namespace cli {
-
-// Policy violation severity levels
-enum class PolicySeverity {
-  kInfo,      // Informational, doesn't block acceptance
-  kWarning,   // Warning, can be overridden
-  kCritical   // Critical, blocks acceptance
-};
-
-// Individual policy violation
-struct PolicyViolation {
-  std::string policy_name;
-  PolicySeverity severity;
-  std::string message;
-  std::string details;  // Additional context
-};
-
-// Result of policy evaluation
-struct PolicyResult {
-  bool passed;  // True if all critical policies passed
-  std::vector<PolicyViolation> violations;
-  
-  // Categorized violations
-  std::vector<PolicyViolation> critical_violations;
-  std::vector<PolicyViolation> warnings;
-  std::vector<PolicyViolation> info;
-  
-  // Helper methods
-  bool has_critical_violations() const { return !critical_violations.empty(); }
-  bool can_accept_with_override() const { 
-    return !has_critical_violations() && !warnings.empty(); 
-  }
-};
-
-// Singleton service for evaluating proposals against policies
-class PolicyEvaluator {
- public:
-  static PolicyEvaluator& GetInstance();
-  
-  // Load policies from disk (.yaze/policies/agent.yaml)
-  absl::Status LoadPolicies(absl::string_view policy_dir = ".yaze/policies");
-  
-  // Evaluate a proposal against all loaded policies
-  absl::StatusOr<PolicyResult> EvaluateProposal(
-      absl::string_view proposal_id);
-  
-  // Reload policies from disk (for live editing)
-  absl::Status ReloadPolicies();
-  
-  // Check if policies are loaded and enabled
-  bool IsEnabled() const { return enabled_; }
-  
-  // Get policy configuration path
-  std::string GetPolicyPath() const { return policy_path_; }
-
- private:
-  PolicyEvaluator() = default;
-  ~PolicyEvaluator() = default;
-  
-  // Non-copyable, non-movable
-  PolicyEvaluator(const PolicyEvaluator&) = delete;
-  PolicyEvaluator& operator=(const PolicyEvaluator&) = delete;
-  
-  // Parse YAML policy file
-  absl::Status ParsePolicyFile(absl::string_view yaml_content);
-  
-  // Evaluate individual policy types
-  void EvaluateTestRequirements(
-      absl::string_view proposal_id, PolicyResult* result);
-  void EvaluateChangeConstraints(
-      absl::string_view proposal_id, PolicyResult* result);
-  void EvaluateForbiddenRanges(
-      absl::string_view proposal_id, PolicyResult* result);
-  void EvaluateReviewRequirements(
-      absl::string_view proposal_id, PolicyResult* result);
-  
-  bool enabled_ = false;
-  std::string policy_path_;
-  
-  // Parsed policy structures (implementation detail)
-  struct PolicyConfig;
-  std::unique_ptr<PolicyConfig> config_;
-};
-
-}  // namespace cli
-}  // namespace yaze
-
-#endif  // YAZE_CLI_SERVICE_POLICY_EVALUATOR_H
-```
-
-#### 1.2 Create Policy Configuration Structures
-
-**File**: `src/cli/service/policy_evaluator.cc` (partial)
-
-```cpp
-#include "src/cli/service/policy_evaluator.h"
-
-#include <fstream>
-#include <sstream>
-#include "absl/strings/str_format.h"
-#include "src/cli/service/proposal_registry.h"
-
-// If YAML parsing is available
-#ifdef YAZE_WITH_YAML
-#include <yaml-cpp/yaml.h>
-#endif
-
-namespace yaze {
-namespace cli {
-
-// Internal policy configuration structures
-struct PolicyEvaluator::PolicyConfig {
-  std::string version;
-  bool enabled;
-  
-  struct TestRequirement {
-    std::string name;
-    bool enabled;
-    PolicySeverity severity;
-    std::vector<std::pair<std::string, double>> test_suites;  // suite name → min pass rate
-    std::string message;
-  };
-  
-  struct ChangeConstraint {
-    std::string name;
-    bool enabled;
-    PolicySeverity severity;
-    int max_bytes_changed = -1;
-    std::vector<int> allowed_banks;
-    int max_commands_executed = -1;
-    int max_palettes_changed = -1;
-    bool preserve_transparency = false;
-    std::string message;
-  };
-  
-  struct ForbiddenRange {
-    std::string name;
-    bool enabled;
-    PolicySeverity severity;
-    std::vector<std::tuple<int, int, std::string>> ranges;  // start, end, reason
-    std::string message;
-  };
-  
-  struct ReviewRequirement {
-    std::string name;
-    bool enabled;
-    PolicySeverity severity;
-    std::vector<std::string> conditions;
-    std::string message;
-  };
-  
-  std::vector<TestRequirement> test_requirements;
-  std::vector<ChangeConstraint> change_constraints;
-  std::vector<ForbiddenRange> forbidden_ranges;
-  std::vector<ReviewRequirement> review_requirements;
-};
-
-// Singleton instance
-PolicyEvaluator& PolicyEvaluator::GetInstance() {
-  static PolicyEvaluator instance;
-  return instance;
-}
-
-absl::Status PolicyEvaluator::LoadPolicies(absl::string_view policy_dir) {
-  policy_path_ = absl::StrFormat("%s/agent.yaml", policy_dir);
-  
-  // Check if file exists
-  std::ifstream file(policy_path_);
-  if (!file.good()) {
-    // No policy file - policies disabled
-    enabled_ = false;
-    return absl::OkStatus();
-  }
-  
-  // Read file content
-  std::stringstream buffer;
-  buffer << file.rdbuf();
-  std::string yaml_content = buffer.str();
-  
-  return ParsePolicyFile(yaml_content);
-}
-
-absl::Status PolicyEvaluator::ParsePolicyFile(absl::string_view yaml_content) {
-#ifndef YAZE_WITH_YAML
-  return absl::UnimplementedError(
-      "YAML support not compiled. Build with YAZE_WITH_YAML=ON");
-#else
-  try {
-    YAML::Node root = YAML::Load(std::string(yaml_content));
-    
-    config_ = std::make_unique<PolicyConfig>();
-    config_->version = root["version"].as<std::string>("1.0");
-    config_->enabled = root["enabled"].as<bool>(true);
-    
-    if (!config_->enabled) {
-      enabled_ = false;
-      return absl::OkStatus();
-    }
-    
-    // Parse policies array
-    if (root["policies"]) {
-      for (const auto& policy_node : root["policies"]) {
-        std::string type = policy_node["type"].as<std::string>();
-        
-        if (type == "test_requirement") {
-          // Parse test requirement policy
-          // ... (implementation continues)
-        } else if (type == "change_constraint") {
-          // Parse change constraint policy
-          // ... (implementation continues)
-        } else if (type == "forbidden_range") {
-          // Parse forbidden range policy
-          // ... (implementation continues)
-        } else if (type == "review_requirement") {
-          // Parse review requirement policy
-          // ... (implementation continues)
-        }
-      }
-    }
-    
-    enabled_ = true;
-    return absl::OkStatus();
-    
-  } catch (const YAML::Exception& e) {
-    return absl::InvalidArgumentError(
-        absl::StrFormat("Failed to parse policy YAML: %s", e.what()));
-  }
-#endif
-}
-
-// ... (implementation continues with evaluation methods)
-
-}  // namespace cli
-}  // namespace yaze
-```
-
-### Phase 2: Policy Evaluation Logic (2-3 hours)
-
-Implement the core evaluation methods that check proposals against each policy type.
-
-### Phase 3: GUI Integration (2 hours)
-
-#### 3.1 Update ProposalDrawer
-
-**File**: `src/app/editor/system/proposal_drawer.cc`
-
-Add policy status display and gating logic:
-
-```cpp
-#include "src/cli/service/policy_evaluator.h"
-
-void ProposalDrawer::DrawProposalDetail(const std::string& proposal_id) {
-  // ... existing detail view code ...
-  
-  // === Policy Status Section ===
-  ImGui::Separator();
-  ImGui::TextUnformatted("Policy Status:");
-  
-  auto& policy_eval = cli::PolicyEvaluator::GetInstance();
-  if (policy_eval.IsEnabled()) {
-    auto policy_result = policy_eval.EvaluateProposal(proposal_id);
-    
-    if (policy_result.ok()) {
-      const auto& result = policy_result.value();
-      
-      if (result.passed) {
-        ImGui::TextColored(ImVec4(0, 1, 0, 1), "✓ All policies passed");
-      } else {
-        // Show violations
-        if (result.has_critical_violations()) {
-          ImGui::TextColored(ImVec4(1, 0, 0, 1), "⛔ Critical violations:");
-          for (const auto& violation : result.critical_violations) {
-            ImGui::BulletText("%s: %s", 
-                violation.policy_name.c_str(), 
-                violation.message.c_str());
-          }
-        }
-        
-        if (!result.warnings.empty()) {
-          ImGui::TextColored(ImVec4(1, 1, 0, 1), "⚠️ Warnings:");
-          for (const auto& violation : result.warnings) {
-            ImGui::BulletText("%s: %s", 
-                violation.policy_name.c_str(), 
-                violation.message.c_str());
-          }
-        }
-      }
-      
-      // Gate Accept button
-      ImGui::Separator();
-      bool can_accept = !result.has_critical_violations();
-      
-      if (!can_accept) {
-        ImGui::BeginDisabled();
-      }
-      
-      if (ImGui::Button("Accept Proposal")) {
-        if (result.can_accept_with_override() && !override_confirmed_) {
-          // Show override confirmation dialog
-          ImGui::OpenPopup("Override Policy");
-        } else {
-          AcceptProposal(proposal_id);
-        }
-      }
-      
-      if (!can_accept) {
-        ImGui::EndDisabled();
-        ImGui::SameLine();
-        ImGui::TextColored(ImVec4(1, 0, 0, 1), 
-            "(Accept blocked by policy violations)");
-      }
-      
-      // Override confirmation dialog
-      if (ImGui::BeginPopupModal("Override Policy", nullptr, 
-          ImGuiWindowFlags_AlwaysAutoResize)) {
-        ImGui::Text("This proposal has policy warnings.");
-        ImGui::Text("Do you want to override and accept anyway?");
-        ImGui::Text("This action will be logged.");
-        ImGui::Separator();
-        
-        if (ImGui::Button("Override and Accept")) {
-          override_confirmed_ = true;
-          AcceptProposal(proposal_id);
-          ImGui::CloseCurrentPopup();
-        }
-        ImGui::SameLine();
-        if (ImGui::Button("Cancel")) {
-          ImGui::CloseCurrentPopup();
-        }
-        ImGui::EndPopup();
-      }
-    } else {
-      ImGui::TextColored(ImVec4(1, 0, 0, 1), 
-          "Policy evaluation failed: %s", 
-          policy_result.status().message().data());
-    }
-  } else {
-    ImGui::TextColored(ImVec4(0.5, 0.5, 0.5, 1), 
-        "No policies configured");
-  }
-}
-```
-
-### Phase 4: Testing & Documentation (1-2 hours)
-
-#### 4.1 Example Policy File
-
-Create `.yaze/policies/agent.yaml.example`:
-
-```yaml
-# Example agent policy configuration
-# Copy to .yaze/policies/agent.yaml and customize
-
-version: 1.0
-enabled: true
-
-policies:
-  # Require test suites to pass
-  - name: require_tests
-    type: test_requirement
-    enabled: false  # Disabled by default (no tests yet)
-    severity: critical
-    rules:
-      - test_suite: "smoke_test"
-        min_pass_rate: 1.0
-    message: "All smoke tests must pass"
-
-  # Limit change scope
-  - name: limit_changes
-    type: change_constraint
-    enabled: true
-    severity: warning
-    rules:
-      - max_bytes_changed: 5120  # 5KB
-      - max_commands_executed: 15
-    message: "Keep changes small and focused"
-
-  # Protect ROM header
-  - name: protect_header
-    type: forbidden_range
-    enabled: true
-    severity: critical
-    ranges:
-      - start: 0xFFB0
-        end: 0xFFFF
-        reason: "ROM header"
-    message: "Cannot modify ROM header"
-```
-
-#### 4.2 Unit Tests
-
-Create `test/cli/policy_evaluator_test.cc`:
-
-```cpp
-#include "src/cli/service/policy_evaluator.h"
-#include "gtest/gtest.h"
-
-namespace yaze {
-namespace cli {
-namespace {
-
-TEST(PolicyEvaluatorTest, LoadPoliciesSuccess) {
-  auto& eval = PolicyEvaluator::GetInstance();
-  auto status = eval.LoadPolicies("test/fixtures/policies");
-  EXPECT_TRUE(status.ok());
-  EXPECT_TRUE(eval.IsEnabled());
-}
-
-TEST(PolicyEvaluatorTest, EvaluateProposal_NoViolations) {
-  // ... test implementation
-}
-
-TEST(PolicyEvaluatorTest, EvaluateProposal_CriticalViolation) {
-  // ... test implementation
-}
-
-}  // namespace
-}  // namespace cli
-}  // namespace yaze
-```
-
-## Deliverables
-
- [x] Policy evaluator service interface
- [ ] YAML policy parser implementation
- [ ] Policy evaluation logic for all 4 types
- [ ] ProposalDrawer GUI integration
- [ ] Policy override workflow
- [ ] Example policy configurations
- [ ] Unit tests
- [ ] Documentation and usage guide
-
-## Success Criteria
-
-1. **Functional**:
-   - Policies load from YAML files
-   - Proposals evaluated against all enabled policies
-   - Accept button gated by critical violations
-   - Override workflow for warnings
-
-2. **User Experience**:
-   - Clear policy status display in ProposalDrawer
-   - Helpful violation messages
-   - Override confirmation dialog
-   - Policy evaluation fast (< 100ms)
-
-3. **Quality**:
-   - Unit test coverage > 80%
-   - No crashes or memory leaks
-   - Graceful handling of malformed YAML
-   - Works with policies disabled
-
-## Future Enhancements
-
- Policy templates for common scenarios
- Policy violation history/analytics
- Auto-fix suggestions for violations
- Integration with CI/CD for automated policy checks
- Policy versioning and migration
-
---
-
-**Status**: Ready for implementation  
-**Next Step**: Create PolicyEvaluator skeleton and wire into build system  
-**Estimated Completion**: October 3-4, 2025
--- a/docs/z3ed/E6-z3ed-implementation-plan.md
+++ b/docs/z3ed/E6-z3ed-implementation-plan.md
@@ -25,6 +25,10 @@ The z3ed CLI and AI agent workflow system has completed major infrastructure mil
 - **Priority 3**: Enhanced Error Reporting (IT-08+) - Holistic improvements spanning z3ed, ImGuiTestHarness, EditorManager, and core application services

 **Recent Accomplishments** (Updated: October 2025):
+- **✅ IT-08a Screenshot RPC Complete**: SDL-based screenshot capture operational
+  - Captures 1536x864 BMP files via SDL_RenderReadPixels
+  - Successfully tested via gRPC (5.3MB output files)
+  - Foundation for auto-capture on test failures
 - **✅ Policy Framework Complete**: PolicyEvaluator service fully integrated with ProposalDrawer GUI
  - 4 policy types implemented: test_requirement, change_constraint, forbidden_range, review_requirement
  - 3 severity levels: Info (informational), Warning (overridable), Critical (blocks acceptance)
@@ -41,8 +45,8 @@ The z3ed CLI and AI agent workflow system has completed major infrastructure mil
 - **Proposal Workflow**: Agentic proposal system fully operational (create, list, diff, review in GUI)

 **Known Limitations & Improvement Opportunities**:
- **Screenshot RPC**: Stub implementation → needs SDL_Surface capture + PNG encoding
- **Test Introspection**: No way to query test status, results, or queue → add GetTestStatus/ListTests RPCs
+- **Screenshot Auto-Capture**: Manual RPC only → needs integration with TestManager failure detection
+- **Test Introspection**: ✅ Complete - GetTestStatus/ListTests/GetResults RPCs operational
 - **Widget Discovery**: AI agents can't enumerate available widgets → add DiscoverWidgets RPC
 - **Test Recording**: No record/replay for regression testing → add RecordSession/ReplaySession RPCs
 - **Synchronous Wait**: Async tests return immediately → add blocking mode or result polling
@@ -236,13 +240,15 @@ message WidgetInfo {

 **Outcome**: Recording/replay is production-ready; focus shifts to surfacing rich failure diagnostics (IT-08).

-#### IT-08: Enhanced Error Reporting (5-7 hours)
+#### IT-08: Enhanced Error Reporting (5-7 hours) 🔄 ACTIVE
+**Status**: IT-08a Complete ✅ | IT-08b In Progress 🔄
 **Objective**: Deliver a unified, high-signal error reporting pipeline spanning ImGuiTestHarness, z3ed CLI, EditorManager, and core application services.

 **Implementation Tracks**:
 1. **Harness-Level Diagnostics**
-  - Implement Screenshot RPC (convert stub into working SDL capture pipeline)
-  - Auto-capture screenshots, widget tree dumps, and recent ImGui events on failure
+  - ✅ IT-08a: Screenshot RPC implemented (SDL-based, BMP format, 1536x864)
+  - 📋 IT-08b: Auto-capture screenshots on test failure
+  - 📋 IT-08c: Widget tree dumps and recent ImGui events on failure
  - Serialize results to both structured JSON (for automation) and human-friendly HTML bundles
  - Persist artifacts under `test-results/<test_id>/` with timestamped directories

@@ -516,9 +522,10 @@ z3ed collab replay session_2025_10_02.yaml --speed 2x
 | IT-05 | Add test introspection RPCs (GetTestStatus, ListTests, GetResults) | ImGuiTest Bridge | Code | ✅ Done | IT-01 - Enable clients to poll test results and query execution state (Oct 2, 2025) |
 | IT-06 | Implement widget discovery API for AI agents | ImGuiTest Bridge | Code | 📋 Planned | IT-01 - DiscoverWidgets RPC to enumerate windows, buttons, inputs |
 | IT-07 | Add test recording/replay for regression testing | ImGuiTest Bridge | Code | ✅ Done | IT-05 - RecordSession/ReplaySession RPCs with JSON test scripts |
-| IT-08 | Enhance error reporting with screenshots and state dumps | ImGuiTest Bridge | Code | <EFBFBD> Active | IT-01 - Capture widget state on failure for debugging |
-| IT-08a | Adopt shared error envelope across CLI & services | ImGuiTest Bridge | Code | 🔄 Active | IT-08 |
-| IT-08b | EditorManager diagnostic overlay & logging | ImGuiTest Bridge | UX | 📋 Planned | IT-08 |
+| IT-08 | Enhance error reporting with screenshots and state dumps | ImGuiTest Bridge | Code | 🔄 Active | IT-01 - Capture widget state on failure for debugging |
+| IT-08a | Screenshot RPC implementation (SDL capture) | ImGuiTest Bridge | Code | ✅ Done | IT-01 - Screenshot capture complete (Oct 2, 2025) |
+| IT-08b | Auto-capture screenshots on test failure | ImGuiTest Bridge | Code | 🔄 Active | IT-08a - Integrate with TestManager |
+| IT-08c | Widget state dumps and execution context | ImGuiTest Bridge | Code | 📋 Planned | IT-08b - Enhanced failure diagnostics |
 | IT-09 | Create standardized test suite format for CI integration | ImGuiTest Bridge | Infra | 📋 Planned | IT-07 - JSON/YAML test suite format compatible with CI/CD pipelines |
 | IT-10 | Collaborative editing & multiplayer sessions with shared AI | Collaboration | Feature | 📋 Planned | IT-05, IT-08 - Real-time multi-user editing with live cursors, shared proposals (12-15 hours) |
 | VP-01 | Expand CLI unit tests for new commands and sandbox flow. | Verification Pipeline | Test | 📋 Planned | RC/AW tasks |
--- a/docs/z3ed/IT-08-IMPLEMENTATION-GUIDE.md
+++ b/docs/z3ed/IT-08-IMPLEMENTATION-GUIDE.md
@@ -0,0 +1,647 @@
+# IT-08: Enhanced Error Reporting Implementation Guide
+
+**Status**: IT-08a Complete ✅ | IT-08b In Progress 🔄 | IT-08c Planned 📋  
+**Date**: October 2, 2025  
+**Overall Progress**: 33% Complete (1 of 3 phases)
+
+---
+
+## Phase Overview
+
+| Phase | Task | Status | Time | Description |
+|-------|------|--------|------|-------------|
+| IT-08a | Screenshot RPC | ✅ Complete | 1.5h | SDL-based screenshot capture |
+| IT-08b | Auto-Capture on Failure | 🔄 Active | 1-1.5h | Integrate with TestManager |
+| IT-08c | Widget State Dumps | 📋 Planned | 30-45m | Capture UI context on failure |
+| IT-08d | Error Envelope Standardization | 📋 Planned | 1-2h | Unified error format across services |
+| IT-08e | CLI Error Improvements | 📋 Planned | 1h | Rich error output with artifacts |
+
+**Total Estimated Time**: 5-7 hours  
+**Time Spent**: 1.5 hours  
+**Time Remaining**: 3.5-5.5 hours
+
+---
+
+## IT-08a: Screenshot RPC ✅ COMPLETE
+
+**Date Completed**: October 2, 2025  
+**Time**: 1.5 hours
+
+### Implementation Summary
+
+### What Was Built
+
+Implemented the `Screenshot` RPC in the ImGuiTestHarness service with the following capabilities:
+
+1. **SDL Renderer Integration**: Accesses the ImGui SDL2 backend renderer through `BackendRendererUserData`
+2. **Framebuffer Capture**: Uses `SDL_RenderReadPixels` to capture the full window contents (1536x864, 32-bit ARGB)
+3. **BMP File Output**: Saves screenshots as BMP files using SDL's built-in `SDL_SaveBMP` function
+4. **Flexible Paths**: Supports custom output paths or auto-generates timestamped filenames (`/tmp/yaze_screenshot_<timestamp>.bmp`)
+5. **Response Metadata**: Returns file path, file size (bytes), and image dimensions
+
+### Technical Implementation
+
+**Location**: `/Users/scawful/Code/yaze/src/app/core/service/imgui_test_harness_service.cc`
+
+```cpp
+// Helper struct matching imgui_impl_sdlrenderer2.cpp backend data
+struct ImGui_ImplSDLRenderer2_Data {
+  SDL_Renderer* Renderer;
+};
+
+absl::Status ImGuiTestHarnessServiceImpl::Screenshot(
+    const ScreenshotRequest* request, ScreenshotResponse* response) {
+  // 1. Get SDL renderer from ImGui backend
+  ImGuiIO& io = ImGui::GetIO();
+  auto* backend_data = static_cast<ImGui_ImplSDLRenderer2_Data*>(io.BackendRendererUserData);
+  
+  if (!backend_data || !backend_data->Renderer) {
+    response->set_success(false);
+    response->set_message("SDL renderer not available");
+    return absl::FailedPreconditionError("No SDL renderer available");
+  }
+  
+  SDL_Renderer* renderer = backend_data->Renderer;
+  
+  // 2. Get renderer output size
+  int width, height;
+  SDL_GetRendererOutputSize(renderer, &width, &height);
+  
+  // 3. Create surface to hold screenshot
+  SDL_Surface* surface = SDL_CreateRGBSurface(0, width, height, 32,
+                                              0x00FF0000, 0x0000FF00,
+                                              0x000000FF, 0xFF000000);
+  
+  // 4. Read pixels from renderer (ARGB8888 format)
+  SDL_RenderReadPixels(renderer, nullptr, SDL_PIXELFORMAT_ARGB8888,
+                      surface->pixels, surface->pitch);
+  
+  // 5. Determine output path (custom or auto-generated)
+  std::string output_path = request->output_path();
+  if (output_path.empty()) {
+    output_path = absl::StrFormat("/tmp/yaze_screenshot_%lld.bmp",
+                                  absl::ToUnixMillis(absl::Now()));
+  }
+  
+  // 6. Save to BMP file
+  SDL_SaveBMP(surface, output_path.c_str());
+  
+  // 7. Get file size and clean up
+  std::ifstream file(output_path, std::ios::binary | std::ios::ate);
+  int64_t file_size = file.tellg();
+  
+  SDL_FreeSurface(surface);
+  
+  // 8. Return success response
+  response->set_success(true);
+  response->set_message(absl::StrFormat("Screenshot saved to %s (%dx%d)",
+                                        output_path, width, height));
+  response->set_file_path(output_path);
+  response->set_file_size_bytes(file_size);
+  
+  return absl::OkStatus();
+}
+```
+
+### Testing Results
+
+**Test Command**:
+```bash
+grpcurl -plaintext \
+  -import-path /Users/scawful/Code/yaze/src/app/core/proto \
+  -proto imgui_test_harness.proto \
+  -d '{"output_path": "/tmp/test_screenshot.bmp"}' \
+  localhost:50052 yaze.test.ImGuiTestHarness/Screenshot
+```
+
+**Response**:
+```json
+{
+  "success": true,
+  "message": "Screenshot saved to /tmp/test_screenshot.bmp (1536x864)",
+  "filePath": "/tmp/test_screenshot.bmp",
+  "fileSizeBytes": "5308538"
+}
+```
+
+**File Verification**:
+```bash
+$ ls -lh /tmp/test_screenshot.bmp
+-rw-r--r--  1 scawful  wheel   5.1M Oct  2 20:16 /tmp/test_screenshot.bmp
+
+$ file /tmp/test_screenshot.bmp
+/tmp/test_screenshot.bmp: PC bitmap, Windows 95/NT4 and newer format, 1536 x 864 x 32, cbSize 5308538, bits offset 122
+```
+
+✅ **Result**: Screenshot successfully captured, saved, and validated!
+
+---
+
+## Design Decisions
+
+### Why BMP Format?
+
+**Chosen**: SDL's built-in `SDL_SaveBMP` function  
+**Rationale**:
+- ✅ Zero external dependencies (no need for libpng, stb_image_write, etc.)
+- ✅ Guaranteed to work on all platforms where SDL works
+- ✅ Simple, reliable, and fast
+- ✅ Adequate for debugging/error reporting (file size not critical)
+- ⚠️ Larger file sizes (5.3MB vs ~500KB for PNG), but acceptable for temporary debug files
+
+**Future Consideration**: If disk space becomes an issue, can add PNG encoding using stb_image_write (single-header library, easy to integrate)
+
+### SDL Backend Integration
+
+**Challenge**: How to access the SDL_Renderer from ImGui?  
+**Solution**: 
+- ImGui's `BackendRendererUserData` points to an `ImGui_ImplSDLRenderer2_Data` struct
+- This struct contains the `Renderer` pointer as its first member
+- Cast `BackendRendererUserData` to access the renderer safely
+
+**Why Not Store Renderer Globally?**
+- Multiple ImGui contexts could use different renderers
+- Backend data pattern follows ImGui's architecture conventions
+- More maintainable and future-proof
+
+---
+
+## Integration with Test System
+
+### Current Usage (Manual RPC)
+
+AI agents or CLI tools can manually capture screenshots:
+
+```bash
+# Capture screenshot after opening editor
+z3ed agent test --prompt "Open Overworld Editor"
+grpcurl ... yaze.test.ImGuiTestHarness/Screenshot
+```
+
+### Next Step: Auto-Capture on Failure
+
+The screenshot RPC is now ready to be integrated with TestManager to automatically capture context when tests fail:
+
+**Planned Implementation** (IT-08 Phase 2):
+```cpp
+// In TestManager::MarkHarnessTestCompleted()
+if (test_result == IMGUI_TEST_STATUS_FAILED || 
+    test_result == IMGUI_TEST_STATUS_TIMEOUT) {
+  
+  // Auto-capture screenshot
+  ScreenshotRequest req;
+  req.set_output_path(absl::StrFormat("/tmp/test_%s_failure.bmp", test_id));
+  
+  ScreenshotResponse resp;
+  harness_service_->Screenshot(&req, &resp);
+  
+  test_history_[test_id].screenshot_path = resp.file_path();
+  
+  // Also capture widget state (IT-08 Phase 3)
+  test_history_[test_id].widget_state = CaptureWidgetState();
+}
+```
+
+---
+
+---
+
+## IT-08b: Auto-Capture on Test Failure 🔄 IN PROGRESS
+
+**Goal**: Automatically capture screenshots and context when tests fail  
+**Time Estimate**: 1-1.5 hours  
+**Status**: Ready to implement
+
+### Implementation Plan
+
+#### Step 1: Modify TestManager (30 minutes)
+
+**File**: `src/app/core/test_manager.cc`
+
+Add screenshot capture in `MarkHarnessTestCompleted()`:
+
+```cpp
+void TestManager::MarkHarnessTestCompleted(const std::string& test_id,
+                                           ImGuiTestStatus status) {
+  auto& history_entry = test_history_[test_id];
+  history_entry.status = status;
+  history_entry.end_time = absl::Now();
+  history_entry.execution_time_ms = absl::ToInt64Milliseconds(
+      history_entry.end_time - history_entry.start_time);
+  
+  // Auto-capture screenshot on failure
+  if (status == ImGuiTestStatus_Error || status == ImGuiTestStatus_Warning) {
+    CaptureFailureContext(test_id);
+  }
+}
+
+void TestManager::CaptureFailureContext(const std::string& test_id) {
+  auto& history_entry = test_history_[test_id];
+  
+  // 1. Capture screenshot
+  std::string screenshot_path = 
+      absl::StrFormat("/tmp/yaze_test_%s_failure.bmp", test_id);
+  
+  if (harness_service_) {
+    ScreenshotRequest req;
+    req.set_output_path(screenshot_path);
+    
+    ScreenshotResponse resp;
+    auto status = harness_service_->Screenshot(&req, &resp);
+    
+    if (status.ok()) {
+      history_entry.screenshot_path = resp.file_path();
+      history_entry.screenshot_size_bytes = resp.file_size_bytes();
+    }
+  }
+  
+  // 2. Capture widget state (IT-08c)
+  // history_entry.widget_state = CaptureWidgetState();
+  
+  // 3. Capture execution context
+  history_entry.failure_context = absl::StrFormat(
+      "Frame: %d, Active Window: %s, Focused Widget: %s",
+      ImGui::GetFrameCount(),
+      ImGui::GetCurrentWindow() ? ImGui::GetCurrentWindow()->Name : "none",
+      ImGui::GetActiveID());
+}
+```
+
+#### Step 2: Update TestHistory Structure (15 minutes)
+
+**File**: `src/app/core/test_manager.h`
+
+Add failure context fields:
+
+```cpp
+struct TestHistory {
+  std::string test_id;
+  std::string test_name;
+  ImGuiTestStatus status;
+  absl::Time start_time;
+  absl::Time end_time;
+  int64_t execution_time_ms;
+  std::vector<std::string> logs;
+  std::map<std::string, std::string> metrics;
+  
+  // IT-08b: Failure diagnostics
+  std::string screenshot_path;
+  int64_t screenshot_size_bytes = 0;
+  std::string failure_context;
+  std::string widget_state;  // IT-08c
+};
+```
+
+#### Step 3: Update GetTestResults RPC (30 minutes)
+
+**File**: `src/app/core/service/imgui_test_harness_service.cc`
+
+Include screenshot path in results:
+
+```cpp
+absl::Status ImGuiTestHarnessServiceImpl::GetTestResults(
+    const GetTestResultsRequest* request,
+    GetTestResultsResponse* response) {
+  
+  const auto& history = test_manager_->GetTestHistory(request->test_id());
+  
+  // ... existing result population ...
+  
+  // Add failure diagnostics
+  if (!history.screenshot_path.empty()) {
+    response->set_screenshot_path(history.screenshot_path);
+    response->set_screenshot_size_bytes(history.screenshot_size_bytes);
+  }
+  
+  if (!history.failure_context.empty()) {
+    response->set_failure_context(history.failure_context);
+  }
+  
+  return absl::OkStatus();
+}
+```
+
+#### Step 4: Update Proto Schema (15 minutes)
+
+**File**: `src/app/core/proto/imgui_test_harness.proto`
+
+Add fields to GetTestResultsResponse:
+
+```proto
+message GetTestResultsResponse {
+  string test_id = 1;
+  TestStatus status = 2;
+  int64 execution_time_ms = 3;
+  repeated string logs = 4;
+  map<string, string> metrics = 5;
+  
+  // IT-08b: Failure diagnostics
+  string screenshot_path = 6;
+  int64 screenshot_size_bytes = 7;
+  string failure_context = 8;
+  string widget_state = 9;  // IT-08c
+}
+```
+
+### Testing
+
+```bash
+# 1. Build with changes
+cmake --build build-grpc-test --target yaze -j8
+
+# 2. Start test harness
+./build-grpc-test/bin/yaze.app/Contents/MacOS/yaze \
+  --enable_test_harness --test_harness_port=50052 \
+  --rom_file=assets/zelda3.sfc &
+
+# 3. Trigger a failing test
+grpcurl -plaintext \
+  -import-path src/app/core/proto \
+  -proto imgui_test_harness.proto \
+  -d '{"target":"nonexistent_widget","type":"LEFT"}' \
+  127.0.0.1:50052 yaze.test.ImGuiTestHarness/Click
+
+# 4. Check for screenshot
+ls -lh /tmp/yaze_test_*_failure.bmp
+
+# 5. Query test results
+grpcurl -plaintext \
+  -import-path src/app/core/proto \
+  -proto imgui_test_harness.proto \
+  -d '{"test_id":"grpc_click_<timestamp>"}' \
+  127.0.0.1:50052 yaze.test.ImGuiTestHarness/GetTestResults
+
+# Expected: screenshot_path and failure_context populated
+```
+
+### Success Criteria
+
+- ✅ Screenshots auto-captured on test failure
+- ✅ Screenshot path stored in test history
+- ✅ GetTestResults returns screenshot metadata
+- ✅ No performance impact on passing tests
+- ✅ Screenshots cleaned up after test completion (optional)
+
+---
+
+## IT-08c: Widget State Dumps 📋 PLANNED
+
+**Goal**: Capture UI hierarchy and state on test failures  
+**Time Estimate**: 30-45 minutes  
+**Status**: Specification phase
+
+### Implementation Plan
+
+#### Step 1: Create Widget State Capture Utility (30 minutes)
+
+**File**: `src/app/core/widget_state_capture.h` (new file)
+
+```cpp
+#ifndef YAZE_CORE_WIDGET_STATE_CAPTURE_H
+#define YAZE_CORE_WIDGET_STATE_CAPTURE_H
+
+#include <string>
+#include "imgui/imgui.h"
+
+namespace yaze {
+namespace core {
+
+struct WidgetState {
+  std::string focused_window;
+  std::string focused_widget;
+  std::string hovered_widget;
+  std::vector<std::string> visible_windows;
+  std::vector<std::string> open_menus;
+  std::string active_popup;
+};
+
+std::string CaptureWidgetState();
+std::string SerializeWidgetStateToJson(const WidgetState& state);
+
+}  // namespace core
+}  // namespace yaze
+
+#endif
+```
+
+**File**: `src/app/core/widget_state_capture.cc` (new file)
+
+```cpp
+#include "src/app/core/widget_state_capture.h"
+#include "absl/strings/str_format.h"
+#include "nlohmann/json.hpp"
+
+namespace yaze {
+namespace core {
+
+std::string CaptureWidgetState() {
+  WidgetState state;
+  
+  // Capture focused window
+  ImGuiWindow* current = ImGui::GetCurrentWindow();
+  if (current) {
+    state.focused_window = current->Name;
+  }
+  
+  // Capture active widget
+  ImGuiID active_id = ImGui::GetActiveID();
+  if (active_id != 0) {
+    state.focused_widget = absl::StrFormat("ID_%u", active_id);
+  }
+  
+  // Capture hovered widget
+  ImGuiID hovered_id = ImGui::GetHoveredID();
+  if (hovered_id != 0) {
+    state.hovered_widget = absl::StrFormat("ID_%u", hovered_id);
+  }
+  
+  // Traverse window list
+  ImGuiContext* ctx = ImGui::GetCurrentContext();
+  for (ImGuiWindow* window : ctx->Windows) {
+    if (window->Active && !window->Hidden) {
+      state.visible_windows.push_back(window->Name);
+    }
+  }
+  
+  return SerializeWidgetStateToJson(state);
+}
+
+std::string SerializeWidgetStateToJson(const WidgetState& state) {
+  nlohmann::json j;
+  j["focused_window"] = state.focused_window;
+  j["focused_widget"] = state.focused_widget;
+  j["hovered_widget"] = state.hovered_widget;
+  j["visible_windows"] = state.visible_windows;
+  j["open_menus"] = state.open_menus;
+  j["active_popup"] = state.active_popup;
+  return j.dump(2);  // Pretty print with indent
+}
+
+}  // namespace core
+}  // namespace yaze
+```
+
+#### Step 2: Integrate with TestManager (15 minutes)
+
+Update `CaptureFailureContext()` in `test_manager.cc`:
+
+```cpp
+void TestManager::CaptureFailureContext(const std::string& test_id) {
+  auto& history_entry = test_history_[test_id];
+  
+  // 1. Screenshot (IT-08b)
+  // ... existing code ...
+  
+  // 2. Widget state (IT-08c)
+  history_entry.widget_state = core::CaptureWidgetState();
+  
+  // 3. Execution context
+  // ... existing code ...
+}
+```
+
+### Output Example
+
+```json
+{
+  "focused_window": "Overworld Editor",
+  "focused_widget": "ID_12345",
+  "hovered_widget": "ID_67890",
+  "visible_windows": [
+    "Main Window",
+    "Overworld Editor",
+    "Palette Editor"
+  ],
+  "open_menus": [],
+  "active_popup": ""
+}
+```
+
+---
+
+## IT-08d: Error Envelope Standardization 📋 PLANNED
+
+**Goal**: Unified error format across z3ed, TestManager, EditorManager  
+**Time Estimate**: 1-2 hours  
+**Status**: Design phase
+
+### Proposed Error Envelope
+
+```cpp
+// Shared error structure
+struct ErrorContext {
+  absl::Status status;
+  std::string component;  // "TestHarness", "EditorManager", "z3ed"
+  std::string operation;  // "Click", "LoadROM", "RunTest"
+  std::map<std::string, std::string> metadata;
+  std::vector<std::string> artifact_paths;  // Screenshots, logs, etc.
+  std::string actionable_hint;  // User-facing suggestion
+};
+```
+
+### Integration Points
+
+1. **TestManager**: Wrap failures in ErrorContext
+2. **EditorManager**: Use ErrorContext for all operations
+3. **z3ed CLI**: Parse ErrorContext and format for display
+4. **ProposalDrawer**: Display ErrorContext in GUI modal
+
+---
+
+## IT-08e: CLI Error Improvements 📋 PLANNED
+
+**Goal**: Rich error output in z3ed CLI  
+**Time Estimate**: 1 hour  
+**Status**: Design phase
+
+### Enhanced CLI Output
+
+```bash
+$ z3ed agent test --prompt "Open Overworld editor"
+
+❌ Test Failed: grpc_click_1696357200
+   Component: ImGuiTestHarness
+   Operation: Click widget "Overworld"
+   
+   Error: Widget not found
+   
+   Artifacts:
+   • Screenshot: /tmp/yaze_test_grpc_click_1696357200_failure.bmp
+   • Widget State: /tmp/yaze_test_grpc_click_1696357200_state.json
+   • Logs: /tmp/yaze_test_grpc_click_1696357200.log
+   
+   Context:
+   • Visible Windows: Main Window, Debug
+   • Focused Window: Main Window
+   • Active Widget: None
+   
+   Suggestion:
+   → Check if ROM is loaded (File → Open ROM)
+   → Verify Overworld editor button is visible
+   → Use 'z3ed agent gui discover' to list available widgets
+```
+
+---
+
+## Progress Tracking
+
+### Completed ✅
+- IT-08a: Screenshot RPC (1.5 hours)
+
+### In Progress 🔄
+- IT-08b: Auto-capture on failure (next priority)
+
+### Planned 📋
+- IT-08c: Widget state dumps
+- IT-08d: Error envelope standardization
+- IT-08e: CLI error improvements
+
+### Time Investment
+- **Spent**: 1.5 hours (IT-08a)
+- **Remaining**: 3.5-5.5 hours (IT-08b/c/d/e)
+- **Total**: 5-7 hours (as estimated)
+
+---
+
+## Next Steps
+
+**Immediate** (IT-08b - 1-1.5 hours):
+1. Modify TestManager to capture screenshots on failure
+2. Update TestHistory structure
+3. Update GetTestResults RPC
+4. Test with intentional failures
+
+**Short-term** (IT-08c - 30-45 minutes):
+1. Create widget state capture utility
+2. Integrate with TestManager
+3. Add to GetTestResults RPC
+
+**Medium-term** (IT-08d/e - 2-3 hours):
+1. Design unified error envelope
+2. Implement across all services
+3. Update CLI output formatting
+4. Add ProposalDrawer error modal
+
+---
+
+## References
+
+- **Implementation Plan**: [E6-z3ed-implementation-plan.md](E6-z3ed-implementation-plan.md)
+- **Test Harness Guide**: [IT-05-IMPLEMENTATION-GUIDE.md](IT-05-IMPLEMENTATION-GUIDE.md)
+- **Source Files**: 
+  - `src/app/core/service/imgui_test_harness_service.cc`
+  - `src/app/core/test_manager.{h,cc}`
+  - `src/app/core/proto/imgui_test_harness.proto`
+
+---
+
+**Last Updated**: October 2, 2025  
+**Current Phase**: IT-08b (Auto-capture on failure)  
+**Overall Progress**: 33% Complete (1 of 3 core phases)
+
+---
+
+**Report Generated**: October 2, 2025  
+**Author**: GitHub Copilot (AI Assistant)  
+**Project**: YAZE - Yet Another Zelda3 Editor  
+**Component**: z3ed CLI Tool - Test Automation Harness
--- a/docs/z3ed/IT-08-SCREENSHOT-COMPLETION.md
+++ b/docs/z3ed/IT-08-SCREENSHOT-COMPLETION.md
@@ -1,347 +0,0 @@
-# IT-08 Screenshot RPC - Completion Report
-
-**Date**: October 2, 2025  
-**Task**: IT-08 Enhanced Error Reporting - Screenshot Capture Implementation  
-**Status**: ✅ Screenshot RPC Complete (30% of IT-08)
-
---
-
-## Implementation Summary
-
-### What Was Built
-
-Implemented the `Screenshot` RPC in the ImGuiTestHarness service with the following capabilities:
-
-1. **SDL Renderer Integration**: Accesses the ImGui SDL2 backend renderer through `BackendRendererUserData`
-2. **Framebuffer Capture**: Uses `SDL_RenderReadPixels` to capture the full window contents (1536x864, 32-bit ARGB)
-3. **BMP File Output**: Saves screenshots as BMP files using SDL's built-in `SDL_SaveBMP` function
-4. **Flexible Paths**: Supports custom output paths or auto-generates timestamped filenames (`/tmp/yaze_screenshot_<timestamp>.bmp`)
-5. **Response Metadata**: Returns file path, file size (bytes), and image dimensions
-
-### Technical Implementation
-
-**Location**: `/Users/scawful/Code/yaze/src/app/core/service/imgui_test_harness_service.cc`
-
-```cpp
-// Helper struct matching imgui_impl_sdlrenderer2.cpp backend data
-struct ImGui_ImplSDLRenderer2_Data {
-  SDL_Renderer* Renderer;
-};
-
-absl::Status ImGuiTestHarnessServiceImpl::Screenshot(
-    const ScreenshotRequest* request, ScreenshotResponse* response) {
-  // 1. Get SDL renderer from ImGui backend
-  ImGuiIO& io = ImGui::GetIO();
-  auto* backend_data = static_cast<ImGui_ImplSDLRenderer2_Data*>(io.BackendRendererUserData);
-  
-  if (!backend_data || !backend_data->Renderer) {
-    response->set_success(false);
-    response->set_message("SDL renderer not available");
-    return absl::FailedPreconditionError("No SDL renderer available");
-  }
-  
-  SDL_Renderer* renderer = backend_data->Renderer;
-  
-  // 2. Get renderer output size
-  int width, height;
-  SDL_GetRendererOutputSize(renderer, &width, &height);
-  
-  // 3. Create surface to hold screenshot
-  SDL_Surface* surface = SDL_CreateRGBSurface(0, width, height, 32,
-                                              0x00FF0000, 0x0000FF00,
-                                              0x000000FF, 0xFF000000);
-  
-  // 4. Read pixels from renderer (ARGB8888 format)
-  SDL_RenderReadPixels(renderer, nullptr, SDL_PIXELFORMAT_ARGB8888,
-                      surface->pixels, surface->pitch);
-  
-  // 5. Determine output path (custom or auto-generated)
-  std::string output_path = request->output_path();
-  if (output_path.empty()) {
-    output_path = absl::StrFormat("/tmp/yaze_screenshot_%lld.bmp",
-                                  absl::ToUnixMillis(absl::Now()));
-  }
-  
-  // 6. Save to BMP file
-  SDL_SaveBMP(surface, output_path.c_str());
-  
-  // 7. Get file size and clean up
-  std::ifstream file(output_path, std::ios::binary | std::ios::ate);
-  int64_t file_size = file.tellg();
-  
-  SDL_FreeSurface(surface);
-  
-  // 8. Return success response
-  response->set_success(true);
-  response->set_message(absl::StrFormat("Screenshot saved to %s (%dx%d)",
-                                        output_path, width, height));
-  response->set_file_path(output_path);
-  response->set_file_size_bytes(file_size);
-  
-  return absl::OkStatus();
-}
-```
-
-### Testing Results
-
-**Test Command**:
-```bash
-grpcurl -plaintext \
-  -import-path /Users/scawful/Code/yaze/src/app/core/proto \
-  -proto imgui_test_harness.proto \
-  -d '{"output_path": "/tmp/test_screenshot.bmp"}' \
-  localhost:50052 yaze.test.ImGuiTestHarness/Screenshot
-```
-
-**Response**:
-```json
-{
-  "success": true,
-  "message": "Screenshot saved to /tmp/test_screenshot.bmp (1536x864)",
-  "filePath": "/tmp/test_screenshot.bmp",
-  "fileSizeBytes": "5308538"
-}
-```
-
-**File Verification**:
-```bash
-$ ls -lh /tmp/test_screenshot.bmp
-rw-r--r--  1 scawful  wheel   5.1M Oct  2 20:16 /tmp/test_screenshot.bmp
-
-$ file /tmp/test_screenshot.bmp
-/tmp/test_screenshot.bmp: PC bitmap, Windows 95/NT4 and newer format, 1536 x 864 x 32, cbSize 5308538, bits offset 122
-```
-
-✅ **Result**: Screenshot successfully captured, saved, and validated!
-
---
-
-## Design Decisions
-
-### Why BMP Format?
-
-**Chosen**: SDL's built-in `SDL_SaveBMP` function  
-**Rationale**:
- ✅ Zero external dependencies (no need for libpng, stb_image_write, etc.)
- ✅ Guaranteed to work on all platforms where SDL works
- ✅ Simple, reliable, and fast
- ✅ Adequate for debugging/error reporting (file size not critical)
- ⚠️ Larger file sizes (5.3MB vs ~500KB for PNG), but acceptable for temporary debug files
-
-**Future Consideration**: If disk space becomes an issue, can add PNG encoding using stb_image_write (single-header library, easy to integrate)
-
-### SDL Backend Integration
-
-**Challenge**: How to access the SDL_Renderer from ImGui?  
-**Solution**: 
- ImGui's `BackendRendererUserData` points to an `ImGui_ImplSDLRenderer2_Data` struct
- This struct contains the `Renderer` pointer as its first member
- Cast `BackendRendererUserData` to access the renderer safely
-
-**Why Not Store Renderer Globally?**
- Multiple ImGui contexts could use different renderers
- Backend data pattern follows ImGui's architecture conventions
- More maintainable and future-proof
-
---
-
-## Integration with Test System
-
-### Current Usage (Manual RPC)
-
-AI agents or CLI tools can manually capture screenshots:
-
-```bash
-# Capture screenshot after opening editor
-z3ed agent test --prompt "Open Overworld Editor"
-grpcurl ... yaze.test.ImGuiTestHarness/Screenshot
-```
-
-### Next Step: Auto-Capture on Failure
-
-The screenshot RPC is now ready to be integrated with TestManager to automatically capture context when tests fail:
-
-**Planned Implementation** (IT-08 Phase 2):
-```cpp
-// In TestManager::MarkHarnessTestCompleted()
-if (test_result == IMGUI_TEST_STATUS_FAILED || 
-    test_result == IMGUI_TEST_STATUS_TIMEOUT) {
-  
-  // Auto-capture screenshot
-  ScreenshotRequest req;
-  req.set_output_path(absl::StrFormat("/tmp/test_%s_failure.bmp", test_id));
-  
-  ScreenshotResponse resp;
-  harness_service_->Screenshot(&req, &resp);
-  
-  test_history_[test_id].screenshot_path = resp.file_path();
-  
-  // Also capture widget state (IT-08 Phase 3)
-  test_history_[test_id].widget_state = CaptureWidgetState();
-}
-```
-
---
-
-## Remaining Work (IT-08 Phases 2-3)
-
-### Phase 2: Auto-Capture on Test Failure (1-1.5 hours)
-
-**Tasks**:
-1. Modify `TestManager::MarkHarnessTestCompleted()` to detect failures
-2. Call Screenshot RPC automatically when `status == FAILED || status == TIMEOUT`
-3. Store screenshot path in test history
-4. Update `GetTestResults` RPC to include screenshot paths in response
-5. Test with intentional test failures
-
-**Files to Modify**:
- `src/app/core/test_manager.cc` (auto-capture logic)
- `src/app/core/service/imgui_test_harness_service.cc` (store screenshot in history)
-
-### Phase 3: Widget State Dump (30-45 minutes)
-
-**Tasks**:
-1. Implement `CaptureWidgetState()` function to traverse ImGui window hierarchy
-2. Capture: focused window, focused widget, hovered widget, open menus
-3. Store as JSON string in test history
-4. Include in `GetTestResults` response
-
-**Files to Create**:
- `src/app/core/widget_state_capture.{h,cc}` (traversal logic)
-
-**Example Output**:
-```json
-{
-  "focused_window": "Overworld Editor",
-  "hovered_widget": "canvas_overworld_main",
-  "open_menus": [],
-  "visible_windows": ["Overworld Editor", "Palette Editor", "Tile16 Editor"]
-}
-```
-
---
-
-## Performance Considerations
-
-### Current Performance
-
- **Screenshot Capture Time**: ~10-20ms (depends on resolution)
- **File Write Time**: ~50-100ms (5.3MB BMP)
- **Total Impact**: ~60-120ms per screenshot
-
-**Analysis**: Acceptable for failure scenarios (only captures when test fails, not on every frame)
-
-### Optimization Options (If Needed)
-
-1. **Async Capture**: Move screenshot to background thread (complex, may not be necessary)
-2. **PNG Compression**: Reduce file size from 5.3MB to ~500KB (10x smaller)
-3. **Downscaling**: Capture at 50% resolution (768x432) for faster I/O
-4. **Skip Screenshots for Fast Tests**: Only capture for tests >1 second
-
-**Recommendation**: Current performance is fine for debugging. Only optimize if users report slowdowns.
-
---
-
-## CLI Integration
-
-### z3ed CLI Usage
-
-The Screenshot RPC is accessible via the CLI automation client:
-
-```cpp
-// In gui_automation_client.cc
-absl::StatusOr<ScreenshotResponse> GuiAutomationClient::TakeScreenshot(
-    const std::string& output_path) {
-  ScreenshotRequest request;
-  request.set_output_path(output_path);
-  
-  ScreenshotResponse response;
-  grpc::ClientContext context;
-  
-  auto status = stub_->Screenshot(&context, request, &response);
-  if (!status.ok()) {
-    return absl::InternalError(status.error_message());
-  }
-  
-  return response;
-}
-```
-
-### Agent Mode Integration
-
-AI agents can now request screenshots to understand GUI state:
-
-```yaml
-# Example agent workflow
- action: click
-  target: "Overworld Editor##tab"
-  
- action: screenshot
-  output: "/tmp/overworld_state.bmp"
-  
- action: analyze
-  image: "/tmp/overworld_state.bmp"
-  prompt: "Verify Overworld Editor opened successfully"
-```
-
---
-
-## Next Steps
-
-### Immediate (Continue IT-08)
-
-1. **Build and Test**: ✅ Complete (Oct 2, 2025)
-2. **Auto-Capture on Failure**: 📋 Next (1-1.5 hours)
-3. **Widget State Dump**: 📋 After auto-capture (30-45 minutes)
-
-### After IT-08 Completion
-
-**IT-09: CI/CD Integration** (2-3 hours):
- Test suite YAML format
- JUnit XML output for GitHub Actions
- Example workflow file
-
---
-
-## Success Metrics
-
-✅ **Screenshot RPC Works**: Successfully captures 1536x864 @ 32-bit BMP files  
-✅ **Integration Ready**: Can be called from CLI, agents, or test harness  
-✅ **Performance Acceptable**: ~60-120ms total impact per capture  
-✅ **Error Handling**: Returns clear error messages if renderer unavailable  
-
-**Overall IT-08 Progress**: 30% complete (1 of 3 phases done)
-
---
-
-## Documentation Updates
-
-### Files Updated
-
- `src/app/core/service/imgui_test_harness_service.cc` (Screenshot implementation)
- `docs/z3ed/IT-08-SCREENSHOT-COMPLETION.md` (this file)
-
-### Files to Update Next
-
- `docs/z3ed/IMPLEMENTATION_CONTINUATION.md` (mark Screenshot complete)
- `docs/z3ed/STATUS_REPORT_OCT2.md` (update progress to 30%)
- `docs/z3ed/NEXT_STEPS_OCT2.md` (shift focus to Phase 2)
-
---
-
-## Conclusion
-
-The Screenshot RPC is fully functional and tested. It provides the foundation for IT-08's enhanced error reporting system by capturing visual context when tests fail.
-
-**Key Achievement**: AI agents can now "see" what's on screen, enabling visual debugging and verification workflows.
-
-**What's Next**: Integrate screenshot capture with the test failure detection system so every failed test automatically includes a screenshot + widget state dump.
-
-**Estimated Time to Complete IT-08**: 1.5-2 hours remaining (auto-capture + widget state)
-
---
-
-**Report Generated**: October 2, 2025  
-**Author**: GitHub Copilot (AI Assistant)  
-**Project**: YAZE - Yet Another Zelda3 Editor  
-**Component**: z3ed CLI Tool - Test Automation Harness
--- a/docs/z3ed/IT-08b-AUTO-CAPTURE.md
+++ b/docs/z3ed/IT-08b-AUTO-CAPTURE.md
@@ -0,0 +1,388 @@
+# IT-08b: Auto-Capture on Test Failure - Implementation Guide
+
+**Status**: 🔄 Ready to Implement  
+**Priority**: High (Next Phase of IT-08)  
+**Time Estimate**: 1-1.5 hours  
+**Date**: October 2, 2025
+
+---
+
+## Overview
+
+Automatically capture screenshots and execution context when tests fail, enabling better debugging and diagnostics for AI agents.
+
+**Goal**: Every failed test produces:
+- Screenshot of GUI state at failure
+- Execution context (frame count, active windows, focused widgets)
+- Foundation for IT-08c (widget state dumps)
+
+---
+
+## Implementation Steps
+
+### Step 1: Update TestHistory Structure (15 minutes)
+
+**File**: `src/app/core/test_manager.h`
+
+Add failure diagnostics fields:
+
+```cpp
+struct TestHistory {
+  std::string test_id;
+  std::string test_name;
+  ImGuiTestStatus status;
+  absl::Time start_time;
+  absl::Time end_time;
+  int64_t execution_time_ms;
+  std::vector<std::string> logs;
+  std::map<std::string, std::string> metrics;
+  
+  // IT-08b: Failure diagnostics
+  std::string screenshot_path;
+  int64_t screenshot_size_bytes = 0;
+  std::string failure_context;
+  
+  // IT-08c: Widget state (future)
+  std::string widget_state;
+};
+```
+
+### Step 2: Add CaptureFailureContext Method (30 minutes)
+
+**File**: `src/app/core/test_manager.cc`
+
+Add new method after `MarkHarnessTestCompleted`:
+
+```cpp
+void TestManager::CaptureFailureContext(const std::string& test_id) {
+  if (test_history_.find(test_id) == test_history_.end()) {
+    return;
+  }
+  
+  auto& history = test_history_[test_id];
+  
+  // 1. Capture screenshot via harness service
+  if (harness_service_) {
+    std::string screenshot_path = 
+        absl::StrFormat("/tmp/yaze_test_%s_failure.bmp", test_id);
+    
+    ScreenshotRequest req;
+    req.set_output_path(screenshot_path);
+    
+    ScreenshotResponse resp;
+    auto status = harness_service_->Screenshot(&req, &resp);
+    
+    if (status.ok() && resp.success()) {
+      history.screenshot_path = resp.file_path();
+      history.screenshot_size_bytes = resp.file_size_bytes();
+    } else {
+      YAZE_LOG(ERROR) << "Failed to capture screenshot for " << test_id 
+                      << ": " << status.message();
+    }
+  }
+  
+  // 2. Capture execution context
+  ImGuiContext* ctx = ImGui::GetCurrentContext();
+  if (ctx) {
+    ImGuiWindow* current_window = ImGui::GetCurrentWindow();
+    std::string window_name = current_window ? current_window->Name : "none";
+    
+    ImGuiID active_id = ImGui::GetActiveID();
+    ImGuiID hovered_id = ImGui::GetHoveredID();
+    
+    history.failure_context = absl::StrFormat(
+        "Frame: %d, Window: %s, Active: %u, Hovered: %u",
+        ImGui::GetFrameCount(),
+        window_name,
+        active_id,
+        hovered_id);
+  }
+  
+  // 3. Widget state capture (IT-08c - placeholder)
+  // history.widget_state = CaptureWidgetState();
+}
+```
+
+### Step 3: Integrate with MarkHarnessTestCompleted (15 minutes)
+
+**File**: `src/app/core/test_manager.cc`
+
+Modify existing method to call CaptureFailureContext:
+
+```cpp
+void TestManager::MarkHarnessTestCompleted(const std::string& test_id,
+                                           ImGuiTestStatus status) {
+  if (test_history_.find(test_id) == test_history_.end()) {
+    return;
+  }
+  
+  auto& history = test_history_[test_id];
+  history.status = status;
+  history.end_time = absl::Now();
+  history.execution_time_ms = absl::ToInt64Milliseconds(
+      history.end_time - history.start_time);
+  
+  // Auto-capture diagnostics on failure
+  if (status == ImGuiTestStatus_Error || status == ImGuiTestStatus_Warning) {
+    CaptureFailureContext(test_id);
+  }
+  
+  // Notify waiting threads
+  cv_.notify_all();
+}
+```
+
+### Step 4: Update GetTestResults RPC (30 minutes)
+
+**File**: `src/app/core/proto/imgui_test_harness.proto`
+
+Add fields to response:
+
+```proto
+message GetTestResultsResponse {
+  string test_id = 1;
+  TestStatus status = 2;
+  int64 execution_time_ms = 3;
+  repeated string logs = 4;
+  map<string, string> metrics = 5;
+  
+  // IT-08b: Failure diagnostics
+  string screenshot_path = 6;
+  int64 screenshot_size_bytes = 7;
+  string failure_context = 8;
+  
+  // IT-08c: Widget state (future)
+  string widget_state = 9;
+}
+```
+
+**File**: `src/app/core/service/imgui_test_harness_service.cc`
+
+Update implementation:
+
+```cpp
+absl::Status ImGuiTestHarnessServiceImpl::GetTestResults(
+    const GetTestResultsRequest* request,
+    GetTestResultsResponse* response) {
+  
+  const std::string& test_id = request->test_id();
+  auto history = test_manager_->GetTestHistory(test_id);
+  
+  if (!history.has_value()) {
+    return absl::NotFoundError(
+        absl::StrFormat("Test not found: %s", test_id));
+  }
+  
+  const auto& h = history.value();
+  
+  // Basic info
+  response->set_test_id(h.test_id);
+  response->set_status(ConvertImGuiTestStatusToProto(h.status));
+  response->set_execution_time_ms(h.execution_time_ms);
+  
+  // Logs and metrics
+  for (const auto& log : h.logs) {
+    response->add_logs(log);
+  }
+  for (const auto& [key, value] : h.metrics) {
+    (*response->mutable_metrics())[key] = value;
+  }
+  
+  // IT-08b: Failure diagnostics
+  if (!h.screenshot_path.empty()) {
+    response->set_screenshot_path(h.screenshot_path);
+    response->set_screenshot_size_bytes(h.screenshot_size_bytes);
+  }
+  if (!h.failure_context.empty()) {
+    response->set_failure_context(h.failure_context);
+  }
+  
+  // IT-08c: Widget state (future)
+  if (!h.widget_state.empty()) {
+    response->set_widget_state(h.widget_state);
+  }
+  
+  return absl::OkStatus();
+}
+```
+
+---
+
+## Testing
+
+### Build and Start Test Harness
+
+```bash
+# 1. Rebuild with changes
+cmake --build build-grpc-test --target yaze -j$(sysctl -n hw.ncpu)
+
+# 2. Start test harness
+./build-grpc-test/bin/yaze.app/Contents/MacOS/yaze \
+  --enable_test_harness \
+  --test_harness_port=50052 \
+  --rom_file=assets/zelda3.sfc &
+```
+
+### Trigger Test Failure
+
+```bash
+# 3. Trigger a failing test (nonexistent widget)
+grpcurl -plaintext \
+  -import-path src/app/core/proto \
+  -proto imgui_test_harness.proto \
+  -d '{"target":"nonexistent_widget","type":"LEFT"}' \
+  127.0.0.1:50052 yaze.test.ImGuiTestHarness/Click
+
+# Response should indicate failure
+```
+
+### Verify Screenshot Captured
+
+```bash
+# 4. Check for auto-captured screenshot
+ls -lh /tmp/yaze_test_*_failure.bmp
+
+# Expected: BMP file created (5.3MB)
+```
+
+### Query Test Results
+
+```bash
+# 5. Get test results (replace <test_id> with actual ID from Click response)
+grpcurl -plaintext \
+  -import-path src/app/core/proto \
+  -proto imgui_test_harness.proto \
+  -d '{"test_id":"<test_id>"}' \
+  127.0.0.1:50052 yaze.test.ImGuiTestHarness/GetTestResults
+
+# Expected output:
+{
+  "testId": "grpc_click_12345678",
+  "status": "FAILED",
+  "executionTimeMs": "1234",
+  "logs": [...],
+  "screenshotPath": "/tmp/yaze_test_grpc_click_12345678_failure.bmp",
+  "screenshotSizeBytes": "5308538",
+  "failureContext": "Frame: 1234, Window: Main Window, Active: 0, Hovered: 0"
+}
+```
+
+### End-to-End Test Script
+
+Create `scripts/test_auto_capture.sh`:
+
+```bash
+#!/bin/bash
+set -e
+
+echo "=== IT-08b Auto-Capture Test ==="
+
+# Clean up old screenshots
+rm -f /tmp/yaze_test_*_failure.bmp
+
+# Start YAZE with test harness
+echo "Starting YAZE..."
+./build-grpc-test/bin/yaze.app/Contents/MacOS/yaze \
+  --enable_test_harness \
+  --test_harness_port=50052 \
+  --rom_file=assets/zelda3.sfc &
+YAZE_PID=$!
+
+# Wait for server to start
+sleep 3
+
+# Trigger failing test
+echo "Triggering test failure..."
+TEST_ID=$(grpcurl -plaintext \
+  -import-path src/app/core/proto \
+  -proto imgui_test_harness.proto \
+  -d '{"target":"nonexistent_widget","type":"LEFT"}' \
+  127.0.0.1:50052 yaze.test.ImGuiTestHarness/Click | \
+  jq -r '.testId')
+
+echo "Test ID: $TEST_ID"
+
+# Wait for test to complete
+sleep 2
+
+# Check screenshot captured
+if [ -f "/tmp/yaze_test_${TEST_ID}_failure.bmp" ]; then
+  echo "✅ Screenshot captured: /tmp/yaze_test_${TEST_ID}_failure.bmp"
+else
+  echo "❌ Screenshot NOT captured"
+  kill $YAZE_PID
+  exit 1
+fi
+
+# Query test results
+echo "Querying test results..."
+RESULTS=$(grpcurl -plaintext \
+  -import-path src/app/core/proto \
+  -proto imgui_test_harness.proto \
+  -d "{\"test_id\":\"$TEST_ID\"}" \
+  127.0.0.1:50052 yaze.test.ImGuiTestHarness/GetTestResults)
+
+echo "$RESULTS"
+
+# Verify fields present
+if echo "$RESULTS" | jq -e '.screenshotPath' > /dev/null; then
+  echo "✅ Screenshot path in results"
+else
+  echo "❌ Screenshot path missing"
+  kill $YAZE_PID
+  exit 1
+fi
+
+if echo "$RESULTS" | jq -e '.failureContext' > /dev/null; then
+  echo "✅ Failure context in results"
+else
+  echo "❌ Failure context missing"
+  kill $YAZE_PID
+  exit 1
+fi
+
+echo "=== All tests passed! ==="
+
+# Cleanup
+kill $YAZE_PID
+```
+
+---
+
+## Success Criteria
+
+- ✅ Screenshots auto-captured on test failure (Error or Warning status)
+- ✅ Screenshot path stored in TestHistory
+- ✅ Failure context captured (frame, window, widgets)
+- ✅ GetTestResults RPC returns screenshot_path and failure_context
+- ✅ No performance impact on passing tests (capture only on failure)
+- ✅ Clean error handling if screenshot capture fails
+
+---
+
+## Files Modified
+
+1. `src/app/core/test_manager.h` - TestHistory structure
+2. `src/app/core/test_manager.cc` - CaptureFailureContext method
+3. `src/app/core/proto/imgui_test_harness.proto` - GetTestResultsResponse fields
+4. `src/app/core/service/imgui_test_harness_service.cc` - GetTestResults implementation
+
+---
+
+## Next Steps
+
+**After IT-08b Complete**:
+1. IT-08c: Widget State Dumps (30-45 minutes)
+2. IT-08d: Error Envelope Standardization (1-2 hours)
+3. IT-08e: CLI Error Improvements (1 hour)
+
+**Documentation Updates**:
+1. Update `IT-08-IMPLEMENTATION-GUIDE.md` with IT-08b complete status
+2. Update `E6-z3ed-implementation-plan.md` progress tracking
+3. Update `README.md` with new capabilities
+
+---
+
+**Last Updated**: October 2, 2025  
+**Status**: Ready to implement  
+**Estimated Completion**: October 2-3, 2025 (1-1.5 hours)
--- a/docs/z3ed/POLICY-IMPLEMENTATION-SUMMARY.md
+++ b/docs/z3ed/POLICY-IMPLEMENTATION-SUMMARY.md
@@ -1,251 +0,0 @@
-# Policy Evaluation Framework - Implementation Complete ✅
-
-**Date**: October 2025  
-**Task**: AW-04 - Policy Evaluation Framework  
-**Status**: ✅ Complete - Ready for Production Testing  
-**Time**: 6 hours actual (estimated 6-8 hours)
-
-## Overview
-
-The Policy Evaluation Framework enables safe AI-driven ROM modifications by gating proposal acceptance based on YAML-configured constraints. This prevents the agent from making dangerous changes (corrupting ROM headers, exceeding byte limits, bypassing test requirements) while maintaining flexibility through configurable policies.
-
-## Implementation Summary
-
-### Core Components
-
-1. **PolicyEvaluator Service** (`src/cli/service/policy_evaluator.{h,cc}`)
-   - Singleton service managing policy loading and evaluation
-   - 377 lines of implementation code
-   - Thread-safe with absl::StatusOr error handling
-   - Auto-loads from `.yaze/policies/agent.yaml` on first use
-
-2. **Policy Types** (4 implemented):
-   - **test_requirement**: Gates on test status (critical severity)
-   - **change_constraint**: Limits bytes modified (warning/critical)
-   - **forbidden_range**: Blocks specific memory regions (critical)
-   - **review_requirement**: Flags proposals needing scrutiny (warning)
-
-3. **Severity Levels** (3 levels):
-   - **Info**: Informational only, no blocking
-   - **Warning**: User can override with confirmation
-   - **Critical**: Blocks acceptance completely
-
-4. **GUI Integration** (`src/app/editor/system/proposal_drawer.{h,cc}`)
-   - `DrawPolicyStatus()`: Color-coded violation display
-     - ⛔ Red for critical violations
-     - ⚠️ Yellow for warnings
-     - ℹ️ Blue for info messages
-   - Accept button gating: Disabled when critical violations present
-   - Override dialog: Confirmation required for warnings
-
-5. **Configuration** (`.yaze/policies/agent.yaml`)
-   - Simple YAML-like format for policy definitions
-   - Example configuration with 4 policies provided
-   - User can enable/disable individual policies
-   - Supports comments and version tracking
-
-### Build System Integration
-
- Added `cli/service/policy_evaluator.cc` to:
-  - `src/cli/z3ed.cmake` (z3ed CLI target)
-  - `src/app/app.cmake` (yaze GUI target, with `YAZE_ENABLE_POLICY_FRAMEWORK=1`)
- **Conditional Compilation**: Policy framework only enabled in main `yaze` target
-  - `yaze_emu` (emulator) builds without policy support
-  - Uses `#ifdef YAZE_ENABLE_POLICY_FRAMEWORK` to wrap optional code
- Clean build with no errors (warnings only for Abseil version mismatch)
-
-## Code Changes
-
-### Files Created (3 new files):
-
-1. **docs/z3ed/AW-04-POLICY-FRAMEWORK.md** (1,234 lines)
-   - Complete implementation specification
-   - YAML schema documentation
-   - Architecture diagrams and examples
-   - 4-phase implementation plan
-
-2. **src/cli/service/policy_evaluator.h** (85 lines)
-   - PolicyEvaluator singleton interface
-   - PolicyResult, PolicyViolation structures
-   - PolicySeverity enum
-   - Public API: LoadPolicies(), EvaluateProposal(), ReloadPolicies()
-
-3. **src/cli/service/policy_evaluator.cc** (377 lines)
-   - ParsePolicyFile(): Simple YAML parser
-   - Evaluate[Test|Change|Forbidden|Review](): Policy evaluation logic
-   - CategorizeViolations(): Severity-based filtering
-
-4. **.yaze/policies/agent.yaml** (34 lines)
-   - Example policy configuration
-   - 4 sample policies with detailed comments
-   - Ready for production use
-
-### Files Modified (5 files):
-
-1. **src/app/editor/system/proposal_drawer.h**
-   - Added: `DrawPolicyStatus()` method
-   - Added: `show_override_dialog_` member variable
-
-2. **src/app/editor/system/proposal_drawer.cc** (~100 lines added)
-   - Integrated PolicyEvaluator::Get().EvaluateProposal()
-   - Implemented DrawPolicyStatus() with color-coded violations
-   - Modified DrawActionButtons() to gate Accept button
-   - Added policy override confirmation dialog
-
-3. **src/cli/z3ed.cmake**
-   - Added: `cli/service/policy_evaluator.cc` to z3ed sources
-
-4. **src/app/app.cmake**
-   - Added: `cli/service/policy_evaluator.cc` to yaze sources
-   - Added: `YAZE_ENABLE_POLICY_FRAMEWORK=1` compile definition
-   - Note: `yaze_emu` target does NOT include policy framework (optional feature)
-
-5. **src/app/editor/system/proposal_drawer.cc**
-   - Wrapped policy code with `#ifdef YAZE_ENABLE_POLICY_FRAMEWORK`
-   - Gracefully degrades when policy framework disabled
-
-6. **docs/z3ed/E6-z3ed-implementation-plan.md**
-   - Updated: AW-04 status from "📋 Next" to "✅ Done"
-   - Updated: Active phase to Policy Framework complete
-   - Updated: Time investment to 28.5 hours total
-
-## Technical Details
-
-### Conditional Compilation
-
-The policy framework uses conditional compilation to allow building without policy support:
-
-```cpp
-#ifdef YAZE_ENABLE_POLICY_FRAMEWORK
-  auto& policy_eval = cli::PolicyEvaluator::GetInstance();
-  auto policy_result = policy_eval.EvaluateProposal(p.id);
-  // ... policy evaluation logic ...
-#endif
-```
-
-**Build Targets**:
- `yaze` (main editor): Policy framework **enabled** ✅
- `yaze_emu` (emulator): Policy framework **disabled** (not needed)
- `z3ed` (CLI): Policy framework **enabled** ✅
-
-### API Usage Patterns
-
-**StatusOr Error Handling**:
-```cpp
-auto proposal_result = registry.GetProposal(proposal_id);
-if (!proposal_result.ok()) {
-  return PolicyResult{false, {}, {}, {}, {}};
-}
-const auto& proposal = proposal_result.value();
-```
-
-**String View Conversions**:
-```cpp
-// Explicit conversion required for absl::string_view → std::string
-std::string trimmed = std::string(absl::StripAsciiWhitespace(line));
-config_->version = std::string(absl::StripAsciiWhitespace(parts[1]));
-```
-
-**Singleton Pattern**:
-```cpp
-PolicyEvaluator& evaluator = PolicyEvaluator::Get();
-PolicyResult result = evaluator.EvaluateProposal(proposal_id);
-```
-
-### Compilation Fixes Applied
-
-1. **Include Paths**: Changed from `src/cli/service/...` to `cli/service/...`
-2. **StatusOr API**: Used `.ok()` and `.value()` instead of `.has_value()`
-3. **String Numbers**: Added `#include "absl/strings/numbers.h"` for SimpleAtoi
-4. **String View**: Explicit `std::string()` cast for all absl::StripAsciiWhitespace() calls
-5. **Conditional Compilation**: Wrapped policy code with `YAZE_ENABLE_POLICY_FRAMEWORK` to fix yaze_emu build
-
-## Testing Plan
-
-### Phase 1: Manual Validation (Next Step)
- [ ] Launch yaze GUI and open Proposal Drawer
- [ ] Create test proposal and verify policy evaluation runs
- [ ] Test critical violation blocking (Accept button disabled)
- [ ] Test warning override flow (confirmation dialog)
- [ ] Verify policy status display with all severity levels
-
-### Phase 2: Policy Testing
- [ ] Test forbidden_range detection (ROM header protection)
- [ ] Test change_constraint limits (byte count enforcement)
- [ ] Test test_requirement gating (blocks without passing tests)
- [ ] Test review_requirement flagging (complex proposals)
- [ ] Test policy enable/disable toggle
-
-### Phase 3: Edge Cases
- [ ] Invalid YAML syntax handling
- [ ] Missing policy file behavior
- [ ] Malformed policy definitions
- [ ] Policy reload during runtime
- [ ] Multiple policies of same type
-
-### Phase 4: Unit Tests
- [ ] PolicyEvaluator::ParsePolicyFile() unit tests
- [ ] Individual policy type evaluation tests
- [ ] Severity categorization tests
- [ ] Integration tests with ProposalRegistry
-
-## Known Limitations
-
-1. **YAML Parsing**: Simple custom parser implemented
-   - Works for current format but not full YAML spec
-   - Consider yaml-cpp for complex nested structures
-
-2. **Forbidden Range Checking**: Requires ROM diff parsing
-   - Currently placeholder implementation
-   - Will need integration with .z3ed-diff format
-
-3. **Review Requirement Conditions**: Complex expression evaluation
-   - Currently checks simple string matching
-   - May need expression parser for production
-
-4. **Performance**: No profiling done yet
-   - Target: < 100ms per evaluation
-   - Likely well under target given simple logic
-
-## Production Readiness Checklist
-
- ✅ Core implementation complete
- ✅ Build system integration
- ✅ GUI integration
- ✅ Example configuration
- ✅ Documentation complete
- ⏳ Manual testing (next step)
- ⏳ Unit test coverage
- ⏳ Windows cross-platform validation
- ⏳ Performance profiling
-
-## Next Steps
-
-**Immediate** (30 minutes):
-1. Launch yaze and test policy evaluation in ProposalDrawer
-2. Verify all 4 policy types work correctly
-3. Test override workflow for warnings
-
-**Short-term** (2-3 hours):
-1. Add unit tests for PolicyEvaluator
-2. Test on Windows build
-3. Document policy configuration in user guide
-
-**Medium-term** (4-6 hours):
-1. Integrate with .z3ed-diff for forbidden range detection
-2. Implement full YAML parser (yaml-cpp)
-3. Add policy reload command to CLI
-4. Performance profiling and optimization
-
-## References
-
- **Specification**: [AW-04-POLICY-FRAMEWORK.md](AW-04-POLICY-FRAMEWORK.md)
- **Implementation Plan**: [E6-z3ed-implementation-plan.md](E6-z3ed-implementation-plan.md)
- **Example Config**: `.yaze/policies/agent.yaml`
- **Source Files**: 
-  - `src/cli/service/policy_evaluator.{h,cc}`
-  - `src/app/editor/system/proposal_drawer.{h,cc}`
-
---
-
-**Accomplishment**: The Policy Evaluation Framework is now fully implemented and ready for production testing. This represents a major safety milestone for the z3ed agentic workflow system, enabling confident AI-driven ROM modifications with human-defined constraints.
--- a/docs/z3ed/README.md
+++ b/docs/z3ed/README.md
@@ -16,6 +16,8 @@

 This directory contains the primary documentation for the `z3ed` system.

+**📋 Documentation Status**: Consolidated (Oct 2, 2025) - 10 core files, 6,547 lines
+
 ## Core Documentation

 Start here to understand the architecture, learn how to use the commands, and see the current development status.
@@ -90,6 +92,7 @@ See the **[Technical Reference](E6-z3ed-reference.md)** for a full command list.
  - Successfully tested via gRPC (5.3MB output files)
  - Foundation for auto-capture on test failures
  - AI agents can now capture visual context for debugging
+- ✅ IT-07 Test Recording & Replay Complete: Regression testing workflow operational
 - ✅ Server-side wiring for test lifecycle tracking inside `TestManager`
 - ✅ gRPC status mapping helper to surface accurate error codes back to clients
 - ✅ CLI integration with YAML/JSON output formats
@@ -97,11 +100,11 @@ See the **[Technical Reference](E6-z3ed-reference.md)** for a full command list.

 **Next Priority**: IT-08b (Auto-capture on failure) + IT-08c (Widget state dumps) to complete enhanced error reporting

-**Test Harness Evolution** (In Progress: IT-05 to IT-09 | 76% Complete):
+**Test Harness Evolution** (In Progress: IT-05 to IT-09 | 78% Complete):
 - **Test Introspection**: ✅ Query test status, results, and execution history
 - **Widget Discovery**: ✅ AI agents can enumerate available GUI interactions dynamically
 - **Test Recording**: ✅ Capture manual workflows as JSON scripts for regression testing
- **Enhanced Debugging**: 🔄 Screenshot capture (✅), widget state dumps (📋), execution context on failures (📋)
+- **Enhanced Debugging**: 🔄 Screenshot capture (✅ IT-08a), widget state dumps (📋 IT-08c), execution context on failures (📋 IT-08b)
 - **CI/CD Integration**: 📋 Standardized test suite format with JUnit XML output

 See **[E6-z3ed-cli-design.md § 9](E6-z3ed-cli-design.md#9-test-harness-evolution-from-automation-to-platform)** for detailed architecture and implementation roadmap.
@@ -111,12 +114,13 @@ See **[E6-z3ed-cli-design.md § 9](E6-z3ed-cli-design.md#9-test-harness-evolutio
 **📖 Getting Started**:
 - **New to z3ed?** Start with this [README.md](README.md) then [E6-z3ed-cli-design.md](E6-z3ed-cli-design.md)
 - **Want to use z3ed?** See [QUICK_REFERENCE.md](QUICK_REFERENCE.md) for all commands
- **Resume implementation?** Read [IMPLEMENTATION_CONTINUATION.md](IMPLEMENTATION_CONTINUATION.md)

 **🔧 Implementation Guides**:
- [IT-05-IMPLEMENTATION-GUIDE.md](IT-05-IMPLEMENTATION-GUIDE.md) - Test Introspection API (next priority)
- [STATUS_REPORT_OCT2.md](STATUS_REPORT_OCT2.md) - Complete progress summary
+- [IT-05-IMPLEMENTATION-GUIDE.md](IT-05-IMPLEMENTATION-GUIDE.md) - Test Introspection API (complete ✅)
+- [IT-08-IMPLEMENTATION-GUIDE.md](IT-08-IMPLEMENTATION-GUIDE.md) - Enhanced Error Reporting (in progress 🔄)
+- [IMPLEMENTATION_CONTINUATION.md](IMPLEMENTATION_CONTINUATION.md) - Detailed continuation plan for current phase

 **📚 Reference**:
 - [E6-z3ed-reference.md](E6-z3ed-reference.md) - Technical reference and API docs
 - [E6-z3ed-implementation-plan.md](E6-z3ed-implementation-plan.md) - Task backlog and roadmap
+- [QUICK_REFERENCE.md](QUICK_REFERENCE.md) - Quick command reference
--- a/docs/z3ed/REMOTE_CONTROL_WORKFLOWS.md
+++ b/docs/z3ed/REMOTE_CONTROL_WORKFLOWS.md
@@ -1,402 +0,0 @@
-# Remote Control Agent Workflows
-
-**Date**: October 2, 2025  
-**Status**: Functional - Test Harness + Widget Registry Integration  
-**Purpose**: Enable AI agents to remotely control YAZE for automated editing
-
-## Overview
-
-The remote control system allows AI agents to interact with YAZE through gRPC, using the ImGuiTestHarness and Widget ID Registry to perform real editing tasks.
-
-## Quick Start
-
-### 1. Start YAZE with Test Harness
-
-```bash
-./build-grpc-test/bin/yaze.app/Contents/MacOS/yaze \
-  --enable_test_harness \
-  --test_harness_port=50052 \
-  --rom_file=assets/zelda3.sfc &
-```
-
-### 2. Open Overworld Editor
-
-In YAZE GUI:
- Click "Overworld" button
- This registers 13 toolset widgets for remote control
-
-### 3. Run Test Script
-
-```bash
-./scripts/test_remote_control.sh
-```
-
-Expected output:
- ✓ All 8 practical workflows pass
- Agent can switch modes, open tools, control zoom
-
-## Supported Workflows
-
-### Mode Switching
-
-**Draw Tile Mode**:
-```bash
-grpcurl -plaintext -d '{"target":"Overworld/Toolset/button:DrawTile","type":"LEFT"}' \
-  127.0.0.1:50052 yaze.test.ImGuiTestHarness/Click
-```
- Enables tile painting on overworld map
- Agent can then click canvas to draw selected tiles
-
-**Pan Mode**:
-```bash
-grpcurl -plaintext -d '{"target":"Overworld/Toolset/button:Pan","type":"LEFT"}' \
-  127.0.0.1:50052 yaze.test.ImGuiTestHarness/Click
-```
- Enables map navigation
- Agent can drag canvas to reposition view
-
-**Entrances Mode**:
-```bash
-grpcurl -plaintext -d '{"target":"Overworld/Toolset/button:Entrances","type":"LEFT"}' \
-  127.0.0.1:50052 yaze.test.ImGuiTestHarness/Click
-```
- Enables entrance editing
- Agent can click to place/move entrances
-
-**Exits Mode**:
-```bash
-grpcurl -plaintext -d '{"target":"Overworld/Toolset/button:Exits","type":"LEFT"}' \
-  127.0.0.1:50052 yaze.test.ImGuiTestHarness/Click
-```
- Enables exit editing
- Agent can click to place/move exits
-
-**Sprites Mode**:
-```bash
-grpcurl -plaintext -d '{"target":"Overworld/Toolset/button:Sprites","type":"LEFT"}' \
-  127.0.0.1:50052 yaze.test.ImGuiTestHarness/Click
-```
- Enables sprite editing
- Agent can place/move sprites on overworld
-
-**Items Mode**:
-```bash
-grpcurl -plaintext -d '{"target":"Overworld/Toolset/button:Items","type":"LEFT"}' \
-  127.0.0.1:50052 yaze.test.ImGuiTestHarness/Click
-```
- Enables item placement
- Agent can add items to overworld
-
-### Tool Opening
-
-**Tile16 Editor**:
-```bash
-grpcurl -plaintext -d '{"target":"Overworld/Toolset/button:Tile16Editor","type":"LEFT"}' \
-  127.0.0.1:50052 yaze.test.ImGuiTestHarness/Click
-```
- Opens Tile16 Editor window
- Agent can select tiles for drawing
-
-### View Controls
-
-**Zoom In**:
-```bash
-grpcurl -plaintext -d '{"target":"Overworld/Toolset/button:ZoomIn","type":"LEFT"}' \
-  127.0.0.1:50052 yaze.test.ImGuiTestHarness/Click
-```
-
-**Zoom Out**:
-```bash
-grpcurl -plaintext -d '{"target":"Overworld/Toolset/button:ZoomOut","type":"LEFT"}' \
-  127.0.0.1:50052 yaze.test.ImGuiTestHarness/Click
-```
-
-**Fullscreen Toggle**:
-```bash
-grpcurl -plaintext -d '{"target":"Overworld/Toolset/button:Fullscreen","type":"LEFT"}' \
-  127.0.0.1:50052 yaze.test.ImGuiTestHarness/Click
-```
-
-## Multi-Step Workflows
-
-### Workflow 1: Draw Custom Tiles
-
-**Goal**: Agent draws specific tiles on the overworld map
-
-**Steps**:
-1. Switch to Draw Tile mode
-2. Open Tile16 Editor
-3. Select desired tile (TODO: needs canvas click support)
-4. Click on overworld canvas at (x, y) to draw
-
-**Current Status**: Steps 1-2 working, 3-4 need implementation
-
-### Workflow 2: Reposition Entrance
-
-**Goal**: Agent moves an entrance to a new location
-
-**Steps**:
-1. Switch to Entrances mode
-2. Click on existing entrance to select
-3. Drag to new location (TODO: needs drag support)
-4. Verify entrance properties updated
-
-**Current Status**: Step 1 working, 2-4 need implementation
-
-### Workflow 3: Place Sprites
-
-**Goal**: Agent adds sprites to overworld
-
-**Steps**:
-1. Switch to Sprites mode
-2. Select sprite from palette (TODO)
-3. Click canvas to place sprite
-4. Adjust sprite properties if needed
-
-**Current Status**: Step 1 working, 2-4 need implementation
-
-## Widget Registry Integration
-
-### Hierarchical Widget IDs
-
-The test harness now supports hierarchical widget IDs from the registry:
-
-```
-Format: <Editor>/<Section>/<Type>:<Name>
-Example: Overworld/Toolset/button:DrawTile
-```
-
-**Benefits**:
- Stable, predictable widget references
- Better error messages with suggestions
- Backwards compatible with legacy format
- Self-documenting structure
-
-### Pattern Matching
-
-When a widget isn't found, the system suggests alternatives:
-
-```bash
-# Typo in widget name
-grpcurl ... -d '{"target":"Overworld/Toolset/button:DrawTyle"}'
-
-# Response:
-# "Widget not found: DrawTyle. Did you mean: 
-#  Overworld/Toolset/button:DrawTile?"
-```
-
-### Widget Discovery
-
-Future enhancement - list all available widgets:
-
-```bash
-z3ed agent discover --pattern "Overworld/*"
-# Lists all Overworld widgets
-
-z3ed agent discover --pattern "*/button:*"
-# Lists all buttons across editors
-```
-
-## Implementation Details
-
-### Test Harness Changes
-
-**File**: `src/app/core/service/imgui_test_harness_service.cc`
-
-**Changes**:
-1. Added widget registry include
-2. Click RPC tries hierarchical lookup first
-3. Fallback to legacy string-based lookup
-4. Pattern matching for suggestions
-
-**Code**:
-```cpp
-// Try hierarchical widget ID lookup first
-auto& registry = gui::WidgetIdRegistry::Instance();
-ImGuiID widget_id = registry.GetWidgetId(target);
-
-if (widget_id != 0) {
-  // Found in registry - use ImGui ID directly
-  ctx->ItemClick(widget_id, mouse_button);
-} else {
-  // Fallback to legacy lookup
-  ctx->ItemClick(widget_label.c_str(), mouse_button);
-}
-```
-
-### Widget Registration
-
-**File**: `src/app/editor/overworld/overworld_editor.cc`
-
-**Registered Widgets** (13 total):
- Overworld/Toolset/button:Pan
- Overworld/Toolset/button:DrawTile
- Overworld/Toolset/button:Entrances
- Overworld/Toolset/button:Exits
- Overworld/Toolset/button:Items
- Overworld/Toolset/button:Sprites
- Overworld/Toolset/button:Transports
- Overworld/Toolset/button:Music
- Overworld/Toolset/button:ZoomIn
- Overworld/Toolset/button:ZoomOut
- Overworld/Toolset/button:Fullscreen
- Overworld/Toolset/button:Tile16Editor
- Overworld/Toolset/button:CopyMap
-
-## Next Steps
-
-### Priority 1: Canvas Interaction (2-3 hours)
-
-**Goal**: Enable agent to click on canvas at specific coordinates
-
-**Implementation**:
-1. Add canvas click to Click RPC
-2. Support coordinate-based clicking: `{"target":"canvas:Overworld","x":100,"y":200}`
-3. Test drawing tiles programmatically
-
-**Use Cases**:
- Draw tiles at specific locations
- Select entities by clicking
- Navigate by clicking minimap
-
-### Priority 2: Tile Selection (1-2 hours)
-
-**Goal**: Enable agent to select tiles from Tile16 Editor
-
-**Implementation**:
-1. Register Tile16 Editor canvas widgets
-2. Support tile palette clicking
-3. Track selected tile state
-
-**Use Cases**:
- Select tile before drawing
- Change tile selection mid-workflow
- Verify correct tile selected
-
-### Priority 3: Entity Manipulation (2-3 hours)
-
-**Goal**: Enable dragging of entrances, exits, sprites
-
-**Implementation**:
-1. Add Drag RPC to proto
-2. Implement drag operation in test harness
-3. Support drag start + end coordinates
-
-**Use Cases**:
- Move entrances to new positions
- Reposition sprites
- Adjust exit locations
-
-### Priority 4: Workflow Chaining (1-2 hours)
-
-**Goal**: Combine multiple operations into workflows
-
-**Implementation**:
-1. Create workflow definition format
-2. Execute sequence of RPCs
-3. Handle errors gracefully
-
-**Example Workflow**:
-```yaml
-workflow: draw_custom_tile
-steps:
-  - click: Overworld/Toolset/button:DrawTile
-  - click: Overworld/Toolset/button:Tile16Editor
-  - wait: window_visible:Tile16 Editor
-  - click: canvas:Tile16Editor
-    x: 64
-    y: 64
-  - click: canvas:Overworld
-    x: 512
-    y: 384
-```
-
-## Testing Strategy
-
-### Manual Testing
-
-1. Start test harness
-2. Run test script: `./scripts/test_remote_control.sh`
-3. Observe mode changes in GUI
-4. Verify no crashes or errors
-
-### Automated Testing
-
-1. Add to CI pipeline
-2. Run as part of E2E validation
-3. Test on multiple platforms
-
-### Integration Testing
-
-1. Test with real agent workflows
-2. Validate agent can complete tasks
-3. Measure reliability and timing
-
-## Performance Characteristics
-
-**Click Latency**: < 200ms
- gRPC overhead: ~10ms
- Test queue time: ~50ms
- ImGui event processing: ~100ms
- Total: ~160ms average
-
-**Mode Switch Time**: < 500ms
- Includes UI update
- State transition
- Visual feedback
-
-**Tool Opening**: < 1s
- Window creation
- Content loading
- Layout calculation
-
-## Troubleshooting
-
-### Widget Not Found
-
-**Problem**: "Widget not found: Overworld/Toolset/button:DrawTile"
-
-**Solutions**:
-1. Verify Overworld editor is open (widgets registered on open)
-2. Check widget name spelling
-3. Look at suggestions in error message
-4. Try legacy format: "button:DrawTile"
-
-### Click Not Working
-
-**Problem**: Click succeeds but nothing happens
-
-**Solutions**:
-1. Check if widget is enabled (not grayed out)
-2. Verify correct mode/context for action
-3. Add delay between clicks
-4. Check ImGui event queue
-
-### Test Timeout
-
-**Problem**: "Test timeout - widget not found or unresponsive"
-
-**Solutions**:
-1. Increase timeout (default 5s)
-2. Check if GUI is responsive
-3. Verify widget is visible (not hidden)
-4. Look for modal dialogs blocking interaction
-
-## References
-
-**Documentation**:
- [WIDGET_ID_REFACTORING_PROGRESS.md](WIDGET_ID_REFACTORING_PROGRESS.md)
- [IT-01-QUICKSTART.md](IT-01-QUICKSTART.md)
- [E2E_VALIDATION_GUIDE.md](E2E_VALIDATION_GUIDE.md)
-
-**Code Files**:
- `src/app/core/service/imgui_test_harness_service.cc` - Test harness implementation
- `src/app/gui/widget_id_registry.{h,cc}` - Widget registry
- `src/app/editor/overworld/overworld_editor.cc` - Widget registrations
- `scripts/test_remote_control.sh` - Test script
-
---
-
-**Last Updated**: October 2, 2025, 11:45 PM  
-**Status**: Functional - Basic mode switching works  
-**Next**: Canvas interaction + tile selection
--- a/docs/z3ed/WIDGET_ID_NEXT_ACTIONS.md
+++ b/docs/z3ed/WIDGET_ID_NEXT_ACTIONS.md
@@ -1,357 +0,0 @@
-# Widget ID Refactoring - Next Actions
-
-**Date**: October 2, 2025  
-**Status**: Phase 1 Complete - Testing & Integration Phase  
-**Previous Session**: [SESSION_SUMMARY_OCT2_NIGHT.md](SESSION_SUMMARY_OCT2_NIGHT.md)
-
-## Quick Start - Next Session
-
-### Option 1: Manual Testing (15 minutes) 🎯 RECOMMENDED FIRST
-
-**Goal**: Verify widgets register correctly in running GUI
-
-```bash
-# 1. Launch YAZE
-./build/bin/yaze.app/Contents/MacOS/yaze
-
-# 2. Open a ROM
-# File → Open ROM → assets/zelda3.sfc
-
-# 3. Open Overworld Editor
-# Click "Overworld" button in main window
-
-# 4. Test toolset buttons
-# Click through: Pan, DrawTile, Entrances, etc.
-# Expected: All work normally, no crashes
-
-# 5. Check console output
-# Look for any errors or warnings
-# Widget registrations happen silently
-```
-
-**Success Criteria**:
- ✅ GUI launches without crashes
- ✅ Overworld editor opens normally
- ✅ All toolset buttons clickable
- ✅ No error messages in console
-
---
-
-### Option 2: Add Widget Discovery Command (30 minutes)
-
-**Goal**: Create CLI command to list registered widgets
-
-**File to Edit**: `src/cli/handlers/agent.cc`
-
-**Add New Command**: `z3ed agent discover`
-
-```cpp
-// Add to agent.cc:
-absl::Status HandleDiscoverCommand(const std::vector<std::string>& args) {
-  // Parse --pattern flag (default "*")
-  std::string pattern = "*";
-  for (size_t i = 0; i < args.size(); ++i) {
-    if (args[i] == "--pattern" && i + 1 < args.size()) {
-      pattern = args[++i];
-    }
-  }
-  
-  // Get widget registry
-  auto& registry = gui::WidgetIdRegistry::Instance();
-  auto matches = registry.FindWidgets(pattern);
-  
-  if (matches.empty()) {
-    std::cout << "No widgets found matching pattern: " << pattern << "\n";
-    return absl::NotFoundError("No widgets found");
-  }
-  
-  std::cout << "=== Registered Widgets ===\n\n";
-  std::cout << "Pattern: " << pattern << "\n";
-  std::cout << "Count: " << matches.size() << "\n\n";
-  
-  for (const auto& path : matches) {
-    const auto* info = registry.GetWidgetInfo(path);
-    if (info) {
-      std::cout << path << "\n";
-      std::cout << "  Type: " << info->type << "\n";
-      std::cout << "  ImGui ID: " << info->imgui_id << "\n";
-      if (!info->description.empty()) {
-        std::cout << "  Description: " << info->description << "\n";
-      }
-      std::cout << "\n";
-    }
-  }
-  
-  return absl::OkStatus();
-}
-
-// Add routing in HandleAgentCommand:
-if (subcommand == "discover") {
-  return HandleDiscoverCommand(args);
-}
-```
-
-**Test**:
-```bash
-# Rebuild
-cmake --build build --target z3ed -j8
-
-# Test discovery (will fail - widgets registered at runtime)
-./build/bin/z3ed agent discover
-# Note: This requires YAZE to be running with widgets registered
-# We'll need a different approach - see Option 3
-```
-
---
-
-### Option 3: Widget Export at Shutdown (30 minutes) 🎯 BETTER APPROACH
-
-**Goal**: Export widget catalog when YAZE exits
-
-**File to Edit**: `src/app/editor/editor_manager.cc`
-
-**Add Destructor or Shutdown Method**:
-
-```cpp
-// In editor_manager.cc destructor or Shutdown():
-void EditorManager::Shutdown() {
-  // Export widget catalog for z3ed agent
-  auto& registry = gui::WidgetIdRegistry::Instance();
-  std::string catalog_path = "/tmp/yaze_widgets.yaml";
-  
-  try {
-    registry.ExportCatalogToFile(catalog_path, "yaml");
-    std::cout << "Widget catalog exported to: " << catalog_path << "\n";
-  } catch (const std::exception& e) {
-    std::cerr << "Failed to export widget catalog: " << e.what() << "\n";
-  }
-}
-```
-
-**Test**:
-```bash
-# 1. Rebuild
-cmake --build build --target yaze -j8
-
-# 2. Launch YAZE
-./build/bin/yaze.app/Contents/MacOS/yaze
-
-# 3. Open Overworld editor
-# (registers widgets)
-
-# 4. Quit YAZE
-# File → Quit or Cmd+Q
-
-# 5. Check exported catalog
-cat /tmp/yaze_widgets.yaml
-
-# Expected output:
-# widgets:
-#   - path: "Overworld/Toolset/button:Pan"
-#     type: button
-#     imgui_id: 12345
-#     context:
-#       editor: Overworld
-#       tab: Toolset
-#     ...
-```
-
---
-
-### Option 4: Test Harness Integration (1-2 hours)
-
-**Goal**: Enable test harness to click widgets by hierarchical ID
-
-**Files to Edit**:
-1. `src/app/core/service/imgui_test_harness_service.cc`
-2. `src/app/core/proto/imgui_test_harness.proto` (optional - add DiscoverWidgets RPC)
-
-**Implementation**:
-
-```cpp
-// In imgui_test_harness_service.cc, update Click RPC:
-absl::Status ImGuiTestHarnessServiceImpl::Click(
-    const ClickRequest* request, ClickResponse* response) {
-  
-  const std::string& target = request->target();
-  
-  // Try hierarchical widget ID first
-  auto& registry = gui::WidgetIdRegistry::Instance();
-  ImGuiID widget_id = registry.GetWidgetId(target);
-  
-  if (widget_id != 0) {
-    // Found in registry - use ImGui ID directly
-    std::string test_name = absl::StrFormat("DynamicClick_%s", target);
-    
-    auto* dynamic_test = ImGuiTest_CreateDynamicTest(
-        test_manager_->GetEngine(), test_category_.c_str(), test_name.c_str());
-    
-    dynamic_test->GuiFunc = [widget_id](ImGuiTestContext* ctx) {
-      ctx->ItemClick(widget_id);
-    };
-    
-    ImGuiTest_RunTest(test_manager_->GetEngine(), dynamic_test);
-    
-    response->set_success(true);
-    response->set_message(absl::StrFormat("Clicked widget: %s", target));
-    return absl::OkStatus();
-  }
-  
-  // Fallback to legacy string-based lookup
-  // ... existing code ...
-  
-  // If not found, suggest alternatives
-  auto matches = registry.FindWidgets("*" + target + "*");
-  if (!matches.empty()) {
-    std::string suggestions = absl::StrJoin(matches, ", ");
-    return absl::NotFoundError(
-        absl::StrFormat("Widget not found: %s. Did you mean: %s?",
-                        target, suggestions));
-  }
-  
-  return absl::NotFoundError(
-      absl::StrFormat("Widget not found: %s", target));
-}
-```
-
-**Test**:
-```bash
-# 1. Rebuild with gRPC
-cmake --build build-grpc-test --target yaze -j8
-
-# 2. Start test harness
-./build-grpc-test/bin/yaze.app/Contents/MacOS/yaze \
-  --enable_test_harness \
-  --test_harness_port=50052 \
-  --rom_file=assets/zelda3.sfc &
-
-# 3. Open Overworld editor in GUI
-# (registers widgets)
-
-# 4. Test hierarchical click
-grpcurl -plaintext \
-  -import-path src/app/core/proto \
-  -proto imgui_test_harness.proto \
-  -d '{"target":"Overworld/Toolset/button:DrawTile","type":"LEFT"}' \
-  127.0.0.1:50052 yaze.test.ImGuiTestHarness/Click
-
-# Expected: Click succeeds, DrawTile mode activated
-```
-
---
-
-## Recommended Sequence
-
-### Tonight (30 minutes)
-1. ✅ **Option 1**: Manual testing - verify no crashes
-2. 📋 **Option 3**: Add widget export at shutdown
-3. 📋 Inspect exported YAML, verify 13 toolset widgets
-
-### Tomorrow Morning (1-2 hours)
-1. 📋 **Option 4**: Test harness integration
-2. 📋 Test clicking widgets via hierarchical IDs
-3. 📋 Update E2E test script with new IDs
-
-### Tomorrow Afternoon (2-3 hours)
-1. 📋 Complete Overworld editor (canvas, properties)
-2. 📋 Add DiscoverWidgets RPC to proto
-3. 📋 Document patterns and best practices
-
---
-
-## Files to Modify Next
-
-### High Priority
-1. `src/app/editor/editor_manager.cc` - Add widget export at shutdown
-2. `src/app/core/service/imgui_test_harness_service.cc` - Registry lookup in Click RPC
-
-### Medium Priority
-3. `src/app/core/proto/imgui_test_harness.proto` - Add DiscoverWidgets RPC
-4. `src/app/editor/overworld/overworld_editor.cc` - Add canvas/properties widgets
-
-### Low Priority
-5. `scripts/test_harness_e2e.sh` - Update with hierarchical IDs
-6. `docs/z3ed/IT-01-QUICKSTART.md` - Add widget ID examples
-
---
-
-## Success Criteria
-
-### Phase 1 (Complete) ✅
- [x] Widget registry in build
- [x] 13 toolset widgets registered
- [x] Clean build
- [x] Documentation updated
-
-### Phase 2 (Current) 🔄
- [ ] Manual testing passes
- [ ] Widget export works
- [ ] Test harness can click by hierarchical ID
- [ ] At least 1 E2E test updated
-
-### Phase 3 (Next) 📋
- [ ] Complete Overworld editor (30+ widgets)
- [ ] DiscoverWidgets RPC working
- [ ] All E2E tests use hierarchical IDs
- [ ] Performance validated (< 1ms overhead)
-
---
-
-## Quick Commands
-
-### Build
-```bash
-# Regular build
-cmake --build build --target yaze -j8
-
-# Test harness build
-cmake --build build-grpc-test --target yaze -j8
-
-# CLI build
-cmake --build build --target z3ed -j8
-```
-
-### Test
-```bash
-# Manual test
-./build/bin/yaze.app/Contents/MacOS/yaze
-
-# Test harness
-./build-grpc-test/bin/yaze.app/Contents/MacOS/yaze \
-  --enable_test_harness \
-  --test_harness_port=50052 \
-  --rom_file=assets/zelda3.sfc
-```
-
-### Cleanup
-```bash
-# Kill running YAZE instances
-killall yaze
-
-# Clean build
-rm -rf build/CMakeFiles build/bin
-cmake --build build -j8
-```
-
---
-
-## References
-
-**Progress Docs**:
- [WIDGET_ID_REFACTORING_PROGRESS.md](WIDGET_ID_REFACTORING_PROGRESS.md) - Detailed tracker
- [SESSION_SUMMARY_OCT2_NIGHT.md](SESSION_SUMMARY_OCT2_NIGHT.md) - Tonight's work
-
-**Design Docs**:
- [IMGUI_ID_MANAGEMENT_REFACTORING.md](IMGUI_ID_MANAGEMENT_REFACTORING.md) - Complete plan
- [IT-01-QUICKSTART.md](IT-01-QUICKSTART.md) - Test harness guide
-
-**Code References**:
- `src/app/gui/widget_id_registry.{h,cc}` - Registry implementation
- `src/app/editor/overworld/overworld_editor.cc` - Usage example
- `src/app/core/service/imgui_test_harness_service.cc` - Test harness
-
---
-
-**Last Updated**: October 2, 2025, 11:30 PM  
-**Next Action**: Option 1 (Manual Testing) or Option 3 (Widget Export)  
-**Time Estimate**: 15-30 minutes