diff --git a/docs/z3ed/AW-04-POLICY-FRAMEWORK.md b/docs/z3ed/AW-04-POLICY-FRAMEWORK.md new file mode 100644 index 00000000..68ab3a06 --- /dev/null +++ b/docs/z3ed/AW-04-POLICY-FRAMEWORK.md @@ -0,0 +1,627 @@ +# Policy Evaluation Framework (AW-04) + +**Status**: Implementation In Progress +**Priority**: High (Next Phase) +**Time Estimate**: 6-8 hours +**Last Updated**: October 2, 2025 + +## Overview + +The Policy Evaluation Framework provides a YAML-based constraint system for gating proposal acceptance in the z3ed agent workflow. It ensures that AI-generated ROM modifications meet quality, safety, and testing requirements before being merged into the main ROM. + +## Goals + +1. **Quality Gates**: Enforce minimum test pass rates and code quality standards +2. **Safety Constraints**: Prevent modifications to critical ROM regions (headers, checksums) +3. **Scope Limits**: Restrict changes to reasonable byte counts and specific banks +4. **Human Review**: Require manual review for large or complex changes +5. **Flexibility**: Allow policy overrides with confirmation and logging + +## Architecture + +``` +┌─────────────────────────────────────────────────────────┐ +│ ProposalDrawer (GUI) │ +│ └─ Accept button gated by PolicyEvaluator │ +└────────────────────┬────────────────────────────────────┘ + │ + ▼ +┌─────────────────────────────────────────────────────────┐ +│ PolicyEvaluator (Singleton Service) │ +│ ├─ LoadPolicies() from .yaze/policies/ │ +│ ├─ EvaluateProposal(proposal_id) → PolicyResult │ +│ └─ Cache of parsed YAML policies │ +└────────────────────┬────────────────────────────────────┘ + │ + ▼ +┌─────────────────────────────────────────────────────────┐ +│ .yaze/policies/agent.yaml (YAML Configuration) │ +│ ├─ test_requirements (min pass rates) │ +│ ├─ change_constraints (byte limits, allowed banks) │ +│ ├─ review_requirements (human review triggers) │ +│ └─ forbidden_ranges (protected ROM regions) │ +└─────────────────────────────────────────────────────────┘ +``` + +## YAML Policy Schema + +### Example Policy File + +```yaml +# .yaze/policies/agent.yaml +version: 1.0 +enabled: true + +policies: + # Policy 1: Test Requirements + - name: require_tests + type: test_requirement + enabled: true + severity: critical # critical | warning | info + rules: + - test_suite: "overworld_rendering" + min_pass_rate: 0.95 + - test_suite: "palette_integrity" + min_pass_rate: 1.0 + - test_suite: "dungeon_logic" + min_pass_rate: 0.90 + message: "All required test suites must pass before accepting proposal" + + # Policy 2: Change Scope Limits + - name: limit_change_scope + type: change_constraint + enabled: true + severity: critical + rules: + - max_bytes_changed: 10240 # 10KB limit + - allowed_banks: [0x00, 0x01, 0x0E, 0x0F] # Graphics banks only + - max_commands_executed: 20 + message: "Proposal exceeds allowed change scope" + + # Policy 3: Protected ROM Regions + - name: protect_critical_regions + type: forbidden_range + enabled: true + severity: critical + ranges: + - start: 0xFFB0 # ROM header + end: 0xFFFF + reason: "ROM header is protected" + - start: 0x00FFC0 # Internal header + end: 0x00FFDF + reason: "Internal ROM header" + message: "Proposal modifies protected ROM region" + + # Policy 4: Human Review Requirements + - name: human_review_required + type: review_requirement + enabled: true + severity: warning + conditions: + - if: bytes_changed > 1024 + then: require_diff_review + message: "Large change requires diff review" + - if: commands_executed > 10 + then: require_log_review + message: "Complex operation requires log review" + - if: test_failures > 0 + then: require_explanation + message: "Test failures require explanation" + + # Policy 5: Palette Modifications + - name: palette_safety + type: change_constraint + enabled: true + severity: warning + rules: + - max_palettes_changed: 5 + - preserve_transparency: true # Don't modify color index 0 + message: "Palette changes exceed safety threshold" +``` + +### Schema Definition + +```yaml +# Policy file structure +version: string # Semantic version (e.g., "1.0") +enabled: boolean # Master enable/disable + +policies: + - name: string # Unique policy identifier + type: enum # test_requirement | change_constraint | forbidden_range | review_requirement + enabled: boolean # Policy-specific enable/disable + severity: enum # critical | warning | info + + # Type-specific fields: + rules: array # For test_requirement, change_constraint + ranges: array # For forbidden_range + conditions: array # For review_requirement + + message: string # User-facing error message +``` + +## Implementation Plan + +### Phase 1: Core Infrastructure (2 hours) + +#### 1.1 Create PolicyEvaluator Service + +**File**: `src/cli/service/policy_evaluator.h` + +```cpp +#ifndef YAZE_CLI_SERVICE_POLICY_EVALUATOR_H +#define YAZE_CLI_SERVICE_POLICY_EVALUATOR_H + +#include +#include +#include +#include "absl/status/status.h" +#include "absl/status/statusor.h" +#include "absl/strings/string_view.h" + +namespace yaze { +namespace cli { + +// Policy violation severity levels +enum class PolicySeverity { + kInfo, // Informational, doesn't block acceptance + kWarning, // Warning, can be overridden + kCritical // Critical, blocks acceptance +}; + +// Individual policy violation +struct PolicyViolation { + std::string policy_name; + PolicySeverity severity; + std::string message; + std::string details; // Additional context +}; + +// Result of policy evaluation +struct PolicyResult { + bool passed; // True if all critical policies passed + std::vector violations; + + // Categorized violations + std::vector critical_violations; + std::vector warnings; + std::vector info; + + // Helper methods + bool has_critical_violations() const { return !critical_violations.empty(); } + bool can_accept_with_override() const { + return !has_critical_violations() && !warnings.empty(); + } +}; + +// Singleton service for evaluating proposals against policies +class PolicyEvaluator { + public: + static PolicyEvaluator& GetInstance(); + + // Load policies from disk (.yaze/policies/agent.yaml) + absl::Status LoadPolicies(absl::string_view policy_dir = ".yaze/policies"); + + // Evaluate a proposal against all loaded policies + absl::StatusOr EvaluateProposal( + absl::string_view proposal_id); + + // Reload policies from disk (for live editing) + absl::Status ReloadPolicies(); + + // Check if policies are loaded and enabled + bool IsEnabled() const { return enabled_; } + + // Get policy configuration path + std::string GetPolicyPath() const { return policy_path_; } + + private: + PolicyEvaluator() = default; + ~PolicyEvaluator() = default; + + // Non-copyable, non-movable + PolicyEvaluator(const PolicyEvaluator&) = delete; + PolicyEvaluator& operator=(const PolicyEvaluator&) = delete; + + // Parse YAML policy file + absl::Status ParsePolicyFile(absl::string_view yaml_content); + + // Evaluate individual policy types + void EvaluateTestRequirements( + absl::string_view proposal_id, PolicyResult* result); + void EvaluateChangeConstraints( + absl::string_view proposal_id, PolicyResult* result); + void EvaluateForbiddenRanges( + absl::string_view proposal_id, PolicyResult* result); + void EvaluateReviewRequirements( + absl::string_view proposal_id, PolicyResult* result); + + bool enabled_ = false; + std::string policy_path_; + + // Parsed policy structures (implementation detail) + struct PolicyConfig; + std::unique_ptr config_; +}; + +} // namespace cli +} // namespace yaze + +#endif // YAZE_CLI_SERVICE_POLICY_EVALUATOR_H +``` + +#### 1.2 Create Policy Configuration Structures + +**File**: `src/cli/service/policy_evaluator.cc` (partial) + +```cpp +#include "src/cli/service/policy_evaluator.h" + +#include +#include +#include "absl/strings/str_format.h" +#include "src/cli/service/proposal_registry.h" + +// If YAML parsing is available +#ifdef YAZE_WITH_YAML +#include +#endif + +namespace yaze { +namespace cli { + +// Internal policy configuration structures +struct PolicyEvaluator::PolicyConfig { + std::string version; + bool enabled; + + struct TestRequirement { + std::string name; + bool enabled; + PolicySeverity severity; + std::vector> test_suites; // suite name → min pass rate + std::string message; + }; + + struct ChangeConstraint { + std::string name; + bool enabled; + PolicySeverity severity; + int max_bytes_changed = -1; + std::vector allowed_banks; + int max_commands_executed = -1; + int max_palettes_changed = -1; + bool preserve_transparency = false; + std::string message; + }; + + struct ForbiddenRange { + std::string name; + bool enabled; + PolicySeverity severity; + std::vector> ranges; // start, end, reason + std::string message; + }; + + struct ReviewRequirement { + std::string name; + bool enabled; + PolicySeverity severity; + std::vector conditions; + std::string message; + }; + + std::vector test_requirements; + std::vector change_constraints; + std::vector forbidden_ranges; + std::vector review_requirements; +}; + +// Singleton instance +PolicyEvaluator& PolicyEvaluator::GetInstance() { + static PolicyEvaluator instance; + return instance; +} + +absl::Status PolicyEvaluator::LoadPolicies(absl::string_view policy_dir) { + policy_path_ = absl::StrFormat("%s/agent.yaml", policy_dir); + + // Check if file exists + std::ifstream file(policy_path_); + if (!file.good()) { + // No policy file - policies disabled + enabled_ = false; + return absl::OkStatus(); + } + + // Read file content + std::stringstream buffer; + buffer << file.rdbuf(); + std::string yaml_content = buffer.str(); + + return ParsePolicyFile(yaml_content); +} + +absl::Status PolicyEvaluator::ParsePolicyFile(absl::string_view yaml_content) { +#ifndef YAZE_WITH_YAML + return absl::UnimplementedError( + "YAML support not compiled. Build with YAZE_WITH_YAML=ON"); +#else + try { + YAML::Node root = YAML::Load(std::string(yaml_content)); + + config_ = std::make_unique(); + config_->version = root["version"].as("1.0"); + config_->enabled = root["enabled"].as(true); + + if (!config_->enabled) { + enabled_ = false; + return absl::OkStatus(); + } + + // Parse policies array + if (root["policies"]) { + for (const auto& policy_node : root["policies"]) { + std::string type = policy_node["type"].as(); + + if (type == "test_requirement") { + // Parse test requirement policy + // ... (implementation continues) + } else if (type == "change_constraint") { + // Parse change constraint policy + // ... (implementation continues) + } else if (type == "forbidden_range") { + // Parse forbidden range policy + // ... (implementation continues) + } else if (type == "review_requirement") { + // Parse review requirement policy + // ... (implementation continues) + } + } + } + + enabled_ = true; + return absl::OkStatus(); + + } catch (const YAML::Exception& e) { + return absl::InvalidArgumentError( + absl::StrFormat("Failed to parse policy YAML: %s", e.what())); + } +#endif +} + +// ... (implementation continues with evaluation methods) + +} // namespace cli +} // namespace yaze +``` + +### Phase 2: Policy Evaluation Logic (2-3 hours) + +Implement the core evaluation methods that check proposals against each policy type. + +### Phase 3: GUI Integration (2 hours) + +#### 3.1 Update ProposalDrawer + +**File**: `src/app/editor/system/proposal_drawer.cc` + +Add policy status display and gating logic: + +```cpp +#include "src/cli/service/policy_evaluator.h" + +void ProposalDrawer::DrawProposalDetail(const std::string& proposal_id) { + // ... existing detail view code ... + + // === Policy Status Section === + ImGui::Separator(); + ImGui::TextUnformatted("Policy Status:"); + + auto& policy_eval = cli::PolicyEvaluator::GetInstance(); + if (policy_eval.IsEnabled()) { + auto policy_result = policy_eval.EvaluateProposal(proposal_id); + + if (policy_result.ok()) { + const auto& result = policy_result.value(); + + if (result.passed) { + ImGui::TextColored(ImVec4(0, 1, 0, 1), "✓ All policies passed"); + } else { + // Show violations + if (result.has_critical_violations()) { + ImGui::TextColored(ImVec4(1, 0, 0, 1), "⛔ Critical violations:"); + for (const auto& violation : result.critical_violations) { + ImGui::BulletText("%s: %s", + violation.policy_name.c_str(), + violation.message.c_str()); + } + } + + if (!result.warnings.empty()) { + ImGui::TextColored(ImVec4(1, 1, 0, 1), "⚠️ Warnings:"); + for (const auto& violation : result.warnings) { + ImGui::BulletText("%s: %s", + violation.policy_name.c_str(), + violation.message.c_str()); + } + } + } + + // Gate Accept button + ImGui::Separator(); + bool can_accept = !result.has_critical_violations(); + + if (!can_accept) { + ImGui::BeginDisabled(); + } + + if (ImGui::Button("Accept Proposal")) { + if (result.can_accept_with_override() && !override_confirmed_) { + // Show override confirmation dialog + ImGui::OpenPopup("Override Policy"); + } else { + AcceptProposal(proposal_id); + } + } + + if (!can_accept) { + ImGui::EndDisabled(); + ImGui::SameLine(); + ImGui::TextColored(ImVec4(1, 0, 0, 1), + "(Accept blocked by policy violations)"); + } + + // Override confirmation dialog + if (ImGui::BeginPopupModal("Override Policy", nullptr, + ImGuiWindowFlags_AlwaysAutoResize)) { + ImGui::Text("This proposal has policy warnings."); + ImGui::Text("Do you want to override and accept anyway?"); + ImGui::Text("This action will be logged."); + ImGui::Separator(); + + if (ImGui::Button("Override and Accept")) { + override_confirmed_ = true; + AcceptProposal(proposal_id); + ImGui::CloseCurrentPopup(); + } + ImGui::SameLine(); + if (ImGui::Button("Cancel")) { + ImGui::CloseCurrentPopup(); + } + ImGui::EndPopup(); + } + } else { + ImGui::TextColored(ImVec4(1, 0, 0, 1), + "Policy evaluation failed: %s", + policy_result.status().message().data()); + } + } else { + ImGui::TextColored(ImVec4(0.5, 0.5, 0.5, 1), + "No policies configured"); + } +} +``` + +### Phase 4: Testing & Documentation (1-2 hours) + +#### 4.1 Example Policy File + +Create `.yaze/policies/agent.yaml.example`: + +```yaml +# Example agent policy configuration +# Copy to .yaze/policies/agent.yaml and customize + +version: 1.0 +enabled: true + +policies: + # Require test suites to pass + - name: require_tests + type: test_requirement + enabled: false # Disabled by default (no tests yet) + severity: critical + rules: + - test_suite: "smoke_test" + min_pass_rate: 1.0 + message: "All smoke tests must pass" + + # Limit change scope + - name: limit_changes + type: change_constraint + enabled: true + severity: warning + rules: + - max_bytes_changed: 5120 # 5KB + - max_commands_executed: 15 + message: "Keep changes small and focused" + + # Protect ROM header + - name: protect_header + type: forbidden_range + enabled: true + severity: critical + ranges: + - start: 0xFFB0 + end: 0xFFFF + reason: "ROM header" + message: "Cannot modify ROM header" +``` + +#### 4.2 Unit Tests + +Create `test/cli/policy_evaluator_test.cc`: + +```cpp +#include "src/cli/service/policy_evaluator.h" +#include "gtest/gtest.h" + +namespace yaze { +namespace cli { +namespace { + +TEST(PolicyEvaluatorTest, LoadPoliciesSuccess) { + auto& eval = PolicyEvaluator::GetInstance(); + auto status = eval.LoadPolicies("test/fixtures/policies"); + EXPECT_TRUE(status.ok()); + EXPECT_TRUE(eval.IsEnabled()); +} + +TEST(PolicyEvaluatorTest, EvaluateProposal_NoViolations) { + // ... test implementation +} + +TEST(PolicyEvaluatorTest, EvaluateProposal_CriticalViolation) { + // ... test implementation +} + +} // namespace +} // namespace cli +} // namespace yaze +``` + +## Deliverables + +- [x] Policy evaluator service interface +- [ ] YAML policy parser implementation +- [ ] Policy evaluation logic for all 4 types +- [ ] ProposalDrawer GUI integration +- [ ] Policy override workflow +- [ ] Example policy configurations +- [ ] Unit tests +- [ ] Documentation and usage guide + +## Success Criteria + +1. **Functional**: + - Policies load from YAML files + - Proposals evaluated against all enabled policies + - Accept button gated by critical violations + - Override workflow for warnings + +2. **User Experience**: + - Clear policy status display in ProposalDrawer + - Helpful violation messages + - Override confirmation dialog + - Policy evaluation fast (< 100ms) + +3. **Quality**: + - Unit test coverage > 80% + - No crashes or memory leaks + - Graceful handling of malformed YAML + - Works with policies disabled + +## Future Enhancements + +- Policy templates for common scenarios +- Policy violation history/analytics +- Auto-fix suggestions for violations +- Integration with CI/CD for automated policy checks +- Policy versioning and migration + +--- + +**Status**: Ready for implementation +**Next Step**: Create PolicyEvaluator skeleton and wire into build system +**Estimated Completion**: October 3-4, 2025 diff --git a/docs/z3ed/E6-z3ed-implementation-plan.md b/docs/z3ed/E6-z3ed-implementation-plan.md index 91de6541..08cb652f 100644 --- a/docs/z3ed/E6-z3ed-implementation-plan.md +++ b/docs/z3ed/E6-z3ed-implementation-plan.md @@ -5,18 +5,19 @@ - **Priority 1**: Complete E2E Validation by implementing identified fixes for window detection and thread safety. - **Priority 2**: Begin Policy Evaluation Framework (AW-04) - a YAML-based constraint system for proposal acceptance. -**Recent Accomplishments**: -- **gRPC Test Harness (IT-01 & IT-02)**: Core implementation of all 6 RPCs (Ping, Click, Type, Wait, Assert, Screenshot) is complete, enabling automated GUI testing from natural language prompts. -- **Root Cause Analysis**: Identified key sources of test flakiness, including a window-creation timing issue and a thread-safety bug in RPC handlers. Solutions have been designed. -- **Build System**: Hardened the CMake build for reliable gRPC integration. -- **Proposal Workflow**: The agentic proposal system (create, list, diff, review in GUI) is fully operational. +**Recent Accomplishments** (Updated: October 2, 2025): +- **✅ E2E Validation Complete**: All 5 functional RPC tests passing (Ping, Click, Type, Wait, Assert) + - Window detection timing issue **resolved** with 10-frame yield buffer in Wait RPC + - Thread safety issues **resolved** with shared_ptr state management + - Test harness validated on macOS ARM64 with real YAZE GUI interactions +- **gRPC Test Harness (IT-01 & IT-02)**: Full implementation complete with natural language → GUI testing +- **Build System**: Hardened CMake configuration with reliable gRPC integration +- **Proposal Workflow**: Agentic proposal system fully operational (create, list, diff, review in GUI) -**Known Issues**: -- **Test Flakiness**: The e2e test script (`test_harness_e2e.sh`) is flaky due to a timing issue where `Click` actions that open new windows return before the window is interactable. - - **Solution**: The `Click` RPC handler must call `ctx->Yield()` after performing the click to allow the ImGui frame to update before the RPC returns. -- **RPC Handler Crashes**: The `Wait` and `Assert` RPCs can crash due to unsafe state sharing between the gRPC thread and the test engine thread. - - **Solution**: A thread-safe pattern using a `std::shared_ptr` to a state struct must be implemented for these handlers. -- **Screenshot RPC**: The Screenshot RPC is a non-functional stub. +**Known Limitations** (Non-Blocking): +- **Screenshot RPC**: Stub implementation (returns "not implemented" - planned for production phase) +- **Widget Naming**: Documentation needed for icon prefixes and naming conventions +- **Performance**: Tests add ~166ms per Wait call due to frame yielding (acceptable trade-off) **Time Investment**: 20.5 hours total (IT-01: 11h, IT-02: 7.5h, Docs: 2h)on Plan @@ -216,7 +217,7 @@ This plan decomposes the design additions into actionable engineering tasks. Eac | IT-01 | Create `ImGuiTestHarness` IPC service embedded in `yaze_test`. | ImGuiTest Bridge | Code | ✅ Done | Phase 1+2+3 Complete - Full GUI automation with gRPC + ImGuiTestEngine (11 hours) | | IT-02 | Implement CLI agent step translation (`imgui_action` → harness call). | ImGuiTest Bridge | Code | ✅ Done | `z3ed agent test` command with natural language prompts (7.5 hours) | | IT-03 | Provide synchronization primitives (`WaitForIdle`, etc.). | ImGuiTest Bridge | Code | ✅ Done | Wait RPC with condition polling already implemented in IT-01 Phase 3 | -| IT-04 | Complete E2E validation with real YAZE widgets | ImGuiTest Bridge | Test | 🔄 Active | IT-02, Fix window detection after menu actions (2-3 hours) | +| IT-04 | Complete E2E validation with real YAZE widgets | ImGuiTest Bridge | Test | ✅ Done | IT-02 - All 5 functional tests passing, window detection fixed with yield buffer | | VP-01 | Expand CLI unit tests for new commands and sandbox flow. | Verification Pipeline | Test | 📋 Planned | RC/AW tasks | | VP-02 | Add harness integration tests with replay scripts. | Verification Pipeline | Test | 📋 Planned | IT tasks | | VP-03 | Create CI job running agent smoke tests with `YAZE_WITH_JSON`. | Verification Pipeline | Infra | 📋 Planned | VP-01, VP-02 | diff --git a/docs/z3ed/README.md b/docs/z3ed/README.md index c74194ad..f97288ab 100644 --- a/docs/z3ed/README.md +++ b/docs/z3ed/README.md @@ -1,224 +1,53 @@ # z3ed: AI-Powered CLI for YAZE -**Status**: Active Development -**Version**: 0.1.0-alpha -**Last Updated**: October 2, 2025 (E2E Validation 80% Complete) +**Status**: Active Development ## Overview -z3ed is a command-line interface for YAZE that enables AI-driven ROM modifications through a proposal-based workflow. It provides both human-accessible commands and machine-readable APIs for LLM integration. +`z3ed` is a command-line interface for YAZE that enables AI-driven ROM modifications through a proposal-based workflow. It provides both human-accessible commands for developers and machine-readable APIs for LLM integration, forming the backbone of an agentic development ecosystem. + +This directory contains the primary documentation for the `z3ed` system. ## Core Documentation -### Essential Documents (Read These First) +Start here to understand the architecture, learn how to use the commands, and see the current development status. -1. **[E6-z3ed-cli-design.md](E6-z3ed-cli-design.md)** - **SOURCE OF TRUTH** - - Architecture overview - - Design goals and principles - - Command structure - - Agentic workflow framework +1. **[E6-z3ed-cli-design.md](E6-z3ed-cli-design.md)** - **Design & Architecture** + * The "source of truth" for the system's architecture, design goals, and the agentic workflow framework. Read this first to understand *why* the system is built the way it is. -2. **[E6-z3ed-reference.md](E6-z3ed-reference.md)** - **TECHNICAL REFERENCE** - - Complete command reference - - API documentation - - Implementation guides - - Troubleshooting +2. **[E6-z3ed-reference.md](E6-z3ed-reference.md)** - **Technical Reference & Guides** + * A complete command reference, API documentation, implementation guides, and troubleshooting tips. Use this as your day-to-day manual for working with `z3ed`. -3. **[E6-z3ed-implementation-plan.md](E6-z3ed-implementation-plan.md)** - **IMPLEMENTATION TRACKER** - - Task backlog and roadmap - - Progress tracking - - Known issues - - Next priorities - -### Quick Start Guides - -4. **[IT-01-QUICKSTART.md](IT-01-QUICKSTART.md)** - Test harness quick start - - Starting the gRPC server - - Testing with grpcurl - - Common workflows - -5. **[AGENT_TEST_QUICKREF.md](AGENT_TEST_QUICKREF.md)** - CLI agent test command - - Supported prompt patterns - - Example workflows - - Error handling - -6. **[E2E_VALIDATION_GUIDE.md](E2E_VALIDATION_GUIDE.md)** - Complete validation checklist - - Testing procedures - - Success criteria - - Issue reporting - -### Implementation Guides - -7. **[IMGUI_ID_MANAGEMENT_REFACTORING.md](IMGUI_ID_MANAGEMENT_REFACTORING.md)** - GUI ID management refactoring - - Hierarchical widget ID system - - Widget registry for test automation - - Migration guide for editors - - Integration with z3ed agent - -### Status Documents - -8. **[PROJECT_STATUS_OCT2.md](PROJECT_STATUS_OCT2.md)** - Current project status - - Component completion percentages - - Performance metrics - - Known limitations - -9. **[NEXT_PRIORITIES_OCT2.md](NEXT_PRIORITIES_OCT2.md)** - Detailed next steps - - Priority 0-3 task breakdowns - - Implementation guides - - Time estimates +3. **[E6-z3ed-implementation-plan.md](E6-z3ed-implementation-plan.md)** - **Roadmap & Status** + * The project's task backlog, roadmap, progress tracking, and a list of known issues. Check this document for current priorities and to see what's next. ## Quick Start ### Build z3ed ```bash -# Basic build (without gRPC) -cmake --build build --target z3ed -j8 +# Basic build (without GUI automation support) +cmake --build build --target z3ed -# With gRPC support (for GUI automation) +# Build with gRPC support (for GUI automation) cmake -B build-grpc-test -DYAZE_WITH_GRPC=ON -cmake --build build-grpc-test --target z3ed -j$(sysctl -n hw.ncpu) +cmake --build build-grpc-test --target z3ed ``` -### Basic Usage +### Common Commands ```bash -# Display ROM info -z3ed rom info --rom=zelda3.sfc +# Create an agent proposal in a safe sandbox +z3ed agent run --prompt "Make all soldier armor red" --rom=zelda3.sfc --sandbox -# Export a palette -z3ed palette export sprites_aux1 4 soldier.col - -# Create an agent proposal -z3ed agent run --prompt "Make soldiers red" --rom=zelda3.sfc --sandbox - -# List all proposals +# List all active and past proposals z3ed agent list -# View proposal changes +# View the changes for the latest proposal z3ed agent diff -# Automated GUI testing (requires test harness) -z3ed agent test --prompt "Open Overworld editor and verify it loads" +# Run an automated GUI test (requires test harness to be running) +z3ed agent test --prompt "Open the Overworld editor and verify it loads" ``` -### Start Test Harness (Optional) - -```bash -# Terminal 1: Start YAZE with test harness -./build-grpc-test/bin/yaze.app/Contents/MacOS/yaze \ - --enable_test_harness \ - --test_harness_port=50052 \ - --rom_file=assets/zelda3.sfc & - -# Terminal 2: Run automated test -./build-grpc-test/bin/z3ed agent test \ - --prompt "Open Overworld editor" -``` - -## Documentation Structure - -``` -docs/z3ed/ -├── Core Documentation (3 files) -│ ├── E6-z3ed-cli-design.md [Source of Truth] -│ ├── E6-z3ed-reference.md [Technical Reference] -│ └── E6-z3ed-implementation-plan.md [Tracker] -│ -├── Quick Start Guides (3 files) -│ ├── IT-01-QUICKSTART.md [Test Harness] -│ ├── AGENT_TEST_QUICKREF.md [CLI Agent Test] -│ └── E2E_VALIDATION_GUIDE.md [Validation] -│ -├── Implementation Guides (1 file) -│ └── IMGUI_ID_MANAGEMENT_REFACTORING.md [GUI ID System] -│ -├── Status Documents (4 files) -│ ├── README.md [This file] -│ ├── PROJECT_STATUS_OCT2.md [Current Status] -│ ├── NEXT_PRIORITIES_OCT2.md [Next Steps] -│ └── WORK_SUMMARY_OCT2.md [Recent Work] -│ -└── Archive (15+ files) - └── Historical documentation and implementation notes -``` - -## Key Features - -### ✅ Completed (Production-Ready on macOS) - -- **Resource-Oriented CLI**: Clean command structure (`z3ed `) -- **Resource Catalogue**: Machine-readable API specs (`docs/api/z3ed-resources.yaml`) -- **Acceptance Workflow**: Proposal tracking, sandbox management, GUI review -- **ImGuiTestHarness (IT-01)**: gRPC-based GUI automation (6 RPC methods) -- **CLI Agent Test (IT-02)**: Natural language → automated GUI tests -- **ProposalDrawer**: Integrated proposal review UI in YAZE -- **ROM Operations**: info, validate, diff, generate-golden -- **Palette Operations**: export, import, list -- **Overworld Operations**: get-tile, set-tile -- **Dungeon Operations**: list-rooms, add-object - -### 🔄 In Progress (80% Complete) - -- **E2E Validation**: Full workflow testing (window detection needs fix) - -### 📋 Planned (Next Priorities) - -1. **Policy Evaluation Framework (AW-04)**: YAML-based constraints -2. **Windows Cross-Platform Testing**: Validate on Windows with vcpkg -3. **Production Readiness**: Telemetry, screenshot, expanded tests - -## Architecture Highlights - -### Proposal-Based Workflow - -``` -User Prompt → AI Service → Sandbox ROM → Execute Commands → -Create Proposal → Review in GUI → Accept/Reject → Commit to ROM -``` - -### Component Stack - -``` -┌─────────────────────────────────┐ -│ AI Agent (LLM) │ -├─────────────────────────────────┤ -│ z3ed CLI │ -├─────────────────────────────────┤ -│ Service Layer │ -│ • ProposalRegistry │ -│ • RomSandboxManager │ -│ • GuiAutomationClient │ -│ • TestWorkflowGenerator │ -├─────────────────────────────────┤ -│ ImGuiTestHarness (gRPC) │ -├─────────────────────────────────┤ -│ YAZE GUI + ProposalDrawer │ -└─────────────────────────────────┘ -``` - -## Resources - -**Machine-Readable API**: `docs/api/z3ed-resources.yaml` -**Proto Schema**: `src/app/core/proto/imgui_test_harness.proto` -**Test Script**: `scripts/test_harness_e2e.sh` - -## Contributing - -See **[B1-contributing.md](../B1-contributing.md)** for general contribution guidelines. - -For z3ed-specific development: -1. Read **E6-z3ed-cli-design.md** for architecture -2. Check **E6-z3ed-implementation-plan.md** for open tasks -3. Use **E6-z3ed-reference.md** for API details -4. Follow **NEXT_PRIORITIES_OCT2.md** for current work - -## License - -Same as YAZE - see `LICENSE` in repository root. - ---- - -**Last Updated**: October 2, 2025 -**Contributors**: @scawful, GitHub Copilot -**Next Milestone**: E2E Validation Complete (Est. Oct 3, 2025) +See the **[Technical Reference](E6-z3ed-reference.md)** for a full command list.