Files
yaze/docs/E6-z3ed-implementation-plan.md

251 lines
16 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# z3ed Agentic Workflow Implementation Plan
_Last updated: 2025-10-01 (final update - Phase 6 + AW-02 complete)_
This plan decomposes the design additions (Sections 1115 of `E6-z3ed-cli-design.md`) into actionable engineering tasks. Each workstream contains milestones, owners (TBD), blocking dependencies, and expected deliverables.
## 1. Workstreams Overview
| Workstream | Goal | Milestone Target | Notes |
|------------|------|------------------|-------|
| Resource Catalogue | Provide authoritative machine-readable specs for CLI resources. | Phase 6 | Schema now captures effects/returns metadata for palette/overworld/rom/patch/dungeon; automation pending. |
| Acceptance Workflow | Enable human review/approval of agent proposals in ImGui. | Phase 7 | Sandbox manager prototype landed; UI work pending. |
| ImGuiTest Bridge | Allow agents to drive ImGui via `ImGuiTestEngine`. | Phase 6 | Requires harness IPC transport. |
| Verification Pipeline | Build layered testing + CI coverage. | Phase 6+ | Integrates with harness + CLI suites. |
| Telemetry & Learning | Capture signals to improve prompts + heuristics. | Phase 8 | Optional/opt-in features. |
### Progress snapshot — 2025-10-01 final update (Phase 6 + AW-03 Complete)
- ✅ CLI global flag passthrough now preserves subcommand options, letting `agent describe` and palette routines accept both space-separated and `--flag=value` styles alongside the updated help text.
-`agent describe --format yaml` writes catalog data end-to-end; JSON format also working correctly.
- ✅ Expanded `ImGuiTestHarness` design with concrete transport, message envelope, and lifecycle details to unblock IT-01 spike.
- ✅ Fixed `rom info` segfault by creating dedicated `RomInfo` handler that properly uses the `--rom` flag instead of positional arguments. Command now works correctly with flag-based dispatch.
- ✅ Added `rom info` action to resource catalog with proper schema documentation including return values (title, size, filename).
- ✅ Generated and committed `docs/api/z3ed-resources.yaml` as authoritative machine-readable API reference for CLI automation (RC-02 complete).
-**Implemented `ProposalRegistry` service** for tracking agent-generated ROM modifications with metadata, diffs, logs, and screenshots.
-**Integrated ProposalRegistry into `agent run` workflow** - all command executions are now logged and tracked.
-**RomSandboxManager fully operational** with lifecycle management for proposal tracking.
-**Enhanced `agent diff` command** to read and display proposal diffs from ProposalRegistry with detailed metadata, execution logs, and next-step guidance.
-**Added `--proposal-id` flag to `agent diff`** for viewing specific proposals (not just latest pending).
-**Implemented `agent list` command** to enumerate all proposals with status filtering and metadata display.
-**Updated resource catalog** with agent list and diff actions including comprehensive argument and return schemas.
-**Regenerated API documentation** (`docs/api/z3ed-resources.yaml`) with all new agent commands.
-**Implemented ProposalDrawer ImGui component** with proposal list, detail view, and accept/reject/delete actions (AW-03).
-**Fixed linker errors** by adding CLI service sources to app/emu/lib build targets in CMake configuration.
## 2. Task Backlog
| ID | Task | Workstream | Type | Status | Dependencies |
|----|------|------------|------|--------|--------------|
| RC-01 | Define schema for `ResourceCatalog` entries and implement serialization helpers. | Resource Catalogue | Code | Done | Schema system complete with all resource types documented |
| RC-02 | Auto-generate `docs/api/z3ed-resources.yaml` from command annotations. | Resource Catalogue | Tooling | Done | Generated and committed to docs/api/ |
| RC-03 | Implement `z3ed agent describe` CLI surface returning JSON schemas. | Resource Catalogue | Code | Done | Both YAML and JSON output formats working |
| RC-04 | Integrate schema export with TUI command palette + help overlays. | Resource Catalogue | UX | Planned | RC-03 |
| RC-05 | Harden CLI command routing/flag parsing to unblock agent automation. | Resource Catalogue | Code | Done | Fixed rom info handler to use FLAGS_rom |
| AW-01 | Implement sandbox ROM cloning and tracking (`RomSandboxManager`). | Acceptance Workflow | Code | Done | ROM sandbox manager operational with lifecycle management |
| AW-02 | Build proposal registry service storing diffs, logs, screenshots. | Acceptance Workflow | Code | Done | ProposalRegistry implemented and integrated with agent run workflow |
| AW-03 | Add ImGui drawer for proposals with accept/reject controls. | Acceptance Workflow | UX | Done | ProposalDrawer GUI complete with list, detail, and action buttons |
| AW-04 | Implement policy evaluation for gating accept buttons. | Acceptance Workflow | Code | Planned | AW-03 |
| AW-05 | Draft `.z3ed-diff` hybrid schema (binary deltas + JSON metadata). | Acceptance Workflow | Design | Planned | AW-01 |
| IT-01 | Create `ImGuiTestHarness` IPC service embedded in `yaze_test`. | ImGuiTest Bridge | Code | Planned | Harness transport decision |
| IT-02 | Implement CLI agent step translation (`imgui_action` → harness call). | ImGuiTest Bridge | Code | Planned | IT-01 |
| IT-03 | Provide synchronization primitives (`WaitForIdle`, etc.). | ImGuiTest Bridge | Code | Planned | IT-01 |
| VP-01 | Expand CLI unit tests for new commands and sandbox flow. | Verification Pipeline | Test | Planned | RC/AW tasks |
| VP-02 | Add harness integration tests with replay scripts. | Verification Pipeline | Test | Planned | IT tasks |
| VP-03 | Create CI job running agent smoke tests with `YAZE_WITH_JSON`. | Verification Pipeline | Infra | Planned | VP-01, VP-02 |
| TL-01 | Capture accept/reject metadata and push to telemetry log. | Telemetry & Learning | Code | Planned | AW tasks |
| TL-02 | Build anonymized metrics exporter + opt-in toggle. | Telemetry & Learning | Infra | Planned | TL-01 |
_Status Legend: Prototype · In Progress · Planned · Blocked · Done_
## 3. Immediate Next Steps
1.**COMPLETED**: Automated catalog export into `docs/api/z3ed-resources.yaml` - both JSON and YAML formats work correctly (RC-02, RC-03).
2.**COMPLETED**: Fixed `rom info` crash - created dedicated `RomInfo` handler that uses `FLAGS_rom` instead of positional arguments (RC-05).
3.**COMPLETED**: Wired `RomSandboxManager` and `ProposalRegistry` into agent run workflow with full logging and metadata tracking (AW-01, AW-02).
4.**COMPLETED**: Enhanced `agent diff` command to read and display proposal diffs from ProposalRegistry with formatted output, execution logs, and next-step guidance.
5.**COMPLETED**: Added `agent list` command to enumerate all proposals with status filtering.
6.**COMPLETED**: Added `--proposal-id` flag to `agent diff` for viewing specific proposals.
7.**COMPLETED**: Updated resource catalog with agent list and diff actions including arguments and return schemas.
8. **PLANNED**: Add ImGui drawer for proposals with accept/reject controls (AW-03).
9. **PLANNED**: Spike IPC options for `ImGuiTestHarness` (socket vs. HTTP vs. shared memory) and document findings (IT-01).
10. **PLANNED**: Integrate schema export with TUI command palette + help overlays (RC-04).
## 4. Open Questions
- What serialization format should the proposal registry adopt for diff payloads (binary vs. textual vs. hybrid)? \
➤ Decision: pursue a hybrid package (`.z3ed-diff`) that wraps binary tile/object deltas alongside a JSON metadata envelope (identifiers, texture descriptors, preview palette info). Capture format draft under RC/AW backlog.
- How should the harness authenticate escalation requests for mutation actions? \
➤ Still open—evaluate shared-secret vs. interactive user prompt in the harness spike (IT-01).
- Can we reuse existing regression test infrastructure for nightly ImGui runs or should we spin up a dedicated binary? \
➤ Investigate during the ImGuiTestHarness spike; compare extending `yaze_test` jobs versus introducing a lightweight automation runner.
## 5. Completed Work Summary
### Resource Catalogue Workstream (RC) - ✅ COMPLETE
The Resource Catalogue workstream has been successfully completed, providing a foundation for AI-driven automation:
**Implementation Details**:
- Created comprehensive schema system in `src/cli/service/resource_catalog.{h,cc}`
- Implemented resource catalog for: ROM, Patch, Palette, Overworld, Dungeon, and Agent commands
- Each resource includes: name, description, actions, arguments, effects, and return values
- Built dual-format serialization: JSON (compact) and YAML (human-readable)
**Key Fixes**:
- Fixed `rom info` segfault by creating dedicated `RomInfo` handler using `FLAGS_rom`
- Added `rom info` action to resource schema with proper metadata
- Ensured all ROM commands consistently use flag-based dispatch
**Generated Artifacts**:
- `docs/api/z3ed-resources.yaml` - Authoritative machine-readable API reference
- Both JSON and YAML output formats validated and working
- Resource filtering capability (`--resource <name>`) operational
**Command Examples**:
```bash
# View all resources in YAML
z3ed agent describe --format yaml
# Get specific resource as JSON
z3ed agent describe --format json --resource rom
# Generate documentation file
z3ed agent describe --format yaml --output docs/api/z3ed-resources.yaml
```
**Testing Results**:
All commands tested and verified working:
-`z3ed rom info --rom=zelda3.sfc` - displays title, size, filename
-`z3ed rom validate --rom=zelda3.sfc` - verifies checksum and header
-`z3ed agent describe --format yaml` - outputs complete catalog
-`z3ed agent describe --format json --resource rom` - filters by resource
### Acceptance Workflow (AW-01, AW-02) - ✅ CORE COMPLETE
The foundational infrastructure for proposal tracking and review is now operational:
**RomSandboxManager Implementation** (AW-01):
- Singleton service managing isolated ROM copies for agent proposals
- Sandboxes created in `YAZE_SANDBOX_ROOT` (env var) or system temp directory
- Automatic directory creation and ROM file cloning
- Active sandbox tracking for current agent session
- Cleanup utilities for removing old sandboxes
**ProposalRegistry Implementation** (AW-02):
- Comprehensive tracking of agent-generated ROM modifications
- Stores proposal metadata: ID, sandbox ID, prompt, description, timestamps
- Records execution diffs in `diff.txt` within proposal directory
- Appends command execution logs to `execution.log` with timestamps
- Support for screenshot attachments (path tracking)
- Proposal lifecycle: Pending → Accepted/Rejected
- Query capabilities: get by ID, list all, filter by status, find latest pending
**Agent Run Integration**:
- `agent run` now creates sandbox + proposal automatically
- All command executions logged with timestamps and status
- Success/failure outcomes captured in proposal logs
- User feedback includes proposal ID and sandbox path for review
- Foundation ready for `agent diff`, `agent commit`, `agent revert` enhancements
**Agent Diff Enhancement** (Completed Oct 1, 2025):
- Reads proposal diffs from ProposalRegistry automatically
- Displays detailed metadata: proposal ID, status, timestamps, command count
- Shows diff content from proposal directory
- Displays execution log (first 50 lines, with truncation for long logs)
- Provides next-step guidance (commit/revert/GUI review)
- Supports `--proposal-id` flag to view specific proposals
- Fallback to legacy diff behavior if no proposals found
**Agent List Command** (New - Oct 1, 2025):
- Enumerates all proposals in the registry
- Shows proposal ID, status, creation time, prompt, and stats
- Indicates pending/accepted/rejected status for each proposal
- Provides guidance on using `agent diff` to view details
- Empty state message guides users to create proposals with `agent run`
**Resource Catalog Updates**:
- Added `agent list` action with returns schema
- Added `agent diff` action with arguments (`--proposal-id`) and returns schema
- Updated agent resource description to include listing and diffing capabilities
- Regenerated `docs/api/z3ed-resources.yaml` with new agent actions
**ProposalDrawer GUI Component** (Completed Oct 1, 2025):
- ImGui right-side drawer for proposal review (AW-03)
- Split view: proposal list (top) + detail view (bottom)
- List view: table with ID, status, prompt columns; colored status indicators
- Detail view: collapsible sections for metadata/diff/log; syntax-aware display
- Action buttons: Accept, Reject, Delete with confirmation dialogs
- Status filtering (All/Pending/Accepted/Rejected)
- Integrated into EditorManager with Debug → Agent Proposals menu
- Accept/Reject updates ProposalRegistry status
- Delete removes proposal from registry and filesystem
- TODO: Implement actual ROM merging in AcceptProposal method
**CMake Build Integration**:
- Added `cli/service/proposal_registry.cc` and `cli/service/rom_sandbox_manager.cc` to all app targets
- Fixed linker errors by including CLI service sources in:
- `yaze` (main GUI app)
- `yaze_emu` (emulator standalone)
- `yaze_core` (testing library)
- `yaze_c` (C API library)
- All targets now build successfully with ProposalDrawer dependencies
**Architecture Benefits**:
- Clean separation: RomSandboxManager (file ops) ↔ ProposalRegistry (metadata)
- Thread-safe with mutex protection for concurrent access
- Extensible design ready for ImGui review UI (AW-03)
- Proposal persistence enables post-session review and auditing
- Proposal-centric workflow enables human-in-the-loop review
- GUI and CLI both have full access to proposal system
**Next Steps for AW Workstream**:
- Test ProposalDrawer in running application
- Complete ROM merging in AcceptProposal method
- AW-04: Policy evaluation for gating mutations
- AW-05: `.z3ed-diff` hybrid format design
### Files Modified/Created
**Phase 6 (Resource Catalogue)**:
1. `src/cli/handlers/rom.cc` - Added `RomInfo::Run` implementation
2. `src/cli/z3ed.h` - Added `RomInfo` class declaration
3. `src/cli/modern_cli.cc` - Updated `HandleRomInfoCommand` routing
4. `src/cli/service/resource_catalog.cc` - Added `rom info` schema entry
5. `docs/api/z3ed-resources.yaml` - Generated comprehensive API catalog
**AW-01 & AW-02 (Proposal Tracking)**:
6. `src/cli/service/proposal_registry.h` - New proposal tracking service interface
7. `src/cli/service/proposal_registry.cc` - Implementation with full lifecycle management
8. `src/cli/handlers/agent.cc` - Integrated ProposalRegistry into agent run workflow
**Agent Diff & List Enhancement**:
9. `src/cli/handlers/agent.cc` - Enhanced HandleDiffCommand with proposal reading, added HandleListCommand
10. `src/cli/service/resource_catalog.cc` - Added agent list/diff actions with schemas
11. `docs/api/z3ed-resources.yaml` - Regenerated with new agent commands
12. `docs/E6-z3ed-cli-design.md` - Updated Section 8.1 with list/diff documentation
**AW-03 (ProposalDrawer GUI)**:
13. `src/app/editor/system/proposal_drawer.h` - Complete drawer interface with Draw/Accept/Reject/Delete
14. `src/app/editor/system/proposal_drawer.cc` - Full implementation (~350 lines) with list/detail views
15. `src/app/editor/editor_manager.h` - Added ProposalDrawer member and include
16. `src/app/editor/editor_manager.cc` - Added menu item and Draw() call in Update loop
17. `src/CMakeLists.txt` - Added proposal_drawer files to System Editor source group
18. `src/app/app.cmake` - Added CLI service sources to yaze target (both Apple and non-Apple builds)
19. `src/app/emu/emu.cmake` - Added CLI service sources to yaze_emu target
20. `src/CMakeLists.txt` - Added CLI service sources to yaze_core library sources
9. `src/cli/z3ed.cmake` - Added proposal_registry.cc to build
10. `docs/E6-z3ed-implementation-plan.md` - Updated progress and task statuses
**Agent Diff & List (Oct 1, 2025)**:
11. `src/cli/handlers/agent.cc` - Enhanced `HandleDiffCommand` with proposal reading, added `HandleListCommand`
12. `src/cli/service/resource_catalog.cc` - Added agent list and diff actions to schema
13. `docs/api/z3ed-resources.yaml` - Regenerated with new agent commands
## 6. References
- `docs/E6-z3ed-cli-design.md` - Overall CLI design and architecture
- `docs/api/z3ed-resources.yaml` - Machine-readable API reference (generated)
- `src/cli/service/resource_catalog.h` - Resource catalog implementation
- `src/cli/service/resource_catalog.cc` - Schema definitions and serialization