Files
yaze/docs/E6-z3ed-implementation-plan.md
scawful 02c6985201 feat: Enhance ROM loading options and proposal management
- Introduced `RomLoadOptions` struct to manage various loading configurations for ROM files, including options for stripping headers, populating metadata, and loading Zelda 3 content.
- Updated `Rom::LoadFromFile` and `Rom::LoadFromData` methods to accept `RomLoadOptions`, allowing for more flexible ROM loading behavior.
- Implemented `MaybeStripSmcHeader` function to conditionally remove SMC headers from ROM data.
- Added new command handler `RomInfo` to display basic ROM information, including title and size.
- Created `ProposalRegistry` class to manage agent-generated proposals, including creation, logging, and status updates.
- Enhanced CLI commands to support proposal listing and detailed diff viewing, improving user interaction with agent-generated modifications.
- Updated resource catalog to include new actions for ROM info and agent proposal management.
2025-10-01 18:18:48 -04:00

210 lines
14 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# z3ed Agentic Workflow Implementation Plan
_Last updated: 2025-10-01 (final update - Phase 6 + AW-02 complete)_
This plan decomposes the design additions (Sections 1115 of `E6-z3ed-cli-design.md`) into actionable engineering tasks. Each workstream contains milestones, owners (TBD), blocking dependencies, and expected deliverables.
## 1. Workstreams Overview
| Workstream | Goal | Milestone Target | Notes |
|------------|------|------------------|-------|
| Resource Catalogue | Provide authoritative machine-readable specs for CLI resources. | Phase 6 | Schema now captures effects/returns metadata for palette/overworld/rom/patch/dungeon; automation pending. |
| Acceptance Workflow | Enable human review/approval of agent proposals in ImGui. | Phase 7 | Sandbox manager prototype landed; UI work pending. |
| ImGuiTest Bridge | Allow agents to drive ImGui via `ImGuiTestEngine`. | Phase 6 | Requires harness IPC transport. |
| Verification Pipeline | Build layered testing + CI coverage. | Phase 6+ | Integrates with harness + CLI suites. |
| Telemetry & Learning | Capture signals to improve prompts + heuristics. | Phase 8 | Optional/opt-in features. |
### Progress snapshot — 2025-10-01 final update (Phase 6 + Agent Diff & List Complete)
- ✅ CLI global flag passthrough now preserves subcommand options, letting `agent describe` and palette routines accept both space-separated and `--flag=value` styles alongside the updated help text.
-`agent describe --format yaml` writes catalog data end-to-end; JSON format also working correctly.
- ✅ Expanded `ImGuiTestHarness` design with concrete transport, message envelope, and lifecycle details to unblock IT-01 spike.
- ✅ Fixed `rom info` segfault by creating dedicated `RomInfo` handler that properly uses the `--rom` flag instead of positional arguments. Command now works correctly with flag-based dispatch.
- ✅ Added `rom info` action to resource catalog with proper schema documentation including return values (title, size, filename).
- ✅ Generated and committed `docs/api/z3ed-resources.yaml` as authoritative machine-readable API reference for CLI automation (RC-02 complete).
-**Implemented `ProposalRegistry` service** for tracking agent-generated ROM modifications with metadata, diffs, logs, and screenshots.
-**Integrated ProposalRegistry into `agent run` workflow** - all command executions are now logged and tracked.
-**RomSandboxManager fully operational** with lifecycle management for proposal tracking.
-**Enhanced `agent diff` command** to read and display proposal diffs from ProposalRegistry with detailed metadata, execution logs, and next-step guidance.
-**Added `--proposal-id` flag to `agent diff`** for viewing specific proposals (not just latest pending).
-**Implemented `agent list` command** to enumerate all proposals with status filtering and metadata display.
-**Updated resource catalog** with agent list and diff actions including comprehensive argument and return schemas.
-**Regenerated API documentation** (`docs/api/z3ed-resources.yaml`) with all new agent commands.
## 2. Task Backlog
| ID | Task | Workstream | Type | Status | Dependencies |
|----|------|------------|------|--------|--------------|
| RC-01 | Define schema for `ResourceCatalog` entries and implement serialization helpers. | Resource Catalogue | Code | Done | Schema system complete with all resource types documented |
| RC-02 | Auto-generate `docs/api/z3ed-resources.yaml` from command annotations. | Resource Catalogue | Tooling | Done | Generated and committed to docs/api/ |
| RC-03 | Implement `z3ed agent describe` CLI surface returning JSON schemas. | Resource Catalogue | Code | Done | Both YAML and JSON output formats working |
| RC-04 | Integrate schema export with TUI command palette + help overlays. | Resource Catalogue | UX | Planned | RC-03 |
| RC-05 | Harden CLI command routing/flag parsing to unblock agent automation. | Resource Catalogue | Code | Done | Fixed rom info handler to use FLAGS_rom |
| AW-01 | Implement sandbox ROM cloning and tracking (`RomSandboxManager`). | Acceptance Workflow | Code | Done | ROM sandbox manager operational with lifecycle management |
| AW-02 | Build proposal registry service storing diffs, logs, screenshots. | Acceptance Workflow | Code | Done | ProposalRegistry implemented and integrated with agent run workflow |
| AW-03 | Add ImGui drawer for proposals with accept/reject controls. | Acceptance Workflow | UX | Planned | AW-02 |
| AW-04 | Implement policy evaluation for gating accept buttons. | Acceptance Workflow | Code | Planned | AW-03 |
| AW-05 | Draft `.z3ed-diff` hybrid schema (binary deltas + JSON metadata). | Acceptance Workflow | Design | Planned | AW-01 |
| IT-01 | Create `ImGuiTestHarness` IPC service embedded in `yaze_test`. | ImGuiTest Bridge | Code | Planned | Harness transport decision |
| IT-02 | Implement CLI agent step translation (`imgui_action` → harness call). | ImGuiTest Bridge | Code | Planned | IT-01 |
| IT-03 | Provide synchronization primitives (`WaitForIdle`, etc.). | ImGuiTest Bridge | Code | Planned | IT-01 |
| VP-01 | Expand CLI unit tests for new commands and sandbox flow. | Verification Pipeline | Test | Planned | RC/AW tasks |
| VP-02 | Add harness integration tests with replay scripts. | Verification Pipeline | Test | Planned | IT tasks |
| VP-03 | Create CI job running agent smoke tests with `YAZE_WITH_JSON`. | Verification Pipeline | Infra | Planned | VP-01, VP-02 |
| TL-01 | Capture accept/reject metadata and push to telemetry log. | Telemetry & Learning | Code | Planned | AW tasks |
| TL-02 | Build anonymized metrics exporter + opt-in toggle. | Telemetry & Learning | Infra | Planned | TL-01 |
_Status Legend: Prototype · In Progress · Planned · Blocked · Done_
## 3. Immediate Next Steps
1.**COMPLETED**: Automated catalog export into `docs/api/z3ed-resources.yaml` - both JSON and YAML formats work correctly (RC-02, RC-03).
2.**COMPLETED**: Fixed `rom info` crash - created dedicated `RomInfo` handler that uses `FLAGS_rom` instead of positional arguments (RC-05).
3.**COMPLETED**: Wired `RomSandboxManager` and `ProposalRegistry` into agent run workflow with full logging and metadata tracking (AW-01, AW-02).
4.**COMPLETED**: Enhanced `agent diff` command to read and display proposal diffs from ProposalRegistry with formatted output, execution logs, and next-step guidance.
5.**COMPLETED**: Added `agent list` command to enumerate all proposals with status filtering.
6.**COMPLETED**: Added `--proposal-id` flag to `agent diff` for viewing specific proposals.
7.**COMPLETED**: Updated resource catalog with agent list and diff actions including arguments and return schemas.
8. **PLANNED**: Add ImGui drawer for proposals with accept/reject controls (AW-03).
9. **PLANNED**: Spike IPC options for `ImGuiTestHarness` (socket vs. HTTP vs. shared memory) and document findings (IT-01).
10. **PLANNED**: Integrate schema export with TUI command palette + help overlays (RC-04).
## 4. Open Questions
- What serialization format should the proposal registry adopt for diff payloads (binary vs. textual vs. hybrid)? \
➤ Decision: pursue a hybrid package (`.z3ed-diff`) that wraps binary tile/object deltas alongside a JSON metadata envelope (identifiers, texture descriptors, preview palette info). Capture format draft under RC/AW backlog.
- How should the harness authenticate escalation requests for mutation actions? \
➤ Still open—evaluate shared-secret vs. interactive user prompt in the harness spike (IT-01).
- Can we reuse existing regression test infrastructure for nightly ImGui runs or should we spin up a dedicated binary? \
➤ Investigate during the ImGuiTestHarness spike; compare extending `yaze_test` jobs versus introducing a lightweight automation runner.
## 5. Completed Work Summary
### Resource Catalogue Workstream (RC) - ✅ COMPLETE
The Resource Catalogue workstream has been successfully completed, providing a foundation for AI-driven automation:
**Implementation Details**:
- Created comprehensive schema system in `src/cli/service/resource_catalog.{h,cc}`
- Implemented resource catalog for: ROM, Patch, Palette, Overworld, Dungeon, and Agent commands
- Each resource includes: name, description, actions, arguments, effects, and return values
- Built dual-format serialization: JSON (compact) and YAML (human-readable)
**Key Fixes**:
- Fixed `rom info` segfault by creating dedicated `RomInfo` handler using `FLAGS_rom`
- Added `rom info` action to resource schema with proper metadata
- Ensured all ROM commands consistently use flag-based dispatch
**Generated Artifacts**:
- `docs/api/z3ed-resources.yaml` - Authoritative machine-readable API reference
- Both JSON and YAML output formats validated and working
- Resource filtering capability (`--resource <name>`) operational
**Command Examples**:
```bash
# View all resources in YAML
z3ed agent describe --format yaml
# Get specific resource as JSON
z3ed agent describe --format json --resource rom
# Generate documentation file
z3ed agent describe --format yaml --output docs/api/z3ed-resources.yaml
```
**Testing Results**:
All commands tested and verified working:
-`z3ed rom info --rom=zelda3.sfc` - displays title, size, filename
-`z3ed rom validate --rom=zelda3.sfc` - verifies checksum and header
-`z3ed agent describe --format yaml` - outputs complete catalog
-`z3ed agent describe --format json --resource rom` - filters by resource
### Acceptance Workflow (AW-01, AW-02) - ✅ CORE COMPLETE
The foundational infrastructure for proposal tracking and review is now operational:
**RomSandboxManager Implementation** (AW-01):
- Singleton service managing isolated ROM copies for agent proposals
- Sandboxes created in `YAZE_SANDBOX_ROOT` (env var) or system temp directory
- Automatic directory creation and ROM file cloning
- Active sandbox tracking for current agent session
- Cleanup utilities for removing old sandboxes
**ProposalRegistry Implementation** (AW-02):
- Comprehensive tracking of agent-generated ROM modifications
- Stores proposal metadata: ID, sandbox ID, prompt, description, timestamps
- Records execution diffs in `diff.txt` within proposal directory
- Appends command execution logs to `execution.log` with timestamps
- Support for screenshot attachments (path tracking)
- Proposal lifecycle: Pending → Accepted/Rejected
- Query capabilities: get by ID, list all, filter by status, find latest pending
**Agent Run Integration**:
- `agent run` now creates sandbox + proposal automatically
- All command executions logged with timestamps and status
- Success/failure outcomes captured in proposal logs
- User feedback includes proposal ID and sandbox path for review
- Foundation ready for `agent diff`, `agent commit`, `agent revert` enhancements
**Agent Diff Enhancement** (Completed Oct 1, 2025):
- Reads proposal diffs from ProposalRegistry automatically
- Displays detailed metadata: proposal ID, status, timestamps, command count
- Shows diff content from proposal directory
- Displays execution log (first 50 lines, with truncation for long logs)
- Provides next-step guidance (commit/revert/GUI review)
- Supports `--proposal-id` flag to view specific proposals
- Fallback to legacy diff behavior if no proposals found
**Agent List Command** (New - Oct 1, 2025):
- Enumerates all proposals in the registry
- Shows proposal ID, status, creation time, prompt, and stats
- Indicates pending/accepted/rejected status for each proposal
- Provides guidance on using `agent diff` to view details
- Empty state message guides users to create proposals with `agent run`
**Resource Catalog Updates**:
- Added `agent list` action with returns schema
- Added `agent diff` action with arguments (`--proposal-id`) and returns schema
- Updated agent resource description to include listing and diffing capabilities
- Regenerated `docs/api/z3ed-resources.yaml` with new agent actions
**Architecture Benefits**:
- Clean separation: RomSandboxManager (file ops) ↔ ProposalRegistry (metadata)
- Thread-safe with mutex protection for concurrent access
- Extensible design ready for ImGui review UI (AW-03)
- Proposal persistence enables post-session review and auditing
- Proposal-centric workflow enables human-in-the-loop review
**Next Steps for AW Workstream**:
- AW-03: ImGui drawer with accept/reject controls
- AW-04: Policy evaluation for gating mutations
- AW-05: `.z3ed-diff` hybrid format design
### Files Modified/Created
**Phase 6 (Resource Catalogue)**:
1. `src/cli/handlers/rom.cc` - Added `RomInfo::Run` implementation
2. `src/cli/z3ed.h` - Added `RomInfo` class declaration
3. `src/cli/modern_cli.cc` - Updated `HandleRomInfoCommand` routing
4. `src/cli/service/resource_catalog.cc` - Added `rom info` schema entry
5. `docs/api/z3ed-resources.yaml` - Generated comprehensive API catalog
**AW-01 & AW-02 (Proposal Tracking)**:
6. `src/cli/service/proposal_registry.h` - New proposal tracking service interface
7. `src/cli/service/proposal_registry.cc` - Implementation with full lifecycle management
8. `src/cli/handlers/agent.cc` - Integrated ProposalRegistry into agent run workflow
9. `src/cli/z3ed.cmake` - Added proposal_registry.cc to build
10. `docs/E6-z3ed-implementation-plan.md` - Updated progress and task statuses
**Agent Diff & List (Oct 1, 2025)**:
11. `src/cli/handlers/agent.cc` - Enhanced `HandleDiffCommand` with proposal reading, added `HandleListCommand`
12. `src/cli/service/resource_catalog.cc` - Added agent list and diff actions to schema
13. `docs/api/z3ed-resources.yaml` - Regenerated with new agent commands
## 6. References
- `docs/E6-z3ed-cli-design.md` - Overall CLI design and architecture
- `docs/api/z3ed-resources.yaml` - Machine-readable API reference (generated)
- `src/cli/service/resource_catalog.h` - Resource catalog implementation
- `src/cli/service/resource_catalog.cc` - Schema definitions and serialization