From 18eff96e61b1c0fad1562a231caf64de937e7c5e Mon Sep 17 00:00:00 2001 From: scawful Date: Fri, 3 Oct 2025 10:06:31 -0400 Subject: [PATCH] Add z3ed Agent Roadmap Document - Introduced a new `AGENT-ROADMAP.md` file outlining the strategic vision and implementation plan for the `z3ed` AI agent. - Defined the core vision of transitioning to a conversational ROM hacking assistant with key features such as an interactive chat interface, ROM introspection, and contextual awareness. - Detailed the technical implementation plan, including the development of a `ConversationalAgentService`, read-only tools for the agent, and user-facing TUI/GUI chat interfaces. - Consolidated immediate priorities, short-term goals, and long-term vision for the agent's development. This commit establishes a comprehensive roadmap for enhancing the z3ed agent's capabilities, paving the way for future AI-driven features and user interactions. --- docs/z3ed/AGENT-ROADMAP.md | 99 ++ docs/z3ed/AGENTIC-PLAN-STATUS.md | 374 -------- docs/z3ed/IT-05-IMPLEMENTATION-GUIDE.md | 509 ---------- docs/z3ed/IT-08-IMPLEMENTATION-GUIDE.md | 830 ---------------- docs/z3ed/IT-10-COLLABORATIVE-EDITING.md | 1011 -------------------- docs/z3ed/LLM-IMPLEMENTATION-CHECKLIST.md | 298 ------ docs/z3ed/LLM-INTEGRATION-PLAN.md | 1048 --------------------- docs/z3ed/OVERWORLD-DUNGEON-AI-PLAN.md | 477 ---------- docs/z3ed/QUICK-START-GEMINI.md | 307 ------ docs/z3ed/QUICK_REFERENCE.md | 490 ---------- docs/z3ed/README.md | 90 +- docs/z3ed/SSL-AND-COLLABORATIVE-PLAN.md | 239 ----- 12 files changed, 130 insertions(+), 5642 deletions(-) create mode 100644 docs/z3ed/AGENT-ROADMAP.md delete mode 100644 docs/z3ed/AGENTIC-PLAN-STATUS.md delete mode 100644 docs/z3ed/IT-05-IMPLEMENTATION-GUIDE.md delete mode 100644 docs/z3ed/IT-08-IMPLEMENTATION-GUIDE.md delete mode 100644 docs/z3ed/IT-10-COLLABORATIVE-EDITING.md delete mode 100644 docs/z3ed/LLM-IMPLEMENTATION-CHECKLIST.md delete mode 100644 docs/z3ed/LLM-INTEGRATION-PLAN.md delete mode 100644 docs/z3ed/OVERWORLD-DUNGEON-AI-PLAN.md delete mode 100644 docs/z3ed/QUICK-START-GEMINI.md delete mode 100644 docs/z3ed/QUICK_REFERENCE.md delete mode 100644 docs/z3ed/SSL-AND-COLLABORATIVE-PLAN.md diff --git a/docs/z3ed/AGENT-ROADMAP.md b/docs/z3ed/AGENT-ROADMAP.md new file mode 100644 index 00000000..89669451 --- /dev/null +++ b/docs/z3ed/AGENT-ROADMAP.md @@ -0,0 +1,99 @@ +# z3ed Agent Roadmap + +**Latest Update**: October 3, 2025 + +This document outlines the strategic vision and concrete next steps for the `z3ed` AI agent, focusing on a transition from a command-line tool to a fully interactive, conversational assistant for ROM hacking. + +## Core Vision: The Conversational ROM Hacking Assistant + +The next evolution of the `z3ed` agent is to create a chat-like interface where users can interact with the AI in a more natural and exploratory way. Instead of just issuing a single command, users will be able to have a dialogue with the agent to inspect the ROM, ask questions, and iteratively build up a set of changes. + +This vision will be realized through a shared interface available in both the `z3ed` TUI and the main `yaze` GUI application. + +### Key Features +1. **Interactive Chat Interface**: A familiar chat window for conversing with the agent. +2. **ROM Introspection**: The agent will be able to answer questions about the ROM, such as "What dungeons are defined in this project?" or "How many soldiers are in the Hyrule Castle throne room?". +3. **Contextual Awareness**: The agent will maintain the context of the conversation, allowing for follow-up questions and commands. +4. **Seamless Transition to Action**: When the user is ready to make a change, the agent will use the conversation history to generate a comprehensive proposal for editing the ROM. +5. **Shared Experience**: The same conversational agent will be accessible from both the terminal and the graphical user interface, providing a consistent experience. + +## Technical Implementation Plan + +### 1. Conversational Agent Service +- **Description**: A new service that will manage the back-and-forth between the user and the LLM. It will maintain chat history and orchestrate the agent's different modes (Q&A vs. command generation). +- **Components**: + - `ConversationalAgentService`: The main class for managing the chat session. + - Integration with existing `AIService` implementations (Ollama, Gemini). +- **Status**: Not started. + +### 2. Read-Only "Tools" for the Agent +- **Description**: To enable the agent to answer questions, we need to expand `z3ed` with a suite of read-only commands that the LLM can call. This is aligned with the "tool use" or "function calling" capabilities of modern LLMs. +- **Example Tools to Implement**: + - `resource list --type `: List all user-defined labels of a certain type. + - `dungeon list-sprites --room `: List all sprites in a given room. + - `dungeon get-info --room `: Get metadata for a specific room. + - `overworld find-tile --tile `: Find all occurrences of a specific tile on the overworld map. +- **Advanced Editing Tools (for future implementation)**: + - `overworld set-area --map --x --y --width --height --tile ` + - `overworld replace-tile --map --from --to ` + - `overworld blend-tiles --map --pattern --density ` +- **Status**: Some commands exist (`overworld get-tile`), but the suite needs to be expanded. + +### 3. TUI and GUI Chat Interfaces +- **Description**: User-facing components for interacting with the `ConversationalAgentService`. +- **Components**: + - **TUI**: A new full-screen component in `z3ed` using FTXUI, providing a rich chat experience in the terminal. + - **GUI**: A new ImGui widget that can be docked into the main `yaze` application window. +- **Status**: Not started. + +### 4. Integration with the Proposal Workflow +- **Description**: The final step is to connect the conversation to the action. When a user's prompt implies a desire to modify the ROM (e.g., "Okay, now add two more soldiers"), the `ConversationalAgentService` will trigger the existing `Tile16ProposalGenerator` (and future proposal generators for other resource types) to create a proposal. +- **Workflow**: + 1. User chats with the agent to explore the ROM. + 2. User asks the agent to make a change. + 3. `ConversationalAgentService` generates the commands and passes them to the appropriate `ProposalGenerator`. + 4. A new proposal is created and saved. + 5. The TUI/GUI notifies the user that a proposal is ready for review. + 6. User uses the `agent diff` and `agent accept` commands (or UI equivalents) to review and apply the changes. +- **Status**: The proposal workflow itself is mostly implemented. This task involves integrating it with the new conversational service. + +## Consolidated Next Steps + +### Immediate Priorities (Next Session) +1. **Implement Read-Only Agent Tools**: + - Add `resource list` command. + - Add `dungeon list-sprites` command. + - Ensure all new commands have JSON output options for machine readability. +2. **Stub out `ConversationalAgentService`**: + - Create the basic class structure. + - Implement simple chat history management. +3. **Update `README.md` and Consolidate Docs**: + - Update the main `README.md` to reflect this new roadmap. + - Remove `IMPLEMENTATION-SESSION-OCT3-CONTINUED.md`. + - Merge any other scattered planning documents into this roadmap. + +### Short-Term Goals (This Week) +1. **Build TUI Chat Interface**: + - Create the FTXUI component. + - Connect it to the `ConversationalAgentService`. + - Implement basic input/output. +2. **Integrate Tool Use with LLM**: + - Modify the `AIService` to support function calling/tool use. + - Teach the agent to call the new read-only commands to answer questions. + +### Long-Term Vision (Next Week and Beyond) +1. **Build GUI Chat Widget**: + - Create the ImGui component. + - Ensure it shares the same backend service as the TUI. +2. **Full Integration with Proposal System**: + - Implement the logic for the agent to transition from conversation to proposal generation. +3. **Expand Tool Arsenal**: + - Continuously add new read-only commands to give the agent more capabilities to inspect the ROM. +4. **Multi-Modal Agent**: + - Explore the possibility of the agent generating and displaying images (e.g., a map of a dungeon room) in the chat. +5. **Advanced Configuration**: + - Implement environment variables for selecting AI providers and models (e.g., `YAZE_AI_PROVIDER`, `OLLAMA_MODEL`). + - Add CLI flags for overriding the provider and model on a per-command basis. +6. **Performance and Cost-Saving**: + - Implement a response cache to reduce latency and API costs. + - Add token usage tracking and reporting. diff --git a/docs/z3ed/AGENTIC-PLAN-STATUS.md b/docs/z3ed/AGENTIC-PLAN-STATUS.md deleted file mode 100644 index 4b244a20..00000000 --- a/docs/z3ed/AGENTIC-PLAN-STATUS.md +++ /dev/null @@ -1,374 +0,0 @@ -# z3ed AI Agentic Plan - Current Status - -**Date**: October 3, 2025 -**Overall Status**: โœ… Infrastructure Complete | ๐Ÿš€ Ready for Testing -**Build Status**: โœ… z3ed compiles successfully in `build-grpc-test` -**Platform Compatibility**: โœ… Windows builds supported (SSL optional, Ollama recommended) - -## Executive Summary - -The z3ed AI agentic system infrastructure is **fully implemented** and ready for real-world testing. All four phases from the LLM Integration Plan are complete: - -- โœ… **Phase 1**: Ollama local integration (DONE) -- โœ… **Phase 2**: Gemini API enhancement (DONE) -- โœ… **Phase 4**: Enhanced prompting with PromptBuilder (DONE) -- โญ๏ธ **Phase 3**: Claude integration (DEFERRED - not critical for initial testing) - -## ๐ŸŽฏ What's Working Right Now - -### 1. Build System โœ… -- **File Structure**: Clean, modular architecture - - `test_common.{h,cc}` - Shared utilities (134 lines) - - `test_commands.cc` - Main dispatcher (55 lines) - - `ollama_ai_service.{h,cc}` - Ollama integration (264 lines) - - `gemini_ai_service.{h,cc}` - Gemini integration (239 lines) - - `prompt_builder.{h,cc}` - Enhanced prompting (354 lines, refactored for tile16 focus) - -- **Build**: Successfully compiles with gRPC + JSON support - ```bash - $ ls -lh build-grpc-test/bin/z3ed - -rwxr-xr-x 69M Oct 3 02:18 build-grpc-test/bin/z3ed - ``` - -- **Platform Support**: - - โœ… macOS: Full support (OpenSSL auto-detected) - - โœ… Linux: Full support (OpenSSL via package manager) - - โœ… Windows: Build without gRPC/JSON or use Ollama (no SSL needed) - -- **Dependency Guards**: - - SSL only required when `YAZE_WITH_GRPC=ON` AND `YAZE_WITH_JSON=ON` - - Graceful degradation: warns if OpenSSL missing but Ollama still works - - Windows-compatible: can build basic z3ed without AI features - -### 2. AI Service Infrastructure โœ… - -#### AIService Interface -**Location**: `src/cli/service/ai_service.h` -- Clean abstraction for pluggable AI backends -- Single method: `GetCommands(prompt) โ†’ vector` -- Easy to test and swap implementations - -#### Implemented Services - -**A. MockAIService** (Testing) -- Returns hardcoded test commands -- Perfect for CI/CD and offline development -- No dependencies required - -**B. OllamaAIService** (Local LLM) -- โœ… Full implementation complete -- โœ… HTTP client using cpp-httplib -- โœ… JSON parsing with nlohmann/json -- โœ… Health checks and model validation -- โœ… Configurable model selection -- โœ… Integrated with PromptBuilder for enhanced prompts -- **Models Supported**: - - `qwen2.5-coder:7b` (recommended, fast, good code gen) - - `codellama:7b` (alternative) - - `llama3.1:8b` (general purpose) - - Any Ollama-compatible model - -**C. GeminiAIService** (Google Cloud) -- โœ… Full implementation complete -- โœ… HTTP client using cpp-httplib -- โœ… JSON request/response handling -- โœ… Integrated with PromptBuilder -- โœ… Configurable via `GEMINI_API_KEY` env var -- **Models**: `gemini-1.5-flash`, `gemini-1.5-pro` - -### 3. Enhanced Prompting System โœ… - -**PromptBuilder** (`src/cli/service/prompt_builder.{h,cc}`) - -#### Features Implemented: -- โœ… **System Instructions**: Clear role definition for the AI -- โœ… **Command Documentation**: Inline command reference -- โœ… **Few-Shot Examples**: 8 curated tile16/dungeon examples (refactored Oct 3) -- โœ… **Resource Catalogue**: Extensible command registry -- โœ… **JSON Output Format**: Enforced structured responses -- โœ… **Tile16 Reference**: Inline common tile IDs for AI knowledge - -#### Example Categories (UPDATED): -1. **Overworld Tile16 Editing** โญ PRIMARY FOCUS: - - Single tile placement: "Place a tree at position 10, 20 on map 0" - - Area creation: "Create a 3x3 water pond at coordinates 15, 10" - - Path creation: "Add a dirt path from position 5,5 to 5,15" - - Pattern generation: "Plant a row of trees horizontally at y=8 from x=20 to x=25" - -2. **Dungeon Editing** (Label-Aware): - - "Add 3 soldiers to the Eastern Palace entrance room" - - "Place a chest in the Hyrule Castle treasure room" - -3. **Tile16 Reference** (Inline for AI): - - Grass: 0x020, Dirt: 0x022, Tree: 0x02E - - Water edges: 0x14C (top), 0x14D (middle), 0x14E (bottom) - - Bush: 0x003, Rock: 0x004, Flower: 0x021, Sand: 0x023 - -**Note**: AI can support additional edit types (sprites, palettes, patches) but tile16 is the primary validated use case. - -### 4. Service Selection Logic โœ… - -**AI Service Factory** (`CreateAIService()`) - -Selection Priority: -1. If `GEMINI_API_KEY` set โ†’ Use Gemini -2. If Ollama available โ†’ Use Ollama -3. Fallback โ†’ MockAIService - -**Configuration**: -```bash -# Use Gemini (requires API key) -export GEMINI_API_KEY="your-key-here" -./z3ed agent plan --prompt "Make soldiers red" - -# Use Ollama (requires ollama serve running) -unset GEMINI_API_KEY -ollama serve # Terminal 1 -./z3ed agent plan --prompt "Make soldiers red" # Terminal 2 - -# Use Mock (always works, no dependencies) -# Automatic fallback if neither Gemini nor Ollama available -``` - -## ๐Ÿ“‹ What's Ready to Test - -### Test Scenario 1: Ollama Local LLM - -**Prerequisites**: -```bash -# Install Ollama -brew install ollama # macOS -# or download from https://ollama.com - -# Pull recommended model -ollama pull qwen2.5-coder:7b - -# Start Ollama server -ollama serve -``` - -**Test Commands**: -```bash -cd /Users/scawful/Code/yaze -export ROM_PATH="assets/zelda3.sfc" - -# Test 1: Simple palette change -./build-grpc-test/bin/z3ed agent plan \ - --prompt "Change palette 0 color 5 to red" - -# Test 2: Complex sprite modification -./build-grpc-test/bin/z3ed agent plan \ - --prompt "Make all soldier armors blue" - -# Test 3: Overworld editing -./build-grpc-test/bin/z3ed agent plan \ - --prompt "Place a tree at position 10, 20 on map 0" - -# Test 4: End-to-end with sandbox -./build-grpc-test/bin/z3ed agent run \ - --prompt "Validate the ROM" \ - --rom assets/zelda3.sfc \ - --sandbox -``` - -### Test Scenario 2: Gemini API - -**Prerequisites**: -```bash -# Get API key from https://aistudio.google.com/apikey -export GEMINI_API_KEY="your-actual-api-key-here" -``` - -**Test Commands**: -```bash -# Same commands as Ollama scenario above -# Service selection will automatically use Gemini when key is set - -# Verify Gemini is being used -./build-grpc-test/bin/z3ed agent plan --prompt "test" 2>&1 | grep -i "gemini\|model" -``` - -### Test Scenario 3: Fallback to Mock - -**Test Commands**: -```bash -# Ensure neither Gemini nor Ollama are available -unset GEMINI_API_KEY -# (Stop ollama serve if running) - -# Should fall back to Mock and return hardcoded test commands -./build-grpc-test/bin/z3ed agent plan --prompt "anything" -``` - -## ๐ŸŽฏ Current Implementation Status - -### Phase 1: Ollama Integration โœ… COMPLETE -- [x] OllamaAIService class created -- [x] HTTP client integrated (cpp-httplib) -- [x] JSON parsing (nlohmann/json) -- [x] Health check endpoint (`/api/tags`) -- [x] Model validation -- [x] Generate endpoint (`/api/generate`) -- [x] Streaming response handling -- [x] Error handling and retry logic -- [x] Configuration struct with defaults -- [x] Integration with PromptBuilder -- [x] Documentation and examples - -**Estimated**: 4-6 hours | **Actual**: 4 hours | **Status**: โœ… DONE - -### Phase 2: Gemini Enhancement โœ… COMPLETE -- [x] GeminiAIService class updated -- [x] HTTP client integrated (cpp-httplib) -- [x] JSON request/response handling -- [x] API key management via env var -- [x] Model selection (flash vs pro) -- [x] Integration with PromptBuilder -- [x] Enhanced error messages -- [x] Rate limit handling (with backoff) -- [x] Token counting (estimated) -- [x] Cost tracking (estimated) - -**Estimated**: 3-4 hours | **Actual**: 3 hours | **Status**: โœ… DONE - -### Phase 3: Claude Integration โญ๏ธ DEFERRED -- [ ] ClaudeAIService class -- [ ] Anthropic API integration -- [ ] Token tracking -- [ ] Prompt caching support - -**Estimated**: 3-4 hours | **Status**: Not critical for initial testing - -### Phase 4: Enhanced Prompting โœ… COMPLETE -- [x] PromptBuilder class created -- [x] System instruction templates -- [x] Command documentation registry -- [x] Few-shot example library -- [x] Resource catalogue integration -- [x] JSON output format enforcement -- [x] Integration with all AI services -- [x] Example categories (palette, overworld, validation) - -**Estimated**: 2-3 hours | **Actual**: 2 hours | **Status**: โœ… DONE - -## ๐Ÿš€ Next Steps - -### Immediate Actions (Next Session) - -1. **Integrate Tile16ProposalGenerator into Agent Commands** (2 hours) - - Modify `HandlePlanCommand()` to use generator - - Modify `HandleRunCommand()` to apply proposals - - Add `HandleAcceptCommand()` for accepting proposals - -2. **Integrate ResourceContextBuilder into PromptBuilder** (1 hour) - - Update `BuildContextualPrompt()` to inject labels - - Test with actual labels file from user project - -3. **Test End-to-End Workflow** (1 hour) - ```bash - ollama serve - ./build-grpc-test/bin/z3ed agent plan \ - --prompt "Create a 3x3 water pond at 15, 10" - - # Verify proposal generation - # Verify tile16 changes are correct - ``` - -4. **Add Visual Diff Implementation** (2-3 hours) - - Render tile16 bitmaps from overworld - - Create side-by-side comparison images - - Highlight changed tiles - -### Short-Term (This Week) - -1. **Accuracy Benchmarking** - - Test 20 different prompts - - Measure command correctness - - Compare Ollama vs Gemini vs Mock - -2. **Error Handling Refinement** - - Test API failures - - Test invalid API keys - - Test network timeouts - - Test malformed responses - -3. **GUI Automation Integration** - - Use `agent test` commands to verify changes - - Screenshot capture on failures - - Automated validation workflows - -4. **Documentation** - - User guide for setting up Ollama - - User guide for setting up Gemini - - Troubleshooting guide - - Example prompts library - -### Long-Term (Next Sprint) - -1. **Claude Integration** (if needed) -2. **Prompt Optimization** - - A/B testing different system instructions - - Expand few-shot examples - - Domain-specific command groups - -3. **Advanced Features** - - Multi-turn conversations - - Context retention - - Command chaining validation - - Safety checks before execution - -## ๐Ÿ“Š Success Metrics - -### Build Health โœ… -- [x] z3ed compiles without errors -- [x] All AI services link correctly -- [x] No linker errors with httplib/json -- [x] Binary size reasonable (69MB is fine with gRPC) - -### Code Quality โœ… -- [x] Modular architecture -- [x] Clean separation of concerns -- [x] Proper error handling -- [x] Comprehensive documentation - -### Functionality Ready ๐Ÿš€ -- [ ] Ollama generates valid commands (NEEDS TESTING) -- [ ] Gemini generates valid commands (NEEDS TESTING) -- [ ] Mock service always works (โœ… VERIFIED) -- [ ] Service selection logic works (โœ… VERIFIED) -- [ ] Sandbox isolation works (โœ… VERIFIED from previous tests) - -## ๐ŸŽ‰ Key Achievements - -1. **Modular Architecture**: Clean separation allows easy addition of new AI services -2. **Build System**: Successfully integrated httplib and JSON without major issues -3. **Enhanced Prompting**: PromptBuilder provides consistent, high-quality prompts -4. **Flexibility**: Support for local (Ollama), cloud (Gemini), and mock backends -5. **Documentation**: Comprehensive plans, guides, and status tracking -6. **Testing Ready**: All infrastructure in place to start real-world validation - -## ๐Ÿ“ Files Summary - -### Created/Modified Recently -- โœ… `src/cli/handlers/agent/test_common.{h,cc}` (NEW) -- โœ… `src/cli/handlers/agent/test_commands.cc` (REBUILT) -- โœ… `src/cli/z3ed.cmake` (UPDATED) -- โœ… `src/cli/service/gemini_ai_service.cc` (FIXED includes) -- โœ… `src/cli/service/tile16_proposal_generator.{h,cc}` (NEW - Oct 3) โœจ -- โœ… `src/cli/service/resource_context_builder.{h,cc}` (NEW - Oct 3) โœจ -- โœ… `src/app/zelda3/overworld/overworld.h` (UPDATED - SetTile method) โœจ -- โœ… `src/cli/handlers/overworld.cc` (UPDATED - SetTile implementation) โœจ -- โœ… `docs/z3ed/IMPLEMENTATION-SESSION-OCT3-CONTINUED.md` (NEW) โœจ -- โœ… `docs/z3ed/AGENTIC-PLAN-STATUS.md` (UPDATED - this file) - -### Previously Implemented (Phase 1-4) -- โœ… `src/cli/service/ollama_ai_service.{h,cc}` -- โœ… `src/cli/service/gemini_ai_service.{h,cc}` -- โœ… `src/cli/service/prompt_builder.{h,cc}` -- โœ… `src/cli/service/ai_service.{h,cc}` - ---- - -**Status**: โœ… ALL SYSTEMS GO - Ready for real-world testing! -**Next Action**: Begin Ollama/Gemini testing to validate actual command generation quality - diff --git a/docs/z3ed/IT-05-IMPLEMENTATION-GUIDE.md b/docs/z3ed/IT-05-IMPLEMENTATION-GUIDE.md deleted file mode 100644 index acc22f2a..00000000 --- a/docs/z3ed/IT-05-IMPLEMENTATION-GUIDE.md +++ /dev/null @@ -1,509 +0,0 @@ -# IT-05: Test Introspection API โ€“ Implementation Guide - -**Status (Oct 2, 2025)**: โœ… **COMPLETE - Production Ready** - -## Progress Snapshot - -- โœ… Proto definitions and service stubs added for `GetTestStatus`, `ListTests`, `GetTestResults`. -- โœ… `TestManager` now records execution lifecycle, aggregates, logs, and metrics with thread-safe history trimming. -- โœ… `ImGuiTestHarnessServiceImpl` implements the three introspection RPC handlers, including pagination and status conversion helpers. -- โœ… CLI wiring complete: `GuiAutomationClient` exposes all introspection methods. -- โœ… User-facing commands: `z3ed agent test {status,list,results}` fully functional with YAML/JSON output. -- โœ… End-to-end validation script (`scripts/test_introspection_e2e.sh`) validates complete workflow. - -**E2E Test Results** (Oct 2, 2025): -```bash -โœ“ GetTestStatus RPC - Query test execution status -โœ“ ListTests RPC - Enumerate available tests -โœ“ GetTestResults RPC - Retrieve detailed results (YAML + JSON) -โœ“ Follow mode - Poll status until completion -โœ“ Category filtering - Filter tests by category -โœ“ Pagination - Limit number of results -```st Introspection API โ€“ Implementation Guide - -**Status (Oct 2, 2025)**: ๐ŸŸก *Server-side RPCs complete; CLI + E2E pending* - -## Progress Snapshot - -- โœ… Proto definitions and service stubs added for `GetTestStatus`, `ListTests`, `GetTestResults`. -- โœ… `TestManager` now records execution lifecycle, aggregates, logs, and metrics with thread-safe history trimming. -- โœ… `ImGuiTestHarnessServiceImpl` implements the three RPC handlers, including pagination and status conversion helpers. -- โš ๏ธ CLI wiring, automation client calls, and user-facing output still TODO. -- โš ๏ธ End-to-end validation script (`scripts/test_introspection_e2e.sh`) not yet authored. - -**Current Limitations**: -- โŒ Tests execute asynchronously with no way to query status -- โŒ Clients must poll blindly or give up early -- โŒ No visibility into test execution queue -- โŒ Results lost after test completion -- โŒ Can't track test history or identify flaky tests - -**Why This Blocks AI Agent Autonomy** - -Without test introspection, **AI agents cannot implement closed-loop feedback**: - -``` -โŒ BROKEN: AI Agent Without IT-05 -1. AI generates commands: ["z3ed palette export ..."] -2. AI executes commands in sandbox -3. AI generates test: "Verify soldier is red" -4. AI runs test โ†’ Gets test_id -5. ??? AI has no way to check if test passed ??? -6. AI presents proposal to user blindly - (might be broken, AI doesn't know) - -โœ… WORKING: AI Agent With IT-05 -1. AI generates commands -2. AI executes in sandbox -3. AI generates verification test -4. AI runs test โ†’ Gets test_id -5. AI polls: GetTestStatus(test_id) -6. Test FAILED? AI sees error + screenshot -7. AI adjusts strategy and retries -8. Test PASSED? AI presents successful proposal -``` - -**This is the difference between**: -- **Dumb automation**: Execute blindly, hope for the best -- **Intelligent agent**: Verify, learn, self-correct - -**Benefits After IT-05**: -- โœ… AI agents can reliably poll for test completion -- โœ… AI agents can read detailed failure messages -- โœ… AI agents can implement retry logic with adjusted strategies -- โœ… CLI can show real-time progress bars -- โœ… Test history enables trend analysis (flaky tests, performance regressions) -- โœ… Foundation for test recording/replay (IT-07) -- โœ… **Enables autonomous agent operation**ion API - Implementation Guide - -**Status**: ๐Ÿ“‹ Planned | Priority 1 | Time Estimate: 6-8 hours -**Dependencies**: IT-01 Complete โœ…, IT-02 Complete โœ… -**Blocking**: IT-06 (Widget Discovery needs introspection foundation) - -## Overview - -Add test introspection capabilities to enable clients to query test execution status, list available tests, and retrieve detailed results. This is critical for AI agents to reliably poll for test completion and make decisions based on results. - -## Motivation - -**Current Limitations**: -- โŒ Tests execute asynchronously with no way to query status -- โŒ Clients must poll blindly or give up early -- โŒ No visibility into test execution queue -- โŒ Results lost after test completion -- โŒ Can't track test history or identify flaky tests - -**Benefits After IT-05** - -- โœ… AI agents can reliably poll for test completion -- โœ… CLI can show real-time progress bars -- โœ… Test history enables trend analysis -- โœ… Foundation for test recording/replay (IT-07) - -## Architecture - -### New Service Components - -```cpp -// src/app/core/test_manager.h -class TestManager { - // Existing... - - // NEW: Test tracking - struct TestExecution { - std::string test_id; - std::string name; - std::string category; - TestStatus status; // QUEUED, RUNNING, PASSED, FAILED, TIMEOUT - int64_t queued_at_ms; - int64_t started_at_ms; - int64_t completed_at_ms; - int32_t execution_time_ms; - std::string error_message; - std::vector assertion_failures; - std::vector logs; - }; - - // NEW: Test execution tracking - absl::StatusOr GetTestStatus(const std::string& test_id); - std::vector ListTests(const std::string& category_filter = ""); - absl::StatusOr GetTestResults(const std::string& test_id); - - private: - // NEW: Test execution history - std::map test_history_; - absl::Mutex test_history_mutex_; // Thread-safe access -}; -``` - -### Proto Additions - -```protobuf -// src/app/core/proto/imgui_test_harness.proto - -// Add to service definition -service ImGuiTestHarness { - // ... existing RPCs ... - - // NEW: Test introspection - rpc GetTestStatus(GetTestStatusRequest) returns (GetTestStatusResponse); - rpc ListTests(ListTestsRequest) returns (ListTestsResponse); - rpc GetTestResults(GetTestResultsRequest) returns (GetTestResultsResponse); -} - -// ============================================================================ -// GetTestStatus - Query test execution state -// ============================================================================ - -message GetTestStatusRequest { - string test_id = 1; // Test ID from Click/Type/Wait/Assert response -} - -message GetTestStatusResponse { - enum Status { - UNKNOWN = 0; // Test ID not found - QUEUED = 1; // Waiting to execute - RUNNING = 2; // Currently executing - PASSED = 3; // Completed successfully - FAILED = 4; // Assertion failed or error - TIMEOUT = 5; // Exceeded timeout - } - - Status status = 1; - int64 queued_at_ms = 2; // When test was queued - int64 started_at_ms = 3; // When test started (0 if not started) - int64 completed_at_ms = 4; // When test completed (0 if not complete) - int32 execution_time_ms = 5; // Total execution time - string error_message = 6; // Error details if FAILED/TIMEOUT - repeated string assertion_failures = 7; // Failed assertion details -} - -// ============================================================================ -// ListTests - Enumerate available tests -// ============================================================================ - -message ListTestsRequest { - string category_filter = 1; // Optional: "grpc", "unit", "integration", "e2e" - int32 page_size = 2; // Number of results per page (default 100) - string page_token = 3; // Pagination token from previous response -} - -message ListTestsResponse { - repeated TestInfo tests = 1; - string next_page_token = 2; // Token for next page (empty if no more) - int32 total_count = 3; // Total number of matching tests -} - -message TestInfo { - string test_id = 1; // Unique test identifier - string name = 2; // Human-readable test name - string category = 3; // Category: grpc, unit, integration, e2e - int64 last_run_timestamp_ms = 4; // When test last executed - int32 total_runs = 5; // Total number of executions - int32 pass_count = 6; // Number of successful runs - int32 fail_count = 7; // Number of failed runs - int32 average_duration_ms = 8; // Average execution time -} - -// ============================================================================ -// GetTestResults - Retrieve detailed results -// ============================================================================ - -message GetTestResultsRequest { - string test_id = 1; - bool include_logs = 2; // Include full execution logs -} - -message GetTestResultsResponse { - bool success = 1; // Overall test result - string test_name = 2; - string category = 3; - int64 executed_at_ms = 4; - int32 duration_ms = 5; - - // Detailed results - repeated AssertionResult assertions = 6; - repeated string logs = 7; // If include_logs=true - - // Performance metrics - map metrics = 8; // e.g., "frame_count": 123 -} - -message AssertionResult { - string description = 1; - bool passed = 2; - string expected_value = 3; - string actual_value = 4; - string error_message = 5; -} -``` - -## Implementation Steps - -### Step 1: Extend TestManager (โœ”๏ธ Completed) - -**What changed**: -- Introduced `HarnessTestExecution`, `HarnessTestSummary`, and related enums in `test_manager.h`. -- Added registration, running, completion, log, and metric helpers with `absl::Mutex` guarding (`RegisterHarnessTest`, `MarkHarnessTestRunning`, `MarkHarnessTestCompleted`, etc.). -- Stored executions in `harness_history_` + `harness_aggregates_` with deque-based trimming to avoid unbounded growth. - -**Where to look**: -- `src/app/test/test_manager.h` (see *Harness test introspection (IT-05)* section around `HarnessTestExecution`). -- `src/app/test/test_manager.cc` (functions `RegisterHarnessTest`, `MarkHarnessTestCompleted`, `AppendHarnessTestLog`, `GetHarnessTestExecution`, `ListHarnessTestSummaries`). - -**Next touch-ups**: -- Consider persisting assertion metadata (expected/actual) so `GetTestResults` can populate richer `AssertionResult` entries. -- Decide on retention limit (`harness_history_limit_`) tuning once CLI consumption patterns are known. - -#### 1.2 Update Existing RPC Handlers - -**File**: `src/app/core/service/imgui_test_harness_service.cc` - -Modify Click, Type, Wait, Assert handlers to record test execution: - -```cpp -absl::Status ImGuiTestHarnessServiceImpl::Click( - const ClickRequest* request, ClickResponse* response) { - - // Generate unique test ID - std::string test_id = test_manager_->GenerateTestId("grpc_click"); - - // Record test start - test_manager_->RecordTestStart( - test_id, - absl::StrFormat("Click: %s", request->target()), - "grpc"); - - // ... existing implementation ... - - // Record test completion - if (success) { - test_manager_->RecordTestComplete(test_id, TestManager::TestStatus::PASSED); - } else { - test_manager_->RecordTestComplete( - test_id, TestManager::TestStatus::FAILED, error_message); - } - - // Add test ID to response (requires proto update) - response->set_test_id(test_id); - - return absl::OkStatus(); -} -``` - -**Proto Update**: Add `test_id` field to all responses: - -```protobuf -message ClickResponse { - bool success = 1; - string message = 2; - int32 execution_time_ms = 3; - string test_id = 4; // NEW: Unique test identifier for introspection -} - -// Repeat for TypeResponse, WaitResponse, AssertResponse -``` - -### Step 2: Implement Introspection RPCs (โœ”๏ธ Completed) - -**What changed**: -- Added helper utilities (`ConvertHarnessStatus`, `ToUnixMillisSafe`, `ClampDurationToInt32`) in `imgui_test_harness_service.cc`. -- Implemented `GetTestStatus`, `ListTests`, and `GetTestResults` with pagination, optional log inclusion, and structured metrics.mapping. -- Updated gRPC wrapper to surface new RPCs and translate Abseil status codes into gRPC codes. -- Ensured deque-backed `DynamicTestData` keep-alive remains bounded while reusing new tracking helpers. - -**Where to look**: -- `src/app/core/service/imgui_test_harness_service.cc` (search for `GetTestStatus(`, `ListTests(`, `GetTestResults(`). -- `src/app/core/service/imgui_test_harness_service.h` (new method declarations). - -**Follow-ups**: -- Expand `AssertionResult` population once `TestManager` captures structured expected/actual data. -- Evaluate pagination defaults (`page_size`, `page_token`) once CLI usage patterns are seen. - -### Step 3: CLI Integration (๐Ÿšง TODO) - -Goal: expose the new RPCs through `GuiAutomationClient` and user-facing `z3ed agent test` subcommands. The pseudo-code below illustrates the desired flow; implementation still pending. - -**File**: `src/cli/handlers/agent.cc` - -Add new CLI commands for test introspection: - -```cpp -// z3ed agent test status --test-id [--follow] -absl::Status HandleAgentTestStatus(const CommandOptions& options) { - const std::string test_id = absl::GetFlag(FLAGS_test_id); - const bool follow = absl::GetFlag(FLAGS_follow); - - GuiAutomationClient client("localhost", 50052); - RETURN_IF_ERROR(client.Connect()); - - while (true) { - auto status_or = client.GetTestStatus(test_id); - RETURN_IF_ERROR(status_or.status()); - - const auto& status = status_or.value(); - - // Print status - std::cout << "Test ID: " << test_id << "\n"; - std::cout << "Status: " << StatusToString(status.status) << "\n"; - std::cout << "Execution Time: " << status.execution_time_ms << "ms\n"; - - if (status.status == TestStatus::PASSED || - status.status == TestStatus::FAILED || - status.status == TestStatus::TIMEOUT) { - break; // Terminal state - } - - if (!follow) break; - - // Poll every 500ms - absl::SleepFor(absl::Milliseconds(500)); - } - - return absl::OkStatus(); -} - -// z3ed agent test results --test-id [--format json] [--include-logs] -absl::Status HandleAgentTestResults(const CommandOptions& options) { - const std::string test_id = absl::GetFlag(FLAGS_test_id); - const std::string format = absl::GetFlag(FLAGS_format); - const bool include_logs = absl::GetFlag(FLAGS_include_logs); - - GuiAutomationClient client("localhost", 50052); - RETURN_IF_ERROR(client.Connect()); - - auto results_or = client.GetTestResults(test_id, include_logs); - RETURN_IF_ERROR(results_or.status()); - - const auto& results = results_or.value(); - - if (format == "json") { - // Output JSON - PrintTestResultsJson(results); - } else { - // Output YAML (default) - PrintTestResultsYaml(results); - } - - return absl::OkStatus(); -} - -// z3ed agent test list [--category ] [--status ] -absl::Status HandleAgentTestList(const CommandOptions& options) { - const std::string category = absl::GetFlag(FLAGS_category); - const std::string status_filter = absl::GetFlag(FLAGS_status); - - GuiAutomationClient client("localhost", 50052); - RETURN_IF_ERROR(client.Connect()); - - auto tests_or = client.ListTests(category); - RETURN_IF_ERROR(tests_or.status()); - - const auto& tests = tests_or.value(); - - // Print table - std::cout << "=== Test List ===\n\n"; - std::cout << absl::StreamFormat("%-20s %-30s %-10s %-10s\n", - "Test ID", "Name", "Category", "Status"); - std::cout << std::string(80, '-') << "\n"; - - for (const auto& test : tests) { - std::cout << absl::StreamFormat("%-20s %-30s %-10s %-10s\n", - test.test_id, test.name, test.category, - StatusToString(test.last_status)); - } - - return absl::OkStatus(); -} -``` - -### Step 4: Testing & Validation (๐Ÿšง TODO) - -#### Test Script: `scripts/test_introspection_e2e.sh` - -```bash -#!/bin/bash -# Test introspection API - -set -e - -# Start YAZE -./build-grpc-test/bin/yaze.app/Contents/MacOS/yaze \ - --enable_test_harness \ - --test_harness_port=50052 \ - --rom_file=assets/zelda3.sfc & - -YAZE_PID=$! -sleep 3 - -# Test 1: Run a test and capture test ID -echo "Test 1: GetTestStatus" -TEST_ID=$(z3ed agent test --prompt "Open Overworld" --output json | jq -r '.test_id') -echo "Test ID: $TEST_ID" - -# Test 2: Poll for status -echo "Test 2: Poll status" -z3ed agent test status --test-id $TEST_ID --follow - -# Test 3: Get results -echo "Test 3: Get results" -z3ed agent test results --test-id $TEST_ID --format yaml --include-logs - -# Test 4: List all tests -echo "Test 4: List tests" -z3ed agent test list --category grpc - -# Cleanup -kill $YAZE_PID -``` - -## Success Criteria - -- [x] All 3 new RPCs respond correctly โœ… -- [x] Test IDs returned in Click/Type/Wait/Assert responses โœ… -- [x] Status polling works with `--follow` flag โœ… -- [x] Test history persists across multiple test runs โœ… -- [x] CLI commands output clean YAML/JSON โœ… -- [x] No memory leaks in test history tracking (bounded deque + pruning) โœ… -- [x] Thread-safe access to test history (mutex-protected) โœ… -- [x] Documentation updated in `E6-z3ed-reference.md` โœ… -- [x] E2E test script validates complete workflow โœ… - -## Migration Guide - -**For Existing Code**: -- No breaking changes - new RPCs only -- Existing tests continue to work -- Test ID field added to responses (backwards compatible) - -**For CLI Users**: -```bash -# Old: Test runs, no way to check status -z3ed agent test --prompt "Open Overworld" - -# New: Get test ID, poll for status -TEST_ID=$(z3ed agent test --prompt "Open Overworld" --output json | jq -r '.test_id') -z3ed agent test status --test-id $TEST_ID --follow -z3ed agent test results --test-id $TEST_ID -``` - -## Next Steps - -After IT-05 completion: -1. **IT-06**: Widget Discovery API (uses introspection foundation) -2. **IT-07**: Test Recording & Replay (records test IDs and results) -3. **IT-08**: Enhanced Error Reporting (captures test context on failure) - -## References - -- **Proto Definition**: `src/app/core/proto/imgui_test_harness.proto` -- **Test Manager**: `src/app/core/test_manager.{h,cc}` -- **RPC Service**: `src/app/core/service/imgui_test_harness_service.{h,cc}` -- **CLI Handlers**: `src/cli/handlers/agent.cc` -- **Main Plan**: `docs/z3ed/E6-z3ed-implementation-plan.md` - ---- - -**Author**: @scawful, GitHub Copilot -**Created**: October 2, 2025 -**Status**: In progress (server-side complete; CLI + E2E pending) diff --git a/docs/z3ed/IT-08-IMPLEMENTATION-GUIDE.md b/docs/z3ed/IT-08-IMPLEMENTATION-GUIDE.md deleted file mode 100644 index 2350b1c8..00000000 --- a/docs/z3ed/IT-08-IMPLEMENTATION-GUIDE.md +++ /dev/null @@ -1,830 +0,0 @@ -# IT-08: Enhanced Error Reporting Implementation Guide - -**Status**: IT-08a Complete โœ… | IT-08b Complete โœ… | IT-08c Complete โœ… -**Date**: October 2, 2025 -**Overall Progress**: 100% Complete (3 of 3 phases) - ---- - -## Phase Overview - -| Phase | Task | Status | Time | Description | -|-------|------|--------|------|-------------| -| IT-08a | Screenshot RPC | โœ… Complete | 1.5h | SDL-based screenshot capture | -| IT-08b | Auto-Capture on Failure | โœ… Complete | 1.5h | Integrate with TestManager | -| IT-08c | Widget State Dumps | โœ… Complete | 45m | Capture UI context on failure | -| IT-08d | Error Envelope Standardization | ๐Ÿ“‹ Planned | 1-2h | Unified error format across services | -| IT-08e | CLI Error Improvements | ๐Ÿ“‹ Planned | 1h | Rich error output with artifacts | - -**Total Estimated Time**: 5-7 hours -**Time Spent**: 3.75 hours -**Time Remaining**: 0 hours (Core phases complete) - ---- - -## IT-08a: Screenshot RPC โœ… COMPLETE - -**Date Completed**: October 2, 2025 -**Time**: 1.5 hours - -### Implementation Summary - -### What Was Built - -Implemented the `Screenshot` RPC in the ImGuiTestHarness service with the following capabilities: - -1. **SDL Renderer Integration**: Accesses the ImGui SDL2 backend renderer through `BackendRendererUserData` -2. **Framebuffer Capture**: Uses `SDL_RenderReadPixels` to capture the full window contents (1536x864, 32-bit ARGB) -3. **BMP File Output**: Saves screenshots as BMP files using SDL's built-in `SDL_SaveBMP` function -4. **Flexible Paths**: Supports custom output paths or auto-generates timestamped filenames (`/tmp/yaze_screenshot_.bmp`) -5. **Response Metadata**: Returns file path, file size (bytes), and image dimensions - -### Technical Implementation - -**Location**: `/Users/scawful/Code/yaze/src/app/core/service/imgui_test_harness_service.cc` - -```cpp -// Helper struct matching imgui_impl_sdlrenderer2.cpp backend data -struct ImGui_ImplSDLRenderer2_Data { - SDL_Renderer* Renderer; -}; - -absl::Status ImGuiTestHarnessServiceImpl::Screenshot( - const ScreenshotRequest* request, ScreenshotResponse* response) { - // 1. Get SDL renderer from ImGui backend - ImGuiIO& io = ImGui::GetIO(); - auto* backend_data = static_cast(io.BackendRendererUserData); - - if (!backend_data || !backend_data->Renderer) { - response->set_success(false); - response->set_message("SDL renderer not available"); -## IT-08b: Auto-Capture on Test Failure โœ… COMPLETE - - ## IT-08b: Auto-Capture on Test Failure โœ… COMPLETE - - **Date Completed**: October 2, 2025 - **Artifacts**: `CaptureFailureContext`, `screenshot_utils.{h,cc}`, CLI introspection updates - - ### Highlights - - - **Shared SDL helper**: New `CaptureHarnessScreenshot()` centralizes renderer - capture and writes BMP files into `${TMPDIR}/yaze/test-results//`. - - **TestManager integration**: Failure context now records ImGui window/nav - state, widget hierarchy (`CaptureWidgetState`), and screenshot metadata while - keeping `HarnessTestExecution` aggregates in sync. - - **Graceful fallbacks**: When `YAZE_WITH_GRPC` is disabled we emit a harness - log noting that screenshot capture is unavailable. - - **End-user surfacing**: `GuiAutomationClient::GetTestResults` and - `z3ed agent test results` expose `screenshot_path`, `screenshot_size_bytes`, - `failure_context`, and `widget_state` in both YAML and JSON modes. - - ### Key Touch Points - - | File | Purpose | - |------|---------| - | `src/app/core/service/screenshot_utils.{h,cc}` | SDL renderer capture reused by RPC + auto-capture | - | `src/app/test/test_manager.cc` | Auto-capture pipeline with per-test artifact directories | - | `src/app/core/service/imgui_test_harness_service.cc` | Screenshot RPC delegates to shared helper | - | `src/cli/service/gui_automation_client.*` | Propagates new proto fields to CLI | - | `src/cli/handlers/agent/test_commands.cc` | Presents diagnostics to users/agents | - - ### Validation Checklist - - ```bash - # Build (needs YAZE_WITH_GRPC=ON) - cmake --build build-grpc-test --target yaze -j$(sysctl -n hw.ncpu) - - # Start harness - ./build-grpc-test/bin/yaze.app/Contents/MacOS/yaze \ - --enable_test_harness --test_harness_port=50052 \ - --rom_file=assets/zelda3.sfc & - - # Queue a failing automation step - grpcurl -plaintext \ - -import-path src/app/core/proto \ - -proto imgui_test_harness.proto \ - -d '{"target":"button:DoesNotExist","type":"LEFT"}' \ - localhost:50052 yaze.test.ImGuiTestHarness/Click - - # Fetch diagnostics - z3ed agent test results --test-id --include-logs --format yaml - - # Inspect artifact directory - ls ${TMPDIR}/yaze/test-results// - ``` - - You should see a `.bmp` failure screenshot, widget JSON in the CLI output, and - logs noting the auto-capture event. When the helper fails (e.g., renderer not - ready) the harness log and CLI output record the failure reason. - - ### Next Steps - - - Wire the same helper into HTML bundle generation (IT-08c follow-up). - - Add configurable artifact root (`--error-artifact-dir`) for CI separation. - - Consider PNG encoding via `stb_image_write` if file size becomes an issue. - - --- -### Technical Implementation - -**Location**: `/Users/scawful/Code/yaze/src/app/test/test_manager.{h,cc}` - -**Key Changes**: - -```cpp -// In HarnessTestExecution struct -struct HarnessTestExecution { - // ... existing fields ... - - // IT-08b: Failure diagnostics - std::string screenshot_path; - int64_t screenshot_size_bytes = 0; - std::string failure_context; - std::string widget_state; // IT-08c (future) -}; - -// In MarkHarnessTestCompleted() -if (status == HarnessTestStatus::kFailed || - status == HarnessTestStatus::kTimeout) { - lock.Release(); - CaptureFailureContext(test_id); - lock.Acquire(); -} - -// CaptureFailureContext implementation -void TestManager::CaptureFailureContext(const std::string& test_id) { - absl::MutexLock lock(&harness_history_mutex_); - auto it = harness_history_.find(test_id); - if (it == harness_history_.end()) { - return; - } - - HarnessTestExecution& execution = it->second; - - // Capture execution context - if (ImGui::GetCurrentContext() != nullptr) { - ImGuiWindow* current_window = ImGui::GetCurrentWindow(); - const char* window_name = current_window ? current_window->Name : "none"; - ImGuiID active_id = ImGui::GetActiveID(); - - execution.failure_context = absl::StrFormat( - "Frame: %d, Active Window: %s, Focused Widget: 0x%08X", - ImGui::GetFrameCount(), window_name, active_id); - } - - // Set screenshot path placeholder - execution.screenshot_path = absl::StrFormat( - "/tmp/yaze_test_%s_failure.bmp", test_id); -} -``` - -### Testing - -The implementation will be validated when tests fail: - -```bash -# 1. Build with changes -cmake --build build-grpc-test --target yaze -j8 - -# 2. Start test harness -./build-grpc-test/bin/yaze.app/Contents/MacOS/yaze \ - --enable_test_harness --test_harness_port=50052 \ - --rom_file=assets/zelda3.sfc & - -# 3. Trigger a failing test -grpcurl -plaintext \ - -import-path src/app/core/proto \ - -proto imgui_test_harness.proto \ - -d '{"target":"nonexistent_widget","type":"LEFT"}' \ - 127.0.0.1:50052 yaze.test.ImGuiTestHarness/Click - -# 4. Query test results -grpcurl -plaintext \ - -import-path src/app/core/proto \ - -proto imgui_test_harness.proto \ - -d '{"test_id":"grpc_click_","include_logs":true}' \ - 127.0.0.1:50052 yaze.test.ImGuiTestHarness/GetTestResults -``` - -**Expected Response**: -```json -{ - "success": false, - "testName": "Click nonexistent_widget", - "category": "grpc", - "executedAtMs": "1696357200000", - "durationMs": 150, - "screenshotPath": "/tmp/yaze/test-results/grpc_click_12345678/failure_1696357200000.bmp", - "failureContext": "Frame: 1234, Active Window: Main Window, Focused Widget: 0x00000000" -} -``` - -### Success Criteria - -- โœ… Failure context captured automatically on test failures -- โœ… Screenshot path stored in test history -- โœ… GetTestResults RPC returns failure diagnostics -- โœ… No deadlocks (mutex released before calling CaptureFailureContext) -- โœ… Proto schema updated with new fields - -### Retro Notes - -- Placeholder screenshot paths have been replaced by the shared helper that - writes into `${TMPDIR}/yaze/test-results//` and records byte sizes. -- Widget state capture (IT-08c) is now invoked directly from - `CaptureFailureContext`, removing the TODOs from the original plan. - ---- - -## IT-08b: Auto-Capture on Test Failure ๐Ÿ”„ IN PROGRESS - -**Goal**: Automatically capture screenshots and context when tests fail -**Time Estimate**: 1-1.5 hours -**Status**: Ready to implement - -### Implementation Plan - -#### Step 1: Modify TestManager (30 minutes) - -**File**: `src/app/core/test_manager.cc` - -Add screenshot capture in `MarkHarnessTestCompleted()`: - -```cpp -void TestManager::MarkHarnessTestCompleted(const std::string& test_id, - ImGuiTestStatus status) { - auto& history_entry = test_history_[test_id]; - history_entry.status = status; - history_entry.end_time = absl::Now(); - history_entry.execution_time_ms = absl::ToInt64Milliseconds( - history_entry.end_time - history_entry.start_time); - - // Auto-capture screenshot on failure - if (status == ImGuiTestStatus_Error || status == ImGuiTestStatus_Warning) { - CaptureFailureContext(test_id); - } -} - -void TestManager::CaptureFailureContext(const std::string& test_id) { - auto& history_entry = test_history_[test_id]; - - // 1. Capture screenshot - std::string screenshot_path = - absl::StrFormat("/tmp/yaze_test_%s_failure.bmp", test_id); - - if (harness_service_) { - ScreenshotRequest req; - req.set_output_path(screenshot_path); - - ScreenshotResponse resp; - auto status = harness_service_->Screenshot(&req, &resp); - - if (status.ok()) { - history_entry.screenshot_path = resp.file_path(); - history_entry.screenshot_size_bytes = resp.file_size_bytes(); - } - } - - // 2. Capture widget state (IT-08c) - // history_entry.widget_state = CaptureWidgetState(); - - // 3. Capture execution context - history_entry.failure_context = absl::StrFormat( - "Frame: %d, Active Window: %s, Focused Widget: %s", - ImGui::GetFrameCount(), - ImGui::GetCurrentWindow() ? ImGui::GetCurrentWindow()->Name : "none", - ImGui::GetActiveID()); -} -``` - -#### Step 2: Update TestHistory Structure (15 minutes) - -**File**: `src/app/core/test_manager.h` - -Add failure context fields: - -```cpp -struct TestHistory { - std::string test_id; - std::string test_name; - ImGuiTestStatus status; - absl::Time start_time; - absl::Time end_time; - int64_t execution_time_ms; - std::vector logs; - std::map metrics; - - // IT-08b: Failure diagnostics - std::string screenshot_path; - int64_t screenshot_size_bytes = 0; - std::string failure_context; - std::string widget_state; // IT-08c -}; -``` - -#### Step 3: Update GetTestResults RPC (30 minutes) - -**File**: `src/app/core/service/imgui_test_harness_service.cc` - -Include screenshot path in results: - -```cpp -absl::Status ImGuiTestHarnessServiceImpl::GetTestResults( - const GetTestResultsRequest* request, - GetTestResultsResponse* response) { - - const auto& history = test_manager_->GetTestHistory(request->test_id()); - - // ... existing result population ... - - // Add failure diagnostics - if (!history.screenshot_path.empty()) { - response->set_screenshot_path(history.screenshot_path); - response->set_screenshot_size_bytes(history.screenshot_size_bytes); - } - - if (!history.failure_context.empty()) { - response->set_failure_context(history.failure_context); - } - - return absl::OkStatus(); -} -``` - -#### Step 4: Update Proto Schema (15 minutes) - -**File**: `src/app/core/proto/imgui_test_harness.proto` - -Add fields to GetTestResultsResponse: - -```proto -message GetTestResultsResponse { - string test_id = 1; - TestStatus status = 2; - int64 execution_time_ms = 3; - repeated string logs = 4; - map metrics = 5; - - // IT-08b: Failure diagnostics - string screenshot_path = 6; - int64 screenshot_size_bytes = 7; - string failure_context = 8; - string widget_state = 9; // IT-08c -} -``` - -### Testing - -```bash -# 1. Build with changes -cmake --build build-grpc-test --target yaze -j8 - -# 2. Start test harness -./build-grpc-test/bin/yaze.app/Contents/MacOS/yaze \ - --enable_test_harness --test_harness_port=50052 \ - --rom_file=assets/zelda3.sfc & - -# 3. Trigger a failing test -grpcurl -plaintext \ - -import-path src/app/core/proto \ - -proto imgui_test_harness.proto \ - -d '{"target":"nonexistent_widget","type":"LEFT"}' \ - 127.0.0.1:50052 yaze.test.ImGuiTestHarness/Click - -# 4. Check for screenshot -ls -lh /tmp/yaze_test_*_failure.bmp - -# 5. Query test results -grpcurl -plaintext \ - -import-path src/app/core/proto \ - -proto imgui_test_harness.proto \ - -d '{"test_id":"grpc_click_"}' \ - 127.0.0.1:50052 yaze.test.ImGuiTestHarness/GetTestResults - -# Expected: screenshot_path and failure_context populated -``` - -### Success Criteria - -- โœ… Screenshots auto-captured on test failure -- โœ… Screenshot path stored in test history -- โœ… GetTestResults returns screenshot metadata -- โœ… No performance impact on passing tests -- โœ… Screenshots cleaned up after test completion (optional) - ---- - -## IT-08c: Widget State Dumps โœ… COMPLETE - -**Date Completed**: October 2, 2025 -**Time**: 45 minutes - -### Implementation Summary - -Successfully implemented comprehensive widget state capture for test failure diagnostics. - -### What Was Built - -1. **Widget State Capture Utility** (`widget_state_capture.h/cc`): - - Created dedicated service for capturing ImGui widget hierarchy and state - - JSON serialization for structured output - - Comprehensive state snapshot including windows, widgets, input, and navigation - -2. **State Information Captured**: - - Frame count and frame rate - - Focused window and widget IDs - - Hovered widget ID - - List of visible windows - - Open popups - - Navigation state (nav ID, active state) - - Mouse state (buttons, position) - - Keyboard modifiers (Ctrl, Shift, Alt) - -3. **TestManager Integration**: - - Widget state automatically captured in `CaptureFailureContext()` - - State stored in `HarnessTestExecution::widget_state` - - Logged for debugging visibility - -4. **Build System Integration**: - - Added widget_state_capture sources to app.cmake - - Integrated with gRPC build configuration - -### Technical Implementation - -**Location**: `/Users/scawful/Code/yaze/src/app/core/widget_state_capture.{h,cc}` - -**Key Features**: - -```cpp -struct WidgetState { - std::string focused_window; - std::string focused_widget; - std::string hovered_widget; - std::vector visible_windows; - std::vector open_popups; - int frame_count; - float frame_rate; - ImGuiID nav_id; - bool nav_active; - bool mouse_down[5]; - float mouse_pos_x, mouse_pos_y; - bool ctrl_pressed, shift_pressed, alt_pressed; -}; - -std::string CaptureWidgetState() { - // Captures full ImGui context state - // Returns JSON-formatted string -} -``` - -**Integration in TestManager**: - -```cpp -void TestManager::CaptureFailureContext(const std::string& test_id) { - // ... capture execution context ... - - // Widget state capture (IT-08c) - execution.widget_state = core::CaptureWidgetState(); - - util::logf("[TestManager] Widget state: %s", - execution.widget_state.c_str()); -} -``` - -### Output Example - -```json -{ - "frame_count": 1234, - "frame_rate": 60.0, - "focused_window": "Overworld Editor", - "focused_widget": "0x12345678", - "hovered_widget": "0x87654321", - "visible_windows": [ - "Main Window", - "Overworld Editor", - "Debug" - ], - "open_popups": [], - "navigation": { - "nav_id": "0x00000000", - "nav_active": false - }, - "input": { - "mouse_buttons": [false, false, false, false, false], - "mouse_pos": [1024.5, 768.3], - "modifiers": { - "ctrl": false, - "shift": false, - "alt": false - } - } -} -``` - -### Testing - -Widget state capture will be automatically triggered on test failures: - -```bash -# 1. Build with new code -cmake --build build-grpc-test --target yaze -j8 - -# 2. Start test harness -./build-grpc-test/bin/yaze.app/Contents/MacOS/yaze \ - --enable_test_harness --test_harness_port=50052 \ - --rom_file=assets/zelda3.sfc & - -# 3. Trigger a failing test -grpcurl -plaintext \ - -import-path src/app/core/proto \ - -proto imgui_test_harness.proto \ - -d '{"target":"nonexistent_widget","type":"LEFT"}' \ - 127.0.0.1:50052 yaze.test.ImGuiTestHarness/Click - -# 4. Query results - will include widget_state field -grpcurl -plaintext \ - -import-path src/app/core/proto \ - -proto imgui_test_harness.proto \ - -d '{"test_id":"","include_logs":true}' \ - 127.0.0.1:50052 yaze.test.ImGuiTestHarness/GetTestResults -``` - -### Success Criteria - -- โœ… Widget state capture utility implemented -- โœ… JSON serialization working -- โœ… Integrated with TestManager failure capture -- โœ… Added to build system -- โœ… Comprehensive state information captured -- โœ… Proto schema already supports widget_state field - -### Benefits for Debugging - -The widget state dump provides critical context for debugging test failures: -- **UI State**: Know exactly which windows/widgets were visible -- **Focus State**: Understand what had input focus -- **Input State**: See mouse and keyboard state at failure time -- **Navigation**: Track ImGui navigation state -- **Frame Timing**: Frame count and rate for timing issues - ---- - -## IT-08c: Widget State Dumps ๐Ÿ“‹ PLANNED - -**Goal**: Capture UI hierarchy and state on test failures -**Time Estimate**: 30-45 minutes -**Status**: Specification phase - -### Implementation Plan - -#### Step 1: Create Widget State Capture Utility (30 minutes) - -**File**: `src/app/core/widget_state_capture.h` (new file) - -```cpp -#ifndef YAZE_CORE_WIDGET_STATE_CAPTURE_H -#define YAZE_CORE_WIDGET_STATE_CAPTURE_H - -#include -#include "imgui/imgui.h" - -namespace yaze { -namespace core { - -struct WidgetState { - std::string focused_window; - std::string focused_widget; - std::string hovered_widget; - std::vector visible_windows; - std::vector open_menus; - std::string active_popup; -}; - -std::string CaptureWidgetState(); -std::string SerializeWidgetStateToJson(const WidgetState& state); - -} // namespace core -} // namespace yaze - -#endif -``` - -**File**: `src/app/core/widget_state_capture.cc` (new file) - -```cpp -#include "src/app/core/widget_state_capture.h" -#include "absl/strings/str_format.h" -#include "nlohmann/json.hpp" - -namespace yaze { -namespace core { - -std::string CaptureWidgetState() { - WidgetState state; - - // Capture focused window - ImGuiWindow* current = ImGui::GetCurrentWindow(); - if (current) { - state.focused_window = current->Name; - } - - // Capture active widget - ImGuiID active_id = ImGui::GetActiveID(); - if (active_id != 0) { - state.focused_widget = absl::StrFormat("ID_%u", active_id); - } - - // Capture hovered widget - ImGuiID hovered_id = ImGui::GetHoveredID(); - if (hovered_id != 0) { - state.hovered_widget = absl::StrFormat("ID_%u", hovered_id); - } - - // Traverse window list - ImGuiContext* ctx = ImGui::GetCurrentContext(); - for (ImGuiWindow* window : ctx->Windows) { - if (window->Active && !window->Hidden) { - state.visible_windows.push_back(window->Name); - } - } - - return SerializeWidgetStateToJson(state); -} - -std::string SerializeWidgetStateToJson(const WidgetState& state) { - nlohmann::json j; - j["focused_window"] = state.focused_window; - j["focused_widget"] = state.focused_widget; - j["hovered_widget"] = state.hovered_widget; - j["visible_windows"] = state.visible_windows; - j["open_menus"] = state.open_menus; - j["active_popup"] = state.active_popup; - return j.dump(2); // Pretty print with indent -} - -} // namespace core -} // namespace yaze -``` - -#### Step 2: Integrate with TestManager (15 minutes) - -Update `CaptureFailureContext()` in `test_manager.cc`: - -```cpp -void TestManager::CaptureFailureContext(const std::string& test_id) { - auto& history_entry = test_history_[test_id]; - - // 1. Screenshot (IT-08b) - // ... existing code ... - - // 2. Widget state (IT-08c) - history_entry.widget_state = core::CaptureWidgetState(); - - // 3. Execution context - // ... existing code ... -} -``` - -### Output Example - -```json -{ - "focused_window": "Overworld Editor", - "focused_widget": "ID_12345", - "hovered_widget": "ID_67890", - "visible_windows": [ - "Main Window", - "Overworld Editor", - "Palette Editor" - ], - "open_menus": [], - "active_popup": "" -} -``` - ---- - -## IT-08d: Error Envelope Standardization ๐Ÿ“‹ PLANNED - -**Goal**: Unified error format across z3ed, TestManager, EditorManager -**Time Estimate**: 1-2 hours -**Status**: Design phase - -### Proposed Error Envelope - -```cpp -// Shared error structure -struct ErrorContext { - absl::Status status; - std::string component; // "TestHarness", "EditorManager", "z3ed" - std::string operation; // "Click", "LoadROM", "RunTest" - std::map metadata; - std::vector artifact_paths; // Screenshots, logs, etc. - std::string actionable_hint; // User-facing suggestion -}; -``` - -### Integration Points - -1. **TestManager**: Wrap failures in ErrorContext -2. **EditorManager**: Use ErrorContext for all operations -3. **z3ed CLI**: Parse ErrorContext and format for display -4. **ProposalDrawer**: Display ErrorContext in GUI modal - ---- - -## IT-08e: CLI Error Improvements ๐Ÿ“‹ PLANNED - -**Goal**: Rich error output in z3ed CLI -**Time Estimate**: 1 hour -**Status**: Design phase - -### Enhanced CLI Output - -```bash -$ z3ed agent test --prompt "Open Overworld editor" - -โŒ Test Failed: grpc_click_1696357200 - Component: ImGuiTestHarness - Operation: Click widget "Overworld" - - Error: Widget not found - - Artifacts: - โ€ข Screenshot: /tmp/yaze_test_grpc_click_1696357200_failure.bmp - โ€ข Widget State: /tmp/yaze_test_grpc_click_1696357200_state.json - โ€ข Logs: /tmp/yaze_test_grpc_click_1696357200.log - - Context: - โ€ข Visible Windows: Main Window, Debug - โ€ข Focused Window: Main Window - โ€ข Active Widget: None - - Suggestion: - โ†’ Check if ROM is loaded (File โ†’ Open ROM) - โ†’ Verify Overworld editor button is visible - โ†’ Use 'z3ed agent gui discover' to list available widgets -``` - ---- - -## Progress Tracking - -### Completed โœ… -- IT-08a: Screenshot RPC (1.5 hours) -- IT-08b: Auto-capture on failure (1.5 hours) -- IT-08c: Widget state dumps (45 minutes) - -### In Progress ๐Ÿ”„ -- None - Core error reporting complete - -### Planned ๐Ÿ“‹ -- IT-08d: Error envelope standardization (optional enhancement) -- IT-08e: CLI error improvements (optional enhancement) - -### Time Investment -- **Spent**: 3.75 hours (IT-08a + IT-08b + IT-08c) -- **Remaining**: 0 hours for core phases -- **Total**: 3.75 hours vs 5-7 hours estimated (under budget โœ…) - ---- - -## Next Steps - -**IT-08 Core Complete** โœ… - -All three core phases of IT-08 (Enhanced Error Reporting) are now complete: -1. โœ… Screenshot capture via SDL -2. โœ… Auto-capture on test failure -3. โœ… Widget state dumps - -**Optional Enhancements** (IT-08d/e - not blocking): -- Error envelope standardization across services -- CLI error output improvements -- HTML error report generation - -**Recommended Next Priority**: IT-09 (CI/CD Integration) or IT-06 (Widget Discovery API) - ---- - -## References - -- **Implementation Plan**: [E6-z3ed-implementation-plan.md](E6-z3ed-implementation-plan.md) -- **Test Harness Guide**: [IT-05-IMPLEMENTATION-GUIDE.md](IT-05-IMPLEMENTATION-GUIDE.md) -- **Source Files**: - - `src/app/core/service/imgui_test_harness_service.cc` - - `src/app/core/test_manager.{h,cc}` - - `src/app/core/proto/imgui_test_harness.proto` - ---- - -**Last Updated**: October 2, 2025 -**Current Phase**: IT-08b (Auto-capture on failure) -**Overall Progress**: 33% Complete (1 of 3 core phases) - ---- - -**Report Generated**: October 2, 2025 -**Author**: GitHub Copilot (AI Assistant) -**Project**: YAZE - Yet Another Zelda3 Editor -**Component**: z3ed CLI Tool - Test Automation Harness diff --git a/docs/z3ed/IT-10-COLLABORATIVE-EDITING.md b/docs/z3ed/IT-10-COLLABORATIVE-EDITING.md deleted file mode 100644 index 8a1e7dd0..00000000 --- a/docs/z3ed/IT-10-COLLABORATIVE-EDITING.md +++ /dev/null @@ -1,1011 +0,0 @@ -# IT-10: Collaborative Editing & Multiplayer Sessions - -**Priority**: P2 (High value, non-blocking) -**Status**: ๐Ÿ“‹ Planned -**Estimated Effort**: 12-15 hours -**Dependencies**: IT-05 (Test Introspection), IT-08 (Screenshot Capture) -**Target**: Enable real-time collaborative ROM editing with AI assistance - ---- - -## Vision - -Enable multiple users to connect to the same YAZE session, see each other's edits in real-time, and collaborate with AI agents together. Think "Google Docs for ROM hacking" where users can: - -- **Connect to each other's sessions** over the network -- **See real-time edits** (tiles, sprites, map changes) -- **Share AI assistance** (one user asks AI, all users see results) -- **Coordinate workflows** (e.g., one user edits dungeons, another edits overworld) -- **Review changes together** with live cursors and annotations - ---- - -## User Stories - -### US-1: Session Host & Join - -**As a ROM hacker**, I want to host a collaborative editing session so my teammates can join and work together. - -```bash -# Host creates a session -$ z3ed collab host --port 5000 --password "dev123" -โœ… Collaborative session started - Session ID: yaze-collab-f3a9b2c1 - URL: yaze://connect/localhost:5000?session=yaze-collab-f3a9b2c1 - Password: dev123 - -๐Ÿ‘ฅ Waiting for collaborators... - -# Remote user joins -$ z3ed collab join yaze://connect/192.168.1.100:5000?session=yaze-collab-f3a9b2c1 -๐Ÿ” Enter session password: *** -โœ… Connected to session (Host: Alice) -๐Ÿ‘ฅ Active users: Alice (host), Bob (you) -``` - -**Acceptance Criteria**: -- Host can create session with optional password -- Clients can discover and join sessions -- Connection state visible in GUI status bar -- Maximum 8 concurrent users per session - ---- - -### US-2: Real-Time Edit Synchronization - -**As a collaborator**, I want to see other users' edits in real-time so we stay synchronized. - -**Scenario**: Alice edits a tile in Overworld Editor -``` -Alice's GUI: - - Draws tile at (10, 15) โ†’ Sends edit event to all clients - -Bob's GUI (auto-update): - - Receives edit event โ†’ Redraws tile at (10, 15) - - Shows Alice's cursor/selection indicator -``` - -**Acceptance Criteria**: -- Edits appear on all clients within 100ms -- Conflict resolution for simultaneous edits -- Undo/redo synchronized across sessions -- Cursor positions visible for all users - ---- - -### US-3: Shared AI Agent - -**As a team lead**, I want to use AI agents with my team so we can all benefit from automation. - -```bash -# Alice (host) runs an AI agent test -$ z3ed agent test --prompt "Add treasure chest to room 12" --share - -๐Ÿค– AI Agent: Analyzing request... - Action: Click "Dungeon Editor" tab - Action: Select Room 12 - Action: Add object type 0x12 (treasure chest) at (5, 8) - -โœ… Proposal generated (ID: prop_3f8a) - -# All connected users see the proposal in their GUI -Bob's Screen: - โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” - โ”‚ ๐Ÿค– AI Proposal from Alice โ”‚ - โ”‚ โ”‚ - โ”‚ Add treasure chest to room 12 โ”‚ - โ”‚ โ€ข Click "Dungeon Editor" tab โ”‚ - โ”‚ โ€ข Select Room 12 โ”‚ - โ”‚ โ€ข Add treasure chest at (5, 8) โ”‚ - โ”‚ โ”‚ - โ”‚ [Accept] [Reject] [Discuss] โ”‚ - โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ - -# Team vote: 2/3 accept โ†’ Proposal executes for all users -``` - -**Acceptance Criteria**: -- AI agent results broadcast to all session members -- Proposals require majority approval (configurable threshold) -- All users see agent execution in real-time -- Failed operations rollback for all users - ---- - -### US-4: Live Cursors & Annotations - -**As a collaborator**, I want to see where other users are working so we don't conflict. - -**Visual Indicators**: -``` -โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” -โ”‚ Overworld Editor โ”‚ -โ”‚ โ”‚ -โ”‚ ๐ŸŸฆ (Alice's cursor at map 0x40) โ”‚ -โ”‚ ๐ŸŸฉ (Bob's cursor at map 0x41) โ”‚ -โ”‚ ๐ŸŸฅ (Charlie editing palette) โ”‚ -โ”‚ โ”‚ -โ”‚ Active Editors: โ”‚ -โ”‚ โ€ข Alice: Overworld (read-write) โ”‚ -โ”‚ โ€ข Bob: Overworld (read-write) โ”‚ -โ”‚ โ€ข Charlie: Palette Editor (read-only) โ”‚ -โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ -``` - -**Acceptance Criteria**: -- Each user has unique color-coded cursor -- Active editor window highlighted for each user -- Text chat overlay for quick communication -- Annotation tools (pins, comments, highlights) - ---- - -### US-5: Session Recording & Replay - -**As a project manager**, I want to record collaborative sessions so we can review work later. - -```bash -# Host enables session recording -$ z3ed collab host --record session_2025_10_02.yaml - -# Recording captures: -# - All edit operations (tiles, sprites, maps) -# - AI agent proposals and votes -# - Chat messages and annotations -# - User join/leave events -# - Timestamps for audit trail - -# Later: Replay the session -$ z3ed collab replay session_2025_10_02.yaml --speed 2x - -# Replay shows: -# - Timeline of all edits -# - User activity heatmap -# - Decision points (proposals accepted/rejected) -# - Final ROM state comparison -``` - -**Acceptance Criteria**: -- All session activity recorded in structured format (YAML/JSON) -- Replay supports speed control (0.5x - 10x) -- Export to video format (optional, uses screenshots) -- Audit log for compliance/review - ---- - -## Architecture - -### Components - -#### 1. Collaboration Server (New) - -**Location**: `src/app/core/collab/collab_server.{h,cc}` - -**Responsibilities**: -- Manage WebSocket connections from clients -- Broadcast edit events to all connected clients -- Handle session authentication (password, tokens) -- Enforce access control (read-only vs read-write) -- Maintain session state (active users, current ROM) - -**Technology**: -- **WebSocket** for low-latency bidirectional communication -- **Protocol Buffers** for efficient serialization -- **JWT tokens** for session authentication -- **Redis** (optional) for distributed sessions - -**Key APIs**: -```cpp -class CollabServer { - public: - // Start server on specified port - absl::Status Start(int port, const std::string& password); - - // Handle new client connection - void OnClientConnected(ClientConnection* client); - - // Broadcast edit event to all clients - void BroadcastEdit(const EditEvent& event, ClientConnection* sender); - - // Handle AI proposal from client - void BroadcastProposal(const AgentProposal& proposal); - - // Get active users in session - std::vector GetActiveUsers() const; - - private: - std::unique_ptr ws_server_; - absl::Mutex clients_mutex_; - std::vector> clients_; - SessionState session_state_; -}; -``` - ---- - -#### 2. Collaboration Client (New) - -**Location**: `src/app/core/collab/collab_client.{h,cc}` - -**Responsibilities**: -- Connect to remote collaboration server -- Send local edits to server -- Receive and apply remote edits -- Sync ROM state on join -- Handle disconnection/reconnection - -**Key APIs**: -```cpp -class CollabClient { - public: - // Connect to session - absl::Status Connect(const std::string& url, const std::string& password); - - // Send local edit to server - void SendEdit(const EditEvent& event); - - // Callback when remote edit received - void OnRemoteEdit(const EditEvent& event); - - // Get list of active users - std::vector GetUsers() const; - - // Disconnect from session - void Disconnect(); - - private: - std::unique_ptr ws_client_; - CollabEventHandler* event_handler_; - SessionInfo session_info_; -}; -``` - ---- - -#### 3. Edit Event Protocol (New) - -**Location**: `src/app/core/proto/collab_events.proto` - -**Message Definitions**: -```protobuf -syntax = "proto3"; - -package yaze.collab; - -// Generic edit event -message EditEvent { - string event_id = 1; // Unique event ID - string user_id = 2; // User who made the edit - int64 timestamp_ms = 3; // Unix timestamp - - oneof event_type { - TileEdit tile_edit = 10; - SpriteEdit sprite_edit = 11; - PaletteEdit palette_edit = 12; - MapEdit map_edit = 13; - ObjectEdit object_edit = 14; - } -} - -// Tile edit (Tile16 Editor, Tilemap) -message TileEdit { - string editor = 1; // "tile16", "tilemap" - int32 x = 2; - int32 y = 3; - int32 layer = 4; - bytes tile_data = 5; // Tile pixel data or ID -} - -// Sprite edit -message SpriteEdit { - int32 sprite_id = 1; - int32 x = 2; - int32 y = 3; - bytes sprite_data = 4; -} - -// Map edit (Overworld/Dungeon) -message MapEdit { - string map_type = 1; // "overworld", "dungeon" - int32 map_id = 2; - bytes map_data = 3; -} - -// User cursor position -message CursorEvent { - string user_id = 1; - string editor = 2; // Active editor window - int32 x = 3; - int32 y = 4; - string color = 5; // Cursor color (hex) -} - -// AI proposal event -message ProposalEvent { - string proposal_id = 1; - string user_id = 2; // User who initiated agent - string prompt = 3; - repeated ProposalAction actions = 4; - - enum ProposalStatus { - PENDING = 0; - ACCEPTED = 1; - REJECTED = 2; - EXECUTING = 3; - COMPLETED = 4; - } - ProposalStatus status = 5; - - // Voting - map votes = 6; // user_id -> accept/reject - int32 votes_needed = 7; -} - -message ProposalAction { - string action_type = 1; // "click", "type", "edit" - map params = 2; -} - -// Session state -message SessionState { - string session_id = 1; - string host_user_id = 2; - repeated UserInfo users = 3; - bytes rom_checksum = 4; // SHA256 of ROM - int64 session_start_ms = 5; -} - -message UserInfo { - string user_id = 1; - string username = 2; - string color = 3; // User's cursor color - bool is_host = 4; - bool read_only = 5; - string active_editor = 6; -} -``` - ---- - -#### 4. Conflict Resolution System - -**Challenge**: Multiple users edit the same tile/sprite simultaneously - -**Solution**: Operational Transformation (OT) with timestamps - -```cpp -class ConflictResolver { - public: - // Resolve conflicting edits - EditEvent ResolveConflict(const EditEvent& local, - const EditEvent& remote); - - private: - // Last-write-wins with timestamp - EditEvent LastWriteWins(const EditEvent& e1, const EditEvent& e2); - - // Merge edits if possible (e.g., different layers) - std::optional TryMerge(const EditEvent& e1, - const EditEvent& e2); -}; -``` - -**Conflict Resolution Rules**: -1. **Same tile, different times**: Last write wins (based on timestamp) -2. **Same tile, same time (<100ms)**: Host user wins (host authority) -3. **Different tiles**: No conflict, apply both -4. **Different layers**: No conflict, apply both -5. **Undo/Redo**: Undo takes precedence (explicit user intent) - ---- - -#### 5. GUI Integration - -**Status Bar Indicator**: -``` -โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” -โ”‚ File Edit View Tools Help ๐Ÿ‘ฅ 3 users connected โ”‚ -โ”‚ ๐ŸŸข Alice (Host) โ”‚ -โ”‚ ๐Ÿ”ต Bob โ”‚ -โ”‚ ๐ŸŸฃ Charlie โ”‚ -โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ -``` - -**Collaboration Panel**: -``` -โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” -โ”‚ Collaboration โ”‚ -โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค -โ”‚ Session: yaze-collab-f3a9b2c1 โ”‚ -โ”‚ Status: ๐ŸŸข Connected โ”‚ -โ”‚ โ”‚ -โ”‚ Users (3): โ”‚ -โ”‚ ๐ŸŸข Alice (Host) - Dungeon โ”‚ -โ”‚ ๐Ÿ”ต Bob (You) - Overworld โ”‚ -โ”‚ ๐ŸŸฃ Charlie - Palette โ”‚ -โ”‚ โ”‚ -โ”‚ Activity: โ”‚ -โ”‚ โ€ข Alice edited room 12 โ”‚ -โ”‚ โ€ข Bob added sprite #23 โ”‚ -โ”‚ โ€ข Charlie changed palette 2 โ”‚ -โ”‚ โ”‚ -โ”‚ [Chat] [Proposals] [Disconnect] โ”‚ -โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ -``` - -**Cursor Overlay**: -```cpp -// In canvas rendering -void OverworldCanvas::DrawCollaborativeCursors() { - for (const auto& user : collab_client_->GetUsers()) { - if (user.active_editor == "overworld" && user.user_id != my_user_id) { - ImVec2 cursor_pos = TileToScreen(user.cursor_x, user.cursor_y); - ImU32 color = ImGui::GetColorU32(user.color); - - // Draw cursor indicator - draw_list->AddCircleFilled(cursor_pos, 5.0f, color); - - // Draw username label - draw_list->AddText(cursor_pos + ImVec2(10, -5), color, user.username.c_str()); - } - } -} -``` - ---- - -## CLI Commands - -### Session Management - -```bash -# Host a session -z3ed collab host [options] - --port Port to listen on (default: 5000) - --password Session password (optional) - --max-users Maximum concurrent users (default: 8) - --read-only Comma-separated list of read-only users - --record Record session to file - -# Join a session -z3ed collab join [options] - --password Session password - --username Display name (default: system username) - --read-only Join in read-only mode - -# List active sessions (LAN discovery) -z3ed collab list - -# Disconnect from session -z3ed collab disconnect - -# Kick user (host only) -z3ed collab kick -``` - -### Session Replay - -```bash -# Replay recorded session -z3ed collab replay [options] - --speed Playback speed multiplier (default: 1.0) - --start-time