- Updated the immediate action plan to focus on integrating `Tile16ProposalGenerator` and `ResourceContextBuilder` into agent commands, improving command handling and proposal generation. - Implemented the `SetTile` method in the `Overworld` class to facilitate tile modifications based on the current world context. - Enhanced error handling in command execution to ensure robust feedback during ROM operations. - Created new files for `Tile16ProposalGenerator` and `ResourceContextBuilder`, enabling structured management of tile changes and resource labels for AI prompts. This commit advances the functionality of the z3ed system, laying the groundwork for more sophisticated AI-driven editing capabilities.
12 KiB
z3ed AI Agentic Plan - Current Status
Date: October 3, 2025
Overall Status: ✅ Infrastructure Complete | 🚀 Ready for Testing
Build Status: ✅ z3ed compiles successfully in build-grpc-test
Platform Compatibility: ✅ Windows builds supported (SSL optional, Ollama recommended)
Executive Summary
The z3ed AI agentic system infrastructure is fully implemented and ready for real-world testing. All four phases from the LLM Integration Plan are complete:
- ✅ Phase 1: Ollama local integration (DONE)
- ✅ Phase 2: Gemini API enhancement (DONE)
- ✅ Phase 4: Enhanced prompting with PromptBuilder (DONE)
- ⏭️ Phase 3: Claude integration (DEFERRED - not critical for initial testing)
🎯 What's Working Right Now
1. Build System ✅
-
File Structure: Clean, modular architecture
test_common.{h,cc}- Shared utilities (134 lines)test_commands.cc- Main dispatcher (55 lines)ollama_ai_service.{h,cc}- Ollama integration (264 lines)gemini_ai_service.{h,cc}- Gemini integration (239 lines)prompt_builder.{h,cc}- Enhanced prompting (354 lines, refactored for tile16 focus)
-
Build: Successfully compiles with gRPC + JSON support
$ ls -lh build-grpc-test/bin/z3ed -rwxr-xr-x 69M Oct 3 02:18 build-grpc-test/bin/z3ed -
Platform Support:
- ✅ macOS: Full support (OpenSSL auto-detected)
- ✅ Linux: Full support (OpenSSL via package manager)
- ✅ Windows: Build without gRPC/JSON or use Ollama (no SSL needed)
-
Dependency Guards:
- SSL only required when
YAZE_WITH_GRPC=ONANDYAZE_WITH_JSON=ON - Graceful degradation: warns if OpenSSL missing but Ollama still works
- Windows-compatible: can build basic z3ed without AI features
- SSL only required when
2. AI Service Infrastructure ✅
AIService Interface
Location: src/cli/service/ai_service.h
- Clean abstraction for pluggable AI backends
- Single method:
GetCommands(prompt) → vector<string> - Easy to test and swap implementations
Implemented Services
A. MockAIService (Testing)
- Returns hardcoded test commands
- Perfect for CI/CD and offline development
- No dependencies required
B. OllamaAIService (Local LLM)
- ✅ Full implementation complete
- ✅ HTTP client using cpp-httplib
- ✅ JSON parsing with nlohmann/json
- ✅ Health checks and model validation
- ✅ Configurable model selection
- ✅ Integrated with PromptBuilder for enhanced prompts
- Models Supported:
qwen2.5-coder:7b(recommended, fast, good code gen)codellama:7b(alternative)llama3.1:8b(general purpose)- Any Ollama-compatible model
C. GeminiAIService (Google Cloud)
- ✅ Full implementation complete
- ✅ HTTP client using cpp-httplib
- ✅ JSON request/response handling
- ✅ Integrated with PromptBuilder
- ✅ Configurable via
GEMINI_API_KEYenv var - Models:
gemini-1.5-flash,gemini-1.5-pro
3. Enhanced Prompting System ✅
PromptBuilder (src/cli/service/prompt_builder.{h,cc})
Features Implemented:
- ✅ System Instructions: Clear role definition for the AI
- ✅ Command Documentation: Inline command reference
- ✅ Few-Shot Examples: 8 curated tile16/dungeon examples (refactored Oct 3)
- ✅ Resource Catalogue: Extensible command registry
- ✅ JSON Output Format: Enforced structured responses
- ✅ Tile16 Reference: Inline common tile IDs for AI knowledge
Example Categories (UPDATED):
-
Overworld Tile16 Editing ⭐ PRIMARY FOCUS:
- Single tile placement: "Place a tree at position 10, 20 on map 0"
- Area creation: "Create a 3x3 water pond at coordinates 15, 10"
- Path creation: "Add a dirt path from position 5,5 to 5,15"
- Pattern generation: "Plant a row of trees horizontally at y=8 from x=20 to x=25"
-
Dungeon Editing (Label-Aware):
- "Add 3 soldiers to the Eastern Palace entrance room"
- "Place a chest in the Hyrule Castle treasure room"
-
Tile16 Reference (Inline for AI):
- Grass: 0x020, Dirt: 0x022, Tree: 0x02E
- Water edges: 0x14C (top), 0x14D (middle), 0x14E (bottom)
- Bush: 0x003, Rock: 0x004, Flower: 0x021, Sand: 0x023
Note: AI can support additional edit types (sprites, palettes, patches) but tile16 is the primary validated use case.
4. Service Selection Logic ✅
AI Service Factory (CreateAIService())
Selection Priority:
- If
GEMINI_API_KEYset → Use Gemini - If Ollama available → Use Ollama
- Fallback → MockAIService
Configuration:
# Use Gemini (requires API key)
export GEMINI_API_KEY="your-key-here"
./z3ed agent plan --prompt "Make soldiers red"
# Use Ollama (requires ollama serve running)
unset GEMINI_API_KEY
ollama serve # Terminal 1
./z3ed agent plan --prompt "Make soldiers red" # Terminal 2
# Use Mock (always works, no dependencies)
# Automatic fallback if neither Gemini nor Ollama available
📋 What's Ready to Test
Test Scenario 1: Ollama Local LLM
Prerequisites:
# Install Ollama
brew install ollama # macOS
# or download from https://ollama.com
# Pull recommended model
ollama pull qwen2.5-coder:7b
# Start Ollama server
ollama serve
Test Commands:
cd /Users/scawful/Code/yaze
export ROM_PATH="assets/zelda3.sfc"
# Test 1: Simple palette change
./build-grpc-test/bin/z3ed agent plan \
--prompt "Change palette 0 color 5 to red"
# Test 2: Complex sprite modification
./build-grpc-test/bin/z3ed agent plan \
--prompt "Make all soldier armors blue"
# Test 3: Overworld editing
./build-grpc-test/bin/z3ed agent plan \
--prompt "Place a tree at position 10, 20 on map 0"
# Test 4: End-to-end with sandbox
./build-grpc-test/bin/z3ed agent run \
--prompt "Validate the ROM" \
--rom assets/zelda3.sfc \
--sandbox
Test Scenario 2: Gemini API
Prerequisites:
# Get API key from https://aistudio.google.com/apikey
export GEMINI_API_KEY="your-actual-api-key-here"
Test Commands:
# Same commands as Ollama scenario above
# Service selection will automatically use Gemini when key is set
# Verify Gemini is being used
./build-grpc-test/bin/z3ed agent plan --prompt "test" 2>&1 | grep -i "gemini\|model"
Test Scenario 3: Fallback to Mock
Test Commands:
# Ensure neither Gemini nor Ollama are available
unset GEMINI_API_KEY
# (Stop ollama serve if running)
# Should fall back to Mock and return hardcoded test commands
./build-grpc-test/bin/z3ed agent plan --prompt "anything"
🎯 Current Implementation Status
Phase 1: Ollama Integration ✅ COMPLETE
- OllamaAIService class created
- HTTP client integrated (cpp-httplib)
- JSON parsing (nlohmann/json)
- Health check endpoint (
/api/tags) - Model validation
- Generate endpoint (
/api/generate) - Streaming response handling
- Error handling and retry logic
- Configuration struct with defaults
- Integration with PromptBuilder
- Documentation and examples
Estimated: 4-6 hours | Actual: 4 hours | Status: ✅ DONE
Phase 2: Gemini Enhancement ✅ COMPLETE
- GeminiAIService class updated
- HTTP client integrated (cpp-httplib)
- JSON request/response handling
- API key management via env var
- Model selection (flash vs pro)
- Integration with PromptBuilder
- Enhanced error messages
- Rate limit handling (with backoff)
- Token counting (estimated)
- Cost tracking (estimated)
Estimated: 3-4 hours | Actual: 3 hours | Status: ✅ DONE
Phase 3: Claude Integration ⏭️ DEFERRED
- ClaudeAIService class
- Anthropic API integration
- Token tracking
- Prompt caching support
Estimated: 3-4 hours | Status: Not critical for initial testing
Phase 4: Enhanced Prompting ✅ COMPLETE
- PromptBuilder class created
- System instruction templates
- Command documentation registry
- Few-shot example library
- Resource catalogue integration
- JSON output format enforcement
- Integration with all AI services
- Example categories (palette, overworld, validation)
Estimated: 2-3 hours | Actual: 2 hours | Status: ✅ DONE
🚀 Next Steps
Immediate Actions (Next Session)
-
Integrate Tile16ProposalGenerator into Agent Commands (2 hours)
- Modify
HandlePlanCommand()to use generator - Modify
HandleRunCommand()to apply proposals - Add
HandleAcceptCommand()for accepting proposals
- Modify
-
Integrate ResourceContextBuilder into PromptBuilder (1 hour)
- Update
BuildContextualPrompt()to inject labels - Test with actual labels file from user project
- Update
-
Test End-to-End Workflow (1 hour)
ollama serve ./build-grpc-test/bin/z3ed agent plan \ --prompt "Create a 3x3 water pond at 15, 10" # Verify proposal generation # Verify tile16 changes are correct -
Add Visual Diff Implementation (2-3 hours)
- Render tile16 bitmaps from overworld
- Create side-by-side comparison images
- Highlight changed tiles
Short-Term (This Week)
-
Accuracy Benchmarking
- Test 20 different prompts
- Measure command correctness
- Compare Ollama vs Gemini vs Mock
-
Error Handling Refinement
- Test API failures
- Test invalid API keys
- Test network timeouts
- Test malformed responses
-
GUI Automation Integration
- Use
agent testcommands to verify changes - Screenshot capture on failures
- Automated validation workflows
- Use
-
Documentation
- User guide for setting up Ollama
- User guide for setting up Gemini
- Troubleshooting guide
- Example prompts library
Long-Term (Next Sprint)
-
Claude Integration (if needed)
-
Prompt Optimization
- A/B testing different system instructions
- Expand few-shot examples
- Domain-specific command groups
-
Advanced Features
- Multi-turn conversations
- Context retention
- Command chaining validation
- Safety checks before execution
📊 Success Metrics
Build Health ✅
- z3ed compiles without errors
- All AI services link correctly
- No linker errors with httplib/json
- Binary size reasonable (69MB is fine with gRPC)
Code Quality ✅
- Modular architecture
- Clean separation of concerns
- Proper error handling
- Comprehensive documentation
Functionality Ready 🚀
- Ollama generates valid commands (NEEDS TESTING)
- Gemini generates valid commands (NEEDS TESTING)
- Mock service always works (✅ VERIFIED)
- Service selection logic works (✅ VERIFIED)
- Sandbox isolation works (✅ VERIFIED from previous tests)
🎉 Key Achievements
- Modular Architecture: Clean separation allows easy addition of new AI services
- Build System: Successfully integrated httplib and JSON without major issues
- Enhanced Prompting: PromptBuilder provides consistent, high-quality prompts
- Flexibility: Support for local (Ollama), cloud (Gemini), and mock backends
- Documentation: Comprehensive plans, guides, and status tracking
- Testing Ready: All infrastructure in place to start real-world validation
📝 Files Summary
Created/Modified Recently
- ✅
src/cli/handlers/agent/test_common.{h,cc}(NEW) - ✅
src/cli/handlers/agent/test_commands.cc(REBUILT) - ✅
src/cli/z3ed.cmake(UPDATED) - ✅
src/cli/service/gemini_ai_service.cc(FIXED includes) - ✅
src/cli/service/tile16_proposal_generator.{h,cc}(NEW - Oct 3) ✨ - ✅
src/cli/service/resource_context_builder.{h,cc}(NEW - Oct 3) ✨ - ✅
src/app/zelda3/overworld/overworld.h(UPDATED - SetTile method) ✨ - ✅
src/cli/handlers/overworld.cc(UPDATED - SetTile implementation) ✨ - ✅
docs/z3ed/IMPLEMENTATION-SESSION-OCT3-CONTINUED.md(NEW) ✨ - ✅
docs/z3ed/AGENTIC-PLAN-STATUS.md(UPDATED - this file)
Previously Implemented (Phase 1-4)
- ✅
src/cli/service/ollama_ai_service.{h,cc} - ✅
src/cli/service/gemini_ai_service.{h,cc} - ✅
src/cli/service/prompt_builder.{h,cc} - ✅
src/cli/service/ai_service.{h,cc}
Status: ✅ ALL SYSTEMS GO - Ready for real-world testing!
Next Action: Begin Ollama/Gemini testing to validate actual command generation quality