# LLM Integration Plan for z3ed Agent System **Status**: Implementation Ready | Priority: High **Created**: October 3, 2025 **Estimated Time**: 12-15 hours ## Executive Summary This document outlines the practical implementation plan for integrating LLM capabilities into the z3ed agent system. The infrastructure is **already in place** with the `AIService` interface, `MockAIService` for testing, and a partially implemented `GeminiAIService`. This plan focuses on making LLM integration production-ready with both local (Ollama) and remote (Gemini, Claude) options. **Current State**: - ✅ `AIService` interface defined (`src/cli/service/ai_service.h`) - ✅ `MockAIService` operational (returns hardcoded test commands) - ✅ `GeminiAIService` skeleton implemented (needs fixes + proper prompting) - ✅ Agent workflow fully functional with proposal system - ✅ Resource catalogue (command schemas) ready for LLM consumption - ✅ GUI automation harness operational for verification **What's Missing**: - 🔧 Ollama integration for local LLM support - 🔧 Improved Gemini prompting with resource catalogue - 🔧 Claude API integration as alternative remote option - 🔧 AI service selection mechanism (env vars + CLI flags) - 🔧 Proper prompt engineering with system instructions - 🔧 Error handling and retry logic for API failures - 🔧 Token usage monitoring and cost tracking --- ## 1. Implementation Priorities ### Phase 1: Ollama Local Integration (4-6 hours) 🎯 START HERE **Rationale**: Ollama provides the fastest path to a working LLM agent with no API keys, costs, or rate limits. Perfect for development and testing. **Benefits**: - **Privacy**: All processing happens locally - **Zero Cost**: No API charges or token limits - **Offline**: Works without internet connection - **Fast Iteration**: No rate limits for testing - **Model Flexibility**: Easily swap between codellama, llama3, qwen2.5-coder, etc. #### 1.1. Create OllamaAIService Class **File**: `src/cli/service/ollama_ai_service.h` ```cpp #ifndef YAZE_SRC_CLI_OLLAMA_AI_SERVICE_H_ #define YAZE_SRC_CLI_OLLAMA_AI_SERVICE_H_ #include #include #include "absl/status/status.h" #include "absl/status/statusor.h" #include "cli/service/ai_service.h" namespace yaze { namespace cli { // Ollama configuration for local LLM inference struct OllamaConfig { std::string base_url = "http://localhost:11434"; // Default Ollama endpoint std::string model = "qwen2.5-coder:7b"; // Recommended for code generation float temperature = 0.1; // Low temp for deterministic commands int max_tokens = 2048; // Sufficient for command lists std::string system_prompt; // Injected from resource catalogue }; class OllamaAIService : public AIService { public: explicit OllamaAIService(const OllamaConfig& config); // Generate z3ed commands from natural language prompt absl::StatusOr> GetCommands( const std::string& prompt) override; // Health check: verify Ollama server is running and model is available absl::Status CheckAvailability(); // List available models on Ollama server absl::StatusOr> ListAvailableModels(); private: OllamaConfig config_; // Build system prompt from resource catalogue std::string BuildSystemPrompt(); // Parse JSON response from Ollama API absl::StatusOr ParseOllamaResponse(const std::string& json_response); }; } // namespace cli } // namespace yaze #endif // YAZE_SRC_CLI_OLLAMA_AI_SERVICE_H_ ``` **File**: `src/cli/service/ollama_ai_service.cc` ```cpp #include "cli/service/ollama_ai_service.h" #include #include "absl/strings/str_cat.h" #include "absl/strings/str_format.h" #ifdef YAZE_WITH_HTTPLIB #include "incl/httplib.h" #include "third_party/json/src/json.hpp" #endif namespace yaze { namespace cli { OllamaAIService::OllamaAIService(const OllamaConfig& config) : config_(config) { if (config_.system_prompt.empty()) { config_.system_prompt = BuildSystemPrompt(); } } std::string OllamaAIService::BuildSystemPrompt() { // TODO: Read from docs/api/z3ed-resources.yaml return R"(You are an expert ROM hacking assistant for The Legend of Zelda: A Link to the Past. Your role is to generate PRECISE z3ed CLI commands to fulfill user requests. CRITICAL RULES: 1. Output ONLY a JSON array of command strings 2. Each command must follow exact z3ed syntax 3. Commands must be executable without modification 4. Use only commands from the available command set 5. Include all required arguments with proper flags AVAILABLE COMMANDS: - rom info --rom - rom validate --rom - palette export --group --id --to - palette import --group --id --from - palette set-color --file --index --color - overworld get-tile --map --x --y - overworld set-tile --map --x --y --tile - dungeon export-room --room --to - dungeon import-room --room --from RESPONSE FORMAT: ["command1", "command2", "command3"] EXAMPLE USER REQUEST: "Make all soldier armors red" CORRECT RESPONSE: ["palette export --group sprites --id soldier --to /tmp/soldier.pal", "palette set-color --file /tmp/soldier.pal --index 5 --color FF0000", "palette import --group sprites --id soldier --from /tmp/soldier.pal"] Begin your response now.)"; } absl::Status OllamaAIService::CheckAvailability() { #ifndef YAZE_WITH_HTTPLIB return absl::UnimplementedError( "Ollama service requires httplib. Build with vcpkg or system httplib."); #else try { httplib::Client cli(config_.base_url); cli.set_connection_timeout(5); // 5 second timeout auto res = cli.Get("/api/tags"); if (!res) { return absl::UnavailableError(absl::StrFormat( "Cannot connect to Ollama server at %s. " "Make sure Ollama is installed and running (ollama serve).", config_.base_url)); } if (res->status != 200) { return absl::InternalError(absl::StrFormat( "Ollama server error: HTTP %d", res->status)); } // Check if requested model is available nlohmann::json models_json = nlohmann::json::parse(res->body); bool model_found = false; for (const auto& model : models_json["models"]) { if (model["name"].get().find(config_.model) != std::string::npos) { model_found = true; break; } } if (!model_found) { return absl::NotFoundError(absl::StrFormat( "Model '%s' not found. Pull it with: ollama pull %s", config_.model, config_.model)); } return absl::OkStatus(); } catch (const std::exception& e) { return absl::InternalError(absl::StrCat("Ollama check failed: ", e.what())); } #endif } absl::StatusOr> OllamaAIService::ListAvailableModels() { #ifndef YAZE_WITH_HTTPLIB return absl::UnimplementedError("Requires httplib support"); #else httplib::Client cli(config_.base_url); auto res = cli.Get("/api/tags"); if (!res || res->status != 200) { return absl::UnavailableError("Cannot list Ollama models"); } nlohmann::json models_json = nlohmann::json::parse(res->body); std::vector models; for (const auto& model : models_json["models"]) { models.push_back(model["name"].get()); } return models; #endif } absl::StatusOr> OllamaAIService::GetCommands( const std::string& prompt) { #ifndef YAZE_WITH_HTTPLIB return absl::UnimplementedError( "Ollama service requires httplib. Build with vcpkg or system httplib."); #else // Build request payload nlohmann::json request_body = { {"model", config_.model}, {"prompt", config_.system_prompt + "\n\nUSER REQUEST: " + prompt}, {"stream", false}, {"temperature", config_.temperature}, {"max_tokens", config_.max_tokens}, {"format", "json"} // Force JSON output }; httplib::Client cli(config_.base_url); cli.set_read_timeout(60); // Longer timeout for inference auto res = cli.Post("/api/generate", request_body.dump(), "application/json"); if (!res) { return absl::UnavailableError( "Failed to connect to Ollama. Is 'ollama serve' running?"); } if (res->status != 200) { return absl::InternalError(absl::StrFormat( "Ollama API error: HTTP %d - %s", res->status, res->body)); } // Parse response try { nlohmann::json response_json = nlohmann::json::parse(res->body); std::string generated_text = response_json["response"].get(); // Parse the command array from generated text nlohmann::json commands_json = nlohmann::json::parse(generated_text); if (!commands_json.is_array()) { return absl::InvalidArgumentError( "LLM did not return a JSON array. Response: " + generated_text); } std::vector commands; for (const auto& cmd : commands_json) { if (cmd.is_string()) { commands.push_back(cmd.get()); } } if (commands.empty()) { return absl::InvalidArgumentError( "LLM returned empty command list. Prompt may be unclear."); } return commands; } catch (const nlohmann::json::exception& e) { return absl::InternalError(absl::StrCat( "Failed to parse Ollama response: ", e.what(), "\nRaw: ", res->body)); } #endif } } // namespace cli } // namespace yaze ``` #### 1.2. Add CMake Configuration **File**: `CMakeLists.txt` (add to dependencies section) ```cmake # Optional httplib for AI services (Ollama, Gemini) option(YAZE_WITH_HTTPLIB "Enable HTTP client for AI services" ON) if(YAZE_WITH_HTTPLIB) find_package(httplib CONFIG) if(httplib_FOUND) set(YAZE_WITH_HTTPLIB ON) add_compile_definitions(YAZE_WITH_HTTPLIB) message(STATUS "httplib found - AI services enabled") else() # Try to use bundled httplib from third_party if(EXISTS "${CMAKE_SOURCE_DIR}/third_party/httplib") set(YAZE_WITH_HTTPLIB ON) add_compile_definitions(YAZE_WITH_HTTPLIB) message(STATUS "Using bundled httplib - AI services enabled") else() set(YAZE_WITH_HTTPLIB OFF) message(WARNING "httplib not found - AI services disabled") endif() endif() endif() ``` #### 1.3. Wire into Agent Commands **File**: `src/cli/handlers/agent/general_commands.cc` Replace hardcoded `MockAIService` usage with service selection: ```cpp #include "cli/service/ollama_ai_service.h" #include "cli/service/gemini_ai_service.h" // Helper: Select AI service based on environment std::unique_ptr CreateAIService() { // Priority: Ollama (local) > Gemini (remote) > Mock (testing) const char* ollama_env = std::getenv("YAZE_AI_PROVIDER"); const char* gemini_key = std::getenv("GEMINI_API_KEY"); // Explicit provider selection if (ollama_env && std::string(ollama_env) == "ollama") { OllamaConfig config; // Allow model override via env if (const char* model = std::getenv("OLLAMA_MODEL")) { config.model = model; } auto service = std::make_unique(config); // Health check if (auto status = service->CheckAvailability(); !status.ok()) { std::cerr << "⚠️ Ollama unavailable: " << status.message() << std::endl; std::cerr << " Falling back to MockAIService" << std::endl; return std::make_unique(); } std::cout << "🤖 Using Ollama AI with model: " << config.model << std::endl; return service; } // Gemini if API key provided if (gemini_key && std::strlen(gemini_key) > 0) { std::cout << "🤖 Using Gemini AI (remote)" << std::endl; return std::make_unique(gemini_key); } // Default: Mock service for testing std::cout << "🤖 Using MockAIService (no LLM configured)" << std::endl; std::cout << " Set YAZE_AI_PROVIDER=ollama or GEMINI_API_KEY to enable LLM" << std::endl; return std::make_unique(); } // Update HandleRunCommand: absl::Status HandleRunCommand(const std::vector& arg_vec) { // ... existing setup code ... auto ai_service = CreateAIService(); // ← Replace MockAIService instantiation auto commands_or = ai_service->GetCommands(prompt); // ... rest of execution logic ... } ``` #### 1.4. Testing & Validation **Prerequisites**: ```bash # Install Ollama (macOS) brew install ollama # Start Ollama server ollama serve & # Pull recommended model ollama pull qwen2.5-coder:7b # Test connectivity curl http://localhost:11434/api/tags ``` **End-to-End Test Script** (`scripts/test_ollama_integration.sh`): ```bash #!/bin/bash set -e echo "🧪 Testing Ollama AI Integration" # 1. Check Ollama availability echo "Checking Ollama server..." if ! curl -s http://localhost:11434/api/tags > /dev/null; then echo "❌ Ollama not running. Start with: ollama serve" exit 1 fi # 2. Check model availability echo "Checking qwen2.5-coder:7b model..." if ! ollama list | grep -q "qwen2.5-coder:7b"; then echo "⚠️ Model not found. Pulling..." ollama pull qwen2.5-coder:7b fi # 3. Test agent run with simple prompt echo "Testing agent run command..." export YAZE_AI_PROVIDER=ollama export OLLAMA_MODEL=qwen2.5-coder:7b ./build/bin/z3ed agent run \ --prompt "Export the first overworld palette to /tmp/test.pal" \ --rom zelda3.sfc \ --sandbox # 4. Verify proposal created echo "Checking proposal registry..." if ! ./build/bin/z3ed agent list | grep -q "pending"; then echo "❌ No pending proposal found" exit 1 fi # 5. Review generated commands echo "✅ Reviewing generated commands..." ./build/bin/z3ed agent diff --format yaml echo "✅ Ollama integration test passed!" ``` --- ### Phase 2: Improve Gemini Integration (2-3 hours) The existing `GeminiAIService` needs fixes and better prompting: #### 2.1. Fix GeminiAIService Implementation **File**: `src/cli/service/gemini_ai_service.cc` ```cpp absl::StatusOr> GeminiAIService::GetCommands( const std::string& prompt) { #ifndef YAZE_WITH_HTTPLIB return absl::UnimplementedError( "Gemini AI service requires httplib. Build with vcpkg."); #else if (api_key_.empty()) { return absl::FailedPreconditionError( "GEMINI_API_KEY not set. Get key from: https://makersuite.google.com/app/apikey"); } // Build comprehensive system instruction std::string system_instruction = R"({ "role": "system", "content": "You are an expert ROM hacking assistant for The Legend of Zelda: A Link to the Past. Generate ONLY a JSON array of z3ed CLI commands. Each command must be executable without modification. Available commands: rom info, rom validate, palette export/import/set-color, overworld get-tile/set-tile, dungeon export-room/import-room. Response format: [\"command1\", \"command2\"]" })"; httplib::Client cli("https://generativelanguage.googleapis.com"); cli.set_read_timeout(60); nlohmann::json request_body = { {"contents", {{ {"role", "user"}, {"parts", {{ {"text", absl::StrFormat("System: %s\n\nUser: %s", system_instruction, prompt)} }}} }}}, {"generationConfig", { {"temperature", 0.1}, // Low temp for deterministic output {"maxOutputTokens", 2048}, {"topP", 0.8}, {"topK", 10} }}, {"safetySettings", { {{"category", "HARM_CATEGORY_DANGEROUS_CONTENT"}, {"threshold", "BLOCK_NONE"}} }} }; httplib::Headers headers = { {"Content-Type", "application/json"}, }; std::string endpoint = absl::StrFormat( "/v1beta/models/gemini-1.5-flash:generateContent?key=%s", api_key_); auto res = cli.Post(endpoint, headers, request_body.dump(), "application/json"); if (!res) { return absl::UnavailableError( "Failed to connect to Gemini API. Check internet connection."); } if (res->status != 200) { return absl::InternalError(absl::StrFormat( "Gemini API error: HTTP %d - %s", res->status, res->body)); } // Parse response try { nlohmann::json response_json = nlohmann::json::parse(res->body); // Extract text from nested structure std::string text_content = response_json["candidates"][0]["content"]["parts"][0]["text"] .get(); // Gemini may wrap JSON in markdown code blocks - strip them if (text_content.find("```json") != std::string::npos) { size_t start = text_content.find("["); size_t end = text_content.rfind("]"); if (start != std::string::npos && end != std::string::npos) { text_content = text_content.substr(start, end - start + 1); } } nlohmann::json commands_array = nlohmann::json::parse(text_content); if (!commands_array.is_array()) { return absl::InvalidArgumentError( "Gemini did not return a JSON array. Response: " + text_content); } std::vector commands; for (const auto& cmd : commands_array) { if (cmd.is_string()) { commands.push_back(cmd.get()); } } return commands; } catch (const nlohmann::json::exception& e) { return absl::InternalError(absl::StrCat( "Failed to parse Gemini response: ", e.what(), "\nRaw: ", res->body)); } #endif } ``` --- ### Phase 3: Add Claude Integration (2-3 hours) Claude 3.5 Sonnet is excellent for code generation and has a generous free tier. #### 3.1. Create ClaudeAIService **File**: `src/cli/service/claude_ai_service.h` ```cpp #ifndef YAZE_SRC_CLI_CLAUDE_AI_SERVICE_H_ #define YAZE_SRC_CLI_CLAUDE_AI_SERVICE_H_ #include #include #include "absl/status/statusor.h" #include "cli/service/ai_service.h" namespace yaze { namespace cli { class ClaudeAIService : public AIService { public: explicit ClaudeAIService(const std::string& api_key); absl::StatusOr> GetCommands( const std::string& prompt) override; private: std::string api_key_; std::string model_ = "claude-3-5-sonnet-20241022"; // Latest version }; } // namespace cli } // namespace yaze #endif // YAZE_SRC_CLI_CLAUDE_AI_SERVICE_H_ ``` **File**: `src/cli/service/claude_ai_service.cc` ```cpp #include "cli/service/claude_ai_service.h" #include "absl/strings/str_format.h" #ifdef YAZE_WITH_HTTPLIB #include "incl/httplib.h" #include "third_party/json/src/json.hpp" #endif namespace yaze { namespace cli { ClaudeAIService::ClaudeAIService(const std::string& api_key) : api_key_(api_key) {} absl::StatusOr> ClaudeAIService::GetCommands( const std::string& prompt) { #ifndef YAZE_WITH_HTTPLIB return absl::UnimplementedError("Claude service requires httplib"); #else if (api_key_.empty()) { return absl::FailedPreconditionError( "CLAUDE_API_KEY not set. Get key from: https://console.anthropic.com/"); } httplib::Client cli("https://api.anthropic.com"); cli.set_read_timeout(60); nlohmann::json request_body = { {"model", model_}, {"max_tokens", 2048}, {"temperature", 0.1}, {"system", "You are an expert ROM hacking assistant. Generate ONLY a JSON array of z3ed commands. No explanations."}, {"messages", {{ {"role", "user"}, {"content", prompt} }}} }; httplib::Headers headers = { {"Content-Type", "application/json"}, {"x-api-key", api_key_}, {"anthropic-version", "2023-06-01"} }; auto res = cli.Post("/v1/messages", headers, request_body.dump(), "application/json"); if (!res) { return absl::UnavailableError("Failed to connect to Claude API"); } if (res->status != 200) { return absl::InternalError(absl::StrFormat( "Claude API error: HTTP %d - %s", res->status, res->body)); } try { nlohmann::json response_json = nlohmann::json::parse(res->body); std::string text_content = response_json["content"][0]["text"].get(); // Claude may wrap in markdown - strip if present if (text_content.find("```json") != std::string::npos) { size_t start = text_content.find("["); size_t end = text_content.rfind("]"); if (start != std::string::npos && end != std::string::npos) { text_content = text_content.substr(start, end - start + 1); } } nlohmann::json commands_json = nlohmann::json::parse(text_content); std::vector commands; for (const auto& cmd : commands_json) { if (cmd.is_string()) { commands.push_back(cmd.get()); } } return commands; } catch (const std::exception& e) { return absl::InternalError(absl::StrCat( "Failed to parse Claude response: ", e.what())); } #endif } } // namespace cli } // namespace yaze ``` --- ### Phase 4: Enhanced Prompt Engineering (3-4 hours) #### 4.1. Load Resource Catalogue into System Prompt **File**: `src/cli/service/prompt_builder.h` ```cpp #ifndef YAZE_SRC_CLI_PROMPT_BUILDER_H_ #define YAZE_SRC_CLI_PROMPT_BUILDER_H_ #include #include "absl/status/statusor.h" namespace yaze { namespace cli { // Utility for building comprehensive LLM prompts from resource catalogue class PromptBuilder { public: // Load command schemas from docs/api/z3ed-resources.yaml static absl::StatusOr LoadResourceCatalogue(); // Build system prompt with full command documentation static std::string BuildSystemPrompt(); // Build few-shot examples for better LLM performance static std::string BuildFewShotExamples(); // Inject ROM context (current ROM info, loaded editors, etc.) static std::string BuildContextPrompt(); }; } // namespace cli } // namespace yaze #endif // YAZE_SRC_CLI_PROMPT_BUILDER_H_ ``` #### 4.2. Few-Shot Examples Include proven examples in system prompt: ```cpp std::string PromptBuilder::BuildFewShotExamples() { return R"( EXAMPLE 1: User: "Make soldier armor red" Response: ["palette export --group sprites --id soldier --to /tmp/soldier.pal", "palette set-color --file /tmp/soldier.pal --index 5 --color FF0000", "palette import --group sprites --id soldier --from /tmp/soldier.pal"] EXAMPLE 2: User: "Validate ROM integrity" Response: ["rom validate --rom zelda3.sfc"] EXAMPLE 3: User: "Change tile at coordinates (10, 20) in Light World to grass" Response: ["overworld set-tile --map 0 --x 10 --y 20 --tile 0x40"] )"; } ``` --- ## 2. Configuration & User Experience ### Environment Variables ```bash # AI Provider Selection export YAZE_AI_PROVIDER=ollama # Options: ollama, gemini, claude, mock export OLLAMA_MODEL=qwen2.5-coder:7b export OLLAMA_URL=http://localhost:11434 # API Keys (remote providers) export GEMINI_API_KEY=your_key_here export CLAUDE_API_KEY=your_key_here # Logging & Debugging export YAZE_AI_DEBUG=1 # Log full prompts and responses export YAZE_AI_CACHE_DIR=/tmp/yaze_ai_cache # Cache LLM responses ``` ### CLI Flags Add new flags to `z3ed agent run`: ```bash # Override provider for single command z3ed agent run --prompt "..." --ai-provider ollama # Override model z3ed agent run --prompt "..." --ai-model "llama3:70b" # Dry run: show generated commands without executing z3ed agent run --prompt "..." --dry-run # Interactive mode: confirm each command before execution z3ed agent run --prompt "..." --interactive ``` --- ## 3. Testing & Validation ### Unit Tests **File**: `test/cli/ai_service_test.cc` ```cpp #include "cli/service/ollama_ai_service.h" #include "cli/service/gemini_ai_service.h" #include "cli/service/claude_ai_service.h" #include TEST(OllamaAIServiceTest, CheckAvailability) { OllamaConfig config; config.base_url = "http://localhost:11434"; OllamaAIService service(config); // Should not crash, may return unavailable if Ollama not running auto status = service.CheckAvailability(); EXPECT_TRUE(status.ok() || absl::IsUnavailable(status) || absl::IsNotFound(status)); } TEST(OllamaAIServiceTest, GetCommands) { // Skip if Ollama not available OllamaConfig config; OllamaAIService service(config); if (!service.CheckAvailability().ok()) { GTEST_SKIP() << "Ollama not available"; } auto result = service.GetCommands("Validate the ROM"); ASSERT_TRUE(result.ok()) << result.status(); auto commands = result.value(); EXPECT_GT(commands.size(), 0); EXPECT_THAT(commands[0], testing::HasSubstr("rom validate")); } ``` ### Integration Tests **File**: `scripts/test_ai_services.sh` ```bash #!/bin/bash set -e echo "🧪 Testing AI Services Integration" # Test 1: Ollama (if available) if curl -s http://localhost:11434/api/tags > /dev/null 2>&1; then echo "✓ Ollama available - testing..." export YAZE_AI_PROVIDER=ollama ./build/bin/z3ed agent plan --prompt "Export first palette" else echo "⊘ Ollama not running - skipping" fi # Test 2: Gemini (if key set) if [ -n "$GEMINI_API_KEY" ]; then echo "✓ Gemini API key set - testing..." export YAZE_AI_PROVIDER=gemini ./build/bin/z3ed agent plan --prompt "Validate ROM" else echo "⊘ GEMINI_API_KEY not set - skipping" fi # Test 3: Claude (if key set) if [ -n "$CLAUDE_API_KEY" ]; then echo "✓ Claude API key set - testing..." export YAZE_AI_PROVIDER=claude ./build/bin/z3ed agent plan --prompt "Export dungeon room" else echo "⊘ CLAUDE_API_KEY not set - skipping" fi echo "✅ All available AI services tested successfully" ``` --- ## 4. Documentation Updates ### User Guide **File**: `docs/z3ed/AI-SERVICE-SETUP.md` ```markdown # Setting Up LLM Integration for z3ed ## Quick Start: Ollama (Recommended) 1. **Install Ollama**: ```bash # macOS brew install ollama # Linux curl -fsSL https://ollama.com/install.sh | sh ``` 2. **Start Server**: ```bash ollama serve ``` 3. **Pull Model**: ```bash ollama pull qwen2.5-coder:7b # Recommended: fast + accurate ``` 4. **Configure z3ed**: ```bash export YAZE_AI_PROVIDER=ollama ``` 5. **Test**: ```bash z3ed agent run --prompt "Validate my ROM" --rom zelda3.sfc ``` ## Alternative: Gemini API (Remote) 1. Get API key: https://makersuite.google.com/app/apikey 2. Configure: ```bash export GEMINI_API_KEY=your_key_here export YAZE_AI_PROVIDER=gemini ``` 3. Run: `z3ed agent run --prompt "..."` ## Alternative: Claude API (Remote) 1. Get API key: https://console.anthropic.com/ 2. Configure: ```bash export CLAUDE_API_KEY=your_key_here export YAZE_AI_PROVIDER=claude ``` 3. Run: `z3ed agent run --prompt "..."` ## Troubleshooting **Issue**: "Cannot connect to Ollama" **Solution**: Make sure `ollama serve` is running **Issue**: "Model not found" **Solution**: Run `ollama pull ` **Issue**: "LLM returned empty command list" **Solution**: Rephrase prompt to be more specific ``` --- ## 5. Implementation Timeline ### Week 1 (October 7-11) - **Day 1-2**: Implement `OllamaAIService` class - **Day 3**: Wire into agent commands with service selection - **Day 4**: Testing and validation - **Day 5**: Documentation and examples ### Week 2 (October 14-18) - **Day 1**: Fix and improve `GeminiAIService` - **Day 2**: Implement `ClaudeAIService` - **Day 3**: Enhanced prompt engineering with resource catalogue - **Day 4**: Integration tests across all services - **Day 5**: User guide and troubleshooting docs --- ## 6. Success Criteria ✅ **Phase 1 Complete When**: - Ollama service connects and generates valid commands - `z3ed agent run` works end-to-end with local LLM - Health checks report clear error messages - Test script passes on macOS with Ollama installed ✅ **Phase 2 Complete When**: - Gemini API calls succeed with valid responses - Markdown code block stripping works reliably - Error messages are actionable (e.g., "API key invalid") ✅ **Phase 3 Complete When**: - Claude service implemented with same interface - All three services (Ollama, Gemini, Claude) work interchangeably - Service selection mechanism is transparent to user ✅ **Phase 4 Complete When**: - System prompts include full resource catalogue - Few-shot examples improve command accuracy >90% - LLM responses consistently match expected command format --- ## 7. Future Enhancements (Post-MVP) - **Response Caching**: Cache LLM responses by prompt hash to reduce costs/latency - **Token Usage Tracking**: Monitor and report token consumption per session - **Model Comparison**: A/B test different models for accuracy/cost trade-offs - **Fine-Tuning**: Fine-tune local models on z3ed command corpus - **Multi-Turn Dialogue**: Support follow-up questions and clarifications - **Agentic Loop**: LLM self-corrects based on execution results - **GUI Integration**: In-app AI assistant panel in YAZE editor --- ## Appendix A: Recommended Models | Model | Provider | Size | Speed | Accuracy | Use Case | |-------|----------|------|-------|----------|----------| | qwen2.5-coder:7b | Ollama | 7B | Fast | High | **Recommended**: Best balance | | codellama:13b | Ollama | 13B | Medium | Higher | Complex tasks | | llama3:70b | Ollama | 70B | Slow | Highest | Maximum accuracy | | gemini-1.5-flash | Gemini | N/A | Fast | High | Remote option, low cost | | claude-3.5-sonnet | Claude | N/A | Medium | Highest | Premium remote option | ## Appendix B: Example Prompts **Simple**: - "Validate the ROM" - "Export the first palette" - "Show ROM info" **Moderate**: - "Make soldier armor red" - "Change tile at (10, 20) in Light World to grass" - "Export dungeon room 5 to /tmp/room5.bin" **Complex**: - "Find all palettes using color #FF0000 and change to #00FF00" - "Export all dungeon rooms, modify object 3 in each, then reimport" - "Generate a comparison report between two ROM versions" --- ## Next Steps **👉 START HERE**: Implement Phase 1 (Ollama Integration) by following section 1.1-1.4 above. Once complete, update this document with: - Actual time spent vs. estimates - Issues encountered and solutions - Model performance observations - User feedback **Questions? Blockers?** Open an issue or ping @scawful in Discord.