yaze/docs/z3ed/LLM-INTEGRATION-PLAN.md

# LLM Integration Plan for z3ed Agent System

**Status**: Implementation Ready | Priority: High
**Created**: October 3, 2025
**Estimated Time**: 12-15 hours

## Executive Summary

This document outlines the practical implementation plan for integrating LLM capabilities into the z3ed agent system. The infrastructure is **already in place** with the `AIService` interface, `MockAIService` for testing, and a partially implemented `GeminiAIService`. This plan focuses on making LLM integration production-ready with both local (Ollama) and remote (Gemini, Claude) options.

**Current State**:
- ✅ `AIService` interface defined (`src/cli/service/ai_service.h`)
- ✅ `MockAIService` operational (returns hardcoded test commands)
- ✅ `GeminiAIService` skeleton implemented (needs fixes + proper prompting)
- ✅ Agent workflow fully functional with proposal system
- ✅ Resource catalogue (command schemas) ready for LLM consumption
- ✅ GUI automation harness operational for verification

**What's Missing**:
- 🔧 Ollama integration for local LLM support
- 🔧 Improved Gemini prompting with resource catalogue
- 🔧 Claude API integration as alternative remote option
- 🔧 AI service selection mechanism (env vars + CLI flags)
- 🔧 Proper prompt engineering with system instructions
- 🔧 Error handling and retry logic for API failures
- 🔧 Token usage monitoring and cost tracking

---

## 1. Implementation Priorities

### Phase 1: Ollama Local Integration (4-6 hours) 🎯 START HERE

**Rationale**: Ollama provides the fastest path to a working LLM agent with no API keys, costs, or rate limits. Perfect for development and testing.

**Benefits**:
- **Privacy**: All processing happens locally
- **Zero Cost**: No API charges or token limits
- **Offline**: Works without internet connection
- **Fast Iteration**: No rate limits for testing
- **Model Flexibility**: Easily swap between codellama, llama3, qwen2.5-coder, etc.

#### 1.1. Create OllamaAIService Class

**File**: `src/cli/service/ollama_ai_service.h`

```cpp
#ifndef YAZE_SRC_CLI_OLLAMA_AI_SERVICE_H_
#define YAZE_SRC_CLI_OLLAMA_AI_SERVICE_H_

#include <string>
#include <vector>

#include "absl/status/status.h"
#include "absl/status/statusor.h"
#include "cli/service/ai_service.h"

namespace yaze {
namespace cli {

// Ollama configuration for local LLM inference
struct OllamaConfig {
  std::string base_url = "http://localhost:11434";  // Default Ollama endpoint
  std::string model = "qwen2.5-coder:7b";           // Recommended for code generation
  float temperature = 0.1;                          // Low temp for deterministic commands
  int max_tokens = 2048;                            // Sufficient for command lists
  std::string system_prompt;                        // Injected from resource catalogue
};

class OllamaAIService : public AIService {
 public:
  explicit OllamaAIService(const OllamaConfig& config);

  // Generate z3ed commands from natural language prompt
  absl::StatusOr<std::vector<std::string>> GetCommands(
      const std::string& prompt) override;

  // Health check: verify Ollama server is running and model is available
  absl::Status CheckAvailability();

  // List available models on Ollama server
  absl::StatusOr<std::vector<std::string>> ListAvailableModels();

 private:
  OllamaConfig config_;

  // Build system prompt from resource catalogue
  std::string BuildSystemPrompt();

  // Parse JSON response from Ollama API
  absl::StatusOr<std::string> ParseOllamaResponse(const std::string& json_response);
};

}  // namespace cli
}  // namespace yaze

#endif  // YAZE_SRC_CLI_OLLAMA_AI_SERVICE_H_
```

**File**: `src/cli/service/ollama_ai_service.cc`

```cpp
#include "cli/service/ollama_ai_service.h"

#include <cstdlib>

#include "absl/strings/str_cat.h"
#include "absl/strings/str_format.h"

#ifdef YAZE_WITH_HTTPLIB
#include "incl/httplib.h"
#include "third_party/json/src/json.hpp"
#endif

namespace yaze {
namespace cli {

OllamaAIService::OllamaAIService(const OllamaConfig& config) : config_(config) {
  if (config_.system_prompt.empty()) {
    config_.system_prompt = BuildSystemPrompt();
  }
}

std::string OllamaAIService::BuildSystemPrompt() {
  // TODO: Read from docs/api/z3ed-resources.yaml
  return R"(You are an expert ROM hacking assistant for The Legend of Zelda: A Link to the Past.
Your role is to generate PRECISE z3ed CLI commands to fulfill user requests.

CRITICAL RULES:
1. Output ONLY a JSON array of command strings
2. Each command must follow exact z3ed syntax
3. Commands must be executable without modification
4. Use only commands from the available command set
5. Include all required arguments with proper flags

AVAILABLE COMMANDS:
- rom info --rom <path>
- rom validate --rom <path>
- palette export --group <group> --id <id> --to <file>
- palette import --group <group> --id <id> --from <file>
- palette set-color --file <file> --index <index> --color <hex_color>
- overworld get-tile --map <map_id> --x <x> --y <y>
- overworld set-tile --map <map_id> --x <x> --y <y> --tile <tile_id>
- dungeon export-room --room <room_id> --to <file>
- dungeon import-room --room <room_id> --from <file>

RESPONSE FORMAT:
["command1", "command2", "command3"]

EXAMPLE USER REQUEST: "Make all soldier armors red"
CORRECT RESPONSE:
["palette export --group sprites --id soldier --to /tmp/soldier.pal",
 "palette set-color --file /tmp/soldier.pal --index 5 --color FF0000",
 "palette import --group sprites --id soldier --from /tmp/soldier.pal"]

Begin your response now.)";
}

absl::Status OllamaAIService::CheckAvailability() {
#ifndef YAZE_WITH_HTTPLIB
  return absl::UnimplementedError(
      "Ollama service requires httplib. Build with vcpkg or system httplib.");
#else
  try {
    httplib::Client cli(config_.base_url);
    cli.set_connection_timeout(5);  // 5 second timeout

    auto res = cli.Get("/api/tags");
    if (!res) {
      return absl::UnavailableError(absl::StrFormat(
          "Cannot connect to Ollama server at %s. "
          "Make sure Ollama is installed and running (ollama serve).",
          config_.base_url));
    }

    if (res->status != 200) {
      return absl::InternalError(absl::StrFormat(
          "Ollama server error: HTTP %d", res->status));
    }

    // Check if requested model is available
    nlohmann::json models_json = nlohmann::json::parse(res->body);
    bool model_found = false;
    for (const auto& model : models_json["models"]) {
      if (model["name"].get<std::string>().find(config_.model) != std::string::npos) {
        model_found = true;
        break;
      }
    }

    if (!model_found) {
      return absl::NotFoundError(absl::StrFormat(
          "Model '%s' not found. Pull it with: ollama pull %s",
          config_.model, config_.model));
    }

    return absl::OkStatus();
  } catch (const std::exception& e) {
    return absl::InternalError(absl::StrCat("Ollama check failed: ", e.what()));
  }
#endif
}

absl::StatusOr<std::vector<std::string>> OllamaAIService::ListAvailableModels() {
#ifndef YAZE_WITH_HTTPLIB
  return absl::UnimplementedError("Requires httplib support");
#else
  httplib::Client cli(config_.base_url);
  auto res = cli.Get("/api/tags");

  if (!res || res->status != 200) {
    return absl::UnavailableError("Cannot list Ollama models");
  }

  nlohmann::json models_json = nlohmann::json::parse(res->body);
  std::vector<std::string> models;
  for (const auto& model : models_json["models"]) {
    models.push_back(model["name"].get<std::string>());
  }
  return models;
#endif
}

absl::StatusOr<std::vector<std::string>> OllamaAIService::GetCommands(
    const std::string& prompt) {
#ifndef YAZE_WITH_HTTPLIB
  return absl::UnimplementedError(
      "Ollama service requires httplib. Build with vcpkg or system httplib.");
#else

  // Build request payload
  nlohmann::json request_body = {
    {"model", config_.model},
    {"prompt", config_.system_prompt + "\n\nUSER REQUEST: " + prompt},
    {"stream", false},
    {"temperature", config_.temperature},
    {"max_tokens", config_.max_tokens},
    {"format", "json"}  // Force JSON output
  };

  httplib::Client cli(config_.base_url);
  cli.set_read_timeout(60);  // Longer timeout for inference

  auto res = cli.Post("/api/generate", request_body.dump(), "application/json");

  if (!res) {
    return absl::UnavailableError(
        "Failed to connect to Ollama. Is 'ollama serve' running?");
  }

  if (res->status != 200) {
    return absl::InternalError(absl::StrFormat(
        "Ollama API error: HTTP %d - %s", res->status, res->body));
  }

  // Parse response
  try {
    nlohmann::json response_json = nlohmann::json::parse(res->body);
    std::string generated_text = response_json["response"].get<std::string>();

    // Parse the command array from generated text
    nlohmann::json commands_json = nlohmann::json::parse(generated_text);

    if (!commands_json.is_array()) {
      return absl::InvalidArgumentError(
          "LLM did not return a JSON array. Response: " + generated_text);
    }

    std::vector<std::string> commands;
    for (const auto& cmd : commands_json) {
      if (cmd.is_string()) {
        commands.push_back(cmd.get<std::string>());
      }
    }

    if (commands.empty()) {
      return absl::InvalidArgumentError(
          "LLM returned empty command list. Prompt may be unclear.");
    }

    return commands;

  } catch (const nlohmann::json::exception& e) {
    return absl::InternalError(absl::StrCat(
        "Failed to parse Ollama response: ", e.what(), "\nRaw: ", res->body));
  }
#endif
}

}  // namespace cli
}  // namespace yaze
```

#### 1.2. Add CMake Configuration

**File**: `CMakeLists.txt` (add to dependencies section)

```cmake
# Optional httplib for AI services (Ollama, Gemini)
option(YAZE_WITH_HTTPLIB "Enable HTTP client for AI services" ON)

if(YAZE_WITH_HTTPLIB)
  find_package(httplib CONFIG)
  if(httplib_FOUND)
    set(YAZE_WITH_HTTPLIB ON)
    add_compile_definitions(YAZE_WITH_HTTPLIB)
    message(STATUS "httplib found - AI services enabled")
  else()
    # Try to use bundled httplib from third_party
    if(EXISTS "${CMAKE_SOURCE_DIR}/third_party/httplib")
      set(YAZE_WITH_HTTPLIB ON)
      add_compile_definitions(YAZE_WITH_HTTPLIB)
      message(STATUS "Using bundled httplib - AI services enabled")
    else()
      set(YAZE_WITH_HTTPLIB OFF)
      message(WARNING "httplib not found - AI services disabled")
    endif()
  endif()
endif()
```

#### 1.3. Wire into Agent Commands

**File**: `src/cli/handlers/agent/general_commands.cc`

Replace hardcoded `MockAIService` usage with service selection:

```cpp
#include "cli/service/ollama_ai_service.h"
#include "cli/service/gemini_ai_service.h"

// Helper: Select AI service based on environment
std::unique_ptr<AIService> CreateAIService() {
  // Priority: Ollama (local) > Gemini (remote) > Mock (testing)

  const char* ollama_env = std::getenv("YAZE_AI_PROVIDER");
  const char* gemini_key = std::getenv("GEMINI_API_KEY");

  // Explicit provider selection
  if (ollama_env && std::string(ollama_env) == "ollama") {
    OllamaConfig config;
    // Allow model override via env
    if (const char* model = std::getenv("OLLAMA_MODEL")) {
      config.model = model;
    }
    auto service = std::make_unique<OllamaAIService>(config);

    // Health check
    if (auto status = service->CheckAvailability(); !status.ok()) {
      std::cerr << "⚠️  Ollama unavailable: " << status.message() << std::endl;
      std::cerr << "   Falling back to MockAIService" << std::endl;
      return std::make_unique<MockAIService>();
    }

    std::cout << "🤖 Using Ollama AI with model: " << config.model << std::endl;
    return service;
  }

  // Gemini if API key provided
  if (gemini_key && std::strlen(gemini_key) > 0) {
    std::cout << "🤖 Using Gemini AI (remote)" << std::endl;
    return std::make_unique<GeminiAIService>(gemini_key);
  }

  // Default: Mock service for testing
  std::cout << "🤖 Using MockAIService (no LLM configured)" << std::endl;
  std::cout << "   Set YAZE_AI_PROVIDER=ollama or GEMINI_API_KEY to enable LLM" << std::endl;
  return std::make_unique<MockAIService>();
}

// Update HandleRunCommand:
absl::Status HandleRunCommand(const std::vector<std::string>& arg_vec) {
  // ... existing setup code ...

  auto ai_service = CreateAIService();  // ← Replace MockAIService instantiation
  auto commands_or = ai_service->GetCommands(prompt);

  // ... rest of execution logic ...
}
```

#### 1.4. Testing & Validation

**Prerequisites**:
```bash
# Install Ollama (macOS)
brew install ollama

# Start Ollama server
ollama serve &

# Pull recommended model
ollama pull qwen2.5-coder:7b

# Test connectivity
curl http://localhost:11434/api/tags
```

**End-to-End Test Script** (`scripts/test_ollama_integration.sh`):

```bash
#!/bin/bash
set -e

echo "🧪 Testing Ollama AI Integration"

# 1. Check Ollama availability
echo "Checking Ollama server..."
if ! curl -s http://localhost:11434/api/tags > /dev/null; then
  echo "❌ Ollama not running. Start with: ollama serve"
  exit 1
fi

# 2. Check model availability
echo "Checking qwen2.5-coder:7b model..."
if ! ollama list | grep -q "qwen2.5-coder:7b"; then
  echo "⚠️  Model not found. Pulling..."
  ollama pull qwen2.5-coder:7b
fi

# 3. Test agent run with simple prompt
echo "Testing agent run command..."
export YAZE_AI_PROVIDER=ollama
export OLLAMA_MODEL=qwen2.5-coder:7b

./build/bin/z3ed agent run \
  --prompt "Export the first overworld palette to /tmp/test.pal" \
  --rom zelda3.sfc \
  --sandbox

# 4. Verify proposal created
echo "Checking proposal registry..."
if ! ./build/bin/z3ed agent list | grep -q "pending"; then
  echo "❌ No pending proposal found"
  exit 1
fi

# 5. Review generated commands
echo "✅ Reviewing generated commands..."
./build/bin/z3ed agent diff --format yaml

echo "✅ Ollama integration test passed!"
```

---

### Phase 2: Improve Gemini Integration (2-3 hours)

The existing `GeminiAIService` needs fixes and better prompting:

#### 2.1. Fix GeminiAIService Implementation

**File**: `src/cli/service/gemini_ai_service.cc`

```cpp
absl::StatusOr<std::vector<std::string>> GeminiAIService::GetCommands(
    const std::string& prompt) {
#ifndef YAZE_WITH_HTTPLIB
  return absl::UnimplementedError(
      "Gemini AI service requires httplib. Build with vcpkg.");
#else
  if (api_key_.empty()) {
    return absl::FailedPreconditionError(
        "GEMINI_API_KEY not set. Get key from: https://makersuite.google.com/app/apikey");
  }

  // Build comprehensive system instruction
  std::string system_instruction = R"({
    "role": "system",
    "content": "You are an expert ROM hacking assistant for The Legend of Zelda: A Link to the Past. Generate ONLY a JSON array of z3ed CLI commands. Each command must be executable without modification. Available commands: rom info, rom validate, palette export/import/set-color, overworld get-tile/set-tile, dungeon export-room/import-room. Response format: [\"command1\", \"command2\"]"
  })";

  httplib::Client cli("https://generativelanguage.googleapis.com");
  cli.set_read_timeout(60);

  nlohmann::json request_body = {
    {"contents", {{
      {"role", "user"},
      {"parts", {{
        {"text", absl::StrFormat("System: %s\n\nUser: %s",
                                 system_instruction, prompt)}
      }}}
    }}},
    {"generationConfig", {
      {"temperature", 0.1},          // Low temp for deterministic output
      {"maxOutputTokens", 2048},
      {"topP", 0.8},
      {"topK", 10}
    }},
    {"safetySettings", {
      {{"category", "HARM_CATEGORY_DANGEROUS_CONTENT"}, {"threshold", "BLOCK_NONE"}}
    }}
  };

  httplib::Headers headers = {
      {"Content-Type", "application/json"},
  };

  std::string endpoint = absl::StrFormat(
      "/v1beta/models/gemini-2.5-flash:generateContent?key=%s", api_key_);

  auto res = cli.Post(endpoint, headers, request_body.dump(), "application/json");

  if (!res) {
    return absl::UnavailableError(
        "Failed to connect to Gemini API. Check internet connection.");
  }

  if (res->status != 200) {
    return absl::InternalError(absl::StrFormat(
        "Gemini API error: HTTP %d - %s", res->status, res->body));
  }

  // Parse response
  try {
    nlohmann::json response_json = nlohmann::json::parse(res->body);

    // Extract text from nested structure
    std::string text_content =
        response_json["candidates"][0]["content"]["parts"][0]["text"]
        .get<std::string>();

    // Gemini may wrap JSON in markdown code blocks - strip them
    if (text_content.find("```json") != std::string::npos) {
      size_t start = text_content.find("[");
      size_t end = text_content.rfind("]");
      if (start != std::string::npos && end != std::string::npos) {
        text_content = text_content.substr(start, end - start + 1);
      }
    }

    nlohmann::json commands_array = nlohmann::json::parse(text_content);

    if (!commands_array.is_array()) {
      return absl::InvalidArgumentError(
          "Gemini did not return a JSON array. Response: " + text_content);
    }

    std::vector<std::string> commands;
    for (const auto& cmd : commands_array) {
      if (cmd.is_string()) {
        commands.push_back(cmd.get<std::string>());
      }
    }

    return commands;

  } catch (const nlohmann::json::exception& e) {
    return absl::InternalError(absl::StrCat(
        "Failed to parse Gemini response: ", e.what(), "\nRaw: ", res->body));
  }
#endif
}
```

---

### Phase 3: Add Claude Integration (2-3 hours)

Claude 3.5 Sonnet is excellent for code generation and has a generous free tier.

#### 3.1. Create ClaudeAIService

**File**: `src/cli/service/claude_ai_service.h`

```cpp
#ifndef YAZE_SRC_CLI_CLAUDE_AI_SERVICE_H_
#define YAZE_SRC_CLI_CLAUDE_AI_SERVICE_H_

#include <string>
#include <vector>

#include "absl/status/statusor.h"
#include "cli/service/ai_service.h"

namespace yaze {
namespace cli {

class ClaudeAIService : public AIService {
 public:
  explicit ClaudeAIService(const std::string& api_key);

  absl::StatusOr<std::vector<std::string>> GetCommands(
      const std::string& prompt) override;

 private:
  std::string api_key_;
  std::string model_ = "claude-3-5-sonnet-20241022";  // Latest version
};

}  // namespace cli
}  // namespace yaze

#endif  // YAZE_SRC_CLI_CLAUDE_AI_SERVICE_H_
```

**File**: `src/cli/service/claude_ai_service.cc`

```cpp
#include "cli/service/claude_ai_service.h"

#include "absl/strings/str_format.h"

#ifdef YAZE_WITH_HTTPLIB
#include "incl/httplib.h"
#include "third_party/json/src/json.hpp"
#endif

namespace yaze {
namespace cli {

ClaudeAIService::ClaudeAIService(const std::string& api_key)
    : api_key_(api_key) {}

absl::StatusOr<std::vector<std::string>> ClaudeAIService::GetCommands(
    const std::string& prompt) {
#ifndef YAZE_WITH_HTTPLIB
  return absl::UnimplementedError("Claude service requires httplib");
#else
  if (api_key_.empty()) {
    return absl::FailedPreconditionError(
        "CLAUDE_API_KEY not set. Get key from: https://console.anthropic.com/");
  }

  httplib::Client cli("https://api.anthropic.com");
  cli.set_read_timeout(60);

  nlohmann::json request_body = {
    {"model", model_},
    {"max_tokens", 2048},
    {"temperature", 0.1},
    {"system", "You are an expert ROM hacking assistant. Generate ONLY a JSON array of z3ed commands. No explanations."},
    {"messages", {{
      {"role", "user"},
      {"content", prompt}
    }}}
  };

  httplib::Headers headers = {
    {"Content-Type", "application/json"},
    {"x-api-key", api_key_},
    {"anthropic-version", "2023-06-01"}
  };

  auto res = cli.Post("/v1/messages", headers, request_body.dump(),
                      "application/json");

  if (!res) {
    return absl::UnavailableError("Failed to connect to Claude API");
  }

  if (res->status != 200) {
    return absl::InternalError(absl::StrFormat(
        "Claude API error: HTTP %d - %s", res->status, res->body));
  }

  try {
    nlohmann::json response_json = nlohmann::json::parse(res->body);
    std::string text_content =
        response_json["content"][0]["text"].get<std::string>();

    // Claude may wrap in markdown - strip if present
    if (text_content.find("```json") != std::string::npos) {
      size_t start = text_content.find("[");
      size_t end = text_content.rfind("]");
      if (start != std::string::npos && end != std::string::npos) {
        text_content = text_content.substr(start, end - start + 1);
      }
    }

    nlohmann::json commands_json = nlohmann::json::parse(text_content);

    std::vector<std::string> commands;
    for (const auto& cmd : commands_json) {
      if (cmd.is_string()) {
        commands.push_back(cmd.get<std::string>());
      }
    }

    return commands;

  } catch (const std::exception& e) {
    return absl::InternalError(absl::StrCat(
        "Failed to parse Claude response: ", e.what()));
  }
#endif
}

}  // namespace cli
}  // namespace yaze
```

---

### Phase 4: Enhanced Prompt Engineering (3-4 hours)

#### 4.1. Load Resource Catalogue into System Prompt

**File**: `src/cli/service/prompt_builder.h`

```cpp
#ifndef YAZE_SRC_CLI_PROMPT_BUILDER_H_
#define YAZE_SRC_CLI_PROMPT_BUILDER_H_

#include <string>
#include "absl/status/statusor.h"

namespace yaze {
namespace cli {

// Utility for building comprehensive LLM prompts from resource catalogue
class PromptBuilder {
 public:
  // Load command schemas from docs/api/z3ed-resources.yaml
  static absl::StatusOr<std::string> LoadResourceCatalogue();

  // Build system prompt with full command documentation
  static std::string BuildSystemPrompt();

  // Build few-shot examples for better LLM performance
  static std::string BuildFewShotExamples();

  // Inject ROM context (current ROM info, loaded editors, etc.)
  static std::string BuildContextPrompt();
};

}  // namespace cli
}  // namespace yaze

#endif  // YAZE_SRC_CLI_PROMPT_BUILDER_H_
```

#### 4.2. Few-Shot Examples

Include proven examples in system prompt:

```cpp
std::string PromptBuilder::BuildFewShotExamples() {
  return R"(
EXAMPLE 1:
User: "Make soldier armor red"
Response: ["palette export --group sprites --id soldier --to /tmp/soldier.pal",
           "palette set-color --file /tmp/soldier.pal --index 5 --color FF0000",
           "palette import --group sprites --id soldier --from /tmp/soldier.pal"]

EXAMPLE 2:
User: "Validate ROM integrity"
Response: ["rom validate --rom zelda3.sfc"]

EXAMPLE 3:
User: "Change tile at coordinates (10, 20) in Light World to grass"
Response: ["overworld set-tile --map 0 --x 10 --y 20 --tile 0x40"]
)";
}
```

---

## 2. Configuration & User Experience

### Environment Variables

```bash
# AI Provider Selection
export YAZE_AI_PROVIDER=ollama    # Options: ollama, gemini, claude, mock
export OLLAMA_MODEL=qwen2.5-coder:7b
export OLLAMA_URL=http://localhost:11434

# API Keys (remote providers)
export GEMINI_API_KEY=your_key_here
export CLAUDE_API_KEY=your_key_here

# Logging & Debugging
export YAZE_AI_DEBUG=1            # Log full prompts and responses
export YAZE_AI_CACHE_DIR=/tmp/yaze_ai_cache  # Cache LLM responses
```

### CLI Flags

Add new flags to `z3ed agent run`:

```bash
# Override provider for single command
z3ed agent run --prompt "..." --ai-provider ollama

# Override model
z3ed agent run --prompt "..." --ai-model "llama3:70b"

# Dry run: show generated commands without executing
z3ed agent run --prompt "..." --dry-run

# Interactive mode: confirm each command before execution
z3ed agent run --prompt "..." --interactive
```

---

## 3. Testing & Validation

### Unit Tests

**File**: `test/cli/ai_service_test.cc`

```cpp
#include "cli/service/ollama_ai_service.h"
#include "cli/service/gemini_ai_service.h"
#include "cli/service/claude_ai_service.h"
#include <gtest/gtest.h>

TEST(OllamaAIServiceTest, CheckAvailability) {
  OllamaConfig config;
  config.base_url = "http://localhost:11434";
  OllamaAIService service(config);

  // Should not crash, may return unavailable if Ollama not running
  auto status = service.CheckAvailability();
  EXPECT_TRUE(status.ok() ||
              absl::IsUnavailable(status) ||
              absl::IsNotFound(status));
}

TEST(OllamaAIServiceTest, GetCommands) {
  // Skip if Ollama not available
  OllamaConfig config;
  OllamaAIService service(config);
  if (!service.CheckAvailability().ok()) {
    GTEST_SKIP() << "Ollama not available";
  }

  auto result = service.GetCommands("Validate the ROM");
  ASSERT_TRUE(result.ok()) << result.status();

  auto commands = result.value();
  EXPECT_GT(commands.size(), 0);
  EXPECT_THAT(commands[0], testing::HasSubstr("rom validate"));
}
```

### Integration Tests

**File**: `scripts/test_ai_services.sh`

```bash
#!/bin/bash
set -e

echo "🧪 Testing AI Services Integration"

# Test 1: Ollama (if available)
if curl -s http://localhost:11434/api/tags > /dev/null 2>&1; then
  echo "✓ Ollama available - testing..."
  export YAZE_AI_PROVIDER=ollama
  ./build/bin/z3ed agent plan --prompt "Export first palette"
else
  echo "⊘ Ollama not running - skipping"
fi

# Test 2: Gemini (if key set)
if [ -n "$GEMINI_API_KEY" ]; then
  echo "✓ Gemini API key set - testing..."
  export YAZE_AI_PROVIDER=gemini
  ./build/bin/z3ed agent plan --prompt "Validate ROM"
else
  echo "⊘ GEMINI_API_KEY not set - skipping"
fi

# Test 3: Claude (if key set)
if [ -n "$CLAUDE_API_KEY" ]; then
  echo "✓ Claude API key set - testing..."
  export YAZE_AI_PROVIDER=claude
  ./build/bin/z3ed agent plan --prompt "Export dungeon room"
else
  echo "⊘ CLAUDE_API_KEY not set - skipping"
fi

echo "✅ All available AI services tested successfully"
```

---

## 4. Documentation Updates

### User Guide

**File**: `docs/z3ed/AI-SERVICE-SETUP.md`

```markdown
# Setting Up LLM Integration for z3ed

## Quick Start: Ollama (Recommended)

1. **Install Ollama**:
   ```bash
   # macOS
   brew install ollama

   # Linux
   curl -fsSL https://ollama.com/install.sh | sh
   ```

2. **Start Server**:
   ```bash
   ollama serve
   ```

3. **Pull Model**:
   ```bash
   ollama pull qwen2.5-coder:7b  # Recommended: fast + accurate
   ```

4. **Configure z3ed**:
   ```bash
   export YAZE_AI_PROVIDER=ollama
   ```

5. **Test**:
   ```bash
   z3ed agent run --prompt "Validate my ROM" --rom zelda3.sfc
   ```

## Alternative: Gemini API (Remote)

1. Get API key: https://makersuite.google.com/app/apikey
2. Configure:
   ```bash
   export GEMINI_API_KEY=your_key_here
   export YAZE_AI_PROVIDER=gemini
   ```
3. Run: `z3ed agent run --prompt "..."`

## Alternative: Claude API (Remote)

1. Get API key: https://console.anthropic.com/
2. Configure:
   ```bash
   export CLAUDE_API_KEY=your_key_here
   export YAZE_AI_PROVIDER=claude
   ```
3. Run: `z3ed agent run --prompt "..."`

## Troubleshooting

**Issue**: "Cannot connect to Ollama"
**Solution**: Make sure `ollama serve` is running

**Issue**: "Model not found"
**Solution**: Run `ollama pull <model_name>`

**Issue**: "LLM returned empty command list"
**Solution**: Rephrase prompt to be more specific
```

---

## 5. Implementation Timeline

### Week 1 (October 7-11)
- **Day 1-2**: Implement `OllamaAIService` class
- **Day 3**: Wire into agent commands with service selection
- **Day 4**: Testing and validation
- **Day 5**: Documentation and examples

### Week 2 (October 14-18)
- **Day 1**: Fix and improve `GeminiAIService`
- **Day 2**: Implement `ClaudeAIService`
- **Day 3**: Enhanced prompt engineering with resource catalogue
- **Day 4**: Integration tests across all services
- **Day 5**: User guide and troubleshooting docs

---

## 6. Success Criteria

✅ **Phase 1 Complete When**:
- Ollama service connects and generates valid commands
- `z3ed agent run` works end-to-end with local LLM
- Health checks report clear error messages
- Test script passes on macOS with Ollama installed

✅ **Phase 2 Complete When**:
- Gemini API calls succeed with valid responses
- Markdown code block stripping works reliably
- Error messages are actionable (e.g., "API key invalid")

✅ **Phase 3 Complete When**:
- Claude service implemented with same interface
- All three services (Ollama, Gemini, Claude) work interchangeably
- Service selection mechanism is transparent to user

✅ **Phase 4 Complete When**:
- System prompts include full resource catalogue
- Few-shot examples improve command accuracy >90%
- LLM responses consistently match expected command format

---

## 7. Future Enhancements (Post-MVP)

- **Response Caching**: Cache LLM responses by prompt hash to reduce costs/latency
- **Token Usage Tracking**: Monitor and report token consumption per session
- **Model Comparison**: A/B test different models for accuracy/cost trade-offs
- **Fine-Tuning**: Fine-tune local models on z3ed command corpus
- **Multi-Turn Dialogue**: Support follow-up questions and clarifications
- **Agentic Loop**: LLM self-corrects based on execution results
- **GUI Integration**: In-app AI assistant panel in YAZE editor

---

## Appendix A: Recommended Models

| Model | Provider | Size | Speed | Accuracy | Use Case |
|-------|----------|------|-------|----------|----------|
| qwen2.5-coder:7b | Ollama | 7B | Fast | High | **Recommended**: Best balance |
| codellama:13b | Ollama | 13B | Medium | Higher | Complex tasks |
| llama3:70b | Ollama | 70B | Slow | Highest | Maximum accuracy |
| gemini-2.5-flash | Gemini | N/A | Fast | High | Remote option, low cost |
| claude-3.5-sonnet | Claude | N/A | Medium | Highest | Premium remote option |

## Appendix B: Example Prompts

**Simple**:
- "Validate the ROM"
- "Export the first palette"
- "Show ROM info"

**Moderate**:
- "Make soldier armor red"
- "Change tile at (10, 20) in Light World to grass"
- "Export dungeon room 5 to /tmp/room5.bin"

**Complex**:
- "Find all palettes using color #FF0000 and change to #00FF00"
- "Export all dungeon rooms, modify object 3 in each, then reimport"
- "Generate a comparison report between two ROM versions"

---

## Next Steps

**👉 START HERE**: Implement Phase 1 (Ollama Integration) by following section 1.1-1.4 above.

Once complete, update this document with:
- Actual time spent vs. estimates
- Issues encountered and solutions
- Model performance observations
- User feedback

**Questions? Blockers?** Open an issue or ping @scawful in Discord.