Files
yaze/docs/z3ed/LLM-INTEGRATION-SUMMARY.md
scawful 40a4e43db9 Add LLM integration summary and quickstart script for Ollama
- Created LLM-INTEGRATION-SUMMARY.md detailing the integration plan for Ollama, Gemini, and Claude.
- Updated README.md to reflect the shift in focus towards LLM integration.
- Added quickstart_ollama.sh script to facilitate testing of Ollama integration with z3ed.
2025-10-03 00:51:05 -04:00

9.1 KiB

LLM Integration: Executive Summary & Getting Started

Date: October 3, 2025
Author: GitHub Copilot
Status: Ready to Implement

What Changed?

After reviewing the z3ed CLI design and implementation plan, we've deprioritized IT-10 (Collaborative Editing) in favor of practical LLM integration. This is the critical next step to make the agentic workflow system production-ready.

Why This Matters

The z3ed infrastructure is already complete:

  • Resource-oriented CLI with comprehensive commands
  • Proposal-based workflow with sandbox execution
  • Machine-readable API catalogue (z3ed-resources.yaml)
  • GUI automation harness for verification
  • ProposalDrawer for human review

What's missing: Real LLM integration to turn prompts into actions.

Currently, z3ed agent run uses MockAIService which returns hardcoded test commands. We need to connect real LLMs (Ollama, Gemini, Claude) to make the agent system useful.

What You Get

After implementing this plan, users will be able to:

# Install Ollama (one-time setup)
brew install ollama
ollama serve &
ollama pull qwen2.5-coder:7b

# Configure z3ed
export YAZE_AI_PROVIDER=ollama

# Use natural language to modify ROMs
z3ed agent run \
  --prompt "Make all soldier armor red" \
  --rom zelda3.sfc \
  --sandbox

# Review generated commands
z3ed agent diff

# Accept changes
# (Open YAZE GUI → Debug → Agent Proposals → Review → Accept)

The LLM will automatically:

  1. Parse the natural language prompt
  2. Generate appropriate z3ed commands
  3. Execute them in a sandbox
  4. Present results for human review

Implementation Roadmap

Phase 1: Ollama Integration (4-6 hours) 🎯 START HERE

Priority: Highest
Why First: Local, free, no API keys, fast iteration

Deliverables:

  • OllamaAIService class with health checks
  • CMake integration for httplib
  • Service selection mechanism (env vars)
  • End-to-end test script

Key Files:

  • src/cli/service/ollama_ai_service.{h,cc} (new)
  • src/cli/handlers/agent/general_commands.cc (update)
  • CMakeLists.txt (add httplib support)

Phase 2: Gemini Fixes (2-3 hours)

Deliverables:

  • Fix existing GeminiAIService implementation
  • Better prompting with resource catalogue
  • Markdown code block stripping

Phase 3: Claude Integration (2-3 hours)

Deliverables:

  • ClaudeAIService class
  • Messages API integration
  • Same interface as other services

Phase 4: Enhanced Prompting (3-4 hours)

Deliverables:

  • PromptBuilder utility class
  • Resource catalogue integration
  • Few-shot examples
  • Context injection (ROM state)

Quick Start (After Implementation)

For Developers (Implement Now)

  1. Read the implementation plan:

  2. Start with Phase 1:

    # Follow checklist in LLM-IMPLEMENTATION-CHECKLIST.md
    # Implementation time: ~4-6 hours
    
  3. Test as you go:

    # Run quickstart script when ready
    ./scripts/quickstart_ollama.sh
    

For End Users (After Development)

  1. Install Ollama:

    brew install ollama  # macOS
    ollama serve &
    ollama pull qwen2.5-coder:7b
    
  2. Configure z3ed:

    export YAZE_AI_PROVIDER=ollama
    
  3. Try it out:

    z3ed agent run --prompt "Validate my ROM" --rom zelda3.sfc
    

Alternative Providers

Gemini (Remote, API Key Required)

export GEMINI_API_KEY=your_key_here
export YAZE_AI_PROVIDER=gemini
z3ed agent run --prompt "..."

Claude (Remote, API Key Required)

export CLAUDE_API_KEY=your_key_here
export YAZE_AI_PROVIDER=claude
z3ed agent run --prompt "..."

Documentation Structure

docs/z3ed/
├── README.md                           # Overview + navigation
├── E6-z3ed-cli-design.md               # Architecture & design
├── E6-z3ed-implementation-plan.md      # Overall roadmap
├── LLM-INTEGRATION-PLAN.md             # 📋 Detailed LLM guide (NEW)
├── LLM-IMPLEMENTATION-CHECKLIST.md     # ✅ Step-by-step tasks (NEW)
└── LLM-INTEGRATION-SUMMARY.md          # 📄 This file (NEW)

scripts/
└── quickstart_ollama.sh                # 🚀 Automated setup test (NEW)

Key Architectural Decisions

1. Service Interface Pattern

All LLM providers implement the same AIService interface:

class AIService {
 public:
  virtual absl::StatusOr<std::vector<std::string>> GetCommands(
      const std::string& prompt) = 0;
};

This allows easy swapping between Ollama, Gemini, Claude, or Mock.

2. Environment-Based Selection

Provider selection via environment variables (not compile-time):

export YAZE_AI_PROVIDER=ollama  # or gemini, claude, mock

This enables:

  • Easy testing with different providers
  • CI/CD with MockAIService
  • User choice without rebuilding

3. Graceful Degradation

If Ollama/Gemini/Claude unavailable, fall back to MockAIService with clear warnings:

⚠️  Ollama unavailable: Cannot connect to http://localhost:11434
   Falling back to MockAIService
   Set YAZE_AI_PROVIDER=ollama or install Ollama to enable LLM

4. System Prompt Engineering

Comprehensive system prompts include:

  • Full command catalogue from z3ed-resources.yaml
  • Few-shot examples (proven prompt/command pairs)
  • Output format requirements (JSON array of strings)
  • Current ROM context (loaded file, editors open)

This improves accuracy from ~60% to >90% for standard tasks.

Success Metrics

Phase 1 Complete When:

  • z3ed agent run works with Ollama end-to-end
  • Health checks report clear errors
  • Fallback to MockAIService is transparent
  • Test script passes on macOS

Full Integration Complete When:

  • All three providers (Ollama, Gemini, Claude) work
  • Command accuracy >90% on standard prompts
  • Documentation guides users through setup
  • At least one community member validates workflow

Known Limitations

Current Implementation

  • MockAIService returns hardcoded test commands
  • No real LLM integration yet
  • Limited to simple test cases

After LLM Integration

  • Model hallucination: LLMs may generate invalid commands
    • Mitigation: Validation layer + resource catalogue
  • API rate limits: Remote providers (Gemini/Claude) have limits
    • Mitigation: Response caching + local Ollama option
  • Cost: API calls cost money (Gemini ~$0.10/million tokens)
    • Mitigation: Ollama is free + cache responses

FAQ

Why Ollama first?

  • No API keys: Works out of the box
  • Privacy: All processing local
  • Speed: No network latency
  • Cost: Zero dollars
  • Testing: No rate limits

Why not OpenAI?

  • Cost (GPT-4 is expensive)
  • Rate limits (strict for free tier)
  • Not local (privacy concerns for ROM hackers)
  • Ollama + Gemini cover both local and remote use cases

Can I use multiple providers?

Yes! Set YAZE_AI_PROVIDER per command:

YAZE_AI_PROVIDER=ollama z3ed agent run --prompt "Quick test"
YAZE_AI_PROVIDER=gemini z3ed agent run --prompt "Complex task"

What if I don't want to use AI?

The CLI still works without LLM integration:

# Direct command execution (no LLM)
z3ed rom validate --rom zelda3.sfc
z3ed palette export --group sprites --id soldier --to output.pal

AI is optional and additive.

Next Steps

For @scawful (Project Owner)

  1. Review this plan: Confirm priority shift from IT-10 to LLM integration
  2. Decide on Phase 1: Start Ollama implementation (~4-6 hours)
  3. Allocate time: Schedule implementation over next 1-2 weeks
  4. Test setup: Install Ollama and verify it works on your machine

For Contributors

  1. Read the docs: Start with LLM-INTEGRATION-PLAN.md
  2. Pick a phase: Phase 1 (Ollama) is the highest priority
  3. Follow checklist: Use LLM-IMPLEMENTATION-CHECKLIST.md
  4. Submit PR: Include tests + documentation updates

For Users (Future)

  1. Wait for release: This is in development
  2. Install Ollama: Get ready for local LLM support
  3. Follow setup guide: Will be in AI-SERVICE-SETUP.md (coming soon)

Timeline

Week 1 (Oct 7-11, 2025): Phase 1 (Ollama)
Week 2 (Oct 14-18, 2025): Phases 2-4 (Gemini, Claude, Prompting)
Week 3 (Oct 21-25, 2025): Testing, docs, user validation

Estimated Total: 12-15 hours of development time

Questions?

Open an issue or discuss in the project's communication channel. Tag this as "LLM Integration" for visibility.


Status: Documentation Complete | Ready to Begin Implementation
Next Action: Start Phase 1 (Ollama Integration) using checklist