Files
yaze/docs/z3ed/LLM-INTEGRATION-SUMMARY.md
scawful 40a4e43db9 Add LLM integration summary and quickstart script for Ollama
- Created LLM-INTEGRATION-SUMMARY.md detailing the integration plan for Ollama, Gemini, and Claude.
- Updated README.md to reflect the shift in focus towards LLM integration.
- Added quickstart_ollama.sh script to facilitate testing of Ollama integration with z3ed.
2025-10-03 00:51:05 -04:00

312 lines
9.1 KiB
Markdown

# LLM Integration: Executive Summary & Getting Started
**Date**: October 3, 2025
**Author**: GitHub Copilot
**Status**: Ready to Implement
## What Changed?
After reviewing the z3ed CLI design and implementation plan, we've **deprioritized IT-10 (Collaborative Editing)** in favor of **practical LLM integration**. This is the critical next step to make the agentic workflow system production-ready.
## Why This Matters
The z3ed infrastructure is **already complete**:
- ✅ Resource-oriented CLI with comprehensive commands
- ✅ Proposal-based workflow with sandbox execution
- ✅ Machine-readable API catalogue (`z3ed-resources.yaml`)
- ✅ GUI automation harness for verification
- ✅ ProposalDrawer for human review
**What's missing**: Real LLM integration to turn prompts into actions.
Currently, `z3ed agent run` uses `MockAIService` which returns hardcoded test commands. We need to connect real LLMs (Ollama, Gemini, Claude) to make the agent system useful.
## What You Get
After implementing this plan, users will be able to:
```bash
# Install Ollama (one-time setup)
brew install ollama
ollama serve &
ollama pull qwen2.5-coder:7b
# Configure z3ed
export YAZE_AI_PROVIDER=ollama
# Use natural language to modify ROMs
z3ed agent run \
--prompt "Make all soldier armor red" \
--rom zelda3.sfc \
--sandbox
# Review generated commands
z3ed agent diff
# Accept changes
# (Open YAZE GUI → Debug → Agent Proposals → Review → Accept)
```
The LLM will automatically:
1. Parse the natural language prompt
2. Generate appropriate `z3ed` commands
3. Execute them in a sandbox
4. Present results for human review
## Implementation Roadmap
### Phase 1: Ollama Integration (4-6 hours) 🎯 START HERE
**Priority**: Highest
**Why First**: Local, free, no API keys, fast iteration
**Deliverables**:
- `OllamaAIService` class with health checks
- CMake integration for httplib
- Service selection mechanism (env vars)
- End-to-end test script
**Key Files**:
- `src/cli/service/ollama_ai_service.{h,cc}` (new)
- `src/cli/handlers/agent/general_commands.cc` (update)
- `CMakeLists.txt` (add httplib support)
### Phase 2: Gemini Fixes (2-3 hours)
**Deliverables**:
- Fix existing `GeminiAIService` implementation
- Better prompting with resource catalogue
- Markdown code block stripping
### Phase 3: Claude Integration (2-3 hours)
**Deliverables**:
- `ClaudeAIService` class
- Messages API integration
- Same interface as other services
### Phase 4: Enhanced Prompting (3-4 hours)
**Deliverables**:
- `PromptBuilder` utility class
- Resource catalogue integration
- Few-shot examples
- Context injection (ROM state)
## Quick Start (After Implementation)
### For Developers (Implement Now)
1. **Read the implementation plan**:
- [LLM-INTEGRATION-PLAN.md](LLM-INTEGRATION-PLAN.md) - Complete technical guide
- [LLM-IMPLEMENTATION-CHECKLIST.md](LLM-IMPLEMENTATION-CHECKLIST.md) - Step-by-step tasks
2. **Start with Phase 1**:
```bash
# Follow checklist in LLM-IMPLEMENTATION-CHECKLIST.md
# Implementation time: ~4-6 hours
```
3. **Test as you go**:
```bash
# Run quickstart script when ready
./scripts/quickstart_ollama.sh
```
### For End Users (After Development)
1. **Install Ollama**:
```bash
brew install ollama # macOS
ollama serve &
ollama pull qwen2.5-coder:7b
```
2. **Configure z3ed**:
```bash
export YAZE_AI_PROVIDER=ollama
```
3. **Try it out**:
```bash
z3ed agent run --prompt "Validate my ROM" --rom zelda3.sfc
```
## Alternative Providers
### Gemini (Remote, API Key Required)
```bash
export GEMINI_API_KEY=your_key_here
export YAZE_AI_PROVIDER=gemini
z3ed agent run --prompt "..."
```
### Claude (Remote, API Key Required)
```bash
export CLAUDE_API_KEY=your_key_here
export YAZE_AI_PROVIDER=claude
z3ed agent run --prompt "..."
```
## Documentation Structure
```
docs/z3ed/
├── README.md # Overview + navigation
├── E6-z3ed-cli-design.md # Architecture & design
├── E6-z3ed-implementation-plan.md # Overall roadmap
├── LLM-INTEGRATION-PLAN.md # 📋 Detailed LLM guide (NEW)
├── LLM-IMPLEMENTATION-CHECKLIST.md # ✅ Step-by-step tasks (NEW)
└── LLM-INTEGRATION-SUMMARY.md # 📄 This file (NEW)
scripts/
└── quickstart_ollama.sh # 🚀 Automated setup test (NEW)
```
## Key Architectural Decisions
### 1. Service Interface Pattern
All LLM providers implement the same `AIService` interface:
```cpp
class AIService {
public:
virtual absl::StatusOr<std::vector<std::string>> GetCommands(
const std::string& prompt) = 0;
};
```
This allows easy swapping between Ollama, Gemini, Claude, or Mock.
### 2. Environment-Based Selection
Provider selection via environment variables (not compile-time):
```bash
export YAZE_AI_PROVIDER=ollama # or gemini, claude, mock
```
This enables:
- Easy testing with different providers
- CI/CD with MockAIService
- User choice without rebuilding
### 3. Graceful Degradation
If Ollama/Gemini/Claude unavailable, fall back to MockAIService with clear warnings:
```
⚠️ Ollama unavailable: Cannot connect to http://localhost:11434
Falling back to MockAIService
Set YAZE_AI_PROVIDER=ollama or install Ollama to enable LLM
```
### 4. System Prompt Engineering
Comprehensive system prompts include:
- Full command catalogue from `z3ed-resources.yaml`
- Few-shot examples (proven prompt/command pairs)
- Output format requirements (JSON array of strings)
- Current ROM context (loaded file, editors open)
This improves accuracy from ~60% to >90% for standard tasks.
## Success Metrics
### Phase 1 Complete When:
- ✅ `z3ed agent run` works with Ollama end-to-end
- ✅ Health checks report clear errors
- ✅ Fallback to MockAIService is transparent
- ✅ Test script passes on macOS
### Full Integration Complete When:
- ✅ All three providers (Ollama, Gemini, Claude) work
- ✅ Command accuracy >90% on standard prompts
- ✅ Documentation guides users through setup
- ✅ At least one community member validates workflow
## Known Limitations
### Current Implementation
- `MockAIService` returns hardcoded test commands
- No real LLM integration yet
- Limited to simple test cases
### After LLM Integration
- **Model hallucination**: LLMs may generate invalid commands
- Mitigation: Validation layer + resource catalogue
- **API rate limits**: Remote providers (Gemini/Claude) have limits
- Mitigation: Response caching + local Ollama option
- **Cost**: API calls cost money (Gemini ~$0.10/million tokens)
- Mitigation: Ollama is free + cache responses
## FAQ
### Why Ollama first?
- **No API keys**: Works out of the box
- **Privacy**: All processing local
- **Speed**: No network latency
- **Cost**: Zero dollars
- **Testing**: No rate limits
### Why not OpenAI?
- Cost (GPT-4 is expensive)
- Rate limits (strict for free tier)
- Not local (privacy concerns for ROM hackers)
- Ollama + Gemini cover both local and remote use cases
### Can I use multiple providers?
Yes! Set `YAZE_AI_PROVIDER` per command:
```bash
YAZE_AI_PROVIDER=ollama z3ed agent run --prompt "Quick test"
YAZE_AI_PROVIDER=gemini z3ed agent run --prompt "Complex task"
```
### What if I don't want to use AI?
The CLI still works without LLM integration:
```bash
# Direct command execution (no LLM)
z3ed rom validate --rom zelda3.sfc
z3ed palette export --group sprites --id soldier --to output.pal
```
AI is **optional** and additive.
## Next Steps
### For @scawful (Project Owner)
1. **Review this plan**: Confirm priority shift from IT-10 to LLM integration
2. **Decide on Phase 1**: Start Ollama implementation (~4-6 hours)
3. **Allocate time**: Schedule implementation over next 1-2 weeks
4. **Test setup**: Install Ollama and verify it works on your machine
### For Contributors
1. **Read the docs**: Start with [LLM-INTEGRATION-PLAN.md](LLM-INTEGRATION-PLAN.md)
2. **Pick a phase**: Phase 1 (Ollama) is the highest priority
3. **Follow checklist**: Use [LLM-IMPLEMENTATION-CHECKLIST.md](LLM-IMPLEMENTATION-CHECKLIST.md)
4. **Submit PR**: Include tests + documentation updates
### For Users (Future)
1. **Wait for release**: This is in development
2. **Install Ollama**: Get ready for local LLM support
3. **Follow setup guide**: Will be in `AI-SERVICE-SETUP.md` (coming soon)
## Timeline
**Week 1 (Oct 7-11, 2025)**: Phase 1 (Ollama)
**Week 2 (Oct 14-18, 2025)**: Phases 2-4 (Gemini, Claude, Prompting)
**Week 3 (Oct 21-25, 2025)**: Testing, docs, user validation
**Estimated Total**: 12-15 hours of development time
## Related Documents
- **[LLM-INTEGRATION-PLAN.md](LLM-INTEGRATION-PLAN.md)** - Complete technical implementation guide
- **[LLM-IMPLEMENTATION-CHECKLIST.md](LLM-IMPLEMENTATION-CHECKLIST.md)** - Step-by-step task list
- **[E6-z3ed-cli-design.md](E6-z3ed-cli-design.md)** - Overall architecture
- **[E6-z3ed-implementation-plan.md](E6-z3ed-implementation-plan.md)** - Project roadmap
## Questions?
Open an issue or discuss in the project's communication channel. Tag this as "LLM Integration" for visibility.
---
**Status**: Documentation Complete | Ready to Begin Implementation
**Next Action**: Start Phase 1 (Ollama Integration) using checklist