feat: Consolidate Z3ED AI build flags into a single master flag and improve error handling

2025-10-03 23:24:03 -04:00
parent 602f1beec5
commit ae3f1dea2f
3 changed files with 662 additions and 98 deletions
--- a/docs/z3ed/AGENT-ROADMAP.md
+++ b/docs/z3ed/AGENT-ROADMAP.md
@@ -58,16 +58,23 @@ This vision will be realized through a shared interface available in both the `z
 ## Next Steps

 ### Immediate Priorities
-1.  **Expand Overworld Tool Coverage**:
+1.  **✅ Build System Consolidation** (COMPLETE - Oct 3, 2025):
+    - ✅ Created Z3ED_AI master flag for simplified builds
+    - ✅ Fixed Gemini crash with graceful degradation
+    - ✅ Updated documentation with new build instructions
+    - ✅ Tested both Ollama and Gemini backends
+    - **Next**: Update CI/CD workflows to use `-DZ3ED_AI=ON`
+2.  **Live LLM Testing** (NEXT UP - 1-2 hours):
+    - Verify function calling works with real Ollama/Gemini
+    - Test multi-step tool execution
+    - Validate all 5 tools with natural language prompts
+3.  **Expand Overworld Tool Coverage**:
    - ✅ Ship read-only tile searches (`overworld find-tile`) with shared formatting for CLI and agent calls.
    - Next: add area summaries, teleport destination lookups, and keep JSON/Text parity for all new tools.
-2.  **Polish the TUI Chat Experience**:
+4.  **Polish the TUI Chat Experience**:
    - Tighten keyboard shortcuts, scrolling, and copy-to-clipboard behaviour.
    - Align log file output with on-screen formatting for easier debugging.
-3.  **Integrate Tool Use with LLM**:
-    - Modify the `AIService` to support function calling/tool use.
-    - Teach the agent to call the new read-only commands to answer questions.
-4.  **Document & Test the New Tooling**:
+5.  **Document & Test the New Tooling**:
    - Update the main `README.md` and relevant docs to cover the new chat formatting.
    - Add regression tests (unit or golden JSON fixtures) for the new Overworld tools.
 5.  **Build GUI Chat Widget**:
@@ -91,6 +98,14 @@ This vision will be realized through a shared interface available in both the `z
 We have made significant progress in laying the foundation for the conversational agent.

 ### ✅ Completed
+- **Build System Consolidation**: ✅ **NEW** Z3ED_AI master flag (Oct 3, 2025)
+  - Single flag enables all AI features: `-DZ3ED_AI=ON`
+  - Auto-manages dependencies (JSON, YAML, httplib, OpenSSL)
+  - Fixed Gemini crash when API key set but JSON disabled
+  - Graceful degradation with clear error messages
+  - Backward compatible with old flags
+  - Ready for build modularization (enables optional `libyaze_agent.a`)
+  - **Docs**: `docs/z3ed/Z3ED_AI_FLAG_MIGRATION.md`
 - **`ConversationalAgentService`**: ✅ Fully operational with multi-step tool execution loop
  - Handles tool calls with automatic JSON output format
  - Prevents recursion through proper tool result replay
@@ -116,60 +131,53 @@ We have made significant progress in laying the foundation for the conversationa
  - Enhanced prompting system with resource catalogue loading
  - System instruction generation with examples
  - Health checks and model availability validation
+  - Both backends tested and working in production

 ### 🚧 In Progress
- **GUI Chat Widget**: ⚠️ **NOT YET IMPLEMENTED**
-  - No `AgentChatWidget` found in `src/app/gui/` directory
-  - TUI implementation exists but GUI integration is pending
-  - **Action Required**: Create `src/app/gui/debug/agent_chat_widget.{h,cc}`
- **LLM Function Calling**: ⚠️ **PARTIALLY IMPLEMENTED**
-  - ToolDispatcher exists and is used by ConversationalAgentService
-  - AI services (Ollama, Gemini) parse tool calls from responses
-  - **Gap**: LLM prompt needs explicit tool schema definitions for function calling
-  - **Action Required**: Add tool definitions to system prompts (see Next Steps)
+- **Live LLM Testing**: Ready to execute with real Ollama/Gemini
+  - All infrastructure complete (function calling, tool schemas, response parsing)
+  - Need to verify multi-step tool execution with live models
+  - Test scenarios prepared for all 5 tools
+  - **Estimated Time**: 1-2 hours
+- **GUI Chat Widget**: Not yet started
+  - TUI implementation complete and can serve as reference
+  - Should reuse table/JSON rendering logic from TUI
+  - Target: `src/app/gui/debug/agent_chat_widget.{h,cc}`
+  - **Estimated Time**: 6-8 hours

 ### 🚀 Next Steps (Priority Order)

-#### Priority 1: Complete LLM Function Calling Integration ✅ COMPLETE (Oct 3, 2025)
-**Goal**: Enable Ollama/Gemini to autonomously invoke read-only tools
+#### Priority 1: Live LLM Testing with Function Calling (1-2 hours)
+**Goal**: Verify Ollama/Gemini can autonomously invoke tools in production

-**Completed Tasks:**
-1. ✅ **Tool Schema Generation** - Added `BuildFunctionCallSchemas()` method
-   - Generates OpenAI-compatible function calling schemas from tool specifications
-   - Properly formats parameters with types, descriptions, and examples
-   - Marks required vs optional arguments
-   - **File**: `src/cli/service/ai/prompt_builder.{h,cc}`
+**Infrastructure Complete** ✅:
+- ✅ Tool schema generation (`BuildFunctionCallSchemas()`)
+- ✅ System prompts include function definitions
+- ✅ AI services parse `tool_calls` from responses
+- ✅ ConversationalAgentService dispatches to ToolDispatcher
+- ✅ All 5 tools tested independently

-2. ✅ **System Prompt Enhancement** - Injected tool definitions
-   - Updated `BuildConstraintsSection()` to include tool schemas
-   - Added tool usage guidance (tools for questions, commands for modifications)
-   - Included example tool call in JSON format
-   - **File**: `src/cli/service/ai/prompt_builder.cc`
+**Testing Tasks**:
+1. **Gemini Testing** (30 min)
+   - Verify Gemini 2.0 generates correct `tool_calls` JSON
+   - Test prompt: "What dungeons are in this ROM?"
+   - Verify tool result fed back into conversation
+   - Test multi-step: "Now list sprites in the first dungeon"

-3. ✅ **LLM Response Parsing** - Already implemented
-   - Both `OllamaAIService` and `GeminiAIService` parse `tool_calls` from JSON
-   - Populate `AgentResponse.tool_calls` with parsed ToolCall objects
-   - **Files**: `src/cli/service/ai/{ollama,gemini}_ai_service.cc`
+2. **Ollama Testing** (30 min)
+   - Verify qwen2.5-coder discovers and calls tools
+   - Same test prompts as Gemini
+   - Compare response quality between models

-4. ✅ **Infrastructure Verification** - Created test scripts
-   - `scripts/test_tool_schemas.sh` - Verifies tool definitions in catalogue
-   - `scripts/test_agent_mock.sh` - Validates component integration
-   - All 5 tools properly defined with arguments and examples
-   - **Status**: Ready for live LLM testing
+3. **Tool Coverage Testing** (30 min)
+   - Exercise all 5 tools with natural language prompts
+   - Verify JSON output formats correctly
+   - Test error handling (invalid room IDs, etc.)

-**What's Working:**
- ✅ Tool definitions loaded from `assets/agent/prompt_catalogue.yaml`
- ✅ Function schemas generated in OpenAI format
- ✅ System prompts include tool definitions with usage guidance
- ✅ AI services parse tool_calls from LLM responses
- ✅ ConversationalAgentService dispatches tools via ToolDispatcher
- ✅ Tools return JSON results that feed back into conversation
-
-**Next Step: Live LLM Testing** (1-2 hours)
- Test with Ollama: Verify qwen2.5-coder can discover and invoke tools
- Test with Gemini: Verify Gemini 2.0 generates correct tool_calls
- Create example prompts that exercise all 5 tools
- Verify multi-step tool execution (agent asks follow-up questions)
+**Success Criteria**:
+- LLM autonomously calls tools without explicit command syntax
+- Tool results incorporated into follow-up responses
+- Multi-turn conversations work with context

 #### Priority 2: Implement GUI Chat Widget (6-8 hours)
 **Goal**: Unified chat experience in YAZE application
@@ -221,4 +229,221 @@ We have made significant progress in laying the foundation for the conversationa
 3. **Streaming Responses** (optional, 3-4 hours)
   - Use Ollama/Gemini streaming APIs
   - Update GUI/TUI to show partial responses as they arrive
-   - Improves perceived latency for long responses
+   - Improves perceived latency for long responses
+
+## z3ed Build Quick Reference
+
+```bash
+# Full AI features (Ollama + Gemini)
+cmake -B build -DZ3ED_AI=ON
+cmake --build build --target z3ed
+
+# AI + GUI automation/testing
+cmake -B build -DZ3ED_AI=ON -DYAZE_WITH_GRPC=ON
+cmake --build build --target z3ed
+
+# Minimal build (no AI)
+cmake -B build
+cmake --build build --target z3ed
+```
+
+## Build Flags Explained
+
+| Flag | Purpose | Dependencies | When to Use |
+|------|---------|--------------|-------------|
+| `Z3ED_AI=ON` | **Master flag** for AI features | JSON, YAML, httplib, (OpenSSL*) | Want Ollama or Gemini support |
+| `YAZE_WITH_GRPC=ON` | GUI automation & testing | gRPC, Protobuf, (auto-enables JSON) | Want GUI test harness |
+| `YAZE_WITH_JSON=ON` | Low-level JSON support | nlohmann_json | Auto-enabled by above flags |
+
+*OpenSSL optional - required for Gemini (HTTPS), Ollama works without it
+
+## Feature Matrix
+
+| Feature | No Flags | Z3ED_AI | Z3ED_AI + GRPC |
+|---------|----------|---------|----------------|
+| Basic CLI | ✅ | ✅ | ✅ |
+| Ollama (local) | ❌ | ✅ | ✅ |
+| Gemini (cloud) | ❌ | ✅* | ✅* |
+| TUI Chat | ❌ | ✅ | ✅ |
+| GUI Test Automation | ❌ | ❌ | ✅ |
+| Tool Dispatcher | ❌ | ✅ | ✅ |
+| Function Calling | ❌ | ✅ | ✅ |
+
+*Requires OpenSSL for HTTPS
+
+## Common Build Scenarios
+
+### Developer (AI features, no GUI testing)
+```bash
+cmake -B build -DZ3ED_AI=ON
+cmake --build build --target z3ed -j8
+```
+
+### Full Stack (AI + GUI automation)
+```bash
+cmake -B build -DZ3ED_AI=ON -DYAZE_WITH_GRPC=ON
+cmake --build build --target z3ed -j8
+```
+
+### CI/CD (minimal, fast)
+```bash
+cmake -B build -DYAZE_MINIMAL_BUILD=ON
+cmake --build build -j$(nproc)
+```
+
+### Release Build (optimized)
+```bash
+cmake -B build -DZ3ED_AI=ON -DCMAKE_BUILD_TYPE=Release
+cmake --build build --target z3ed -j8
+```
+
+## Migration from Old Flags
+
+### Before (Confusing)
+```bash
+cmake -B build -DYAZE_WITH_GRPC=ON -DYAZE_WITH_JSON=ON
+```
+
+### After (Clear Intent)
+```bash
+cmake -B build -DZ3ED_AI=ON -DYAZE_WITH_GRPC=ON
+```
+
+**Note**: Old flags still work for backward compatibility!
+
+## Troubleshooting
+
+### "Build with -DZ3ED_AI=ON" warning
+**Symptom**: AI commands fail with "JSON support required"  
+**Fix**: Rebuild with AI flag
+```bash
+rm -rf build && cmake -B build -DZ3ED_AI=ON && cmake --build build
+```
+
+### "OpenSSL not found" warning
+**Symptom**: Gemini API doesn't work  
+**Impact**: Only affects Gemini (cloud). Ollama (local) works fine  
+**Fix (optional)**:
+```bash
+# macOS
+brew install openssl
+
+# Linux
+sudo apt install libssl-dev
+
+# Then rebuild
+cmake -B build -DZ3ED_AI=ON && cmake --build build
+```
+
+### Ollama vs Gemini not auto-detecting
+**Symptom**: Wrong backend selected  
+**Fix**: Set explicit provider
+```bash
+# Force Ollama
+export YAZE_AI_PROVIDER=ollama
+./build/bin/z3ed agent plan --prompt "test"
+
+# Force Gemini
+export YAZE_AI_PROVIDER=gemini
+export GEMINI_API_KEY="your-key"
+./build/bin/z3ed agent plan --prompt "test"
+```
+
+## Environment Variables
+
+| Variable | Default | Purpose |
+|----------|---------|---------|
+| `YAZE_AI_PROVIDER` | auto | Force `ollama` or `gemini` |
+| `GEMINI_API_KEY` | - | Gemini API key (enables Gemini) |
+| `OLLAMA_MODEL` | `qwen2.5-coder:7b` | Override Ollama model |
+| `GEMINI_MODEL` | `gemini-2.5-flash` | Override Gemini model |
+
+## Platform-Specific Notes
+
+### macOS
+- OpenSSL auto-detected via Homebrew
+- Keychain integration for SSL certs
+- Recommended: `brew install openssl ollama`
+
+### Linux
+- OpenSSL typically pre-installed
+- Install via: `sudo apt install libssl-dev`
+- Ollama: Download from https://ollama.com
+
+### Windows
+- Use Ollama (no SSL required)
+- Gemini requires OpenSSL (harder to setup on Windows)
+- Recommend: Focus on Ollama for Windows builds
+
+## Performance Tips
+
+### Faster Incremental Builds
+```bash
+# Use Ninja instead of Make
+cmake -B build -GNinja -DZ3ED_AI=ON
+ninja -C build z3ed
+
+# Enable ccache
+export CMAKE_CXX_COMPILER_LAUNCHER=ccache
+cmake -B build -DZ3ED_AI=ON
+```
+
+### Reduce Build Scope
+```bash
+# Only build z3ed (not full yaze app)
+cmake --build build --target z3ed
+
+# Parallel build
+cmake --build build --target z3ed -j$(nproc)
+```
+
+## Related Documentation
+
+- **Migration Guide**: [Z3ED_AI_FLAG_MIGRATION.md](Z3ED_AI_FLAG_MIGRATION.md)
+- **Technical Roadmap**: [AGENT-ROADMAP.md](AGENT-ROADMAP.md)
+- **Main README**: [README.md](README.md)
+- **Build Modularization**: `../../build_modularization_plan.md`
+
+## Quick Test
+
+Verify your build works:
+
+```bash
+# Check z3ed runs
+./build/bin/z3ed --version
+
+# Test AI detection
+./build/bin/z3ed agent plan --prompt "test" 2>&1 | head -5
+
+# Expected output (with Z3ED_AI=ON):
+# 🤖 Using Gemini AI with model: gemini-2.5-flash
+# or
+# 🤖 Using Ollama AI with model: qwen2.5-coder:7b
+# or
+# 🤖 Using MockAIService (no LLM configured)
+```
+
+## Support
+
+If you encounter issues:
+1. Check this guide's troubleshooting section
+2. Review [Z3ED_AI_FLAG_MIGRATION.md](Z3ED_AI_FLAG_MIGRATION.md)
+3. Verify CMake output for warnings
+4. Open an issue with build logs
+
+## Summary
+
+**Recommended for most users**:
+```bash
+cmake -B build -DZ3ED_AI=ON
+cmake --build build --target z3ed -j8
+./build/bin/z3ed agent chat
+```
+
+This gives you:
+- ✅ Ollama support (local, free)
+- ✅ Gemini support (cloud, API key required)
+- ✅ TUI chat interface
+- ✅ Tool dispatcher with 5 commands
+- ✅ Function calling support
+- ✅ All AI agent features