Files
yaze/docs/internal/AI_API_ENHANCEMENT_HANDOFF.md
scawful e36d81f357 fix(linux): add missing yaze_gfx_render dependency to yaze_gfx_debug
Fixes linker error on Linux where yaze_gfx_debug.a (performance_dashboard.cc)
was calling AtlasRenderer::Get() and AtlasRenderer::GetStats() but wasn't
linking against yaze_gfx_render which contains atlas_renderer.cc.

Root cause: yaze_gfx_debug was only linking to yaze_gfx_types and
yaze_gfx_resource, missing the yaze_gfx_render dependency.

This also fixes the undefined reference errors for HttpServer methods
which were already properly included in the agent.cmake source list.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-20 01:38:55 -05:00

11 KiB

AI API & Agentic Workflow Enhancement - Handoff Document

Date: 2025-01-XX
Status: Phase 1 Complete, Phase 2-4 Pending
Branch: (to be determined)

Executive Summary

This document tracks progress on transforming Yaze into an AI-native platform with unified model management, API interface, and enhanced agentic workflows. Phase 1 (Unified Model Management) is complete. Phases 2-4 require implementation.

Completed Work (Phase 1)

1. Unified AI Model Management

Core Infrastructure

  • ModelInfo struct (src/cli/service/ai/common.h)

    • Standardized model representation across all providers
    • Fields: name, display_name, provider, description, family, parameter_size, quantization, size_bytes, is_local
  • ModelRegistry class (src/cli/service/ai/model_registry.h/.cc)

    • Singleton pattern for managing multiple AIService instances
    • RegisterService() - Add service instances
    • ListAllModels() - Aggregate models from all registered services
    • Thread-safe with mutex protection

AIService Interface Updates

  • AIService::ListAvailableModels() - Virtual method returning std::vector<ModelInfo>
  • AIService::GetProviderName() - Virtual method returning provider identifier
  • Default implementations provided in base class

Provider Implementations

  • OllamaAIService::ListAvailableModels()

    • Queries /api/tags endpoint
    • Maps Ollama's model structure to ModelInfo
    • Handles size, quantization, family metadata
  • GeminiAIService::ListAvailableModels()

    • Queries Gemini API /v1beta/models endpoint
    • Falls back to known defaults if API key missing
    • Filters for gemini* models

UI Integration

  • AgentChatWidget::RefreshModels()

    • Registers Ollama and Gemini services with ModelRegistry
    • Aggregates models from all providers
    • Caches results in model_info_cache_
  • Header updates (agent_chat_widget.h)

    • Replaced ollama_model_info_cache_ with unified model_info_cache_
    • Replaced ollama_model_cache_ with model_name_cache_
    • Replaced ollama_models_loading_ with models_loading_

Files Modified

  • src/cli/service/ai/common.h - Added ModelInfo struct
  • src/cli/service/ai/ai_service.h - Added ListAvailableModels() and GetProviderName()
  • src/cli/service/ai/ollama_ai_service.h/.cc - Implemented model listing
  • src/cli/service/ai/gemini_ai_service.h/.cc - Implemented model listing
  • src/cli/service/ai/model_registry.h/.cc - New registry class
  • src/app/editor/agent/agent_chat_widget.h/.cc - Updated to use registry

In Progress

UI Rendering Updates (Partial)

The RenderModelConfigControls() function in agent_chat_widget.cc still references old Ollama-specific code. It needs to be updated to:

  • Use unified model_info_cache_ instead of ollama_model_info_cache_
  • Display models from all providers in a single list
  • Filter by provider when a specific provider is selected
  • Show provider badges/indicators for each model

Location: src/app/editor/agent/agent_chat_widget.cc:2083-2318

Current State: Function still has provider-specific branches that should be unified.

Remaining Work

Phase 2: API Interface & Headless Mode

2.1 HTTP Server Implementation

Goal: Expose Yaze functionality via REST API for external agents

Tasks:

  1. Create HttpServer class in src/cli/service/api/

    • Use httplib (already in tree)
    • Start on configurable port (default 8080)
    • Handle CORS if needed
  2. Implement endpoints:

    • GET /api/v1/models - List all available models (delegate to ModelRegistry)
    • POST /api/v1/chat - Send prompt to agent
      • Request: { "prompt": "...", "provider": "ollama", "model": "...", "history": [...] }
      • Response: { "text_response": "...", "tool_calls": [...], "commands": [...] }
    • POST /api/v1/tool/{tool_name} - Execute specific tool
      • Request: { "args": {...} }
      • Response: { "result": "...", "status": "ok|error" }
    • GET /api/v1/health - Health check
    • GET /api/v1/rom/status - ROM loading status
  3. Integration points:

    • Initialize server in yaze.cc main() or via CLI flag
    • Share Rom* context with API handlers
    • Use ConversationalAgentService for chat endpoint
    • Use ToolDispatcher for tool endpoint

Files to Create:

  • src/cli/service/api/http_server.h
  • src/cli/service/api/http_server.cc
  • src/cli/service/api/api_handlers.h
  • src/cli/service/api/api_handlers.cc

Dependencies: httplib, nlohmann/json (already available)

Phase 3: Enhanced Agentic Workflows

3.1 Tool Expansion

FileSystemTool (src/cli/handlers/tools/filesystem_commands.h/.cc)

  • Purpose: Allow agent to read/write files outside ROM (e.g., src/ directory)
  • Safety: Require user confirmation or explicit scope configuration
  • Commands:
    • filesystem-read <path> - Read file contents
    • filesystem-write <path> <content> - Write file (with confirmation)
    • filesystem-list <directory> - List directory contents
    • filesystem-search <pattern> - Search for files matching pattern

BuildTool (src/cli/handlers/tools/build_commands.h/.cc)

  • Purpose: Trigger builds from within agent
  • Commands:
    • build-cmake <build_dir> - Run cmake configuration
    • build-ninja <build_dir> - Run ninja build
    • build-status - Check build status
    • build-errors - Parse and return compilation errors

Integration:

  • Add to ToolDispatcher::ToolCallType enum
  • Register in ToolDispatcher::CreateHandler()
  • Add to ToolDispatcher::ToolPreferences struct
  • Update UI toggles in AgentChatWidget::RenderToolingControls()

3.2 Editor State Context

Goal: Feed editor state (open files, compilation errors) into agent context

Tasks:

  1. Create EditorState struct capturing:

    • Open file paths
    • Active editor type
    • Compilation errors (if any)
    • Recent changes
  2. Inject into agent prompts:

    • Add to PromptBuilder::BuildPromptFromHistory()
    • Include in system prompt when editor state changes
  3. Update ConversationalAgentService:

    • Add SetEditorState(EditorState*) method
    • Pass to PromptBuilder when building prompts

Files to Create/Modify:

  • src/cli/service/agent/editor_state.h (new)
  • src/cli/service/ai/prompt_builder.h/.cc (modify)

Phase 4: Refactoring

4.1 ToolDispatcher Structured Output

Goal: Return JSON instead of capturing stdout

Current State: ToolDispatcher::Dispatch() returns absl::StatusOr<std::string> by capturing stdout from command handlers.

Proposed Changes:

  1. Create ToolResult struct:

    struct ToolResult {
      std::string output;  // Human-readable output
      nlohmann::json data;  // Structured data (if applicable)
      bool success;
      std::vector<std::string> warnings;
    };
    
  2. Update command handlers to return ToolResult:

    • Modify base CommandHandler interface
    • Update each handler implementation
    • Keep backward compatibility with OutputFormatter for CLI
  3. Update ToolDispatcher::Dispatch():

    • Return absl::StatusOr<ToolResult>
    • Convert to JSON for API responses
    • Keep string output for CLI compatibility

Files to Modify:

  • src/cli/service/agent/tool_dispatcher.h/.cc
  • src/cli/handlers/*/command_handlers.h/.cc (all handlers)
  • src/cli/service/agent/command_handler.h (base interface)

Migration Strategy:

  • Add new ExecuteStructured() method alongside existing Execute()
  • Gradually migrate handlers
  • Keep old path for CLI until migration complete

Technical Notes

Model Registry Usage Pattern

// Register services
auto& registry = cli::ModelRegistry::GetInstance();
registry.RegisterService(std::make_shared<OllamaAIService>(ollama_config));
registry.RegisterService(std::make_shared<GeminiAIService>(gemini_config));

// List all models
auto models_or = registry.ListAllModels();
// Returns unified list sorted by name

API Key Management

  • Gemini API key: Currently stored in AgentConfigState::gemini_api_key
  • Consider: Environment variable fallback, secure storage
  • Future: Support multiple API keys for different providers

Thread Safety

  • ModelRegistry uses mutex for thread-safe access
  • HttpServer should handle concurrent requests (httplib supports this)
  • ToolDispatcher may need locking if shared across threads

Testing Checklist

Phase 1 (Model Management)

  • Verify Ollama models appear in unified list
  • Verify Gemini models appear in unified list
  • Test model refresh with multiple providers
  • Test provider filtering in UI
  • Test model selection and configuration

Phase 2 (API)

  • Test /api/v1/models endpoint
  • Test /api/v1/chat with different providers
  • Test /api/v1/tool/* endpoints
  • Test error handling (missing ROM, invalid tool, etc.)
  • Test concurrent requests
  • Test CORS if needed

Phase 3 (Tools)

  • Test FileSystemTool with read operations
  • Test FileSystemTool write confirmation flow
  • Test BuildTool cmake/ninja execution
  • Test BuildTool error parsing
  • Test editor state injection into prompts

Phase 4 (Refactoring)

  • Verify all handlers return structured output
  • Test API endpoints with new format
  • Verify CLI still works with old format
  • Performance test (no regressions)

Known Issues

  1. UI Rendering: RenderModelConfigControls() still has provider-specific code that should be unified
  2. Model Info Display: Some fields from ModelInfo (like quantization, modified_at) are not displayed in unified view
  3. Error Handling: Model listing failures are logged but don't prevent other providers from loading

Next Steps (Priority Order)

  1. Complete UI unification - Update RenderModelConfigControls() to use unified model list
  2. Implement HTTP Server - Start with basic server and /api/v1/models endpoint
  3. Add chat endpoint - Wire up ConversationalAgentService to API
  4. Add tool endpoint - Expose ToolDispatcher via API
  5. Implement FileSystemTool - Start with read-only operations
  6. Implement BuildTool - Basic cmake/ninja execution
  7. Refactor ToolDispatcher - Begin structured output migration

References

  • Plan document: plan-yaze-api-agentic-workflow-enhancement.plan.md
  • Model Registry: src/cli/service/ai/model_registry.h
  • AIService interface: src/cli/service/ai/ai_service.h
  • ToolDispatcher: src/cli/service/agent/tool_dispatcher.h
  • httplib docs: (in ext/httplib/)

Questions for Next Developer

  1. Should the HTTP server be enabled by default or require a flag?
  2. What port should be used? (8080 suggested, but configurable?)
  3. Should FileSystemTool require explicit user approval per operation or a "trusted scope"?
  4. Should BuildTool be limited to specific directories (e.g., build/) for safety?
  5. How should API authentication work? (API key? Localhost-only? None?)

Last Updated: 2025-01-XX
Contact: (to be filled)