fix(linux): add missing yaze_gfx_render dependency to yaze_gfx_debug
Fixes linker error on Linux where yaze_gfx_debug.a (performance_dashboard.cc) was calling AtlasRenderer::Get() and AtlasRenderer::GetStats() but wasn't linking against yaze_gfx_render which contains atlas_renderer.cc. Root cause: yaze_gfx_debug was only linking to yaze_gfx_types and yaze_gfx_resource, missing the yaze_gfx_render dependency. This also fixes the undefined reference errors for HttpServer methods which were already properly included in the agent.cmake source list. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
289
docs/internal/AI_API_ENHANCEMENT_HANDOFF.md
Normal file
289
docs/internal/AI_API_ENHANCEMENT_HANDOFF.md
Normal file
@@ -0,0 +1,289 @@
|
||||
# AI API & Agentic Workflow Enhancement - Handoff Document
|
||||
|
||||
**Date**: 2025-01-XX
|
||||
**Status**: Phase 1 Complete, Phase 2-4 Pending
|
||||
**Branch**: (to be determined)
|
||||
|
||||
## Executive Summary
|
||||
|
||||
This document tracks progress on transforming Yaze into an AI-native platform with unified model management, API interface, and enhanced agentic workflows. Phase 1 (Unified Model Management) is complete. Phases 2-4 require implementation.
|
||||
|
||||
## Completed Work (Phase 1)
|
||||
|
||||
### 1. Unified AI Model Management ✅
|
||||
|
||||
#### Core Infrastructure
|
||||
- **`ModelInfo` struct** (`src/cli/service/ai/common.h`)
|
||||
- Standardized model representation across all providers
|
||||
- Fields: `name`, `display_name`, `provider`, `description`, `family`, `parameter_size`, `quantization`, `size_bytes`, `is_local`
|
||||
|
||||
- **`ModelRegistry` class** (`src/cli/service/ai/model_registry.h/.cc`)
|
||||
- Singleton pattern for managing multiple `AIService` instances
|
||||
- `RegisterService()` - Add service instances
|
||||
- `ListAllModels()` - Aggregate models from all registered services
|
||||
- Thread-safe with mutex protection
|
||||
|
||||
#### AIService Interface Updates
|
||||
- **`AIService::ListAvailableModels()`** - Virtual method returning `std::vector<ModelInfo>`
|
||||
- **`AIService::GetProviderName()`** - Virtual method returning provider identifier
|
||||
- Default implementations provided in base class
|
||||
|
||||
#### Provider Implementations
|
||||
- **`OllamaAIService::ListAvailableModels()`**
|
||||
- Queries `/api/tags` endpoint
|
||||
- Maps Ollama's model structure to `ModelInfo`
|
||||
- Handles size, quantization, family metadata
|
||||
|
||||
- **`GeminiAIService::ListAvailableModels()`**
|
||||
- Queries Gemini API `/v1beta/models` endpoint
|
||||
- Falls back to known defaults if API key missing
|
||||
- Filters for `gemini*` models
|
||||
|
||||
#### UI Integration
|
||||
- **`AgentChatWidget::RefreshModels()`**
|
||||
- Registers Ollama and Gemini services with `ModelRegistry`
|
||||
- Aggregates models from all providers
|
||||
- Caches results in `model_info_cache_`
|
||||
|
||||
- **Header updates** (`agent_chat_widget.h`)
|
||||
- Replaced `ollama_model_info_cache_` with unified `model_info_cache_`
|
||||
- Replaced `ollama_model_cache_` with `model_name_cache_`
|
||||
- Replaced `ollama_models_loading_` with `models_loading_`
|
||||
|
||||
### Files Modified
|
||||
- `src/cli/service/ai/common.h` - Added `ModelInfo` struct
|
||||
- `src/cli/service/ai/ai_service.h` - Added `ListAvailableModels()` and `GetProviderName()`
|
||||
- `src/cli/service/ai/ollama_ai_service.h/.cc` - Implemented model listing
|
||||
- `src/cli/service/ai/gemini_ai_service.h/.cc` - Implemented model listing
|
||||
- `src/cli/service/ai/model_registry.h/.cc` - New registry class
|
||||
- `src/app/editor/agent/agent_chat_widget.h/.cc` - Updated to use registry
|
||||
|
||||
## In Progress
|
||||
|
||||
### UI Rendering Updates (Partial)
|
||||
The `RenderModelConfigControls()` function in `agent_chat_widget.cc` still references old Ollama-specific code. It needs to be updated to:
|
||||
- Use unified `model_info_cache_` instead of `ollama_model_info_cache_`
|
||||
- Display models from all providers in a single list
|
||||
- Filter by provider when a specific provider is selected
|
||||
- Show provider badges/indicators for each model
|
||||
|
||||
**Location**: `src/app/editor/agent/agent_chat_widget.cc:2083-2318`
|
||||
|
||||
**Current State**: Function still has provider-specific branches that should be unified.
|
||||
|
||||
## Remaining Work
|
||||
|
||||
### Phase 2: API Interface & Headless Mode
|
||||
|
||||
#### 2.1 HTTP Server Implementation
|
||||
**Goal**: Expose Yaze functionality via REST API for external agents
|
||||
|
||||
**Tasks**:
|
||||
1. Create `HttpServer` class in `src/cli/service/api/`
|
||||
- Use `httplib` (already in tree)
|
||||
- Start on configurable port (default 8080)
|
||||
- Handle CORS if needed
|
||||
|
||||
2. Implement endpoints:
|
||||
- `GET /api/v1/models` - List all available models (delegate to `ModelRegistry`)
|
||||
- `POST /api/v1/chat` - Send prompt to agent
|
||||
- Request: `{ "prompt": "...", "provider": "ollama", "model": "...", "history": [...] }`
|
||||
- Response: `{ "text_response": "...", "tool_calls": [...], "commands": [...] }`
|
||||
- `POST /api/v1/tool/{tool_name}` - Execute specific tool
|
||||
- Request: `{ "args": {...} }`
|
||||
- Response: `{ "result": "...", "status": "ok|error" }`
|
||||
- `GET /api/v1/health` - Health check
|
||||
- `GET /api/v1/rom/status` - ROM loading status
|
||||
|
||||
3. Integration points:
|
||||
- Initialize server in `yaze.cc` main() or via CLI flag
|
||||
- Share `Rom*` context with API handlers
|
||||
- Use `ConversationalAgentService` for chat endpoint
|
||||
- Use `ToolDispatcher` for tool endpoint
|
||||
|
||||
**Files to Create**:
|
||||
- `src/cli/service/api/http_server.h`
|
||||
- `src/cli/service/api/http_server.cc`
|
||||
- `src/cli/service/api/api_handlers.h`
|
||||
- `src/cli/service/api/api_handlers.cc`
|
||||
|
||||
**Dependencies**: `httplib`, `nlohmann/json` (already available)
|
||||
|
||||
### Phase 3: Enhanced Agentic Workflows
|
||||
|
||||
#### 3.1 Tool Expansion
|
||||
|
||||
**FileSystemTool** (`src/cli/handlers/tools/filesystem_commands.h/.cc`)
|
||||
- **Purpose**: Allow agent to read/write files outside ROM (e.g., `src/` directory)
|
||||
- **Safety**: Require user confirmation or explicit scope configuration
|
||||
- **Commands**:
|
||||
- `filesystem-read <path>` - Read file contents
|
||||
- `filesystem-write <path> <content>` - Write file (with confirmation)
|
||||
- `filesystem-list <directory>` - List directory contents
|
||||
- `filesystem-search <pattern>` - Search for files matching pattern
|
||||
|
||||
**BuildTool** (`src/cli/handlers/tools/build_commands.h/.cc`)
|
||||
- **Purpose**: Trigger builds from within agent
|
||||
- **Commands**:
|
||||
- `build-cmake <build_dir>` - Run cmake configuration
|
||||
- `build-ninja <build_dir>` - Run ninja build
|
||||
- `build-status` - Check build status
|
||||
- `build-errors` - Parse and return compilation errors
|
||||
|
||||
**Integration**:
|
||||
- Add to `ToolDispatcher::ToolCallType` enum
|
||||
- Register in `ToolDispatcher::CreateHandler()`
|
||||
- Add to `ToolDispatcher::ToolPreferences` struct
|
||||
- Update UI toggles in `AgentChatWidget::RenderToolingControls()`
|
||||
|
||||
#### 3.2 Editor State Context
|
||||
**Goal**: Feed editor state (open files, compilation errors) into agent context
|
||||
|
||||
**Tasks**:
|
||||
1. Create `EditorState` struct capturing:
|
||||
- Open file paths
|
||||
- Active editor type
|
||||
- Compilation errors (if any)
|
||||
- Recent changes
|
||||
|
||||
2. Inject into agent prompts:
|
||||
- Add to `PromptBuilder::BuildPromptFromHistory()`
|
||||
- Include in system prompt when editor state changes
|
||||
|
||||
3. Update `ConversationalAgentService`:
|
||||
- Add `SetEditorState(EditorState*)` method
|
||||
- Pass to `PromptBuilder` when building prompts
|
||||
|
||||
**Files to Create/Modify**:
|
||||
- `src/cli/service/agent/editor_state.h` (new)
|
||||
- `src/cli/service/ai/prompt_builder.h/.cc` (modify)
|
||||
|
||||
### Phase 4: Refactoring
|
||||
|
||||
#### 4.1 ToolDispatcher Structured Output
|
||||
**Goal**: Return JSON instead of capturing stdout
|
||||
|
||||
**Current State**: `ToolDispatcher::Dispatch()` returns `absl::StatusOr<std::string>` by capturing stdout from command handlers.
|
||||
|
||||
**Proposed Changes**:
|
||||
1. Create `ToolResult` struct:
|
||||
```cpp
|
||||
struct ToolResult {
|
||||
std::string output; // Human-readable output
|
||||
nlohmann::json data; // Structured data (if applicable)
|
||||
bool success;
|
||||
std::vector<std::string> warnings;
|
||||
};
|
||||
```
|
||||
|
||||
2. Update command handlers to return `ToolResult`:
|
||||
- Modify base `CommandHandler` interface
|
||||
- Update each handler implementation
|
||||
- Keep backward compatibility with `OutputFormatter` for CLI
|
||||
|
||||
3. Update `ToolDispatcher::Dispatch()`:
|
||||
- Return `absl::StatusOr<ToolResult>`
|
||||
- Convert to JSON for API responses
|
||||
- Keep string output for CLI compatibility
|
||||
|
||||
**Files to Modify**:
|
||||
- `src/cli/service/agent/tool_dispatcher.h/.cc`
|
||||
- `src/cli/handlers/*/command_handlers.h/.cc` (all handlers)
|
||||
- `src/cli/service/agent/command_handler.h` (base interface)
|
||||
|
||||
**Migration Strategy**:
|
||||
- Add new `ExecuteStructured()` method alongside existing `Execute()`
|
||||
- Gradually migrate handlers
|
||||
- Keep old path for CLI until migration complete
|
||||
|
||||
## Technical Notes
|
||||
|
||||
### Model Registry Usage Pattern
|
||||
```cpp
|
||||
// Register services
|
||||
auto& registry = cli::ModelRegistry::GetInstance();
|
||||
registry.RegisterService(std::make_shared<OllamaAIService>(ollama_config));
|
||||
registry.RegisterService(std::make_shared<GeminiAIService>(gemini_config));
|
||||
|
||||
// List all models
|
||||
auto models_or = registry.ListAllModels();
|
||||
// Returns unified list sorted by name
|
||||
```
|
||||
|
||||
### API Key Management
|
||||
- Gemini API key: Currently stored in `AgentConfigState::gemini_api_key`
|
||||
- Consider: Environment variable fallback, secure storage
|
||||
- Future: Support multiple API keys for different providers
|
||||
|
||||
### Thread Safety
|
||||
- `ModelRegistry` uses mutex for thread-safe access
|
||||
- `HttpServer` should handle concurrent requests (httplib supports this)
|
||||
- `ToolDispatcher` may need locking if shared across threads
|
||||
|
||||
## Testing Checklist
|
||||
|
||||
### Phase 1 (Model Management)
|
||||
- [ ] Verify Ollama models appear in unified list
|
||||
- [ ] Verify Gemini models appear in unified list
|
||||
- [ ] Test model refresh with multiple providers
|
||||
- [ ] Test provider filtering in UI
|
||||
- [ ] Test model selection and configuration
|
||||
|
||||
### Phase 2 (API)
|
||||
- [ ] Test `/api/v1/models` endpoint
|
||||
- [ ] Test `/api/v1/chat` with different providers
|
||||
- [ ] Test `/api/v1/tool/*` endpoints
|
||||
- [ ] Test error handling (missing ROM, invalid tool, etc.)
|
||||
- [ ] Test concurrent requests
|
||||
- [ ] Test CORS if needed
|
||||
|
||||
### Phase 3 (Tools)
|
||||
- [ ] Test FileSystemTool with read operations
|
||||
- [ ] Test FileSystemTool write confirmation flow
|
||||
- [ ] Test BuildTool cmake/ninja execution
|
||||
- [ ] Test BuildTool error parsing
|
||||
- [ ] Test editor state injection into prompts
|
||||
|
||||
### Phase 4 (Refactoring)
|
||||
- [ ] Verify all handlers return structured output
|
||||
- [ ] Test API endpoints with new format
|
||||
- [ ] Verify CLI still works with old format
|
||||
- [ ] Performance test (no regressions)
|
||||
|
||||
## Known Issues
|
||||
|
||||
1. **UI Rendering**: `RenderModelConfigControls()` still has provider-specific code that should be unified
|
||||
2. **Model Info Display**: Some fields from `ModelInfo` (like `quantization`, `modified_at`) are not displayed in unified view
|
||||
3. **Error Handling**: Model listing failures are logged but don't prevent other providers from loading
|
||||
|
||||
## Next Steps (Priority Order)
|
||||
|
||||
1. **Complete UI unification** - Update `RenderModelConfigControls()` to use unified model list
|
||||
2. **Implement HTTP Server** - Start with basic server and `/api/v1/models` endpoint
|
||||
3. **Add chat endpoint** - Wire up `ConversationalAgentService` to API
|
||||
4. **Add tool endpoint** - Expose `ToolDispatcher` via API
|
||||
5. **Implement FileSystemTool** - Start with read-only operations
|
||||
6. **Implement BuildTool** - Basic cmake/ninja execution
|
||||
7. **Refactor ToolDispatcher** - Begin structured output migration
|
||||
|
||||
## References
|
||||
|
||||
- Plan document: `plan-yaze-api-agentic-workflow-enhancement.plan.md`
|
||||
- Model Registry: `src/cli/service/ai/model_registry.h`
|
||||
- AIService interface: `src/cli/service/ai/ai_service.h`
|
||||
- ToolDispatcher: `src/cli/service/agent/tool_dispatcher.h`
|
||||
- httplib docs: (in `ext/httplib/`)
|
||||
|
||||
## Questions for Next Developer
|
||||
|
||||
1. Should the HTTP server be enabled by default or require a flag?
|
||||
2. What port should be used? (8080 suggested, but configurable?)
|
||||
3. Should FileSystemTool require explicit user approval per operation or a "trusted scope"?
|
||||
4. Should BuildTool be limited to specific directories (e.g., `build/`) for safety?
|
||||
5. How should API authentication work? (API key? Localhost-only? None?)
|
||||
|
||||
---
|
||||
|
||||
**Last Updated**: 2025-01-XX
|
||||
**Contact**: (to be filled)
|
||||
|
||||
Reference in New Issue
Block a user