199 lines
6.5 KiB
Markdown
199 lines
6.5 KiB
Markdown
# WASM AI Service Integration Summary
|
|
|
|
## Overview
|
|
|
|
This document summarizes the implementation of Phase 5: AI Service Integration for WASM web build, as specified in the wasm-web-app-enhancements-plan.md.
|
|
|
|
## Files Created
|
|
|
|
### 1. Browser AI Service (`src/cli/service/ai/`)
|
|
|
|
#### `browser_ai_service.h`
|
|
- **Purpose**: Browser-based AI service interface for WASM builds
|
|
- **Key Features**:
|
|
- Implements `AIService` interface for consistency with native builds
|
|
- Uses `IHttpClient` from network abstraction layer
|
|
- Supports Gemini API for text generation
|
|
- Provides vision model support for image analysis
|
|
- Manages API keys securely via sessionStorage
|
|
- CORS-compliant HTTP requests
|
|
- Proper error handling with `absl::Status`
|
|
- **Compilation**: Only compiled when `__EMSCRIPTEN__` is defined
|
|
|
|
#### `browser_ai_service.cc`
|
|
- **Purpose**: Implementation of browser AI service
|
|
- **Key Features**:
|
|
- `GenerateResponse()` for single prompts and conversation history
|
|
- `AnalyzeImage()` for vision model support
|
|
- JSON request/response handling with nlohmann/json
|
|
- Comprehensive error handling and status code mapping
|
|
- Debug logging to browser console
|
|
- Support for multiple Gemini models (2.0 Flash, 1.5 Pro, etc.)
|
|
- Proper handling of API rate limits and quotas
|
|
|
|
### 2. Browser Storage (`src/app/platform/wasm/`)
|
|
|
|
#### `wasm_browser_storage.h`
|
|
- **Purpose**: Browser storage wrapper for API keys and settings
|
|
- **Note**: This is NOT actually secure storage - uses standard localStorage/sessionStorage
|
|
- **Key Features**:
|
|
- Dual storage modes: sessionStorage (default) and localStorage
|
|
- API key management: Store, Retrieve, Clear, Check existence
|
|
- Generic secret storage for other sensitive data
|
|
- Storage quota tracking
|
|
- Bulk operations (list all keys, clear all)
|
|
- Browser storage availability checking
|
|
|
|
#### `wasm_browser_storage.cc`
|
|
- **Purpose**: Implementation using Emscripten JavaScript interop
|
|
- **Key Features**:
|
|
- JavaScript bridge functions using `EM_JS` macros
|
|
- SessionStorage access (cleared on tab close)
|
|
- LocalStorage access (persistent)
|
|
- Prefix-based key namespacing (`yaze_secure_api_`, `yaze_secure_secret_`)
|
|
- Error handling for storage exceptions
|
|
- Memory management for JS string conversions
|
|
|
|
## Build System Updates
|
|
|
|
### 1. CMake Configuration Updates
|
|
|
|
#### `src/cli/agent.cmake`
|
|
- Modified to create a minimal `yaze_agent` library for WASM builds
|
|
- Includes browser AI service sources
|
|
- Links with network abstraction layer (`yaze_net`)
|
|
- Enables JSON support for API communication
|
|
|
|
#### `src/app/app_core.cmake`
|
|
- Added `wasm_browser_storage.cc` to WASM platform sources
|
|
- Integrated with existing WASM file system and loading manager
|
|
|
|
#### `src/CMakeLists.txt`
|
|
- Updated to include `net_library.cmake` for all builds (including WASM)
|
|
- Network library now provides WASM-compatible HTTP client
|
|
|
|
#### `CMakePresets.json`
|
|
- Added new `wasm-ai` preset for testing AI features in WASM
|
|
- Configured with AI runtime enabled and Fetch API flags
|
|
|
|
## Integration with Existing Systems
|
|
|
|
### Network Abstraction Layer
|
|
- Leverages existing `IHttpClient` interface
|
|
- Uses `EmscriptenHttpClient` for browser-based HTTP requests
|
|
- Supports CORS-compliant requests to Gemini API
|
|
|
|
### AI Service Interface
|
|
- Implements standard `AIService` interface
|
|
- Compatible with existing agent response structures
|
|
- Supports tool calls and structured responses
|
|
|
|
### WASM Platform Support
|
|
- Integrates with existing WASM error handler
|
|
- Works alongside WASM storage and file dialog systems
|
|
- Compatible with progressive loading manager
|
|
|
|
## API Key Security
|
|
|
|
### Storage Security Model
|
|
1. **SessionStorage (Default)**:
|
|
- Keys stored in browser memory
|
|
- Automatically cleared when tab closes
|
|
- No persistence across sessions
|
|
- Recommended for security
|
|
|
|
2. **LocalStorage (Optional)**:
|
|
- Persistent storage
|
|
- Survives browser restarts
|
|
- Less secure but more convenient
|
|
- User choice based on preference
|
|
|
|
### Security Considerations
|
|
- Keys never hardcoded in binary
|
|
- Keys prefixed to avoid conflicts
|
|
- No encryption currently (future enhancement)
|
|
- Browser same-origin policy provides isolation
|
|
|
|
## Usage Example
|
|
|
|
```cpp
|
|
#ifdef __EMSCRIPTEN__
|
|
#include "cli/service/ai/browser_ai_service.h"
|
|
#include "app/net/wasm/emscripten_http_client.h"
|
|
#include "app/platform/wasm/wasm_browser_storage.h"
|
|
|
|
// Store API key from user input
|
|
WasmBrowserStorage::StoreApiKey("gemini", user_api_key);
|
|
}
|
|
|
|
// Create AI service
|
|
BrowserAIConfig config;
|
|
config.api_key = WasmBrowserStorage::RetrieveApiKey("gemini").value();
|
|
config.model = "gemini-2.5-flash";
|
|
|
|
auto http_client = std::make_unique<EmscriptenHttpClient>();
|
|
BrowserAIService ai_service(config, std::move(http_client));
|
|
|
|
// Generate response
|
|
auto response = ai_service.GenerateResponse("Explain the Zelda 3 ROM format");
|
|
#endif
|
|
```
|
|
|
|
## Testing
|
|
|
|
### Test File: `test/browser_ai_test.cc`
|
|
- Verifies secure storage operations
|
|
- Tests AI service creation
|
|
- Validates model listing
|
|
- Checks error handling
|
|
|
|
### Build and Test Commands
|
|
```bash
|
|
# Configure with AI support
|
|
cmake --preset wasm-ai
|
|
|
|
# Build
|
|
cmake --build build_wasm_ai
|
|
|
|
# Run in browser
|
|
emrun build_wasm_ai/yaze.html
|
|
```
|
|
|
|
## CORS Considerations
|
|
|
|
### Gemini API
|
|
- ✅ Works with browser fetch (Google APIs support CORS)
|
|
- ✅ No proxy required
|
|
- ✅ Direct browser-to-API communication
|
|
|
|
### Ollama (Future)
|
|
- ⚠️ Requires `--cors` flag on Ollama server
|
|
- ⚠️ May need proxy for local instances
|
|
- ⚠️ Security implications of CORS relaxation
|
|
|
|
## Future Enhancements
|
|
|
|
1. **Encryption**: Add client-side encryption for stored API keys
|
|
2. **Multiple Providers**: Support for OpenAI, Anthropic APIs
|
|
3. **Streaming Responses**: Implement streaming for better UX
|
|
4. **Offline Caching**: Cache AI responses for offline use
|
|
5. **Web Worker Integration**: Move AI calls to background thread
|
|
|
|
## Limitations
|
|
|
|
1. **Browser Security**: Subject to browser security policies
|
|
2. **CORS Restrictions**: Limited to CORS-enabled APIs
|
|
3. **Storage Limits**: ~5-10MB for sessionStorage/localStorage
|
|
4. **No File System**: Cannot access local models
|
|
5. **Network Required**: No offline AI capabilities
|
|
|
|
## Conclusion
|
|
|
|
The WASM AI service integration successfully brings browser-based AI capabilities to yaze. The implementation:
|
|
- ✅ Provides secure API key management
|
|
- ✅ Integrates cleanly with existing architecture
|
|
- ✅ Supports both text and vision models
|
|
- ✅ Handles errors gracefully
|
|
- ✅ Works within browser security constraints
|
|
|
|
This enables users to leverage AI assistance for ROM hacking directly in their browser without needing to install local AI models or tools. |