6.5 KiB
6.5 KiB
WASM AI Service Integration Summary
Overview
This document summarizes the implementation of Phase 5: AI Service Integration for WASM web build, as specified in the wasm-web-app-enhancements-plan.md.
Files Created
1. Browser AI Service (src/cli/service/ai/)
browser_ai_service.h
- Purpose: Browser-based AI service interface for WASM builds
- Key Features:
- Implements
AIServiceinterface for consistency with native builds - Uses
IHttpClientfrom network abstraction layer - Supports Gemini API for text generation
- Provides vision model support for image analysis
- Manages API keys securely via sessionStorage
- CORS-compliant HTTP requests
- Proper error handling with
absl::Status
- Implements
- Compilation: Only compiled when
__EMSCRIPTEN__is defined
browser_ai_service.cc
- Purpose: Implementation of browser AI service
- Key Features:
GenerateResponse()for single prompts and conversation historyAnalyzeImage()for vision model support- JSON request/response handling with nlohmann/json
- Comprehensive error handling and status code mapping
- Debug logging to browser console
- Support for multiple Gemini models (2.0 Flash, 1.5 Pro, etc.)
- Proper handling of API rate limits and quotas
2. Browser Storage (src/app/platform/wasm/)
wasm_browser_storage.h
- Purpose: Browser storage wrapper for API keys and settings
- Note: This is NOT actually secure storage - uses standard localStorage/sessionStorage
- Key Features:
- Dual storage modes: sessionStorage (default) and localStorage
- API key management: Store, Retrieve, Clear, Check existence
- Generic secret storage for other sensitive data
- Storage quota tracking
- Bulk operations (list all keys, clear all)
- Browser storage availability checking
wasm_browser_storage.cc
- Purpose: Implementation using Emscripten JavaScript interop
- Key Features:
- JavaScript bridge functions using
EM_JSmacros - SessionStorage access (cleared on tab close)
- LocalStorage access (persistent)
- Prefix-based key namespacing (
yaze_secure_api_,yaze_secure_secret_) - Error handling for storage exceptions
- Memory management for JS string conversions
- JavaScript bridge functions using
Build System Updates
1. CMake Configuration Updates
src/cli/agent.cmake
- Modified to create a minimal
yaze_agentlibrary for WASM builds - Includes browser AI service sources
- Links with network abstraction layer (
yaze_net) - Enables JSON support for API communication
src/app/app_core.cmake
- Added
wasm_browser_storage.ccto WASM platform sources - Integrated with existing WASM file system and loading manager
src/CMakeLists.txt
- Updated to include
net_library.cmakefor all builds (including WASM) - Network library now provides WASM-compatible HTTP client
CMakePresets.json
- Added new
wasm-aipreset for testing AI features in WASM - Configured with AI runtime enabled and Fetch API flags
Integration with Existing Systems
Network Abstraction Layer
- Leverages existing
IHttpClientinterface - Uses
EmscriptenHttpClientfor browser-based HTTP requests - Supports CORS-compliant requests to Gemini API
AI Service Interface
- Implements standard
AIServiceinterface - Compatible with existing agent response structures
- Supports tool calls and structured responses
WASM Platform Support
- Integrates with existing WASM error handler
- Works alongside WASM storage and file dialog systems
- Compatible with progressive loading manager
API Key Security
Storage Security Model
-
SessionStorage (Default):
- Keys stored in browser memory
- Automatically cleared when tab closes
- No persistence across sessions
- Recommended for security
-
LocalStorage (Optional):
- Persistent storage
- Survives browser restarts
- Less secure but more convenient
- User choice based on preference
Security Considerations
- Keys never hardcoded in binary
- Keys prefixed to avoid conflicts
- No encryption currently (future enhancement)
- Browser same-origin policy provides isolation
Usage Example
#ifdef __EMSCRIPTEN__
#include "cli/service/ai/browser_ai_service.h"
#include "app/net/wasm/emscripten_http_client.h"
#include "app/platform/wasm/wasm_browser_storage.h"
// Store API key from user input
WasmBrowserStorage::StoreApiKey("gemini", user_api_key);
}
// Create AI service
BrowserAIConfig config;
config.api_key = WasmBrowserStorage::RetrieveApiKey("gemini").value();
config.model = "gemini-2.5-flash";
auto http_client = std::make_unique<EmscriptenHttpClient>();
BrowserAIService ai_service(config, std::move(http_client));
// Generate response
auto response = ai_service.GenerateResponse("Explain the Zelda 3 ROM format");
#endif
Testing
Test File: test/browser_ai_test.cc
- Verifies secure storage operations
- Tests AI service creation
- Validates model listing
- Checks error handling
Build and Test Commands
# Configure with AI support
cmake --preset wasm-ai
# Build
cmake --build build_wasm_ai
# Run in browser
emrun build_wasm_ai/yaze.html
CORS Considerations
Gemini API
- ✅ Works with browser fetch (Google APIs support CORS)
- ✅ No proxy required
- ✅ Direct browser-to-API communication
Ollama (Future)
- ⚠️ Requires
--corsflag on Ollama server - ⚠️ May need proxy for local instances
- ⚠️ Security implications of CORS relaxation
Future Enhancements
- Encryption: Add client-side encryption for stored API keys
- Multiple Providers: Support for OpenAI, Anthropic APIs
- Streaming Responses: Implement streaming for better UX
- Offline Caching: Cache AI responses for offline use
- Web Worker Integration: Move AI calls to background thread
Limitations
- Browser Security: Subject to browser security policies
- CORS Restrictions: Limited to CORS-enabled APIs
- Storage Limits: ~5-10MB for sessionStorage/localStorage
- No File System: Cannot access local models
- Network Required: No offline AI capabilities
Conclusion
The WASM AI service integration successfully brings browser-based AI capabilities to yaze. The implementation:
- ✅ Provides secure API key management
- ✅ Integrates cleanly with existing architecture
- ✅ Supports both text and vision models
- ✅ Handles errors gracefully
- ✅ Works within browser security constraints
This enables users to leverage AI assistance for ROM hacking directly in their browser without needing to install local AI models or tools.