Files
yaze/docs/internal/agents/archive/wasm-planning-2025/wasm-ai-integration-summary.md

6.5 KiB

WASM AI Service Integration Summary

Overview

This document summarizes the implementation of Phase 5: AI Service Integration for WASM web build, as specified in the wasm-web-app-enhancements-plan.md.

Files Created

1. Browser AI Service (src/cli/service/ai/)

browser_ai_service.h

  • Purpose: Browser-based AI service interface for WASM builds
  • Key Features:
    • Implements AIService interface for consistency with native builds
    • Uses IHttpClient from network abstraction layer
    • Supports Gemini API for text generation
    • Provides vision model support for image analysis
    • Manages API keys securely via sessionStorage
    • CORS-compliant HTTP requests
    • Proper error handling with absl::Status
  • Compilation: Only compiled when __EMSCRIPTEN__ is defined

browser_ai_service.cc

  • Purpose: Implementation of browser AI service
  • Key Features:
    • GenerateResponse() for single prompts and conversation history
    • AnalyzeImage() for vision model support
    • JSON request/response handling with nlohmann/json
    • Comprehensive error handling and status code mapping
    • Debug logging to browser console
    • Support for multiple Gemini models (2.0 Flash, 1.5 Pro, etc.)
    • Proper handling of API rate limits and quotas

2. Browser Storage (src/app/platform/wasm/)

wasm_browser_storage.h

  • Purpose: Browser storage wrapper for API keys and settings
  • Note: This is NOT actually secure storage - uses standard localStorage/sessionStorage
  • Key Features:
    • Dual storage modes: sessionStorage (default) and localStorage
    • API key management: Store, Retrieve, Clear, Check existence
    • Generic secret storage for other sensitive data
    • Storage quota tracking
    • Bulk operations (list all keys, clear all)
    • Browser storage availability checking

wasm_browser_storage.cc

  • Purpose: Implementation using Emscripten JavaScript interop
  • Key Features:
    • JavaScript bridge functions using EM_JS macros
    • SessionStorage access (cleared on tab close)
    • LocalStorage access (persistent)
    • Prefix-based key namespacing (yaze_secure_api_, yaze_secure_secret_)
    • Error handling for storage exceptions
    • Memory management for JS string conversions

Build System Updates

1. CMake Configuration Updates

src/cli/agent.cmake

  • Modified to create a minimal yaze_agent library for WASM builds
  • Includes browser AI service sources
  • Links with network abstraction layer (yaze_net)
  • Enables JSON support for API communication

src/app/app_core.cmake

  • Added wasm_browser_storage.cc to WASM platform sources
  • Integrated with existing WASM file system and loading manager

src/CMakeLists.txt

  • Updated to include net_library.cmake for all builds (including WASM)
  • Network library now provides WASM-compatible HTTP client

CMakePresets.json

  • Added new wasm-ai preset for testing AI features in WASM
  • Configured with AI runtime enabled and Fetch API flags

Integration with Existing Systems

Network Abstraction Layer

  • Leverages existing IHttpClient interface
  • Uses EmscriptenHttpClient for browser-based HTTP requests
  • Supports CORS-compliant requests to Gemini API

AI Service Interface

  • Implements standard AIService interface
  • Compatible with existing agent response structures
  • Supports tool calls and structured responses

WASM Platform Support

  • Integrates with existing WASM error handler
  • Works alongside WASM storage and file dialog systems
  • Compatible with progressive loading manager

API Key Security

Storage Security Model

  1. SessionStorage (Default):

    • Keys stored in browser memory
    • Automatically cleared when tab closes
    • No persistence across sessions
    • Recommended for security
  2. LocalStorage (Optional):

    • Persistent storage
    • Survives browser restarts
    • Less secure but more convenient
    • User choice based on preference

Security Considerations

  • Keys never hardcoded in binary
  • Keys prefixed to avoid conflicts
  • No encryption currently (future enhancement)
  • Browser same-origin policy provides isolation

Usage Example

#ifdef __EMSCRIPTEN__
#include "cli/service/ai/browser_ai_service.h"
#include "app/net/wasm/emscripten_http_client.h"
#include "app/platform/wasm/wasm_browser_storage.h"

// Store API key from user input
WasmBrowserStorage::StoreApiKey("gemini", user_api_key);
    }

    // Create AI service
    BrowserAIConfig config;
    config.api_key = WasmBrowserStorage::RetrieveApiKey("gemini").value();
config.model = "gemini-2.5-flash";

auto http_client = std::make_unique<EmscriptenHttpClient>();
BrowserAIService ai_service(config, std::move(http_client));

// Generate response
auto response = ai_service.GenerateResponse("Explain the Zelda 3 ROM format");
#endif

Testing

Test File: test/browser_ai_test.cc

  • Verifies secure storage operations
  • Tests AI service creation
  • Validates model listing
  • Checks error handling

Build and Test Commands

# Configure with AI support
cmake --preset wasm-ai

# Build
cmake --build build_wasm_ai

# Run in browser
emrun build_wasm_ai/yaze.html

CORS Considerations

Gemini API

  • Works with browser fetch (Google APIs support CORS)
  • No proxy required
  • Direct browser-to-API communication

Ollama (Future)

  • ⚠️ Requires --cors flag on Ollama server
  • ⚠️ May need proxy for local instances
  • ⚠️ Security implications of CORS relaxation

Future Enhancements

  1. Encryption: Add client-side encryption for stored API keys
  2. Multiple Providers: Support for OpenAI, Anthropic APIs
  3. Streaming Responses: Implement streaming for better UX
  4. Offline Caching: Cache AI responses for offline use
  5. Web Worker Integration: Move AI calls to background thread

Limitations

  1. Browser Security: Subject to browser security policies
  2. CORS Restrictions: Limited to CORS-enabled APIs
  3. Storage Limits: ~5-10MB for sessionStorage/localStorage
  4. No File System: Cannot access local models
  5. Network Required: No offline AI capabilities

Conclusion

The WASM AI service integration successfully brings browser-based AI capabilities to yaze. The implementation:

  • Provides secure API key management
  • Integrates cleanly with existing architecture
  • Supports both text and vision models
  • Handles errors gracefully
  • Works within browser security constraints

This enables users to leverage AI assistance for ROM hacking directly in their browser without needing to install local AI models or tools.