yaze/docs/C1-z3ed-agent-guide.md

# z3ed Command-Line Interface

**Version**: 0.1.0-alpha
**Last Updated**: October 5, 2025

## 1. Overview

`z3ed` is a command-line companion to YAZE. It surfaces editor functionality, test harness tooling, and automation endpoints for scripting and AI-driven workflows.

### Core Capabilities

1.  Conversational agent interfaces (Ollama or Gemini) for planning and review.
2.  gRPC test harness for widget discovery, replay, and automated verification.
3.  Proposal workflow that records changes for manual review and acceptance.
4.  Resource-oriented commands (`z3ed <resource> <action>`) suitable for scripting.

## 2. Quick Start

### Build

A single `Z3ED_AI=ON` CMake flag enables all AI features, including JSON, YAML, and httplib dependencies. This simplifies the build process.

```bash
# Build with AI features (RECOMMENDED)
cmake -B build -DZ3ED_AI=ON
cmake --build build --target z3ed

# For GUI automation features, also include gRPC
cmake -B build -DZ3ED_AI=ON -DYAZE_WITH_GRPC=ON
cmake --build build --target z3ed
```

### AI Setup

**Ollama (Recommended for Development)**:
```bash
brew install ollama              # macOS
ollama pull qwen2.5-coder:7b    # Pull recommended model
ollama serve                     # Start server
```

**Gemini (Cloud API)**:
```bash
# Get API key from https://aistudio.google.com/apikey
export GEMINI_API_KEY="your-key-here"
```

### Example Commands

**Conversational Agent**:
```bash
# Interactive chat (FTXUI)
z3ed agent chat --rom zelda3.sfc

# Simple text mode (better for AI/automation)
z3ed agent simple-chat --rom zelda3.sfc

# Batch mode
z3ed agent simple-chat --file queries.txt --rom zelda3.sfc
```

**Proposal Workflow**:
```bash
# Generate from prompt
z3ed agent run --prompt "Place tree at 10,10" --rom zelda3.sfc --sandbox

# List proposals
z3ed agent list

# Review
z3ed agent diff --proposal-id <id>

# Accept
z3ed agent accept --proposal-id <id>
```

### Hybrid CLI ↔ GUI Workflow

1. Build with `-DZ3ED_AI=ON -DYAZE_WITH_GRPC=ON` so the CLI, editor widget, and test harness share the same feature set.
2. Use `z3ed agent plan --prompt "Describe overworld tile 10,10"` against a sandboxed ROM to preview actions.
3. Apply the plan with `z3ed agent run ... --sandbox`, then open **Debug → Agent Chat** in YAZE to inspect proposals and logs.
4. Re-run or replay from either surface; proposals stay synchronized through the shared registry.

## 3. Architecture

The z3ed system is composed of several layers, from the high-level AI agent down to the YAZE GUI and test harness.

### System Components Diagram

```
┌─────────────────────────────────────────────────────────┐
│ AI Agent Layer (LLM: Ollama, Gemini)                    │
└────────────────────┬────────────────────────────────────┘
                     │
┌────────────────────▼────────────────────────────────────┐
│ z3ed CLI (Command-Line Interface)                       │
│  ├─ agent run/plan/diff/test/list/describe              │
│  └─ rom/palette/overworld/dungeon commands              │
└────────────────────┬────────────────────────────────────┘
                     │
┌────────────────────▼────────────────────────────────────┐
│ Service Layer (Singleton Services)                      │
│  ├─ ProposalRegistry (Proposal Tracking)                │
│  ├─ RomSandboxManager (Isolated ROM Copies)             │
│  ├─ ResourceCatalog (Machine-Readable API Specs)        │
│  └─ ConversationalAgentService (Chat & Tool Dispatch)   │
└────────────────────┬────────────────────────────────────┘
                     │
┌────────────────────▼────────────────────────────────────┐
│ ImGuiTestHarness (gRPC Server in YAZE)                  │
│  ├─ Ping, Click, Type, Wait, Assert, Screenshot         │
│  └─ Introspection & Discovery RPCs                      │
│  └─ Automation API shared by CLI & Agent Chat           │
└────────────────────┬────────────────────────────────────┘
                     │
┌────────────────────▼────────────────────────────────────┐
│ YAZE GUI (ImGui Application)                            │
│  └─ ProposalDrawer & Editor Windows                     │
└─────────────────────────────────────────────────────────┘
```

### Command Abstraction Layer (v0.2.1)

The CLI command architecture has been refactored to eliminate code duplication and provide consistent patterns:

```
┌─────────────────────────────────────────────────────────┐
│ Tool Command Handler (e.g., resource-list)              │
└────────────────────┬────────────────────────────────────┘
                     │
┌────────────────────▼────────────────────────────────────┐
│ Command Abstraction Layer                               │
│  ├─ ArgumentParser (Unified arg parsing)                │
│  ├─ CommandContext (ROM loading & labels)               │
│  ├─ OutputFormatter (JSON/Text output)                  │
│  └─ CommandHandler (Optional base class)                │
└────────────────────┬────────────────────────────────────┘
                     │
┌────────────────────▼────────────────────────────────────┐
│ Business Logic Layer                                    │
│  ├─ ResourceContextBuilder                              │
│  ├─ OverworldInspector                                  │
│  └─ DungeonAnalyzer                                     │
└─────────────────────────────────────────────────────────┘
```

Key benefits:
- Removes roughly 1300 lines of duplicated command code.
- Cuts individual command implementations by about half.
- Establishes consistent patterns across the CLI for easier testing and automation.

See [Command Abstraction Guide](C5-z3ed-command-abstraction.md) for migration details.

## 4. Agentic & Generative Workflow (MCP)

The `z3ed` CLI is the foundation for an AI-driven Model-Code-Program (MCP) loop, where the AI agent's "program" is a script of `z3ed` commands.

1.  **Model (Planner)**: The agent receives a natural language prompt and leverages an LLM to create a plan, which is a sequence of `z3ed` commands.
2.  **Code (Generation)**: The LLM returns the plan as a structured JSON object containing actions.
3.  **Program (Execution)**: The `z3ed agent` parses the plan and executes each command sequentially in a sandboxed ROM environment.
4.  **Verification (Tester)**: The `ImGuiTestHarness` is used to run automated GUI tests to verify that the changes were applied correctly.

## 5. Command Reference

### Agent Commands

-   `agent run --prompt "..."`: Executes an AI-driven ROM modification in a sandbox.
-   `agent plan --prompt "..."`: Shows the sequence of commands the AI plans to execute.
-   `agent list`: Shows all proposals and their status.
-   `agent diff [--proposal-id <id>]`: Shows the changes, logs, and metadata for a proposal.
-   `agent describe [--resource <name>]`: Exports machine-readable API specifications for AI consumption.
-   `agent chat`: Opens an interactive terminal chat (TUI) with the AI agent.
-   `agent simple-chat`: A lightweight, non-TUI chat mode for scripting and automation.
-   `agent test ...`: Commands for running and managing automated GUI tests.
-   `agent learn ...`: **NEW**: Manage learned knowledge (preferences, ROM patterns, project context, conversation memory).
-   `agent todo create "Description" [--category=<category>] [--priority=<n>]`
-   `agent todo list [--status=<status>] [--category=<category>]`
-   `agent todo update <id> --status=<status>`
-   `agent todo show <id>`
-   `agent todo delete <id>`
-   `agent todo clear-completed`
-   `agent todo next`
-   `agent todo plan`

### Resource Commands

-   `rom info|validate|diff`: Commands for ROM file inspection and comparison.
-   `palette export|import|list`: Commands for palette manipulation.
-   `overworld get-tile|find-tile|set-tile`: Commands for overworld editing.
-   `dungeon list-sprites|list-rooms`: Commands for dungeon inspection.

#### `agent test`: Live Harness Automation

- Discover widgets: `z3ed agent test discover --rom zelda3.sfc --grpc localhost:50051` enumerates ImGui widget IDs through the gRPC-backed harness for later scripting.
-   **Record interactions**: `z3ed agent test record --suite harness/tests/overworld_entry.jsonl` launches YAZE, mirrors your clicks/keystrokes, and persists an editable JSONL trace.
-   **Replay & assert**: `z3ed agent test replay harness/tests/overworld_entry.jsonl --watch` drives the GUI in real time and streams pass/fail telemetry back to both the CLI and Agent Chat widget telemetry panel.
-   **Integrate with proposals**: `z3ed agent test verify --proposal-id <id>` links a recorded scenario with a proposal to guarantee UI state after sandboxed edits.
-   **Debug in the editor**: While a replay is running, open **Debug → Agent Chat → Harness Monitor** to step through events, capture screenshots, or restart the scenario without leaving ImGui.

## 6. Chat Modes

### FTXUI Chat (`agent chat`)
Full-screen interactive terminal with table rendering, syntax highlighting, and scrollable history. Best for manual exploration.

**Features:**
- **Autocomplete**: Real-time command suggestions as you type
- **Fuzzy matching**: Intelligent command completion with scoring
- **Context-aware help**: Suggestions adapt based on command prefix
- **History navigation**: Up/down arrows to cycle through previous commands
- **Syntax highlighting**: Color-coded responses and tables
- **Metrics display**: Real-time performance stats and turn counters

### Simple Chat (`agent simple-chat`)
Lightweight, scriptable text-based REPL that supports single messages, interactive sessions, piped input, and batch files.

**Vim Mode**
Enable vim-style line editing with `--vim`:
- **Normal mode** (`ESC`): Navigate with `hjkl`, `w`/`b` word movement, `0`/`$` line start/end
- **Insert mode** (`i`, `a`, `o`): Regular text input with vim keybindings
- **Editing**: `x` delete char, `dd` delete line, `yy` yank line, `p`/`P` paste
- **History**: Navigate with `Ctrl+P`/`Ctrl+N` or `j`/`k` in normal mode
- **Autocomplete**: Press `Tab` in insert mode for command suggestions
- **Undo/Redo**: `u` to undo changes in normal mode

```bash
# Enable vim mode in simple chat
z3ed agent simple-chat --rom zelda3.sfc --vim

# Example workflow:
# 1. Start in INSERT mode, type your message
# 2. Press ESC to enter NORMAL mode
# 3. Use hjkl to navigate, w/b for word movement
# 4. Press i to return to INSERT mode
# 5. Press Enter to send message
```

### GUI Chat Widget (Editor Integration)
Accessible from **Debug → Agent Chat** inside YAZE. Provides the same conversation loop as the CLI, including streaming history, JSON/table inspection, and ROM-aware tool dispatch.

Recent additions:
- Persistent chat history across sessions
- Collaborative sessions with shared history
- Screenshot capture for Gemini analysis

## 7. AI Provider Configuration

Z3ED supports multiple AI providers. Configuration is resolved with command-line flags taking precedence over environment variables.

-   `--ai_provider=<provider>`: Selects the AI provider (`mock`, `ollama`, `gemini`).
-   `--ai_model=<model>`: Specifies the model name (e.g., `qwen2.5-coder:7b`, `gemini-2.5-flash`).
-   `--gemini_api_key=<key>`: Your Gemini API key.
-   `--ollama_host=<url>`: The URL for your Ollama server (default: `http://localhost:11434`).

### System Prompt Versions

Z3ED includes multiple system prompt versions for different use cases:

-   **v1 (default)**: Original reactive prompt with basic tool calling
-   **v2**: Enhanced with better JSON formatting and error handling
-   **v3 (latest)**: Proactive prompt with intelligent tool chaining and implicit iteration - **RECOMMENDED**

To use v3 prompt: Set environment variable `Z3ED_PROMPT_VERSION=v3` or it will be auto-selected for Gemini 2.0+ models.

## 8. Learn Command - Knowledge Management

The learn command enables the AI agent to remember preferences, patterns, and context across sessions.

### Basic Usage

```bash
# Store a preference
z3ed agent learn --preference "default_palette=2"

# Get a preference
z3ed agent learn --get-preference default_palette

# List all preferences
z3ed agent learn --list-preferences

# View statistics
z3ed agent learn --stats

# Export all learned data
z3ed agent learn --export my_learned_data.json

# Import learned data
z3ed agent learn --import my_learned_data.json
```

### Project Context

Store project-specific information that the agent can reference:

```bash
# Save project context
z3ed agent learn --project "myrom" --context "Vanilla+ difficulty hack, focus on dungeon redesign"

# List projects
z3ed agent learn --list-projects

# Get project details
z3ed agent learn --get-project "myrom"
```

### Conversation Memory

The agent automatically stores summaries of conversations for future reference:

```bash
# View recent memories
z3ed agent learn --recent-memories 10

# Search memories by topic
z3ed agent learn --search-memories "room 5"
```

### Storage Location

All learned data is stored in `~/.yaze/agent/`:
- `preferences.json`: User preferences
- `patterns.json`: Learned ROM patterns
- `projects.json`: Project contexts
- `memories.json`: Conversation summaries

## 9. TODO Management System

The TODO Management System enables the z3ed AI agent to create, track, and execute complex multi-step tasks with dependency management and prioritization.

### Core Capabilities
- Create TODO items with priorities.
- Track task status (pending, in_progress, completed, blocked, cancelled).
- Manage dependencies between tasks.
- Generate execution plans.
- Persist data in JSON.
- Organize by category.
- Record tool/function usage per task.

### Storage Location
TODOs are persisted to: `~/.yaze/agent/todos.json` (macOS/Linux) or `%APPDATA%/yaze/agent/todos.json` (Windows)

## 10. CLI Output & Help System

The `z3ed` CLI features a modernized output system designed to be clean for users and informative for developers.

### Verbose Logging

By default, `z3ed` provides clean, user-facing output. For detailed debugging, including API calls and internal state, use the `--verbose` flag.

**Default (Clean):**
```bash
AI Provider: gemini
Model: gemini-2.5-flash
Waiting for response...
Calling tool: resource-list (type=room)
Tool executed successfully
```

**Verbose Mode:**
```bash
# z3ed agent simple-chat "What is room 5?" --verbose
AI Provider: gemini
Model: gemini-2.5-flash
[DEBUG] Initializing Gemini service...
[DEBUG] Function calling: disabled
[DEBUG] Using curl for HTTPS request...
Waiting for response...
[DEBUG] Parsing response...
Calling tool: resource-list (type=room)
Tool executed successfully
```

### Hierarchical Help System

The help system is organized by category for easy navigation.

-   **Main Help**: `z3ed --help` or `z3ed -h` shows a high-level overview of command categories.
-   **Category Help**: `z3ed help <category>` provides detailed information for a specific group of commands (e.g., `agent`, `patch`, `rom`).

## 10. Collaborative Sessions & Multimodal Vision

### Overview

YAZE supports real-time collaboration for ROM hacking through dual modes: **Local** (filesystem-based) for same-machine collaboration, and **Network** (WebSocket-based via yaze-server v2.0) for internet-based collaboration with advanced features including ROM synchronization, snapshot sharing, and AI agent integration.

---

### Local Collaboration Mode

Perfect for multiple YAZE instances on the same machine or cloud-synced folders (Dropbox, iCloud).

#### How to Use

1. Open YAZE → **Debug → Agent Chat**
2. Select **"Local"** mode
3. **Host a Session:**
   - Enter session name: `Evening ROM Hack`
   - Click **"Host Session"**
   - Share the 6-character code (e.g., `ABC123`)
4. **Join a Session:**
   - Enter the session code
   - Click **"Join Session"**
   - Chat history syncs automatically

#### Features

- **Shared History**: `~/.yaze/agent/sessions/<code>_history.json`
- **Auto-Sync**: 2-second polling for new messages
- **Participant Tracking**: Real-time participant list
- **Toast Notifications**: Get notified when collaborators send messages
- **Zero Setup**: No server required

#### Cloud Folder Workaround

Enable internet collaboration without a server:

```bash
# Link your sessions directory to Dropbox/iCloud
ln -s ~/Dropbox/yaze-sessions ~/.yaze/agent/sessions

# Have your collaborator do the same
# Now you can collaborate through cloud sync!
```

---

### Network Collaboration Mode (yaze-server v2.0)

Real-time collaboration over the internet with advanced features powered by the yaze-server v2.0.

#### Requirements

- **Server**: Node.js 18+ with yaze-server running
- **Client**: YAZE built with `-DYAZE_WITH_GRPC=ON` and `-DZ3ED_AI=ON`
- **Network**: Connectivity between collaborators

#### Server Setup

**Option 1: Using z3ed CLI**
   ```bash
   z3ed collab start [--port=8765]
```

**Option 2: Manual Launch**
```bash
cd /path/to/yaze-server
npm install
npm start

# Server starts on http://localhost:8765
# Health check: curl http://localhost:8765/health
```

**Option 3: Docker**
```bash
docker build -t yaze-server .
docker run -p 8765:8765 yaze-server
```

#### Client Connection

1. Open YAZE → **Debug → Agent Chat**
2. Select **"Network"** mode
3. Enter server URL: `ws://localhost:8765` (or remote server)
4. Click **"Connect to Server"**
5. Host or join sessions like local mode

#### Core Features

**Session Management:**
- Unique 6-character session codes
- Participant tracking with join/leave notifications
- Real-time message broadcasting
- Persistent chat history

**Connection Management:**
- Health monitoring endpoints (`/health`, `/metrics`)
- Graceful shutdown notifications
- Automatic cleanup of inactive sessions
- Rate limiting (100 messages/minute per IP)

#### Advanced Features (v2.0)

**ROM ROM Synchronization**
Share ROM edits in real-time:
- Send base64-encoded diffs to all participants
- Automatic ROM hash tracking
- Size limit: 5MB per diff
- Conflict detection via hash comparison

**Snapshot Multimodal Snapshot Sharing**
Share screenshots and images:
- Capture and share specific editor views
- Support for multiple snapshot types (overworld, dungeon, sprite, etc.)
- Base64 encoding for efficient transfer
- Size limit: 10MB per snapshot

**Proposal Proposal Management**
Collaborative proposal workflow:
- Share AI-generated proposals with all participants
- Track proposal status: pending, accepted, rejected
- Real-time status updates broadcast to all users
- Proposal history tracked in server database

**AI Agent Integration**
Server-routed AI queries:
- Send queries through the collaboration server
- Shared AI responses visible to all participants
- Query history tracked in database
- Optional: Disable AI per session

#### Protocol Reference

The server uses JSON WebSocket messages over HTTP/WebSocket transport.

**Client → Server Messages:**

```json
// Host Session (v2.0 with optional ROM hash and AI control)
{
  "type": "host_session",
  "payload": {
    "session_name": "My Session",
    "username": "alice",
    "rom_hash": "abc123...",  // optional
    "ai_enabled": true         // optional, default true
  }
}

// Join Session
{
  "type": "join_session",
  "payload": {
    "session_code": "ABC123",
    "username": "bob"
  }
}

// Chat Message (v2.0 with metadata support)
{
  "type": "chat_message",
  "payload": {
    "sender": "alice",
    "message": "Hello!",
    "message_type": "chat",    // optional: chat, system, ai
    "metadata": {...}          // optional metadata
  }
}

// ROM Sync (NEW in v2.0)
{
  "type": "rom_sync",
  "payload": {
    "sender": "alice",
    "diff_data": "base64_encoded_diff...",
    "rom_hash": "sha256_hash"
  }
}

// Snapshot Share (NEW in v2.0)
{
  "type": "snapshot_share",
  "payload": {
    "sender": "alice",
    "snapshot_data": "base64_encoded_image...",
    "snapshot_type": "overworld_editor"
  }
}

// Proposal Share (NEW in v2.0)
{
  "type": "proposal_share",
  "payload": {
    "sender": "alice",
    "proposal_data": {
      "title": "Add new sprite",
      "description": "...",
      "changes": [...]
    }
  }
}

// Proposal Update (NEW in v2.0)
{
  "type": "proposal_update",
  "payload": {
    "proposal_id": "uuid",
    "status": "accepted"  // pending, accepted, rejected
  }
}

// AI Query (NEW in v2.0)
{
  "type": "ai_query",
  "payload": {
    "username": "alice",
    "query": "What enemies are in the eastern palace?"
  }
}

// Leave Session
{ "type": "leave_session" }

// Ping
{ "type": "ping" }
```

**Server → Client Messages:**

```json
// Session Hosted
{
  "type": "session_hosted",
  "payload": {
    "session_id": "uuid",
    "session_code": "ABC123",
    "session_name": "My Session",
    "participants": ["alice"],
    "rom_hash": "abc123...",
    "ai_enabled": true
  }
}

// Session Joined
{
  "type": "session_joined",
  "payload": {
    "session_id": "uuid",
    "session_code": "ABC123",
    "session_name": "My Session",
    "participants": ["alice", "bob"],
    "messages": [...]
  }
}

// Chat Message (broadcast)
{
  "type": "chat_message",
  "payload": {
    "sender": "alice",
    "message": "Hello!",
    "timestamp": 1709567890123,
    "message_type": "chat",
    "metadata": null
  }
}

// ROM Sync (broadcast, NEW in v2.0)
{
  "type": "rom_sync",
  "payload": {
    "sync_id": "uuid",
    "sender": "alice",
    "diff_data": "base64...",
    "rom_hash": "sha256...",
    "timestamp": 1709567890123
  }
}

// Snapshot Shared (broadcast, NEW in v2.0)
{
  "type": "snapshot_shared",
  "payload": {
    "snapshot_id": "uuid",
    "sender": "alice",
    "snapshot_data": "base64...",
    "snapshot_type": "overworld_editor",
    "timestamp": 1709567890123
  }
}

// Proposal Shared (broadcast, NEW in v2.0)
{
  "type": "proposal_shared",
  "payload": {
    "proposal_id": "uuid",
    "sender": "alice",
    "proposal_data": {...},
    "status": "pending",
    "timestamp": 1709567890123
  }
}

// Proposal Updated (broadcast, NEW in v2.0)
{
  "type": "proposal_updated",
  "payload": {
    "proposal_id": "uuid",
    "status": "accepted",
    "timestamp": 1709567890123
  }
}

// AI Response (broadcast, NEW in v2.0)
{
  "type": "ai_response",
  "payload": {
    "query_id": "uuid",
    "username": "alice",
    "query": "What enemies are in the eastern palace?",
    "response": "The eastern palace contains...",
    "timestamp": 1709567890123
  }
}

// Participant Events
{
  "type": "participant_joined",  // or "participant_left"
  "payload": {
    "username": "bob",
    "participants": ["alice", "bob"]
  }
}

// Server Shutdown (NEW in v2.0)
{
  "type": "server_shutdown",
  "payload": {
    "message": "Server is shutting down. Please reconnect later."
  }
}

// Pong
{
  "type": "pong",
  "payload": { "timestamp": 1709567890123 }
}

// Error
{
  "type": "error",
  "payload": { "error": "Session ABC123 not found" }
}
```

#### Server Configuration

**Environment Variables:**
- `PORT` - Server port (default: 8765)
- `ENABLE_AI_AGENT` - Enable AI agent integration (default: true)
- `AI_AGENT_ENDPOINT` - External AI agent endpoint URL

**Rate Limiting:**
- Window: 60 seconds
- Max messages: 100 per IP per window
- Max snapshot size: 10 MB
- Max ROM diff size: 5 MB

#### Database Schema (Server v2.0)

The server uses SQLite with the following tables:

- **sessions**: Session metadata, ROM hash, AI enabled flag
- **participants**: User tracking with last_seen timestamps
- **messages**: Chat history with message types and metadata
- **rom_syncs**: ROM diff history with hashes
- **snapshots**: Shared screenshots and images
- **proposals**: AI proposal tracking with status
- **agent_interactions**: AI query and response history

#### Deployment

**Heroku:**
```bash
cd /path/to/yaze-server
heroku create yaze-collab
git push heroku main
heroku config:set ENABLE_AI_AGENT=true
```

**VPS (with PM2):**
```bash
git clone https://github.com/scawful/yaze-server
   cd yaze-server
   npm install
npm install -g pm2
pm2 start server.js --name yaze-collab
pm2 startup
pm2 save
```

**Docker:**
```bash
docker build -t yaze-server .
docker run -p 8765:8765 -e ENABLE_AI_AGENT=true yaze-server
```

#### Testing

**Health Check:**
```bash
curl http://localhost:8765/health
curl http://localhost:8765/metrics
```

**Test with wscat:**
```bash
npm install -g wscat
wscat -c ws://localhost:8765

# Host session
> {"type":"host_session","payload":{"session_name":"Test","username":"alice","ai_enabled":true}}

# Join session (in another terminal)
> {"type":"join_session","payload":{"session_code":"ABC123","username":"bob"}}

# Send message
> {"type":"chat_message","payload":{"sender":"alice","message":"Hello!"}}
```

#### Security Considerations

**Current Implementation:**
Warning: Basic security - suitable for trusted networks
- No authentication or encryption by default
- Plain text message transmission
- Session codes are the only access control

**Recommended for Production:**
1. **SSL/TLS**: Use `wss://` with valid certificates
2. **Authentication**: Implement JWT tokens or OAuth
3. **Session Passwords**: Optional per-session passwords
4. **Persistent Storage**: Use PostgreSQL/MySQL for production
5. **Monitoring**: Add logging to CloudWatch/Datadog
6. **Backup**: Regular database backups

---

### Multimodal Vision (Gemini)

Analyze screenshots of your ROM editor using Gemini's vision capabilities for visual feedback and suggestions.

#### Requirements

- `GEMINI_API_KEY` environment variable set
- YAZE built with `-DYAZE_WITH_GRPC=ON` and `-DZ3ED_AI=ON`

#### Capture Modes

**Full Window**: Captures the entire YAZE application window

**Active Editor** (default): Captures only the currently focused editor window

**Specific Window**: Captures a named window (e.g., "Overworld Editor")

#### How to Use

1. Open **Debug → Agent Chat**
2. Expand **"Gemini Multimodal (Preview)"** panel
3. Select capture mode:
   - - Full Window
   - * Active Editor (default)
   - - Specific Window
4. If Specific Window, enter window name: `Overworld Editor`
5. Click **"Capture Snapshot"**
6. Enter prompt: `"What issues do you see with this layout?"`
7. Click **"Send to Gemini"**

#### Example Prompts

- "Analyze the tile placement in this overworld screen"
- "What's wrong with the palette colors in this screenshot?"
- "Suggest improvements for this dungeon room layout"
- "Does this screen follow good level design practices?"
- "Are there any visual glitches or tile conflicts?"
- "How can I improve the composition of this room?"

The AI response appears in your chat history and can reference specific details from the screenshot. In network collaboration mode, multimodal snapshots can be shared with all participants.

---

### Architecture

```
┌──────────────────────────────────────────────────────┐
│                    YAZE Editor                       │
│                                                      │
│  ┌─────────────────────────────────────────────┐   │
│  │         Agent Chat Widget (ImGui)           │   │
│  │                                             │   │
│  │  [Collaboration Panel]                      │   │
│  │  ├─ Local Mode (filesystem)   Working     │   │
│  │  └─ Network Mode (websocket)  Working     │   │
│  │                                             │   │
│  │  [Multimodal Panel]                         │   │
│  │  ├─ Capture Mode Selection    Working     │   │
│  │  ├─ Screenshot Capture         Working     │   │
│  │  └─ Send to Gemini            Working     │   │
│  └─────────────────────────────────────────────┘   │
│           │                    │                    │
│           ▼                    ▼                    │
│  ┌──────────────────┐  ┌──────────────────┐       │
│  │  Collaboration   │  │  Screenshot      │       │
│  │  Coordinators    │  │  Utils           │       │
│  └──────────────────┘  └──────────────────┘       │
│           │                    │                    │
└───────────┼────────────────────┼────────────────────┘
            │                    │
            ▼                    ▼
┌──────────────────┐    ┌──────────────────┐
│  ~/.yaze/agent/  │    │  Gemini Vision   │
│    sessions/     │    │      API         │
└──────────────────┘    └──────────────────┘
            │
            ▼
┌──────────────────────────────────────────┐
│         yaze-server v2.0                 │
│  - WebSocket Server (Node.js)            │
│  - SQLite Database                       │
│  - Session Management                    │
│  - ROM Sync                              │
│  - Snapshot Sharing                      │
│  - Proposal Management                   │
│  - AI Agent Integration                  │
└──────────────────────────────────────────┘
```

---

### Troubleshooting

**"Failed to start collaboration server"**
- Ensure Node.js is installed: `node --version`
- Check port availability: `lsof -i :8765`
- Verify server directory exists

**"Not connected to collaboration server"**
- Verify server is running: `curl http://localhost:8765/health`
- Check firewall settings
- Confirm server URL is correct

**"Harness client cannot reach gRPC"**
- Confirm YAZE was built with `-DYAZE_WITH_GRPC=ON` and the harness server is enabled via **Debug → Preferences → Automation**.
- Run `z3ed agent test ping --grpc localhost:50051` to verify the CLI can reach the embedded harness endpoint; restart YAZE if the ping fails.
- Inspect the Agent Chat **Harness Monitor** panel for connection status; use **Reconnect** to re-bind if the harness server was restarted.

**"Widget discovery returns empty"**
- Ensure the target ImGui window is open; the harness only indexes visible widgets.
- Toggle **Automation → Enable Introspection** in YAZE to allow the gRPC server to expose widget metadata.
- Run `z3ed agent test discover --window "ProposalDrawer"` to scope discovery to the window you have open.

**"Session not found"**
- Verify session code is correct (case-insensitive)
- Check if session expired (server restart clears sessions)
- Try hosting a new session

**"Rate limit exceeded"**
- Server enforces 100 messages per minute per IP
- Wait 60 seconds and try again

**Participants not updating**
- Click "Refresh Session" button
- Check network connectivity
- Verify server logs for errors

**Messages not broadcasting**
- Ensure all clients are in the same session
- Check session code matches exactly
- Verify network connectivity between client and server

---

### References

- **Server Repository**: [yaze-server](https://github.com/scawful/yaze-server)
- **Agent Editor Docs**: `src/app/editor/agent/README.md`
- **Integration Guide**: `docs/z3ed/YAZE_SERVER_V2_INTEGRATION.md`

## 11. Roadmap & Implementation Status

**Last Updated**: October 11, 2025

###  Completed

-   **Core Infrastructure**: Resource-oriented CLI, proposal workflow, sandbox manager, and resource catalog are all production-ready.
-   **AI Backends**: Both Ollama (local) and Gemini (cloud) are operational.
-   **Conversational Agent**: The agent service, tool dispatcher (with 5 read-only tools), TUI/simple chat interfaces, and ImGui editor chat widget with persistent history.
-   **GUI Test Harness**: A comprehensive GUI testing platform with introspection, widget discovery, recording/replay, and CI integration support.
-   **Collaborative Sessions**:
    - Local filesystem-based collaborative editing with shared chat history
    - Network WebSocket-based collaboration via yaze-server v2.0
    - Dual-mode support (Local/Network) with seamless switching
-   **Multimodal Vision**: Gemini vision API integration with multiple capture modes (Full Window, Active Editor, Specific Window).
-   **yaze-server v2.0**: Production-ready Node.js WebSocket server with:
    - ROM synchronization with diff broadcasting
    - Multimodal snapshot sharing
    - Collaborative proposal management
    - AI agent integration and query routing
    - Health monitoring and metrics endpoints
    - Rate limiting and security features

### 📌 Current Progress Highlights (October 5, 2025)

-   **Agent Platform Expansion**: AgentEditor now delivers full bot lifecycle controls, live prompt editing, multi-session management, and metrics synchronized with chat history and popup views.
-   **Enhanced Chat Popup**: Left-side AgentChatHistoryPopup evolved into a theme-aware, fully interactive mini-chat with inline sending, multimodal capture, filtering, and proposal indicators to minimize context switching.
-   **Proposal Workflow**: Sandbox-backed proposal review is end-to-end with inline quick actions, ProposalDrawer tie-ins, ROM version protections, and collaboration-aware approvals.
-   **Collaboration & Networking**: yaze-server v2.0 protocol, cross-platform WebSocket client, collaboration panel, and gRPC ROM service unlock real-time edits, diff sharing, and remote automation.
-   **AI & Automation Stack**: Proactive prompt v3, native Gemini function calling, learn/TODO systems, GUI automation planners, multimodal vision suite, and dashboard-surfaced test harness coverage broaden intelligent tooling.

###  Active & Next Steps

1.  **CLI Command Refactoring (Phase 2)**: Complete migration of tool_commands.cc to use new abstraction layer. Refactor 15+ commands to eliminate ~1300 lines of duplication. Add comprehensive unit tests. (See [Command Abstraction Guide](C5-z3ed-command-abstraction.md))
2.  **Harden Live LLM Tooling**: Finalize native function-calling loops with Ollama/Gemini and broaden safe read-only tool coverage for dialogue, sprite, and region introspection.
3.  **Real-Time Transport Upgrade**: Replace HTTP polling with full WebSocket support across CLI/editor and expose ROM sync, snapshot, and proposal voting controls directly inside the AgentChat widget.
4.  **Cross-Platform Certification**: Complete Windows validation for AI, gRPC, collaboration, and build presets leveraging the documented vcpkg workflow.
5.  **UI/UX Roadmap Delivery**: Advance EditorManager menu refactors, enhanced hex/palette tooling, Vim-mode terminal chat, and richer popup affordances such as search, export, and resizing.
6.  **Collaboration Safeguards**: Layer encrypted sessions, conflict resolution flows, AI-assisted proposal review, and deeper gRPC ROM service integrations to strengthen multi-user safety.
7.  **Testing & Observability**: Automate multimodal/GUI harness scenarios, add performance benchmarks, and enable export/replay pipelines for the Test Dashboard.
8.  **Hybrid Workflow Examples**: Document and dogfood end-to-end CLI→GUI automation loops (plan/run/diff + harness replay) with screenshots and recorded sessions.
9.  **Automation API Unification**: Extract a reusable harness automation API consumed by both CLI `agent test` commands and the Agent Chat widget to prevent serialization drift.
10. **UI Abstraction Cleanup**: Introduce dedicated presenter/controller layers so `editor_manager.cc` delegates to automation and collaboration services, keeping ImGui widgets declarative.

###  Recently Completed (v0.2.2-alpha - October 12, 2025)

#### Emulator Debugging Infrastructure (NEW) 🔍
-   **Advanced Debugging Service**: Complete gRPC EmulatorService implementation with breakpoints, memory inspection, step execution, and CPU state access
-   **Breakpoint Management**: Set execute/read/write/access breakpoints with conditional support for systematic debugging
-   **Memory Introspection**: Read/write WRAM, hardware registers ($4xxx), and ROM from running emulator without rebuilds
-   **Execution Control**: Step instruction-by-instruction, run to breakpoint, pause/resume with full CPU state capture
-   **AI-Driven Debugging**: Function schemas for 12 new emulator tools enabling natural language debugging sessions
-   **Reproducible Scripts**: AI can generate bash scripts with breakpoint sequences for regression testing
-   **Documentation**: Comprehensive [Emulator Debugging Guide](emulator-debugging-guide.md) with real-world examples

#### Benefits for AI Agents
-   **15min vs 3hr debugging**: Systematic tool-based approach vs manual print-debug cycles
-   **No rebuilds required**: Set breakpoints and read state without recompiling
-   **Precise observation**: Pause at exact addresses, read memory at critical moments
-   **Collaborative debugging**: Share tool call sequences and findings in chat
-   **Example**: Debugging ALTTP input issue went from 15 rebuild cycles to 6 tool calls (see `docs/examples/ai-debug-input-issue.md`)

###  Previously Completed (v0.2.1-alpha - October 11, 2025)

#### CLI Architecture Improvements
-   **Command Abstraction Layer**: Three-tier abstraction system (`CommandContext`, `ArgumentParser`, `OutputFormatter`) to eliminate code duplication across CLI commands
-   **CommandHandler Base Class**: Structured base class for consistent command implementation with automatic context management
-   **Refactoring Framework**: Complete migration guide and examples showing 50-60% code reduction per command
-   **Documentation**: Comprehensive [Command Abstraction Guide](C5-z3ed-command-abstraction.md) with migration checklist and testing strategies

#### Code Quality & Maintainability
-   **Duplication Elimination**: New abstraction layer removes ~1300 lines of duplicated code across tool commands
-   **Consistent Patterns**: All commands now follow unified structure for argument parsing, ROM loading, and output formatting
-   **Better Testing**: Each component (context, parser, formatter) can be unit tested independently
-   **AI-Friendly**: Predictable command structure makes it easier for AI to generate and validate tool calls

###  Previously Completed (v0.2.0-alpha - October 5, 2025)

#### Core AI Features
-   **Enhanced System Prompt (v3)**: Proactive tool chaining with implicit iteration to minimize back-and-forth conversations
-   **Learn Command**: Full implementation with preferences, ROM patterns, project context, and conversation memory storage
-   **Native Gemini Function Calling**: Upgraded from manual curl to native function calling API with automatic tool schema generation
-   **Multimodal Vision Testing**: Comprehensive test suite for Gemini vision capabilities with screenshot integration
-   **AI-Controlled GUI Automation**: Natural language parsing (`AIActionParser`) and test script generation (`GuiActionGenerator`) for automated tile placement
-   **TODO Management System**: Full `TodoManager` class with CRUD operations, CLI commands, dependency tracking, execution planning, and JSON persistence.

#### Version Management & Protection
-   **ROM Version Management System**: `RomVersionManager` with automatic snapshots, safe points, corruption detection, and rollback capabilities
-   **Proposal Approval Framework**: `ProposalApprovalManager` with host/majority/unanimous voting modes to protect ROM from unwanted changes

#### Networking & Collaboration (NEW)
-   **Cross-Platform WebSocket Client**: `WebSocketClient` with Windows/macOS/Linux support using httplib
-   **Collaboration Service**: `CollaborationService` integrating version management with real-time networking
-   **yaze-server v2.0 Protocol**: Extended with proposal voting (`proposal_vote`, `proposal_vote_received`)
-   **z3ed Network Commands**: CLI commands for remote collaboration (`net connect`, `net join`, `proposal submit/wait`)
-   **Collaboration UI Panel**: `CollaborationPanel` widget with version history, ROM sync tracking, snapshot gallery, and approval workflow
-   **gRPC ROM Service**: Complete protocol buffer and implementation for remote ROM manipulation (pending build integration)

#### UI/UX Enhancements
-   **Welcome Screen Enhancement**: Dynamic theme integration, Zelda-themed animations, and project cards.
-   **Component Refactoring**: `PaletteWidget` renamed and moved, UI organization improved (`app/editor/ui/` for welcome_screen, editor_selection_dialog, background_renderer).

#### Build System & Infrastructure
-   **gRPC Windows Build Optimization**: vcpkg integration for 10-20x faster Windows builds, removed abseil-cpp submodule
-   **Cross-Platform Networking**: Native socket support (ws2_32 on Windows, BSD sockets on Unix)
-   **Namespace Refactoring**: Created `app/net` namespace for networking components
-   **Improved Documentation**: Consolidated architecture, enhancement plans, networking guide, and build instructions with JSON-first approach
-   **Build System Improvements**: `mac-ai` preset, proto fixes, and updated GEMINI.md with AI build policies.

## 12. Troubleshooting

-   **"Build with -DZ3ED_AI=ON" warning**: AI features are disabled. Rebuild with the flag to enable them.
-   **"gRPC not available" error**: GUI testing is disabled. Rebuild with `-DYAZE_WITH_GRPC=ON`.
-   **AI generates invalid commands**: The prompt may be vague. Use specific coordinates, tile IDs, and map context.
-   **Chat mode freezes**: Use `agent simple-chat` instead of the FTXUI-based `agent chat` for better stability, especially in scripts.