feat: Expand network collaboration capabilities and enhance agent functionality

- Updated NetworkCollaborationCoordinator to support advanced features including ROM synchronization, snapshot sharing, and proposal management.
- Enhanced message handling for chat, ROM sync, snapshots, proposals, and AI queries with structured payloads.
- Introduced new callback mechanisms for handling incoming events related to ROM sync, snapshots, proposals, and AI responses.
- Improved documentation in the agent directory to reflect new functionalities and usage examples for network collaboration.
- Added comprehensive README for the Agent Editor module detailing architecture, usage, and advanced features.
This commit is contained in:
scawful
2025-10-04 19:57:27 -04:00
parent 588db01df6
commit 3cdeeac5a3
4 changed files with 1178 additions and 73 deletions

View File

@@ -207,89 +207,582 @@ The help system is organized by category for easy navigation.
## 9. Collaborative Sessions & Multimodal Vision ## 9. Collaborative Sessions & Multimodal Vision
### Collaborative Sessions ### Overview
Z3ED supports both local (filesystem-based) and network (WebSocket-based) collaborative sessions for sharing chat conversations and working together on ROM hacks. YAZE supports real-time collaboration for ROM hacking through dual modes: **Local** (filesystem-based) for same-machine collaboration, and **Network** (WebSocket-based via yaze-server v2.0) for internet-based collaboration with advanced features including ROM synchronization, snapshot sharing, and AI agent integration.
#### Local Collaboration Mode ---
**How to Use:** ### Local Collaboration Mode
1. Open YAZE and go to **Debug → Agent Chat**
2. In the Agent Chat widget, select **"Local"** mode Perfect for multiple YAZE instances on the same machine or cloud-synced folders (Dropbox, iCloud).
#### How to Use
1. Open YAZE → **Debug → Agent Chat**
2. Select **"Local"** mode
3. **Host a Session:** 3. **Host a Session:**
- Enter a session name (e.g., "Evening ROM Hack") - Enter session name: `Evening ROM Hack`
- Click "Host" - Click **"Host Session"**
- Share the generated 6-character code (e.g., `ABC123`) with collaborators on the same machine - Share the 6-character code (e.g., `ABC123`)
4. **Join a Session:** 4. **Join a Session:**
- Enter the session code provided by the host - Enter the session code
- Click "Join" - Click **"Join Session"**
- Your chat will now sync with others in the session - Chat history syncs automatically
**Features:** #### Features
- Shared chat history stored in `~/.yaze/agent/sessions/<code>_history.json`
- Automatic synchronization when sending/receiving messages (2-second polling)
- Participant list shows all connected users
- Perfect for multiple YAZE instances on the same machine
#### Network Collaboration Mode (NEW!) - **Shared History**: `~/.yaze/agent/sessions/<code>_history.json`
- **Auto-Sync**: 2-second polling for new messages
- **Participant Tracking**: Real-time participant list
- **Toast Notifications**: Get notified when collaborators send messages
- **Zero Setup**: No server required
**Requirements:** #### Cloud Folder Workaround
- Node.js installed on the server machine
- `yaze-collab-server` repository cloned alongside `yaze`
- Network connectivity between collaborators
**Setup:** Enable internet collaboration without a server:
1. **Start the Collaboration Server:**
```bash
# Link your sessions directory to Dropbox/iCloud
ln -s ~/Dropbox/yaze-sessions ~/.yaze/agent/sessions
# Have your collaborator do the same
# Now you can collaborate through cloud sync!
```
---
### Network Collaboration Mode (yaze-server v2.0)
Real-time collaboration over the internet with advanced features powered by the yaze-server v2.0.
#### Requirements
- **Server**: Node.js 18+ with yaze-server running
- **Client**: YAZE built with `-DYAZE_WITH_GRPC=ON` and `-DZ3ED_AI=ON`
- **Network**: Connectivity between collaborators
#### Server Setup
**Option 1: Using z3ed CLI**
```bash ```bash
# From z3ed CLI:
z3ed collab start [--port=8765] z3ed collab start [--port=8765]
```
# Or manually:
cd yaze-collab-server **Option 2: Manual Launch**
```bash
cd /path/to/yaze-server
npm install
npm start
# Server starts on http://localhost:8765
# Health check: curl http://localhost:8765/health
```
**Option 3: Docker**
```bash
docker build -t yaze-server .
docker run -p 8765:8765 yaze-server
```
#### Client Connection
1. Open YAZE → **Debug → Agent Chat**
2. Select **"Network"** mode
3. Enter server URL: `ws://localhost:8765` (or remote server)
4. Click **"Connect to Server"**
5. Host or join sessions like local mode
#### Core Features
**Session Management:**
- Unique 6-character session codes
- Participant tracking with join/leave notifications
- Real-time message broadcasting
- Persistent chat history
**Connection Management:**
- Health monitoring endpoints (`/health`, `/metrics`)
- Graceful shutdown notifications
- Automatic cleanup of inactive sessions
- Rate limiting (100 messages/minute per IP)
#### Advanced Features (v2.0)
**🎮 ROM Synchronization**
Share ROM edits in real-time:
- Send base64-encoded diffs to all participants
- Automatic ROM hash tracking
- Size limit: 5MB per diff
- Conflict detection via hash comparison
**📸 Multimodal Snapshot Sharing**
Share screenshots and images:
- Capture and share specific editor views
- Support for multiple snapshot types (overworld, dungeon, sprite, etc.)
- Base64 encoding for efficient transfer
- Size limit: 10MB per snapshot
**💡 Proposal Management**
Collaborative proposal workflow:
- Share AI-generated proposals with all participants
- Track proposal status: pending, accepted, rejected
- Real-time status updates broadcast to all users
- Proposal history tracked in server database
**🤖 AI Agent Integration**
Server-routed AI queries:
- Send queries through the collaboration server
- Shared AI responses visible to all participants
- Query history tracked in database
- Optional: Disable AI per session
#### Protocol Reference
The server uses JSON WebSocket messages over HTTP/WebSocket transport.
**Client → Server Messages:**
```json
// Host Session (v2.0 with optional ROM hash and AI control)
{
"type": "host_session",
"payload": {
"session_name": "My Session",
"username": "alice",
"rom_hash": "abc123...", // optional
"ai_enabled": true // optional, default true
}
}
// Join Session
{
"type": "join_session",
"payload": {
"session_code": "ABC123",
"username": "bob"
}
}
// Chat Message (v2.0 with metadata support)
{
"type": "chat_message",
"payload": {
"sender": "alice",
"message": "Hello!",
"message_type": "chat", // optional: chat, system, ai
"metadata": {...} // optional metadata
}
}
// ROM Sync (NEW in v2.0)
{
"type": "rom_sync",
"payload": {
"sender": "alice",
"diff_data": "base64_encoded_diff...",
"rom_hash": "sha256_hash"
}
}
// Snapshot Share (NEW in v2.0)
{
"type": "snapshot_share",
"payload": {
"sender": "alice",
"snapshot_data": "base64_encoded_image...",
"snapshot_type": "overworld_editor"
}
}
// Proposal Share (NEW in v2.0)
{
"type": "proposal_share",
"payload": {
"sender": "alice",
"proposal_data": {
"title": "Add new sprite",
"description": "...",
"changes": [...]
}
}
}
// Proposal Update (NEW in v2.0)
{
"type": "proposal_update",
"payload": {
"proposal_id": "uuid",
"status": "accepted" // pending, accepted, rejected
}
}
// AI Query (NEW in v2.0)
{
"type": "ai_query",
"payload": {
"username": "alice",
"query": "What enemies are in the eastern palace?"
}
}
// Leave Session
{ "type": "leave_session" }
// Ping
{ "type": "ping" }
```
**Server → Client Messages:**
```json
// Session Hosted
{
"type": "session_hosted",
"payload": {
"session_id": "uuid",
"session_code": "ABC123",
"session_name": "My Session",
"participants": ["alice"],
"rom_hash": "abc123...",
"ai_enabled": true
}
}
// Session Joined
{
"type": "session_joined",
"payload": {
"session_id": "uuid",
"session_code": "ABC123",
"session_name": "My Session",
"participants": ["alice", "bob"],
"messages": [...]
}
}
// Chat Message (broadcast)
{
"type": "chat_message",
"payload": {
"sender": "alice",
"message": "Hello!",
"timestamp": 1709567890123,
"message_type": "chat",
"metadata": null
}
}
// ROM Sync (broadcast, NEW in v2.0)
{
"type": "rom_sync",
"payload": {
"sync_id": "uuid",
"sender": "alice",
"diff_data": "base64...",
"rom_hash": "sha256...",
"timestamp": 1709567890123
}
}
// Snapshot Shared (broadcast, NEW in v2.0)
{
"type": "snapshot_shared",
"payload": {
"snapshot_id": "uuid",
"sender": "alice",
"snapshot_data": "base64...",
"snapshot_type": "overworld_editor",
"timestamp": 1709567890123
}
}
// Proposal Shared (broadcast, NEW in v2.0)
{
"type": "proposal_shared",
"payload": {
"proposal_id": "uuid",
"sender": "alice",
"proposal_data": {...},
"status": "pending",
"timestamp": 1709567890123
}
}
// Proposal Updated (broadcast, NEW in v2.0)
{
"type": "proposal_updated",
"payload": {
"proposal_id": "uuid",
"status": "accepted",
"timestamp": 1709567890123
}
}
// AI Response (broadcast, NEW in v2.0)
{
"type": "ai_response",
"payload": {
"query_id": "uuid",
"username": "alice",
"query": "What enemies are in the eastern palace?",
"response": "The eastern palace contains...",
"timestamp": 1709567890123
}
}
// Participant Events
{
"type": "participant_joined", // or "participant_left"
"payload": {
"username": "bob",
"participants": ["alice", "bob"]
}
}
// Server Shutdown (NEW in v2.0)
{
"type": "server_shutdown",
"payload": {
"message": "Server is shutting down. Please reconnect later."
}
}
// Pong
{
"type": "pong",
"payload": { "timestamp": 1709567890123 }
}
// Error
{
"type": "error",
"payload": { "error": "Session ABC123 not found" }
}
```
#### Server Configuration
**Environment Variables:**
- `PORT` - Server port (default: 8765)
- `ENABLE_AI_AGENT` - Enable AI agent integration (default: true)
- `AI_AGENT_ENDPOINT` - External AI agent endpoint URL
**Rate Limiting:**
- Window: 60 seconds
- Max messages: 100 per IP per window
- Max snapshot size: 10 MB
- Max ROM diff size: 5 MB
#### Database Schema (Server v2.0)
The server uses SQLite with the following tables:
- **sessions**: Session metadata, ROM hash, AI enabled flag
- **participants**: User tracking with last_seen timestamps
- **messages**: Chat history with message types and metadata
- **rom_syncs**: ROM diff history with hashes
- **snapshots**: Shared screenshots and images
- **proposals**: AI proposal tracking with status
- **agent_interactions**: AI query and response history
#### Deployment
**Heroku:**
```bash
cd /path/to/yaze-server
heroku create yaze-collab
git push heroku main
heroku config:set ENABLE_AI_AGENT=true
```
**VPS (with PM2):**
```bash
git clone https://github.com/scawful/yaze-server
cd yaze-server
npm install npm install
node server.js npm install -g pm2
``` pm2 start server.js --name yaze-collab
pm2 startup
pm2 save
```
2. **Connect from YAZE:** **Docker:**
- Open YAZE and go to **Debug → Agent Chat** ```bash
- Select **"Network"** mode docker build -t yaze-server .
- Enter server URL (e.g., `ws://localhost:8765`) docker run -p 8765:8765 -e ENABLE_AI_AGENT=true yaze-server
- Click "Connect to Server" ```
3. **Collaborate:** #### Testing
- Host or join sessions just like local mode
- Collaborate with anyone who can reach your server
- Real-time message broadcasting via WebSockets
**Features:** **Health Check:**
- Real-time collaboration over the internet ```bash
- Session management with unique codes curl http://localhost:8765/health
- Participant tracking and notifications curl http://localhost:8765/metrics
- Persistent message history ```
- Perfect for remote pair programming
**Test with wscat:**
```bash
npm install -g wscat
wscat -c ws://localhost:8765
# Host session
> {"type":"host_session","payload":{"session_name":"Test","username":"alice","ai_enabled":true}}
# Join session (in another terminal)
> {"type":"join_session","payload":{"session_code":"ABC123","username":"bob"}}
# Send message
> {"type":"chat_message","payload":{"sender":"alice","message":"Hello!"}}
```
#### Security Considerations
**Current Implementation:**
⚠️ Basic security - suitable for trusted networks
- No authentication or encryption by default
- Plain text message transmission
- Session codes are the only access control
**Recommended for Production:**
1. **SSL/TLS**: Use `wss://` with valid certificates
2. **Authentication**: Implement JWT tokens or OAuth
3. **Session Passwords**: Optional per-session passwords
4. **Persistent Storage**: Use PostgreSQL/MySQL for production
5. **Monitoring**: Add logging to CloudWatch/Datadog
6. **Backup**: Regular database backups
---
### Multimodal Vision (Gemini) ### Multimodal Vision (Gemini)
Ask Gemini to analyze screenshots of your ROM editor to get visual feedback and suggestions. Analyze screenshots of your ROM editor using Gemini's vision capabilities for visual feedback and suggestions.
#### Requirements
**Requirements:**
- `GEMINI_API_KEY` environment variable set - `GEMINI_API_KEY` environment variable set
- YAZE built with `-DYAZE_WITH_GRPC=ON` and `-DZ3ED_AI=ON` - YAZE built with `-DYAZE_WITH_GRPC=ON` and `-DZ3ED_AI=ON`
**How to Use:** #### Capture Modes
1. Open the Agent Chat widget (**Debug → Agent Chat**)
2. Expand the **"Gemini Multimodal (Preview)"** panel **Full Window**: Captures the entire YAZE application window
3. Click **"Capture Map Snapshot"** to take a screenshot of the current view
4. Enter a prompt in the text box (e.g., "What issues do you see with this overworld layout?") **Active Editor** (default): Captures only the currently focused editor window
5. Click **"Send to Gemini"** to get visual analysis
**Specific Window**: Captures a named window (e.g., "Overworld Editor")
#### How to Use
1. Open **Debug → Agent Chat**
2. Expand **"Gemini Multimodal (Preview)"** panel
3. Select capture mode:
- ○ Full Window
- ● Active Editor (default)
- ○ Specific Window
4. If Specific Window, enter window name: `Overworld Editor`
5. Click **"Capture Snapshot"**
6. Enter prompt: `"What issues do you see with this layout?"`
7. Click **"Send to Gemini"**
#### Example Prompts
**Example Prompts:**
- "Analyze the tile placement in this overworld screen" - "Analyze the tile placement in this overworld screen"
- "What's wrong with the palette colors in this screenshot?" - "What's wrong with the palette colors in this screenshot?"
- "Suggest improvements for this dungeon room layout" - "Suggest improvements for this dungeon room layout"
- "Does this screen follow good level design practices?" - "Does this screen follow good level design practices?"
- "Are there any visual glitches or tile conflicts?"
- "How can I improve the composition of this room?"
The AI response will appear in your chat history and can reference specific details from the screenshot. The AI response appears in your chat history and can reference specific details from the screenshot. In network collaboration mode, multimodal snapshots can be shared with all participants.
---
### Architecture
```
┌──────────────────────────────────────────────────────┐
│ YAZE Editor │
│ │
│ ┌─────────────────────────────────────────────┐ │
│ │ Agent Chat Widget (ImGui) │ │
│ │ │ │
│ │ [Collaboration Panel] │ │
│ │ ├─ Local Mode (filesystem) ✓ Working │ │
│ │ └─ Network Mode (websocket) ✓ Working │ │
│ │ │ │
│ │ [Multimodal Panel] │ │
│ │ ├─ Capture Mode Selection ✓ Working │ │
│ │ ├─ Screenshot Capture ✓ Working │ │
│ │ └─ Send to Gemini ✓ Working │ │
│ └─────────────────────────────────────────────┘ │
│ │ │ │
│ ▼ ▼ │
│ ┌──────────────────┐ ┌──────────────────┐ │
│ │ Collaboration │ │ Screenshot │ │
│ │ Coordinators │ │ Utils │ │
│ └──────────────────┘ └──────────────────┘ │
│ │ │ │
└───────────┼────────────────────┼────────────────────┘
│ │
▼ ▼
┌──────────────────┐ ┌──────────────────┐
│ ~/.yaze/agent/ │ │ Gemini Vision │
│ sessions/ │ │ API │
└──────────────────┘ └──────────────────┘
┌──────────────────────────────────────────┐
│ yaze-server v2.0 │
│ - WebSocket Server (Node.js) │
│ - SQLite Database │
│ - Session Management │
│ - ROM Sync │
│ - Snapshot Sharing │
│ - Proposal Management │
│ - AI Agent Integration │
└──────────────────────────────────────────┘
```
---
### Troubleshooting
**"Failed to start collaboration server"**
- Ensure Node.js is installed: `node --version`
- Check port availability: `lsof -i :8765`
- Verify server directory exists
**"Not connected to collaboration server"**
- Verify server is running: `curl http://localhost:8765/health`
- Check firewall settings
- Confirm server URL is correct
**"Session not found"**
- Verify session code is correct (case-insensitive)
- Check if session expired (server restart clears sessions)
- Try hosting a new session
**"Rate limit exceeded"**
- Server enforces 100 messages per minute per IP
- Wait 60 seconds and try again
**Participants not updating**
- Click "Refresh Session" button
- Check network connectivity
- Verify server logs for errors
**Messages not broadcasting**
- Ensure all clients are in the same session
- Check session code matches exactly
- Verify network connectivity between client and server
---
### References
- **Server Repository**: [yaze-server](https://github.com/scawful/yaze-server)
- **Agent Editor Docs**: `src/app/editor/agent/README.md`
- **Integration Guide**: `docs/z3ed/YAZE_SERVER_V2_INTEGRATION.md`
## 10. Roadmap & Implementation Status ## 10. Roadmap & Implementation Status
@@ -301,15 +794,26 @@ The AI response will appear in your chat history and can reference specific deta
- **AI Backends**: Both Ollama (local) and Gemini (cloud) are operational. - **AI Backends**: Both Ollama (local) and Gemini (cloud) are operational.
- **Conversational Agent**: The agent service, tool dispatcher (with 5 read-only tools), TUI/simple chat interfaces, and ImGui editor chat widget with persistent history. - **Conversational Agent**: The agent service, tool dispatcher (with 5 read-only tools), TUI/simple chat interfaces, and ImGui editor chat widget with persistent history.
- **GUI Test Harness**: A comprehensive GUI testing platform with introspection, widget discovery, recording/replay, and CI integration support. - **GUI Test Harness**: A comprehensive GUI testing platform with introspection, widget discovery, recording/replay, and CI integration support.
- **Collaborative Sessions**: Local filesystem-based collaborative editing with shared chat history. - **Collaborative Sessions**:
- **Multimodal Vision**: Gemini vision API integration for analyzing ROM editor screenshots. - Local filesystem-based collaborative editing with shared chat history
- Network WebSocket-based collaboration via yaze-server v2.0
- Dual-mode support (Local/Network) with seamless switching
- **Multimodal Vision**: Gemini vision API integration with multiple capture modes (Full Window, Active Editor, Specific Window).
- **yaze-server v2.0**: Production-ready Node.js WebSocket server with:
- ROM synchronization with diff broadcasting
- Multimodal snapshot sharing
- Collaborative proposal management
- AI agent integration and query routing
- Health monitoring and metrics endpoints
- Rate limiting and security features
### 🚧 Active & Next Steps ### 🚧 Active & Next Steps
1. **Live LLM Testing (1-2h)**: Verify function calling with real models (Ollama/Gemini). 1. **Live LLM Testing (1-2h)**: Verify function calling with real models (Ollama/Gemini).
2. **Expand Tool Coverage (8-10h)**: Add new read-only tools for inspecting dialogue, sprites, and regions. 2. **Expand Tool Coverage (8-10h)**: Add new read-only tools for inspecting dialogue, sprites, and regions.
3. **Network-Based Collaboration**: Upgrade the filesystem-based collaboration to support remote connections via WebSockets or gRPC. 3. **Full WebSocket Protocol (2-3 days)**: Upgrade from HTTP polling to true WebSocket frames using ixwebsocket or websocketpp.
4. **Windows Cross-Platform Testing (8-10h)**: Validate `z3ed` and the test harness on Windows. 4. **Collaboration UI Enhancements (1 day)**: Add UI elements for ROM sync, snapshot sharing, and proposal management in the Agent Chat widget.
5. **Windows Cross-Platform Testing (8-10h)**: Validate `z3ed` and the test harness on Windows.
## 11. Troubleshooting ## 11. Troubleshooting

View File

@@ -0,0 +1,286 @@
# Agent Editor Module
This directory contains all agent and network collaboration functionality for yaze and [yaze-server](https://github.com/scawful/yaze-server).
## Overview
The Agent Editor module provides AI-powered assistance and collaborative editing features for ROM hacking projects. It integrates conversational AI agents, local and network-based collaboration, and multimodal (vision) capabilities.
## Architecture
### Core Components
#### AgentEditor (`agent_editor.h/cc`)
The main manager class that coordinates all agent-related functionality:
- Manages the chat widget lifecycle
- Coordinates local and network collaboration modes
- Provides high-level API for session management
- Handles multimodal callbacks (screenshot capture, Gemini integration)
**Key Features:**
- Unified interface for all agent functionality
- Mode switching between local and network collaboration
- ROM context management for agent queries
- Integration with toast notifications and proposal drawer
#### AgentChatWidget (`agent_chat_widget.h/cc`)
ImGui-based chat interface for interacting with AI agents:
- Real-time conversation with AI assistant
- Message history with persistence
- Proposal preview and quick actions
- Collaboration panel with session controls
- Multimodal panel for screenshot capture and Gemini queries
**Features:**
- Split-panel layout (session details + chat history)
- Auto-scrolling chat with timestamps
- JSON response formatting
- Table data visualization
- Proposal metadata display
#### AgentChatHistoryCodec (`agent_chat_history_codec.h/cc`)
Serialization/deserialization for chat history:
- JSON-based persistence (when built with `YAZE_WITH_JSON`)
- Graceful degradation when JSON support unavailable
- Saves collaboration state, multimodal state, and full chat history
- Shared history support for collaborative sessions
### Collaboration Coordinators
#### AgentCollaborationCoordinator (`agent_collaboration_coordinator.h/cc`)
Local filesystem-based collaboration:
- Creates session files in `~/.yaze/agent/sessions/`
- Generates shareable session codes
- Participant tracking via file system
- Polling-based synchronization
**Use Case:** Same-machine collaboration or cloud-folder syncing (Dropbox, iCloud)
#### NetworkCollaborationCoordinator (`network_collaboration_coordinator.h/cc`)
WebSocket-based network collaboration (requires `YAZE_WITH_GRPC` and `YAZE_WITH_JSON`):
- Real-time connection to collaboration server
- Message broadcasting to all session participants
- Live participant updates
- Session management (host/join/leave)
**Advanced Features (v2.0):**
- **ROM Synchronization** - Share ROM edits and diffs across all participants
- **Multimodal Snapshot Sharing** - Share screenshots and images with session members
- **Proposal Management** - Share and track AI-generated proposals with status updates
- **AI Agent Integration** - Route queries to AI agents for ROM analysis
**Use Case:** Remote collaboration across networks
**Server:** See `yaze-server` repository for the Node.js WebSocket server v2.0
## Usage
### Initialization
```cpp
// In EditorManager or main application:
agent_editor_.Initialize(&toast_manager_, &proposal_drawer_);
// Set up ROM context
agent_editor_.SetRomContext(current_rom_);
// Optional: Configure multimodal callbacks
AgentChatWidget::MultimodalCallbacks callbacks;
callbacks.capture_snapshot = [](std::filesystem::path* out) { /* ... */ };
callbacks.send_to_gemini = [](const std::filesystem::path& img, const std::string& prompt) { /* ... */ };
agent_editor_.GetChatWidget()->SetMultimodalCallbacks(callbacks);
```
### Drawing
```cpp
// In main render loop:
agent_editor_.Draw();
```
### Session Management
```cpp
// Host a local session
auto session = agent_editor_.HostSession("My ROM Hack",
AgentEditor::CollaborationMode::kLocal);
// Join a session by code
auto session = agent_editor_.JoinSession("ABC123",
AgentEditor::CollaborationMode::kLocal);
// Leave session
agent_editor_.LeaveSession();
```
### Network Mode (requires YAZE_WITH_GRPC and YAZE_WITH_JSON)
```cpp
// Connect to collaboration server
agent_editor_.ConnectToServer("ws://localhost:8765");
// Host network session with optional ROM hash and AI support
auto session = agent_editor_.HostSession("Network Session",
AgentEditor::CollaborationMode::kNetwork);
// Using advanced features (v2.0)
// Send ROM sync
network_coordinator->SendRomSync(username, base64_diff_data, rom_hash);
// Share snapshot
network_coordinator->SendSnapshot(username, base64_image_data, "overworld_editor");
// Share proposal
network_coordinator->SendProposal(username, proposal_json);
// Send AI query
network_coordinator->SendAIQuery(username, "What enemies are in room 5?");
```
## File Structure
```
agent/
├── README.md (this file)
├── agent_editor.h Main manager class
├── agent_editor.cc
├── agent_chat_widget.h ImGui chat interface
├── agent_chat_widget.cc
├── agent_chat_history_codec.h History serialization
├── agent_chat_history_codec.cc
├── agent_collaboration_coordinator.h Local file-based collaboration
├── agent_collaboration_coordinator.cc
├── network_collaboration_coordinator.h WebSocket collaboration
└── network_collaboration_coordinator.cc
```
## Build Configuration
### Required
- `YAZE_WITH_JSON` - Enables chat history persistence (via nlohmann/json)
### Optional
- `YAZE_WITH_GRPC` - Enables all agent features including network collaboration
- Without this flag, agent functionality is completely disabled
## Data Files
### Local Storage
- **Chat History:** `~/.yaze/agent/chat_history.json`
- **Shared Sessions:** `~/.yaze/agent/sessions/<session_id>_history.json`
- **Session Metadata:** `~/.yaze/agent/sessions/<code>.session`
### Session File Format
```json
{
"session_name": "My ROM Hack",
"session_code": "ABC123",
"host": "username",
"participants": ["username", "friend1", "friend2"]
}
```
## Integration with EditorManager
The `AgentEditor` is instantiated as a member of `EditorManager` and integrated into the main UI:
```cpp
class EditorManager {
#ifdef YAZE_WITH_GRPC
AgentEditor agent_editor_;
#endif
};
```
Menu integration:
```cpp
{ICON_MD_CHAT " Agent Chat", "",
[this]() { agent_editor_.ToggleChat(); },
[this]() { return agent_editor_.IsChatActive(); }}
```
## Dependencies
### Internal
- `cli::agent::ConversationalAgentService` - AI agent backend
- `cli::GeminiAIService` - Gemini API for multimodal queries
- `yaze::test::*` - Screenshot capture utilities
- `ProposalDrawer` - Displays agent proposals
- `ToastManager` - User notifications
### External (when enabled)
- nlohmann/json - Chat history serialization
- httplib - WebSocket client implementation
- Abseil - Status handling, time utilities
## Advanced Features (v2.0)
The network collaboration coordinator now supports:
### ROM Synchronization
Share ROM edits in real-time:
- Send diff data (base64 encoded) to all participants
- Automatic ROM hash tracking
- Size limits enforced by server (5MB max)
### Multimodal Snapshot Sharing
Share screenshots and images:
- Capture and share specific editor views
- Support for multiple snapshot types (overworld, dungeon, sprite, etc.)
- Base64 encoding for efficient transfer
- Size limits enforced by server (10MB max)
### Proposal Management
Collaborative proposal workflow:
- Share AI-generated proposals with all participants
- Track proposal status (pending, accepted, rejected)
- Real-time status updates broadcast to all users
### AI Agent Integration
Server-side AI routing:
- Send queries through the collaboration server
- Shared AI responses visible to all participants
- Query history tracked in server database
### Health Monitoring
Server health and metrics:
- `/health` endpoint for server status
- `/metrics` endpoint for usage statistics
- Graceful shutdown notifications
## Future Enhancements
1. **Voice chat integration** - Audio channels for remote collaboration
2. **Shared cursor/viewport** - See what collaborators are editing
3. **Conflict resolution UI** - Handle concurrent edits gracefully
4. **Session replay** - Record and playback editing sessions
5. **Agent memory** - Persistent context across sessions
6. **Real-time cursor tracking** - See where collaborators are working
## Server Protocol
The server uses JSON WebSocket messages. Key message types:
### Client → Server
- `host_session` - Create new session (v2.0: supports `rom_hash`, `ai_enabled`)
- `join_session` - Join existing session
- `leave_session` - Leave current session
- `chat_message` - Send message (v2.0: supports `message_type`, `metadata`)
- `rom_sync` - **New in v2.0** - Share ROM diff
- `snapshot_share` - **New in v2.0** - Share screenshot/image
- `proposal_share` - **New in v2.0** - Share proposal
- `proposal_update` - **New in v2.0** - Update proposal status
- `ai_query` - **New in v2.0** - Query AI agent
### Server → Client
- `session_hosted` - Session created confirmation
- `session_joined` - Joined session confirmation
- `chat_message` - Broadcast message
- `participant_joined` / `participant_left` - Participant changes
- `rom_sync` - **New in v2.0** - ROM diff broadcast
- `snapshot_shared` - **New in v2.0** - Snapshot broadcast
- `proposal_shared` - **New in v2.0** - Proposal broadcast
- `proposal_updated` - **New in v2.0** - Proposal status update
- `ai_response` - **New in v2.0** - AI agent response
- `server_shutdown` - **New in v2.0** - Server shutting down
- `error` - Error message

View File

@@ -143,20 +143,29 @@ void NetworkCollaborationCoordinator::DisconnectWebSocket() {
absl::StatusOr<NetworkCollaborationCoordinator::SessionInfo> absl::StatusOr<NetworkCollaborationCoordinator::SessionInfo>
NetworkCollaborationCoordinator::HostSession(const std::string& session_name, NetworkCollaborationCoordinator::HostSession(const std::string& session_name,
const std::string& username) { const std::string& username,
const std::string& rom_hash,
bool ai_enabled) {
if (!connected_) { if (!connected_) {
return absl::FailedPreconditionError("Not connected to collaboration server"); return absl::FailedPreconditionError("Not connected to collaboration server");
} }
username_ = username; username_ = username;
// Build host_session message // Build host_session message with v2.0 fields
Json payload = {
{"session_name", session_name},
{"username", username},
{"ai_enabled", ai_enabled}
};
if (!rom_hash.empty()) {
payload["rom_hash"] = rom_hash;
}
Json message = { Json message = {
{"type", "host_session"}, {"type", "host_session"},
{"payload", { {"payload", payload}
{"session_name", session_name},
{"username", username}
}}
}; };
SendWebSocketMessage("host_session", message["payload"].dump()); SendWebSocketMessage("host_session", message["payload"].dump());
@@ -221,20 +230,122 @@ absl::Status NetworkCollaborationCoordinator::LeaveSession() {
} }
absl::Status NetworkCollaborationCoordinator::SendMessage( absl::Status NetworkCollaborationCoordinator::SendMessage(
const std::string& sender, const std::string& message) { const std::string& sender, const std::string& message,
const std::string& message_type, const std::string& metadata) {
if (!in_session_) {
return absl::FailedPreconditionError("Not in a session");
}
Json payload = {
{"sender", sender},
{"message", message},
{"message_type", message_type}
};
if (!metadata.empty()) {
payload["metadata"] = Json::parse(metadata);
}
Json msg = {
{"type", "chat_message"},
{"payload", payload}
};
SendWebSocketMessage("chat_message", msg["payload"].dump());
return absl::OkStatus();
}
absl::Status NetworkCollaborationCoordinator::SendRomSync(
const std::string& sender, const std::string& diff_data,
const std::string& rom_hash) {
if (!in_session_) { if (!in_session_) {
return absl::FailedPreconditionError("Not in a session"); return absl::FailedPreconditionError("Not in a session");
} }
Json msg = { Json msg = {
{"type", "chat_message"}, {"type", "rom_sync"},
{"payload", { {"payload", {
{"sender", sender}, {"sender", sender},
{"message", message} {"diff_data", diff_data},
{"rom_hash", rom_hash}
}} }}
}; };
SendWebSocketMessage("chat_message", msg["payload"].dump()); SendWebSocketMessage("rom_sync", msg["payload"].dump());
return absl::OkStatus();
}
absl::Status NetworkCollaborationCoordinator::SendSnapshot(
const std::string& sender, const std::string& snapshot_data,
const std::string& snapshot_type) {
if (!in_session_) {
return absl::FailedPreconditionError("Not in a session");
}
Json msg = {
{"type", "snapshot_share"},
{"payload", {
{"sender", sender},
{"snapshot_data", snapshot_data},
{"snapshot_type", snapshot_type}
}}
};
SendWebSocketMessage("snapshot_share", msg["payload"].dump());
return absl::OkStatus();
}
absl::Status NetworkCollaborationCoordinator::SendProposal(
const std::string& sender, const std::string& proposal_data_json) {
if (!in_session_) {
return absl::FailedPreconditionError("Not in a session");
}
Json msg = {
{"type", "proposal_share"},
{"payload", {
{"sender", sender},
{"proposal_data", Json::parse(proposal_data_json)}
}}
};
SendWebSocketMessage("proposal_share", msg["payload"].dump());
return absl::OkStatus();
}
absl::Status NetworkCollaborationCoordinator::UpdateProposal(
const std::string& proposal_id, const std::string& status) {
if (!in_session_) {
return absl::FailedPreconditionError("Not in a session");
}
Json msg = {
{"type", "proposal_update"},
{"payload", {
{"proposal_id", proposal_id},
{"status", status}
}}
};
SendWebSocketMessage("proposal_update", msg["payload"].dump());
return absl::OkStatus();
}
absl::Status NetworkCollaborationCoordinator::SendAIQuery(
const std::string& username, const std::string& query) {
if (!in_session_) {
return absl::FailedPreconditionError("Not in a session");
}
Json msg = {
{"type", "ai_query"},
{"payload", {
{"username", username},
{"query", query}
}}
};
SendWebSocketMessage("ai_query", msg["payload"].dump());
return absl::OkStatus(); return absl::OkStatus();
} }
@@ -258,6 +369,31 @@ void NetworkCollaborationCoordinator::SetErrorCallback(ErrorCallback callback) {
error_callback_ = std::move(callback); error_callback_ = std::move(callback);
} }
void NetworkCollaborationCoordinator::SetRomSyncCallback(RomSyncCallback callback) {
absl::MutexLock lock(&mutex_);
rom_sync_callback_ = std::move(callback);
}
void NetworkCollaborationCoordinator::SetSnapshotCallback(SnapshotCallback callback) {
absl::MutexLock lock(&mutex_);
snapshot_callback_ = std::move(callback);
}
void NetworkCollaborationCoordinator::SetProposalCallback(ProposalCallback callback) {
absl::MutexLock lock(&mutex_);
proposal_callback_ = std::move(callback);
}
void NetworkCollaborationCoordinator::SetProposalUpdateCallback(ProposalUpdateCallback callback) {
absl::MutexLock lock(&mutex_);
proposal_update_callback_ = std::move(callback);
}
void NetworkCollaborationCoordinator::SetAIResponseCallback(AIResponseCallback callback) {
absl::MutexLock lock(&mutex_);
ai_response_callback_ = std::move(callback);
}
void NetworkCollaborationCoordinator::SendWebSocketMessage( void NetworkCollaborationCoordinator::SendWebSocketMessage(
const std::string& type, const std::string& payload_json) { const std::string& type, const std::string& payload_json) {
if (!ws_client_ || !connected_) { if (!ws_client_ || !connected_) {
@@ -310,11 +446,87 @@ void NetworkCollaborationCoordinator::HandleWebSocketMessage(
msg.sender = payload["sender"]; msg.sender = payload["sender"];
msg.message = payload["message"]; msg.message = payload["message"];
msg.timestamp = payload["timestamp"]; msg.timestamp = payload["timestamp"];
msg.message_type = payload.value("message_type", "chat");
if (payload.contains("metadata") && !payload["metadata"].is_null()) {
msg.metadata = payload["metadata"].dump();
}
absl::MutexLock lock(&mutex_); absl::MutexLock lock(&mutex_);
if (message_callback_) { if (message_callback_) {
message_callback_(msg); message_callback_(msg);
} }
} else if (type == "rom_sync") {
Json payload = message["payload"];
RomSync sync;
sync.sync_id = payload["sync_id"];
sync.sender = payload["sender"];
sync.diff_data = payload["diff_data"];
sync.rom_hash = payload["rom_hash"];
sync.timestamp = payload["timestamp"];
absl::MutexLock lock(&mutex_);
if (rom_sync_callback_) {
rom_sync_callback_(sync);
}
} else if (type == "snapshot_shared") {
Json payload = message["payload"];
Snapshot snapshot;
snapshot.snapshot_id = payload["snapshot_id"];
snapshot.sender = payload["sender"];
snapshot.snapshot_data = payload["snapshot_data"];
snapshot.snapshot_type = payload["snapshot_type"];
snapshot.timestamp = payload["timestamp"];
absl::MutexLock lock(&mutex_);
if (snapshot_callback_) {
snapshot_callback_(snapshot);
}
} else if (type == "proposal_shared") {
Json payload = message["payload"];
Proposal proposal;
proposal.proposal_id = payload["proposal_id"];
proposal.sender = payload["sender"];
proposal.proposal_data = payload["proposal_data"].dump();
proposal.status = payload["status"];
proposal.timestamp = payload["timestamp"];
absl::MutexLock lock(&mutex_);
if (proposal_callback_) {
proposal_callback_(proposal);
}
} else if (type == "proposal_updated") {
Json payload = message["payload"];
std::string proposal_id = payload["proposal_id"];
std::string status = payload["status"];
absl::MutexLock lock(&mutex_);
if (proposal_update_callback_) {
proposal_update_callback_(proposal_id, status);
}
} else if (type == "ai_response") {
Json payload = message["payload"];
AIResponse response;
response.query_id = payload["query_id"];
response.username = payload["username"];
response.query = payload["query"];
response.response = payload["response"];
response.timestamp = payload["timestamp"];
absl::MutexLock lock(&mutex_);
if (ai_response_callback_) {
ai_response_callback_(response);
}
} else if (type == "server_shutdown") {
Json payload = message["payload"];
std::string error = "Server shutdown: " + payload["message"].get<std::string>();
absl::MutexLock lock(&mutex_);
if (error_callback_) {
error_callback_(error);
}
// Disconnect
connected_ = false;
} else if (type == "participant_joined" || type == "participant_left") { } else if (type == "participant_joined" || type == "participant_left") {
Json payload = message["payload"]; Json payload = message["payload"];
if (payload.contains("participants")) { if (payload.contains("participants")) {
@@ -361,7 +573,8 @@ NetworkCollaborationCoordinator::NetworkCollaborationCoordinator(
NetworkCollaborationCoordinator::~NetworkCollaborationCoordinator() = default; NetworkCollaborationCoordinator::~NetworkCollaborationCoordinator() = default;
absl::StatusOr<NetworkCollaborationCoordinator::SessionInfo> absl::StatusOr<NetworkCollaborationCoordinator::SessionInfo>
NetworkCollaborationCoordinator::HostSession(const std::string&, const std::string&) { NetworkCollaborationCoordinator::HostSession(const std::string&, const std::string&,
const std::string&, bool) {
return absl::UnimplementedError("Network collaboration requires JSON support"); return absl::UnimplementedError("Network collaboration requires JSON support");
} }
@@ -375,6 +588,31 @@ absl::Status NetworkCollaborationCoordinator::LeaveSession() {
} }
absl::Status NetworkCollaborationCoordinator::SendMessage( absl::Status NetworkCollaborationCoordinator::SendMessage(
const std::string&, const std::string&, const std::string&, const std::string&) {
return absl::UnimplementedError("Network collaboration requires JSON support");
}
absl::Status NetworkCollaborationCoordinator::SendRomSync(
const std::string&, const std::string&, const std::string&) {
return absl::UnimplementedError("Network collaboration requires JSON support");
}
absl::Status NetworkCollaborationCoordinator::SendSnapshot(
const std::string&, const std::string&, const std::string&) {
return absl::UnimplementedError("Network collaboration requires JSON support");
}
absl::Status NetworkCollaborationCoordinator::SendProposal(
const std::string&, const std::string&) {
return absl::UnimplementedError("Network collaboration requires JSON support");
}
absl::Status NetworkCollaborationCoordinator::UpdateProposal(
const std::string&, const std::string&) {
return absl::UnimplementedError("Network collaboration requires JSON support");
}
absl::Status NetworkCollaborationCoordinator::SendAIQuery(
const std::string&, const std::string&) { const std::string&, const std::string&) {
return absl::UnimplementedError("Network collaboration requires JSON support"); return absl::UnimplementedError("Network collaboration requires JSON support");
} }
@@ -384,6 +622,11 @@ bool NetworkCollaborationCoordinator::IsConnected() const { return false; }
void NetworkCollaborationCoordinator::SetMessageCallback(MessageCallback) {} void NetworkCollaborationCoordinator::SetMessageCallback(MessageCallback) {}
void NetworkCollaborationCoordinator::SetParticipantCallback(ParticipantCallback) {} void NetworkCollaborationCoordinator::SetParticipantCallback(ParticipantCallback) {}
void NetworkCollaborationCoordinator::SetErrorCallback(ErrorCallback) {} void NetworkCollaborationCoordinator::SetErrorCallback(ErrorCallback) {}
void NetworkCollaborationCoordinator::SetRomSyncCallback(RomSyncCallback) {}
void NetworkCollaborationCoordinator::SetSnapshotCallback(SnapshotCallback) {}
void NetworkCollaborationCoordinator::SetProposalCallback(ProposalCallback) {}
void NetworkCollaborationCoordinator::SetProposalUpdateCallback(ProposalUpdateCallback) {}
void NetworkCollaborationCoordinator::SetAIResponseCallback(AIResponseCallback) {}
void NetworkCollaborationCoordinator::ConnectWebSocket() {} void NetworkCollaborationCoordinator::ConnectWebSocket() {}
void NetworkCollaborationCoordinator::DisconnectWebSocket() {} void NetworkCollaborationCoordinator::DisconnectWebSocket() {}
void NetworkCollaborationCoordinator::SendWebSocketMessage(const std::string&, const std::string&) {} void NetworkCollaborationCoordinator::SendWebSocketMessage(const std::string&, const std::string&) {}

View File

@@ -36,25 +36,87 @@ class NetworkCollaborationCoordinator {
std::string sender; std::string sender;
std::string message; std::string message;
int64_t timestamp; int64_t timestamp;
std::string message_type; // "chat", "system", "ai"
std::string metadata; // JSON metadata
};
struct RomSync {
std::string sync_id;
std::string sender;
std::string diff_data; // Base64 encoded
std::string rom_hash;
int64_t timestamp;
};
struct Snapshot {
std::string snapshot_id;
std::string sender;
std::string snapshot_data; // Base64 encoded
std::string snapshot_type;
int64_t timestamp;
};
struct Proposal {
std::string proposal_id;
std::string sender;
std::string proposal_data; // JSON data
std::string status; // "pending", "accepted", "rejected"
int64_t timestamp;
};
struct AIResponse {
std::string query_id;
std::string username;
std::string query;
std::string response;
int64_t timestamp;
}; };
// Callbacks for handling incoming events // Callbacks for handling incoming events
using MessageCallback = std::function<void(const ChatMessage&)>; using MessageCallback = std::function<void(const ChatMessage&)>;
using ParticipantCallback = std::function<void(const std::vector<std::string>&)>; using ParticipantCallback = std::function<void(const std::vector<std::string>&)>;
using ErrorCallback = std::function<void(const std::string&)>; using ErrorCallback = std::function<void(const std::string&)>;
using RomSyncCallback = std::function<void(const RomSync&)>;
using SnapshotCallback = std::function<void(const Snapshot&)>;
using ProposalCallback = std::function<void(const Proposal&)>;
using ProposalUpdateCallback = std::function<void(const std::string&, const std::string&)>;
using AIResponseCallback = std::function<void(const AIResponse&)>;
explicit NetworkCollaborationCoordinator(const std::string& server_url); explicit NetworkCollaborationCoordinator(const std::string& server_url);
~NetworkCollaborationCoordinator(); ~NetworkCollaborationCoordinator();
// Session management // Session management
absl::StatusOr<SessionInfo> HostSession(const std::string& session_name, absl::StatusOr<SessionInfo> HostSession(const std::string& session_name,
const std::string& username); const std::string& username,
const std::string& rom_hash = "",
bool ai_enabled = true);
absl::StatusOr<SessionInfo> JoinSession(const std::string& session_code, absl::StatusOr<SessionInfo> JoinSession(const std::string& session_code,
const std::string& username); const std::string& username);
absl::Status LeaveSession(); absl::Status LeaveSession();
// Send chat message to current session // Communication methods
absl::Status SendMessage(const std::string& sender, const std::string& message); absl::Status SendMessage(const std::string& sender,
const std::string& message,
const std::string& message_type = "chat",
const std::string& metadata = "");
// Advanced features
absl::Status SendRomSync(const std::string& sender,
const std::string& diff_data,
const std::string& rom_hash);
absl::Status SendSnapshot(const std::string& sender,
const std::string& snapshot_data,
const std::string& snapshot_type);
absl::Status SendProposal(const std::string& sender,
const std::string& proposal_data_json);
absl::Status UpdateProposal(const std::string& proposal_id,
const std::string& status);
absl::Status SendAIQuery(const std::string& username,
const std::string& query);
// Connection status // Connection status
bool IsConnected() const; bool IsConnected() const;
@@ -66,6 +128,11 @@ class NetworkCollaborationCoordinator {
void SetMessageCallback(MessageCallback callback); void SetMessageCallback(MessageCallback callback);
void SetParticipantCallback(ParticipantCallback callback); void SetParticipantCallback(ParticipantCallback callback);
void SetErrorCallback(ErrorCallback callback); void SetErrorCallback(ErrorCallback callback);
void SetRomSyncCallback(RomSyncCallback callback);
void SetSnapshotCallback(SnapshotCallback callback);
void SetProposalCallback(ProposalCallback callback);
void SetProposalUpdateCallback(ProposalUpdateCallback callback);
void SetAIResponseCallback(AIResponseCallback callback);
private: private:
void ConnectWebSocket(); void ConnectWebSocket();
@@ -90,6 +157,11 @@ class NetworkCollaborationCoordinator {
MessageCallback message_callback_ ABSL_GUARDED_BY(mutex_); MessageCallback message_callback_ ABSL_GUARDED_BY(mutex_);
ParticipantCallback participant_callback_ ABSL_GUARDED_BY(mutex_); ParticipantCallback participant_callback_ ABSL_GUARDED_BY(mutex_);
ErrorCallback error_callback_ ABSL_GUARDED_BY(mutex_); ErrorCallback error_callback_ ABSL_GUARDED_BY(mutex_);
RomSyncCallback rom_sync_callback_ ABSL_GUARDED_BY(mutex_);
SnapshotCallback snapshot_callback_ ABSL_GUARDED_BY(mutex_);
ProposalCallback proposal_callback_ ABSL_GUARDED_BY(mutex_);
ProposalUpdateCallback proposal_update_callback_ ABSL_GUARDED_BY(mutex_);
AIResponseCallback ai_response_callback_ ABSL_GUARDED_BY(mutex_);
}; };
} // namespace editor } // namespace editor