feat: Expand network collaboration capabilities and enhance agent functionality

- Updated NetworkCollaborationCoordinator to support advanced features including ROM synchronization, snapshot sharing, and proposal management.
- Enhanced message handling for chat, ROM sync, snapshots, proposals, and AI queries with structured payloads.
- Introduced new callback mechanisms for handling incoming events related to ROM sync, snapshots, proposals, and AI responses.
- Improved documentation in the agent directory to reflect new functionalities and usage examples for network collaboration.
- Added comprehensive README for the Agent Editor module detailing architecture, usage, and advanced features.
This commit is contained in:
scawful
2025-10-04 19:57:27 -04:00
parent 588db01df6
commit 3cdeeac5a3
4 changed files with 1178 additions and 73 deletions

View File

@@ -0,0 +1,286 @@
# Agent Editor Module
This directory contains all agent and network collaboration functionality for yaze and [yaze-server](https://github.com/scawful/yaze-server).
## Overview
The Agent Editor module provides AI-powered assistance and collaborative editing features for ROM hacking projects. It integrates conversational AI agents, local and network-based collaboration, and multimodal (vision) capabilities.
## Architecture
### Core Components
#### AgentEditor (`agent_editor.h/cc`)
The main manager class that coordinates all agent-related functionality:
- Manages the chat widget lifecycle
- Coordinates local and network collaboration modes
- Provides high-level API for session management
- Handles multimodal callbacks (screenshot capture, Gemini integration)
**Key Features:**
- Unified interface for all agent functionality
- Mode switching between local and network collaboration
- ROM context management for agent queries
- Integration with toast notifications and proposal drawer
#### AgentChatWidget (`agent_chat_widget.h/cc`)
ImGui-based chat interface for interacting with AI agents:
- Real-time conversation with AI assistant
- Message history with persistence
- Proposal preview and quick actions
- Collaboration panel with session controls
- Multimodal panel for screenshot capture and Gemini queries
**Features:**
- Split-panel layout (session details + chat history)
- Auto-scrolling chat with timestamps
- JSON response formatting
- Table data visualization
- Proposal metadata display
#### AgentChatHistoryCodec (`agent_chat_history_codec.h/cc`)
Serialization/deserialization for chat history:
- JSON-based persistence (when built with `YAZE_WITH_JSON`)
- Graceful degradation when JSON support unavailable
- Saves collaboration state, multimodal state, and full chat history
- Shared history support for collaborative sessions
### Collaboration Coordinators
#### AgentCollaborationCoordinator (`agent_collaboration_coordinator.h/cc`)
Local filesystem-based collaboration:
- Creates session files in `~/.yaze/agent/sessions/`
- Generates shareable session codes
- Participant tracking via file system
- Polling-based synchronization
**Use Case:** Same-machine collaboration or cloud-folder syncing (Dropbox, iCloud)
#### NetworkCollaborationCoordinator (`network_collaboration_coordinator.h/cc`)
WebSocket-based network collaboration (requires `YAZE_WITH_GRPC` and `YAZE_WITH_JSON`):
- Real-time connection to collaboration server
- Message broadcasting to all session participants
- Live participant updates
- Session management (host/join/leave)
**Advanced Features (v2.0):**
- **ROM Synchronization** - Share ROM edits and diffs across all participants
- **Multimodal Snapshot Sharing** - Share screenshots and images with session members
- **Proposal Management** - Share and track AI-generated proposals with status updates
- **AI Agent Integration** - Route queries to AI agents for ROM analysis
**Use Case:** Remote collaboration across networks
**Server:** See `yaze-server` repository for the Node.js WebSocket server v2.0
## Usage
### Initialization
```cpp
// In EditorManager or main application:
agent_editor_.Initialize(&toast_manager_, &proposal_drawer_);
// Set up ROM context
agent_editor_.SetRomContext(current_rom_);
// Optional: Configure multimodal callbacks
AgentChatWidget::MultimodalCallbacks callbacks;
callbacks.capture_snapshot = [](std::filesystem::path* out) { /* ... */ };
callbacks.send_to_gemini = [](const std::filesystem::path& img, const std::string& prompt) { /* ... */ };
agent_editor_.GetChatWidget()->SetMultimodalCallbacks(callbacks);
```
### Drawing
```cpp
// In main render loop:
agent_editor_.Draw();
```
### Session Management
```cpp
// Host a local session
auto session = agent_editor_.HostSession("My ROM Hack",
AgentEditor::CollaborationMode::kLocal);
// Join a session by code
auto session = agent_editor_.JoinSession("ABC123",
AgentEditor::CollaborationMode::kLocal);
// Leave session
agent_editor_.LeaveSession();
```
### Network Mode (requires YAZE_WITH_GRPC and YAZE_WITH_JSON)
```cpp
// Connect to collaboration server
agent_editor_.ConnectToServer("ws://localhost:8765");
// Host network session with optional ROM hash and AI support
auto session = agent_editor_.HostSession("Network Session",
AgentEditor::CollaborationMode::kNetwork);
// Using advanced features (v2.0)
// Send ROM sync
network_coordinator->SendRomSync(username, base64_diff_data, rom_hash);
// Share snapshot
network_coordinator->SendSnapshot(username, base64_image_data, "overworld_editor");
// Share proposal
network_coordinator->SendProposal(username, proposal_json);
// Send AI query
network_coordinator->SendAIQuery(username, "What enemies are in room 5?");
```
## File Structure
```
agent/
├── README.md (this file)
├── agent_editor.h Main manager class
├── agent_editor.cc
├── agent_chat_widget.h ImGui chat interface
├── agent_chat_widget.cc
├── agent_chat_history_codec.h History serialization
├── agent_chat_history_codec.cc
├── agent_collaboration_coordinator.h Local file-based collaboration
├── agent_collaboration_coordinator.cc
├── network_collaboration_coordinator.h WebSocket collaboration
└── network_collaboration_coordinator.cc
```
## Build Configuration
### Required
- `YAZE_WITH_JSON` - Enables chat history persistence (via nlohmann/json)
### Optional
- `YAZE_WITH_GRPC` - Enables all agent features including network collaboration
- Without this flag, agent functionality is completely disabled
## Data Files
### Local Storage
- **Chat History:** `~/.yaze/agent/chat_history.json`
- **Shared Sessions:** `~/.yaze/agent/sessions/<session_id>_history.json`
- **Session Metadata:** `~/.yaze/agent/sessions/<code>.session`
### Session File Format
```json
{
"session_name": "My ROM Hack",
"session_code": "ABC123",
"host": "username",
"participants": ["username", "friend1", "friend2"]
}
```
## Integration with EditorManager
The `AgentEditor` is instantiated as a member of `EditorManager` and integrated into the main UI:
```cpp
class EditorManager {
#ifdef YAZE_WITH_GRPC
AgentEditor agent_editor_;
#endif
};
```
Menu integration:
```cpp
{ICON_MD_CHAT " Agent Chat", "",
[this]() { agent_editor_.ToggleChat(); },
[this]() { return agent_editor_.IsChatActive(); }}
```
## Dependencies
### Internal
- `cli::agent::ConversationalAgentService` - AI agent backend
- `cli::GeminiAIService` - Gemini API for multimodal queries
- `yaze::test::*` - Screenshot capture utilities
- `ProposalDrawer` - Displays agent proposals
- `ToastManager` - User notifications
### External (when enabled)
- nlohmann/json - Chat history serialization
- httplib - WebSocket client implementation
- Abseil - Status handling, time utilities
## Advanced Features (v2.0)
The network collaboration coordinator now supports:
### ROM Synchronization
Share ROM edits in real-time:
- Send diff data (base64 encoded) to all participants
- Automatic ROM hash tracking
- Size limits enforced by server (5MB max)
### Multimodal Snapshot Sharing
Share screenshots and images:
- Capture and share specific editor views
- Support for multiple snapshot types (overworld, dungeon, sprite, etc.)
- Base64 encoding for efficient transfer
- Size limits enforced by server (10MB max)
### Proposal Management
Collaborative proposal workflow:
- Share AI-generated proposals with all participants
- Track proposal status (pending, accepted, rejected)
- Real-time status updates broadcast to all users
### AI Agent Integration
Server-side AI routing:
- Send queries through the collaboration server
- Shared AI responses visible to all participants
- Query history tracked in server database
### Health Monitoring
Server health and metrics:
- `/health` endpoint for server status
- `/metrics` endpoint for usage statistics
- Graceful shutdown notifications
## Future Enhancements
1. **Voice chat integration** - Audio channels for remote collaboration
2. **Shared cursor/viewport** - See what collaborators are editing
3. **Conflict resolution UI** - Handle concurrent edits gracefully
4. **Session replay** - Record and playback editing sessions
5. **Agent memory** - Persistent context across sessions
6. **Real-time cursor tracking** - See where collaborators are working
## Server Protocol
The server uses JSON WebSocket messages. Key message types:
### Client → Server
- `host_session` - Create new session (v2.0: supports `rom_hash`, `ai_enabled`)
- `join_session` - Join existing session
- `leave_session` - Leave current session
- `chat_message` - Send message (v2.0: supports `message_type`, `metadata`)
- `rom_sync` - **New in v2.0** - Share ROM diff
- `snapshot_share` - **New in v2.0** - Share screenshot/image
- `proposal_share` - **New in v2.0** - Share proposal
- `proposal_update` - **New in v2.0** - Update proposal status
- `ai_query` - **New in v2.0** - Query AI agent
### Server → Client
- `session_hosted` - Session created confirmation
- `session_joined` - Joined session confirmation
- `chat_message` - Broadcast message
- `participant_joined` / `participant_left` - Participant changes
- `rom_sync` - **New in v2.0** - ROM diff broadcast
- `snapshot_shared` - **New in v2.0** - Snapshot broadcast
- `proposal_shared` - **New in v2.0** - Proposal broadcast
- `proposal_updated` - **New in v2.0** - Proposal status update
- `ai_response` - **New in v2.0** - AI agent response
- `server_shutdown` - **New in v2.0** - Server shutting down
- `error` - Error message

View File

@@ -143,20 +143,29 @@ void NetworkCollaborationCoordinator::DisconnectWebSocket() {
absl::StatusOr<NetworkCollaborationCoordinator::SessionInfo>
NetworkCollaborationCoordinator::HostSession(const std::string& session_name,
const std::string& username) {
const std::string& username,
const std::string& rom_hash,
bool ai_enabled) {
if (!connected_) {
return absl::FailedPreconditionError("Not connected to collaboration server");
}
username_ = username;
// Build host_session message
// Build host_session message with v2.0 fields
Json payload = {
{"session_name", session_name},
{"username", username},
{"ai_enabled", ai_enabled}
};
if (!rom_hash.empty()) {
payload["rom_hash"] = rom_hash;
}
Json message = {
{"type", "host_session"},
{"payload", {
{"session_name", session_name},
{"username", username}
}}
{"payload", payload}
};
SendWebSocketMessage("host_session", message["payload"].dump());
@@ -221,20 +230,122 @@ absl::Status NetworkCollaborationCoordinator::LeaveSession() {
}
absl::Status NetworkCollaborationCoordinator::SendMessage(
const std::string& sender, const std::string& message) {
const std::string& sender, const std::string& message,
const std::string& message_type, const std::string& metadata) {
if (!in_session_) {
return absl::FailedPreconditionError("Not in a session");
}
Json payload = {
{"sender", sender},
{"message", message},
{"message_type", message_type}
};
if (!metadata.empty()) {
payload["metadata"] = Json::parse(metadata);
}
Json msg = {
{"type", "chat_message"},
{"payload", payload}
};
SendWebSocketMessage("chat_message", msg["payload"].dump());
return absl::OkStatus();
}
absl::Status NetworkCollaborationCoordinator::SendRomSync(
const std::string& sender, const std::string& diff_data,
const std::string& rom_hash) {
if (!in_session_) {
return absl::FailedPreconditionError("Not in a session");
}
Json msg = {
{"type", "chat_message"},
{"type", "rom_sync"},
{"payload", {
{"sender", sender},
{"message", message}
{"diff_data", diff_data},
{"rom_hash", rom_hash}
}}
};
SendWebSocketMessage("chat_message", msg["payload"].dump());
SendWebSocketMessage("rom_sync", msg["payload"].dump());
return absl::OkStatus();
}
absl::Status NetworkCollaborationCoordinator::SendSnapshot(
const std::string& sender, const std::string& snapshot_data,
const std::string& snapshot_type) {
if (!in_session_) {
return absl::FailedPreconditionError("Not in a session");
}
Json msg = {
{"type", "snapshot_share"},
{"payload", {
{"sender", sender},
{"snapshot_data", snapshot_data},
{"snapshot_type", snapshot_type}
}}
};
SendWebSocketMessage("snapshot_share", msg["payload"].dump());
return absl::OkStatus();
}
absl::Status NetworkCollaborationCoordinator::SendProposal(
const std::string& sender, const std::string& proposal_data_json) {
if (!in_session_) {
return absl::FailedPreconditionError("Not in a session");
}
Json msg = {
{"type", "proposal_share"},
{"payload", {
{"sender", sender},
{"proposal_data", Json::parse(proposal_data_json)}
}}
};
SendWebSocketMessage("proposal_share", msg["payload"].dump());
return absl::OkStatus();
}
absl::Status NetworkCollaborationCoordinator::UpdateProposal(
const std::string& proposal_id, const std::string& status) {
if (!in_session_) {
return absl::FailedPreconditionError("Not in a session");
}
Json msg = {
{"type", "proposal_update"},
{"payload", {
{"proposal_id", proposal_id},
{"status", status}
}}
};
SendWebSocketMessage("proposal_update", msg["payload"].dump());
return absl::OkStatus();
}
absl::Status NetworkCollaborationCoordinator::SendAIQuery(
const std::string& username, const std::string& query) {
if (!in_session_) {
return absl::FailedPreconditionError("Not in a session");
}
Json msg = {
{"type", "ai_query"},
{"payload", {
{"username", username},
{"query", query}
}}
};
SendWebSocketMessage("ai_query", msg["payload"].dump());
return absl::OkStatus();
}
@@ -258,6 +369,31 @@ void NetworkCollaborationCoordinator::SetErrorCallback(ErrorCallback callback) {
error_callback_ = std::move(callback);
}
void NetworkCollaborationCoordinator::SetRomSyncCallback(RomSyncCallback callback) {
absl::MutexLock lock(&mutex_);
rom_sync_callback_ = std::move(callback);
}
void NetworkCollaborationCoordinator::SetSnapshotCallback(SnapshotCallback callback) {
absl::MutexLock lock(&mutex_);
snapshot_callback_ = std::move(callback);
}
void NetworkCollaborationCoordinator::SetProposalCallback(ProposalCallback callback) {
absl::MutexLock lock(&mutex_);
proposal_callback_ = std::move(callback);
}
void NetworkCollaborationCoordinator::SetProposalUpdateCallback(ProposalUpdateCallback callback) {
absl::MutexLock lock(&mutex_);
proposal_update_callback_ = std::move(callback);
}
void NetworkCollaborationCoordinator::SetAIResponseCallback(AIResponseCallback callback) {
absl::MutexLock lock(&mutex_);
ai_response_callback_ = std::move(callback);
}
void NetworkCollaborationCoordinator::SendWebSocketMessage(
const std::string& type, const std::string& payload_json) {
if (!ws_client_ || !connected_) {
@@ -310,11 +446,87 @@ void NetworkCollaborationCoordinator::HandleWebSocketMessage(
msg.sender = payload["sender"];
msg.message = payload["message"];
msg.timestamp = payload["timestamp"];
msg.message_type = payload.value("message_type", "chat");
if (payload.contains("metadata") && !payload["metadata"].is_null()) {
msg.metadata = payload["metadata"].dump();
}
absl::MutexLock lock(&mutex_);
if (message_callback_) {
message_callback_(msg);
}
} else if (type == "rom_sync") {
Json payload = message["payload"];
RomSync sync;
sync.sync_id = payload["sync_id"];
sync.sender = payload["sender"];
sync.diff_data = payload["diff_data"];
sync.rom_hash = payload["rom_hash"];
sync.timestamp = payload["timestamp"];
absl::MutexLock lock(&mutex_);
if (rom_sync_callback_) {
rom_sync_callback_(sync);
}
} else if (type == "snapshot_shared") {
Json payload = message["payload"];
Snapshot snapshot;
snapshot.snapshot_id = payload["snapshot_id"];
snapshot.sender = payload["sender"];
snapshot.snapshot_data = payload["snapshot_data"];
snapshot.snapshot_type = payload["snapshot_type"];
snapshot.timestamp = payload["timestamp"];
absl::MutexLock lock(&mutex_);
if (snapshot_callback_) {
snapshot_callback_(snapshot);
}
} else if (type == "proposal_shared") {
Json payload = message["payload"];
Proposal proposal;
proposal.proposal_id = payload["proposal_id"];
proposal.sender = payload["sender"];
proposal.proposal_data = payload["proposal_data"].dump();
proposal.status = payload["status"];
proposal.timestamp = payload["timestamp"];
absl::MutexLock lock(&mutex_);
if (proposal_callback_) {
proposal_callback_(proposal);
}
} else if (type == "proposal_updated") {
Json payload = message["payload"];
std::string proposal_id = payload["proposal_id"];
std::string status = payload["status"];
absl::MutexLock lock(&mutex_);
if (proposal_update_callback_) {
proposal_update_callback_(proposal_id, status);
}
} else if (type == "ai_response") {
Json payload = message["payload"];
AIResponse response;
response.query_id = payload["query_id"];
response.username = payload["username"];
response.query = payload["query"];
response.response = payload["response"];
response.timestamp = payload["timestamp"];
absl::MutexLock lock(&mutex_);
if (ai_response_callback_) {
ai_response_callback_(response);
}
} else if (type == "server_shutdown") {
Json payload = message["payload"];
std::string error = "Server shutdown: " + payload["message"].get<std::string>();
absl::MutexLock lock(&mutex_);
if (error_callback_) {
error_callback_(error);
}
// Disconnect
connected_ = false;
} else if (type == "participant_joined" || type == "participant_left") {
Json payload = message["payload"];
if (payload.contains("participants")) {
@@ -361,7 +573,8 @@ NetworkCollaborationCoordinator::NetworkCollaborationCoordinator(
NetworkCollaborationCoordinator::~NetworkCollaborationCoordinator() = default;
absl::StatusOr<NetworkCollaborationCoordinator::SessionInfo>
NetworkCollaborationCoordinator::HostSession(const std::string&, const std::string&) {
NetworkCollaborationCoordinator::HostSession(const std::string&, const std::string&,
const std::string&, bool) {
return absl::UnimplementedError("Network collaboration requires JSON support");
}
@@ -375,6 +588,31 @@ absl::Status NetworkCollaborationCoordinator::LeaveSession() {
}
absl::Status NetworkCollaborationCoordinator::SendMessage(
const std::string&, const std::string&, const std::string&, const std::string&) {
return absl::UnimplementedError("Network collaboration requires JSON support");
}
absl::Status NetworkCollaborationCoordinator::SendRomSync(
const std::string&, const std::string&, const std::string&) {
return absl::UnimplementedError("Network collaboration requires JSON support");
}
absl::Status NetworkCollaborationCoordinator::SendSnapshot(
const std::string&, const std::string&, const std::string&) {
return absl::UnimplementedError("Network collaboration requires JSON support");
}
absl::Status NetworkCollaborationCoordinator::SendProposal(
const std::string&, const std::string&) {
return absl::UnimplementedError("Network collaboration requires JSON support");
}
absl::Status NetworkCollaborationCoordinator::UpdateProposal(
const std::string&, const std::string&) {
return absl::UnimplementedError("Network collaboration requires JSON support");
}
absl::Status NetworkCollaborationCoordinator::SendAIQuery(
const std::string&, const std::string&) {
return absl::UnimplementedError("Network collaboration requires JSON support");
}
@@ -384,6 +622,11 @@ bool NetworkCollaborationCoordinator::IsConnected() const { return false; }
void NetworkCollaborationCoordinator::SetMessageCallback(MessageCallback) {}
void NetworkCollaborationCoordinator::SetParticipantCallback(ParticipantCallback) {}
void NetworkCollaborationCoordinator::SetErrorCallback(ErrorCallback) {}
void NetworkCollaborationCoordinator::SetRomSyncCallback(RomSyncCallback) {}
void NetworkCollaborationCoordinator::SetSnapshotCallback(SnapshotCallback) {}
void NetworkCollaborationCoordinator::SetProposalCallback(ProposalCallback) {}
void NetworkCollaborationCoordinator::SetProposalUpdateCallback(ProposalUpdateCallback) {}
void NetworkCollaborationCoordinator::SetAIResponseCallback(AIResponseCallback) {}
void NetworkCollaborationCoordinator::ConnectWebSocket() {}
void NetworkCollaborationCoordinator::DisconnectWebSocket() {}
void NetworkCollaborationCoordinator::SendWebSocketMessage(const std::string&, const std::string&) {}

View File

@@ -36,25 +36,87 @@ class NetworkCollaborationCoordinator {
std::string sender;
std::string message;
int64_t timestamp;
std::string message_type; // "chat", "system", "ai"
std::string metadata; // JSON metadata
};
struct RomSync {
std::string sync_id;
std::string sender;
std::string diff_data; // Base64 encoded
std::string rom_hash;
int64_t timestamp;
};
struct Snapshot {
std::string snapshot_id;
std::string sender;
std::string snapshot_data; // Base64 encoded
std::string snapshot_type;
int64_t timestamp;
};
struct Proposal {
std::string proposal_id;
std::string sender;
std::string proposal_data; // JSON data
std::string status; // "pending", "accepted", "rejected"
int64_t timestamp;
};
struct AIResponse {
std::string query_id;
std::string username;
std::string query;
std::string response;
int64_t timestamp;
};
// Callbacks for handling incoming events
using MessageCallback = std::function<void(const ChatMessage&)>;
using ParticipantCallback = std::function<void(const std::vector<std::string>&)>;
using ErrorCallback = std::function<void(const std::string&)>;
using RomSyncCallback = std::function<void(const RomSync&)>;
using SnapshotCallback = std::function<void(const Snapshot&)>;
using ProposalCallback = std::function<void(const Proposal&)>;
using ProposalUpdateCallback = std::function<void(const std::string&, const std::string&)>;
using AIResponseCallback = std::function<void(const AIResponse&)>;
explicit NetworkCollaborationCoordinator(const std::string& server_url);
~NetworkCollaborationCoordinator();
// Session management
absl::StatusOr<SessionInfo> HostSession(const std::string& session_name,
const std::string& username);
const std::string& username,
const std::string& rom_hash = "",
bool ai_enabled = true);
absl::StatusOr<SessionInfo> JoinSession(const std::string& session_code,
const std::string& username);
absl::Status LeaveSession();
// Send chat message to current session
absl::Status SendMessage(const std::string& sender, const std::string& message);
// Communication methods
absl::Status SendMessage(const std::string& sender,
const std::string& message,
const std::string& message_type = "chat",
const std::string& metadata = "");
// Advanced features
absl::Status SendRomSync(const std::string& sender,
const std::string& diff_data,
const std::string& rom_hash);
absl::Status SendSnapshot(const std::string& sender,
const std::string& snapshot_data,
const std::string& snapshot_type);
absl::Status SendProposal(const std::string& sender,
const std::string& proposal_data_json);
absl::Status UpdateProposal(const std::string& proposal_id,
const std::string& status);
absl::Status SendAIQuery(const std::string& username,
const std::string& query);
// Connection status
bool IsConnected() const;
@@ -66,6 +128,11 @@ class NetworkCollaborationCoordinator {
void SetMessageCallback(MessageCallback callback);
void SetParticipantCallback(ParticipantCallback callback);
void SetErrorCallback(ErrorCallback callback);
void SetRomSyncCallback(RomSyncCallback callback);
void SetSnapshotCallback(SnapshotCallback callback);
void SetProposalCallback(ProposalCallback callback);
void SetProposalUpdateCallback(ProposalUpdateCallback callback);
void SetAIResponseCallback(AIResponseCallback callback);
private:
void ConnectWebSocket();
@@ -90,6 +157,11 @@ class NetworkCollaborationCoordinator {
MessageCallback message_callback_ ABSL_GUARDED_BY(mutex_);
ParticipantCallback participant_callback_ ABSL_GUARDED_BY(mutex_);
ErrorCallback error_callback_ ABSL_GUARDED_BY(mutex_);
RomSyncCallback rom_sync_callback_ ABSL_GUARDED_BY(mutex_);
SnapshotCallback snapshot_callback_ ABSL_GUARDED_BY(mutex_);
ProposalCallback proposal_callback_ ABSL_GUARDED_BY(mutex_);
ProposalUpdateCallback proposal_update_callback_ ABSL_GUARDED_BY(mutex_);
AIResponseCallback ai_response_callback_ ABSL_GUARDED_BY(mutex_);
};
} // namespace editor