Enhance z3ed Agent Roadmap and CLI Reference with New Tool Commands and Chat Features

This commit is contained in:
scawful
2025-10-03 13:51:01 -04:00
parent ba9f6533a4
commit c7a7707d25
7 changed files with 268 additions and 49 deletions

View File

@@ -108,13 +108,12 @@ We have made significant progress in laying the foundation for the conversationa
- **GUI Chat Widget Stub**: An `AgentChatWidget` is integrated into the main GUI.
- **Initial Agent "Tools"**: `resource-list` and `dungeon-list-sprites` commands are implemented.
- **Tool Use Foundation**: The `ToolDispatcher` is implemented, and the AI services are aware of the new tool call format.
- **Tool Loop Improvements**: Conversational flow now handles multi-step tool calls with default JSON output, allowing results to feed back into the chat without recursion.
### ⚠️ Current Blocker: Build Configuration
We are currently facing a linker error when building the main `yaze` application with gRPC support. The `ToolDispatcher` is unable to find the definitions for the `HandleResourceListCommand` and `HandleDungeonListSpritesCommand` functions.
**Root Cause**: These handler functions are only compiled as part of the `z3ed` target, not the `yaze` target. The `ToolDispatcher`, which is now included in the `yaze` build, depends on them.
### Build Configuration Issue Resolved
The linker error is fixed. Both the CLI and GUI targets now link against `yaze_agent`, so the shared agent handlers (`HandleResourceListCommand`, `HandleDungeonListSpritesCommand`, etc.) compile once and are available to `ToolDispatcher` everywhere.
### 🚀 Next Steps
1. **Resolve the Build Issue**: The immediate next step is to fix the linker error. This will likely involve a thoughtful refactoring of our CMake configuration to better share sources between the `yaze` and `z3ed` targets.
2. **Simplify CMake Structure**: As discussed, the current structure of including `.cmake` files from various subdirectories is becoming difficult to manage. We should consider flattening this into a more centralized source list in the main `src/CMakeLists.txt`.
3. **Continue with Tool Integration**: Once the build is fixed, we can proceed with integrating the tool execution results back into the conversational loop.
1. **Share ROM Context with the Agent**: Inject the active GUI ROM into `ConversationalAgentService` so tool calls work even when `--rom` flags are unavailable.
2. **Surface Tool Output in the UI**: Present JSON/table responses in the chat widgets with formatting instead of raw text dumps.
3. **Expand Tool Coverage**: Add the next batch of read-only utilities (`dungeon get-info`, `overworld find-tile`) now that the tooling loop is stable.

View File

@@ -207,6 +207,52 @@ Examples:
- `dungeon` - Dungeon editing
- `agent` - Agent commands
#### `agent resource-list` - Enumerate labeled resources for the AI
```bash
z3ed agent resource-list --type <resource> [--format <table|json>]
Options:
--type <resource> Required label family (dungeon, overworld, sprite, palette, etc.)
--format <mode> Output format, defaults to `table`. Use `json` for LLM tooling.
Examples:
# Show dungeon labels in a table
z3ed agent resource-list --type dungeon
# Emit JSON for the conversation agent to consume
z3ed agent resource-list --type overworld --format json
```
**Notes**:
- When the conversation agent invokes this tool, JSON output is requested automatically.
- Labels are loaded from `ResourceContextBuilder`, so the command reflects project-specific metadata.
#### `agent dungeon-list-sprites` - Inspect sprites in a dungeon room
```bash
z3ed agent dungeon-list-sprites --room <hex_id> [--format <table|json>]
Options:
--room <hex_id> Dungeon room ID (hexadecimal). Accepts `0x` prefixes or decimal.
--format <mode> Output format, defaults to `table`.
Examples:
z3ed agent dungeon-list-sprites --room 0x012
z3ed agent dungeon-list-sprites --room 18 --format json
```
**Output**:
- Table view prints sprite id/x/y in hex+decimal for quick inspection.
- JSON view is tailored for the LLM toolchain and is returned automatically during tool calls.
#### `agent chat` - Interactive terminal chat (TUI prototype)
```bash
z3ed agent chat
```
- Opens an FTXUI-based interface with scrolling history and input box.
- Uses the shared `ConversationalAgentService`, so the same backend powers the GUI widget.
- Useful for manual testing of tool dispatching and new prompting strategies.
#### `agent test` - Automated GUI testing (IT-02)
```bash
z3ed agent test --prompt "<test_description>" [--host <hostname>] [--port <port>]

View File

@@ -40,11 +40,20 @@ z3ed agent plan --prompt "Place a tree at position 10, 10 on map 0"
# Execute in sandbox with auto-approval
z3ed agent run --prompt "Create a 3x3 water pond at 15, 20" --rom zelda3.sfc --sandbox
# Chat with the agent in the terminal (FTXUI prototype)
z3ed agent chat
# List all proposals
z3ed agent list
# View proposal details
z3ed agent diff --proposal <id>
# Inspect project metadata for the LLM toolchain
z3ed agent resource-list --type dungeon --format json
# Dump sprite placements for a dungeon room
z3ed agent dungeon-list-sprites --room 0x012
```
### GUI Testing Commands
@@ -220,6 +229,11 @@ AI agent features require:
- ✅ Updated README with clear dependency requirements
- ✅ Added Windows compatibility notes
### Conversational Loop
- ✅ Tool dispatcher defaults to JSON when invoked by the agent, keeping outputs machine-readable.
- ✅ Conversational service now replays tool results without recursion, improving chat stability.
- ✅ Mock AI service issues sample tool calls so the loop can be exercised without a live LLM.
## Troubleshooting
### "OpenSSL not found" warning

View File

@@ -12,8 +12,8 @@ namespace agent {
namespace {
constexpr absl::string_view kUsage =
"Usage: agent <run|plan|diff|accept|test|gui|learn|list|commit|revert|describe> "
"[options]";
"Usage: agent <run|plan|diff|accept|test|gui|learn|list|commit|revert|describe|resource-list|dungeon-list-sprites|chat> "
"[options]";
} // namespace
} // namespace agent

View File

@@ -15,47 +15,72 @@ ConversationalAgentService::ConversationalAgentService() {
absl::StatusOr<ChatMessage> ConversationalAgentService::SendMessage(
const std::string& message) {
// 1. Add user message to history.
history_.push_back({ChatMessage::Sender::kUser, message, absl::Now()});
// 2. Get response from the AI service using the full history.
auto response_or = ai_service_->GenerateResponse(history_);
if (!response_or.ok()) {
return absl::InternalError(absl::StrCat("Failed to get AI response: ",
response_or.status().message()));
if (message.empty() && history_.empty()) {
return absl::InvalidArgumentError(
"Conversation must start with a non-empty message.");
}
const auto& agent_response = response_or.value();
// For now, combine text and commands for display.
// In the future, the TUI/GUI will handle these differently.
std::string response_text = agent_response.text_response;
if (!agent_response.commands.empty()) {
response_text += "\n\nCommands:\n" + absl::StrJoin(agent_response.commands, "\n");
if (!message.empty()) {
history_.push_back({ChatMessage::Sender::kUser, message, absl::Now()});
}
// If the agent requested a tool call, dispatch it.
if (!agent_response.tool_calls.empty()) {
for (const auto& tool_call : agent_response.tool_calls) {
auto tool_result_or = tool_dispatcher_.Dispatch(tool_call);
if (tool_result_or.ok()) {
// Add the tool result to the history and send back to the AI.
history_.push_back({ChatMessage::Sender::kAgent, tool_result_or.value(), absl::Now()});
return SendMessage(""); // Re-prompt the AI with the new context.
} else {
// Handle tool execution error.
return absl::InternalError(absl::StrCat("Tool execution failed: ", tool_result_or.status().message()));
constexpr int kMaxToolIterations = 4;
for (int iteration = 0; iteration < kMaxToolIterations; ++iteration) {
auto response_or = ai_service_->GenerateResponse(history_);
if (!response_or.ok()) {
return absl::InternalError(absl::StrCat(
"Failed to get AI response: ", response_or.status().message()));
}
const auto& agent_response = response_or.value();
if (!agent_response.tool_calls.empty()) {
bool executed_tool = false;
for (const auto& tool_call : agent_response.tool_calls) {
auto tool_result_or = tool_dispatcher_.Dispatch(tool_call);
if (!tool_result_or.ok()) {
return absl::InternalError(absl::StrCat(
"Tool execution failed: ", tool_result_or.status().message()));
}
const std::string& tool_output = tool_result_or.value();
if (!tool_output.empty()) {
history_.push_back(
{ChatMessage::Sender::kAgent, tool_output, absl::Now()});
}
executed_tool = true;
}
if (executed_tool) {
// Re-query the AI with updated context.
continue;
}
}
std::string response_text = agent_response.text_response;
if (!agent_response.reasoning.empty()) {
if (!response_text.empty()) {
response_text.append("\n\n");
}
response_text.append("Reasoning: ");
response_text.append(agent_response.reasoning);
}
if (!agent_response.commands.empty()) {
if (!response_text.empty()) {
response_text.append("\n\n");
}
response_text.append("Commands:\n");
response_text.append(absl::StrJoin(agent_response.commands, "\n"));
}
ChatMessage chat_response = {ChatMessage::Sender::kAgent, response_text,
absl::Now()};
history_.push_back(chat_response);
return chat_response;
}
ChatMessage chat_response = {ChatMessage::Sender::kAgent, response_text,
absl::Now()};
// 3. Add agent response to history.
history_.push_back(chat_response);
return chat_response;
return absl::InternalError(
"Agent did not produce a response after executing tools.");
}
const std::vector<ChatMessage>& ConversationalAgentService::GetHistory() const {

View File

@@ -3,6 +3,7 @@
#include <iostream>
#include <sstream>
#include "absl/strings/match.h"
#include "absl/strings/str_format.h"
#include "cli/handlers/agent/commands.h"
@@ -13,9 +14,18 @@ namespace agent {
absl::StatusOr<std::string> ToolDispatcher::Dispatch(
const ToolCall& tool_call) {
std::vector<std::string> args;
bool has_format = false;
for (const auto& [key, value] : tool_call.args) {
args.push_back(absl::StrFormat("--%s", key));
args.push_back(value);
if (absl::EqualsIgnoreCase(key, "format")) {
has_format = true;
}
}
if (!has_format) {
args.push_back("--format");
args.push_back("json");
}
// Capture stdout

View File

@@ -1,19 +1,117 @@
#include "cli/service/ai/ai_service.h"
#include "cli/service/agent/conversational_agent_service.h"
#include <algorithm>
#include <cctype>
#include "absl/strings/ascii.h"
#include "absl/strings/match.h"
#include "absl/strings/numbers.h"
#include "absl/strings/str_format.h"
namespace yaze {
namespace cli {
namespace {
std::string ExtractRoomId(const std::string& normalized_prompt) {
size_t hex_pos = normalized_prompt.find("0x");
if (hex_pos != std::string::npos) {
std::string hex_value;
for (size_t i = hex_pos; i < normalized_prompt.size(); ++i) {
char c = normalized_prompt[i];
if (std::isxdigit(static_cast<unsigned char>(c)) || c == 'x') {
hex_value.push_back(c);
} else {
break;
}
}
if (hex_value.size() > 2) {
return hex_value;
}
}
// Fallback: look for decimal digits, then convert to hex string.
std::string digits;
for (char c : normalized_prompt) {
if (std::isdigit(static_cast<unsigned char>(c))) {
digits.push_back(c);
} else if (!digits.empty()) {
break;
}
}
if (!digits.empty()) {
int value = 0;
if (absl::SimpleAtoi(digits, &value)) {
return absl::StrFormat("0x%03X", value);
}
}
return "0x000";
}
} // namespace
absl::StatusOr<AgentResponse> MockAIService::GenerateResponse(
const std::string& prompt) {
AgentResponse response;
if (prompt == "Place a tree") {
response.text_response = "Sure, I can do that. Here is the command:";
response.commands.push_back("overworld set-tile 0 10 20 0x02E");
response.reasoning = "The user asked to place a tree, so I generated the appropriate `set-tile` command.";
} else {
response.text_response = "I'm sorry, I don't understand that prompt. Try 'Place a tree'.";
const std::string normalized = absl::AsciiStrToLower(prompt);
if (normalized.empty()) {
response.text_response =
"Let's start with a prompt about the overworld or dungeons.";
return response;
}
if (absl::StrContains(normalized, "place") &&
absl::StrContains(normalized, "tree")) {
response.text_response =
"Sure, I can do that. Here's the command to place a tree.";
response.commands.push_back("overworld set-tile --map 0 --x 10 --y 20 --tile 0x02E");
response.reasoning =
"The user asked to place a tree tile16, so I generated the matching set-tile command.";
return response;
}
if (absl::StrContains(normalized, "list") &&
absl::StrContains(normalized, "resource")) {
std::string resource_type = "dungeon";
if (absl::StrContains(normalized, "overworld")) {
resource_type = "overworld";
} else if (absl::StrContains(normalized, "sprite")) {
resource_type = "sprite";
} else if (absl::StrContains(normalized, "palette")) {
resource_type = "palette";
}
ToolCall call;
call.tool_name = "resource-list";
call.args.emplace("type", resource_type);
response.text_response =
absl::StrFormat("Fetching %s labels from the ROM...", resource_type);
response.reasoning =
"Using the resource-list tool keeps the LLM in sync with project labels.";
response.tool_calls.push_back(call);
return response;
}
if (absl::StrContains(normalized, "sprite") &&
absl::StrContains(normalized, "room")) {
ToolCall call;
call.tool_name = "dungeon-list-sprites";
call.args.emplace("room", ExtractRoomId(normalized));
response.text_response =
"Let me inspect the dungeon room sprites for you.";
response.reasoning =
"Calling the sprite inspection tool provides precise coordinates for the agent.";
response.tool_calls.push_back(call);
return response;
}
response.text_response =
"I'm not sure how to help with that yet. Try asking for resource labels "
"or listing dungeon sprites.";
return response;
}
@@ -22,7 +120,34 @@ absl::StatusOr<AgentResponse> MockAIService::GenerateResponse(
if (history.empty()) {
return absl::InvalidArgumentError("History cannot be empty.");
}
return GenerateResponse(history.back().message);
// If the last message in history is a tool output, synthesize a summary.
for (auto it = history.rbegin(); it != history.rend(); ++it) {
if (it->sender == agent::ChatMessage::Sender::kAgent &&
(absl::StrContains(it->message, "=== ") ||
absl::StrContains(it->message, "\"id\"") ||
absl::StrContains(it->message, "\n{"))) {
AgentResponse response;
response.text_response =
"Here's what I found:\n" + it->message +
"\nLet me know if you'd like to make a change.";
response.reasoning =
"Summarized the latest tool output for the user.";
return response;
}
}
auto user_it = std::find_if(history.rbegin(), history.rend(),
[](const agent::ChatMessage& message) {
return message.sender ==
agent::ChatMessage::Sender::kUser;
});
if (user_it == history.rend()) {
return absl::InvalidArgumentError(
"History does not contain a user message.");
}
return GenerateResponse(user_it->message);
}
} // namespace cli