Enhance z3ed Agent Roadmap and CLI Reference with New Tool Commands and Chat Features

2025-10-03 13:51:01 -04:00
parent ba9f6533a4
commit c7a7707d25
7 changed files with 268 additions and 49 deletions
--- a/docs/z3ed/AGENT-ROADMAP.md
+++ b/docs/z3ed/AGENT-ROADMAP.md
@@ -108,13 +108,12 @@ We have made significant progress in laying the foundation for the conversationa
 - **GUI Chat Widget Stub**: An `AgentChatWidget` is integrated into the main GUI.
 - **Initial Agent "Tools"**: `resource-list` and `dungeon-list-sprites` commands are implemented.
 - **Tool Use Foundation**: The `ToolDispatcher` is implemented, and the AI services are aware of the new tool call format.
 - **Tool Loop Improvements**: Conversational flow now handles multi-step tool calls with default JSON output, allowing results to feed back into the chat without recursion.
-### ⚠️ Current Blocker: Build Configuration
+### ✅ Build Configuration Issue Resolved
-We are currently facing a linker error when building the main `yaze` application with gRPC support. The `ToolDispatcher` is unable to find the definitions for the `HandleResourceListCommand` and `HandleDungeonListSpritesCommand` functions.
+The linker error is fixed. Both the CLI and GUI targets now link against `yaze_agent`, so the shared agent handlers (`HandleResourceListCommand`, `HandleDungeonListSpritesCommand`, etc.) compile once and are available to `ToolDispatcher` everywhere.
 **Root Cause**: These handler functions are only compiled as part of the `z3ed` target, not the `yaze` target. The `ToolDispatcher`, which is now included in the `yaze` build, depends on them.
 ### 🚀 Next Steps
-1.  **Resolve the Build Issue**: The immediate next step is to fix the linker error. This will likely involve a thoughtful refactoring of our CMake configuration to better share sources between the `yaze` and `z3ed` targets.
+1.  **Share ROM Context with the Agent**: Inject the active GUI ROM into `ConversationalAgentService` so tool calls work even when `--rom` flags are unavailable.
-2.  **Simplify CMake Structure**: As discussed, the current structure of including `.cmake` files from various subdirectories is becoming difficult to manage. We should consider flattening this into a more centralized source list in the main `src/CMakeLists.txt`.
+2.  **Surface Tool Output in the UI**: Present JSON/table responses in the chat widgets with formatting instead of raw text dumps.
-3.  **Continue with Tool Integration**: Once the build is fixed, we can proceed with integrating the tool execution results back into the conversational loop.
+3.  **Expand Tool Coverage**: Add the next batch of read-only utilities (`dungeon get-info`, `overworld find-tile`) now that the tooling loop is stable.
--- a/docs/z3ed/E6-z3ed-reference.md
+++ b/docs/z3ed/E6-z3ed-reference.md
@@ -207,6 +207,52 @@ Examples:
 - `dungeon` - Dungeon editing
 - `agent` - Agent commands
 #### `agent resource-list` - Enumerate labeled resources for the AI
 ```bash
 z3ed agent resource-list --type <resource> [--format <table|json>]
 Options:
  --type <resource>   Required label family (dungeon, overworld, sprite, palette, etc.)
  --format <mode>     Output format, defaults to `table`. Use `json` for LLM tooling.
 Examples:
  # Show dungeon labels in a table
  z3ed agent resource-list --type dungeon
  # Emit JSON for the conversation agent to consume
  z3ed agent resource-list --type overworld --format json
 ```
 **Notes**:
 - When the conversation agent invokes this tool, JSON output is requested automatically.
 - Labels are loaded from `ResourceContextBuilder`, so the command reflects project-specific metadata.
 #### `agent dungeon-list-sprites` - Inspect sprites in a dungeon room
 ```bash
 z3ed agent dungeon-list-sprites --room <hex_id> [--format <table|json>]
 Options:
  --room <hex_id>   Dungeon room ID (hexadecimal). Accepts `0x` prefixes or decimal.
  --format <mode>   Output format, defaults to `table`.
 Examples:
  z3ed agent dungeon-list-sprites --room 0x012
  z3ed agent dungeon-list-sprites --room 18 --format json
 ```
 **Output**:
 - Table view prints sprite id/x/y in hex+decimal for quick inspection.
 - JSON view is tailored for the LLM toolchain and is returned automatically during tool calls.
 #### `agent chat` - Interactive terminal chat (TUI prototype)
 ```bash
 z3ed agent chat
 ```
 - Opens an FTXUI-based interface with scrolling history and input box.
 - Uses the shared `ConversationalAgentService`, so the same backend powers the GUI widget.
 - Useful for manual testing of tool dispatching and new prompting strategies.
 #### `agent test` - Automated GUI testing (IT-02)
 ```bash
 z3ed agent test --prompt "<test_description>" [--host <hostname>] [--port <port>]
--- a/docs/z3ed/README.md
+++ b/docs/z3ed/README.md
@@ -40,11 +40,20 @@ z3ed agent plan --prompt "Place a tree at position 10, 10 on map 0"
 # Execute in sandbox with auto-approval
 z3ed agent run --prompt "Create a 3x3 water pond at 15, 20" --rom zelda3.sfc --sandbox
 # Chat with the agent in the terminal (FTXUI prototype)
 z3ed agent chat
 # List all proposals
 z3ed agent list
 # View proposal details
 z3ed agent diff --proposal <id>
 # Inspect project metadata for the LLM toolchain
 z3ed agent resource-list --type dungeon --format json
 # Dump sprite placements for a dungeon room
 z3ed agent dungeon-list-sprites --room 0x012
 ```
 ### GUI Testing Commands
@@ -220,6 +229,11 @@ AI agent features require:
 - ✅ Updated README with clear dependency requirements
 - ✅ Added Windows compatibility notes
 ### Conversational Loop
 - ✅ Tool dispatcher defaults to JSON when invoked by the agent, keeping outputs machine-readable.
 - ✅ Conversational service now replays tool results without recursion, improving chat stability.
 - ✅ Mock AI service issues sample tool calls so the loop can be exercised without a live LLM.
 ## Troubleshooting
 ### "OpenSSL not found" warning
--- a/src/cli/handlers/agent.cc
+++ b/src/cli/handlers/agent.cc
@@ -12,7 +12,7 @@ namespace agent {
 namespace {
 constexpr absl::string_view kUsage =
-    "Usage: agent <run|plan|diff|accept|test|gui|learn|list|commit|revert|describe> "
+  "Usage: agent <run|plan|diff|accept|test|gui|learn|list|commit|revert|describe|resource-list|dungeon-list-sprites|chat> "
  "[options]";
 }  // namespace
--- a/src/cli/service/agent/conversational_agent_service.cc
+++ b/src/cli/service/agent/conversational_agent_service.cc
@@ -15,47 +15,72 @@ ConversationalAgentService::ConversationalAgentService() {
 absl::StatusOr<ChatMessage> ConversationalAgentService::SendMessage(
    const std::string& message) {
-  // 1. Add user message to history.
+  if (message.empty() && history_.empty()) {
-  history_.push_back({ChatMessage::Sender::kUser, message, absl::Now()});
+    return absl::InvalidArgumentError(
        "Conversation must start with a non-empty message.");
  }
-  // 2. Get response from the AI service using the full history.
+  if (!message.empty()) {
    history_.push_back({ChatMessage::Sender::kUser, message, absl::Now()});
  }
  constexpr int kMaxToolIterations = 4;
  for (int iteration = 0; iteration < kMaxToolIterations; ++iteration) {
    auto response_or = ai_service_->GenerateResponse(history_);
    if (!response_or.ok()) {
-    return absl::InternalError(absl::StrCat("Failed to get AI response: ",
+      return absl::InternalError(absl::StrCat(
-                                           response_or.status().message()));
+          "Failed to get AI response: ", response_or.status().message()));
    }
    const auto& agent_response = response_or.value();
  // For now, combine text and commands for display.
  // In the future, the TUI/GUI will handle these differently.
  std::string response_text = agent_response.text_response;
  if (!agent_response.commands.empty()) {
    response_text += "\n\nCommands:\n" + absl::StrJoin(agent_response.commands, "\n");
  }
  // If the agent requested a tool call, dispatch it.
    if (!agent_response.tool_calls.empty()) {
      bool executed_tool = false;
      for (const auto& tool_call : agent_response.tool_calls) {
        auto tool_result_or = tool_dispatcher_.Dispatch(tool_call);
-      if (tool_result_or.ok()) {
+        if (!tool_result_or.ok()) {
-        // Add the tool result to the history and send back to the AI.
+          return absl::InternalError(absl::StrCat(
-        history_.push_back({ChatMessage::Sender::kAgent, tool_result_or.value(), absl::Now()});
+              "Tool execution failed: ", tool_result_or.status().message()));
-        return SendMessage(""); // Re-prompt the AI with the new context.
+        }
-      } else {
+
-        // Handle tool execution error.
+        const std::string& tool_output = tool_result_or.value();
-        return absl::InternalError(absl::StrCat("Tool execution failed: ", tool_result_or.status().message()));
+        if (!tool_output.empty()) {
          history_.push_back(
              {ChatMessage::Sender::kAgent, tool_output, absl::Now()});
        }
        executed_tool = true;
      }
      if (executed_tool) {
        // Re-query the AI with updated context.
        continue;
      }
    }
    std::string response_text = agent_response.text_response;
    if (!agent_response.reasoning.empty()) {
      if (!response_text.empty()) {
        response_text.append("\n\n");
      }
      response_text.append("Reasoning: ");
      response_text.append(agent_response.reasoning);
    }
    if (!agent_response.commands.empty()) {
      if (!response_text.empty()) {
        response_text.append("\n\n");
      }
      response_text.append("Commands:\n");
      response_text.append(absl::StrJoin(agent_response.commands, "\n"));
    }
    ChatMessage chat_response = {ChatMessage::Sender::kAgent, response_text,
                                 absl::Now()};
  // 3. Add agent response to history.
    history_.push_back(chat_response);
    return chat_response;
  }
  return absl::InternalError(
      "Agent did not produce a response after executing tools.");
 }
 const std::vector<ChatMessage>& ConversationalAgentService::GetHistory() const {
--- a/src/cli/service/agent/tool_dispatcher.cc
+++ b/src/cli/service/agent/tool_dispatcher.cc
@@ -3,6 +3,7 @@
 #include <iostream>
 #include <sstream>
 #include "absl/strings/match.h"
 #include "absl/strings/str_format.h"
 #include "cli/handlers/agent/commands.h"
@@ -13,9 +14,18 @@ namespace agent {
 absl::StatusOr<std::string> ToolDispatcher::Dispatch(
    const ToolCall& tool_call) {
  std::vector<std::string> args;
  bool has_format = false;
  for (const auto& [key, value] : tool_call.args) {
    args.push_back(absl::StrFormat("--%s", key));
    args.push_back(value);
    if (absl::EqualsIgnoreCase(key, "format")) {
      has_format = true;
    }
  }
  if (!has_format) {
    args.push_back("--format");
    args.push_back("json");
  }
  // Capture stdout
--- a/src/cli/service/ai/ai_service.cc
+++ b/src/cli/service/ai/ai_service.cc
@@ -1,19 +1,117 @@
 #include "cli/service/ai/ai_service.h"
 #include "cli/service/agent/conversational_agent_service.h"
 #include <algorithm>
 #include <cctype>
 #include "absl/strings/ascii.h"
 #include "absl/strings/match.h"
 #include "absl/strings/numbers.h"
 #include "absl/strings/str_format.h"
 namespace yaze {
 namespace cli {
 namespace {
 std::string ExtractRoomId(const std::string& normalized_prompt) {
  size_t hex_pos = normalized_prompt.find("0x");
  if (hex_pos != std::string::npos) {
    std::string hex_value;
    for (size_t i = hex_pos; i < normalized_prompt.size(); ++i) {
      char c = normalized_prompt[i];
      if (std::isxdigit(static_cast<unsigned char>(c)) || c == 'x') {
        hex_value.push_back(c);
      } else {
        break;
      }
    }
    if (hex_value.size() > 2) {
      return hex_value;
    }
  }
  // Fallback: look for decimal digits, then convert to hex string.
  std::string digits;
  for (char c : normalized_prompt) {
    if (std::isdigit(static_cast<unsigned char>(c))) {
      digits.push_back(c);
    } else if (!digits.empty()) {
      break;
    }
  }
  if (!digits.empty()) {
    int value = 0;
    if (absl::SimpleAtoi(digits, &value)) {
      return absl::StrFormat("0x%03X", value);
    }
  }
  return "0x000";
 }
 }  // namespace
 absl::StatusOr<AgentResponse> MockAIService::GenerateResponse(
    const std::string& prompt) {
  AgentResponse response;
-  if (prompt == "Place a tree") {
+  const std::string normalized = absl::AsciiStrToLower(prompt);
-    response.text_response = "Sure, I can do that. Here is the command:";
+
-    response.commands.push_back("overworld set-tile 0 10 20 0x02E");
+  if (normalized.empty()) {
-    response.reasoning = "The user asked to place a tree, so I generated the appropriate `set-tile` command.";
+    response.text_response =
-  } else {
+        "Let's start with a prompt about the overworld or dungeons.";
-    response.text_response = "I'm sorry, I don't understand that prompt. Try 'Place a tree'.";
+    return response;
  }
  if (absl::StrContains(normalized, "place") &&
      absl::StrContains(normalized, "tree")) {
    response.text_response =
        "Sure, I can do that. Here's the command to place a tree.";
    response.commands.push_back("overworld set-tile --map 0 --x 10 --y 20 --tile 0x02E");
    response.reasoning =
        "The user asked to place a tree tile16, so I generated the matching set-tile command.";
    return response;
  }
  if (absl::StrContains(normalized, "list") &&
      absl::StrContains(normalized, "resource")) {
    std::string resource_type = "dungeon";
    if (absl::StrContains(normalized, "overworld")) {
      resource_type = "overworld";
    } else if (absl::StrContains(normalized, "sprite")) {
      resource_type = "sprite";
    } else if (absl::StrContains(normalized, "palette")) {
      resource_type = "palette";
    }
    ToolCall call;
    call.tool_name = "resource-list";
    call.args.emplace("type", resource_type);
    response.text_response =
        absl::StrFormat("Fetching %s labels from the ROM...", resource_type);
    response.reasoning =
        "Using the resource-list tool keeps the LLM in sync with project labels.";
    response.tool_calls.push_back(call);
    return response;
  }
  if (absl::StrContains(normalized, "sprite") &&
      absl::StrContains(normalized, "room")) {
    ToolCall call;
    call.tool_name = "dungeon-list-sprites";
    call.args.emplace("room", ExtractRoomId(normalized));
    response.text_response =
        "Let me inspect the dungeon room sprites for you.";
    response.reasoning =
        "Calling the sprite inspection tool provides precise coordinates for the agent.";
    response.tool_calls.push_back(call);
    return response;
  }
  response.text_response =
      "I'm not sure how to help with that yet. Try asking for resource labels "
      "or listing dungeon sprites.";
  return response;
 }
@@ -22,7 +120,34 @@ absl::StatusOr<AgentResponse> MockAIService::GenerateResponse(
  if (history.empty()) {
    return absl::InvalidArgumentError("History cannot be empty.");
  }
-  return GenerateResponse(history.back().message);
+
  // If the last message in history is a tool output, synthesize a summary.
  for (auto it = history.rbegin(); it != history.rend(); ++it) {
    if (it->sender == agent::ChatMessage::Sender::kAgent &&
        (absl::StrContains(it->message, "=== ") ||
         absl::StrContains(it->message, "\"id\"") ||
         absl::StrContains(it->message, "\n{"))) {
      AgentResponse response;
      response.text_response =
          "Here's what I found:\n" + it->message +
          "\nLet me know if you'd like to make a change.";
      response.reasoning =
          "Summarized the latest tool output for the user.";
      return response;
    }
  }
  auto user_it = std::find_if(history.rbegin(), history.rend(),
                               [](const agent::ChatMessage& message) {
                                 return message.sender ==
                                        agent::ChatMessage::Sender::kUser;
                               });
  if (user_it == history.rend()) {
    return absl::InvalidArgumentError(
        "History does not contain a user message.");
  }
  return GenerateResponse(user_it->message);
 }
 }  // namespace cli