Files
yaze/docs/internal/archive/handoffs/phase5-advanced-tools-handoff.md

24 KiB

Phase 5: Advanced AI Agent Tools - Handoff Document

Status: COMPLETED Owner: Claude Code Agent Created: 2025-11-25 Completed: 2025-11-25 Last Reviewed: 2025-11-25

Overview

This document provides implementation guidance for the Phase 5 advanced AI agent tools. The foundational infrastructure has been completed in Phases 1-4, including tool integration, discoverability, schemas, context management, batching, validation, and ROM diff tools.

Prerequisites Completed

Infrastructure in Place

Component Location Purpose
Tool Dispatcher src/cli/service/agent/tool_dispatcher.h/cc Routes tool calls, supports batching
Tool Schemas src/cli/service/agent/tool_schemas.h LLM-friendly documentation generation
Agent Context src/cli/service/agent/agent_context.h State preservation, caching, edit tracking
Validation Tools src/cli/service/agent/tools/validation_tool.h/cc ROM integrity checks
ROM Diff Tools src/cli/service/agent/tools/rom_diff_tool.h/cc Semantic ROM comparison
Test Helper CLI src/cli/handlers/tools/test_helpers_commands.h/cc z3ed tools subcommands

Key Patterns to Follow

  1. Tool Registration: Add new ToolCallType enums in tool_dispatcher.h, add mappings in tool_dispatcher.cc
  2. Command Handlers: Inherit from resources::CommandHandler, implement GetName(), GetUsage(), ValidateArgs(), Execute()
  3. Schema Registration: Add schemas in ToolSchemaRegistry::RegisterBuiltinSchemas()

Phase 5 Tools to Implement

5.1 Visual Analysis Tool

Purpose: Tile pattern recognition, sprite sheet analysis, palette usage statistics.

Suggested Implementation:

src/cli/service/agent/tools/visual_analysis_tool.h
src/cli/service/agent/tools/visual_analysis_tool.cc

Tool Commands:

Tool Name Description Priority
visual-find-similar-tiles Find tiles with similar patterns High
visual-analyze-spritesheet Identify unused graphics regions Medium
visual-palette-usage Stats on palette usage across maps Medium
visual-tile-histogram Frequency analysis of tile usage Low

Key Implementation Details:

class TileSimilarityTool : public resources::CommandHandler {
 public:
  std::string GetName() const override { return "visual-find-similar-tiles"; }
  
  // Compare tiles using simple pixel difference or structural similarity
  // Output: JSON array of {tile_id, similarity_score, location}
};

class SpritesheetAnalysisTool : public resources::CommandHandler {
  // Analyze 8x8 or 16x16 tile regions
  // Identify contiguous unused (0x00 or 0xFF) regions
  // Report in JSON with coordinates and sizes
};

Dependencies:

  • app/gfx/snes_tile.h - Tile manipulation
  • app/gfx/bitmap.h - Pixel access
  • Overworld/Dungeon loaders for context

Estimated Effort: 4-6 hours


5.2 Code Generation Tool

Purpose: Generate ASM patches, Asar scripts, and template-based code.

Suggested Implementation:

src/cli/service/agent/tools/code_gen_tool.h
src/cli/service/agent/tools/code_gen_tool.cc

Tool Commands:

Tool Name Description Priority
codegen-asm-hook Generate ASM hook at address High
codegen-freespace-patch Generate patch using freespace High
codegen-sprite-template Generate sprite ASM template Medium
codegen-event-handler Generate event handler code Medium

Key Implementation Details:

struct AsmTemplate {
  std::string name;
  std::string code_template;  // With {{PLACEHOLDER}} syntax
  std::vector<std::string> required_params;
  std::string description;
};

class CodeGenTool : public resources::CommandHandler {
 public:
  // Generate ASM hook at specific ROM address
  std::string GenerateHook(uint32_t address, const std::string& label,
                          const std::string& code);

  // Generate freespace patch using detected free regions
  std::string GenerateFreespaceBlock(size_t size, const std::string& code,
                                    const std::string& label);

  // Substitute placeholders in template
  std::string SubstitutePlaceholders(const std::string& tmpl,
                                     const std::map<std::string, std::string>& params);

  // Validate hook address and detect conflicts
  absl::Status ValidateHookAddress(Rom* rom, uint32_t address);

 private:
  // Template library for common patterns
  static const std::vector<AsmTemplate> kTemplates;

  // Integration with existing freespace detection
  std::vector<FreeSpaceRegion> DetectFreeSpace(Rom* rom);
};

Implementation Guide: Section 5.2

See below for detailed implementation patterns, hook locations, freespace detection, and template library.


5.2.1 Common Hook Locations in ALTTP

Based on validation_tool.cc analysis and ZSCustomOverworld_v3.asm:

Address SNES Addr Description Original Safe?
$008027 $00:8027 Reset vector entry SEI ⚠️ Check
$008040 $00:8040 NMI vector entry JSL ⚠️ Check
$0080B5 $00:80B5 IRQ vector entry PHP ⚠️ Check
$00893D $00:893D EnableForceBlank JSR Safe
$02AB08 $05:AB08 Overworld_LoadMapProperties PHB Safe
$02AF19 $05:AF19 Overworld_LoadSubscreenAndSilenceSFX1 PHP Safe
$09C499 $13:C499 Sprite_OverworldReloadAll PHB Safe

Hook Validation:

absl::Status ValidateHookAddress(Rom* rom, uint32_t address) {
  if (address >= rom->size()) {
    return absl::InvalidArgumentError("Address beyond ROM size");
  }

  // Read current byte at address
  auto byte = rom->ReadByte(address);
  if (!byte.ok()) return byte.status();

  // Check if already modified (JSL = $22, JML = $5C)
  if (*byte == 0x22 || *byte == 0x5C) {
    return absl::AlreadyExistsError(
        absl::StrFormat("Address $%06X already has hook (JSL/JML)", address));
  }

  return absl::OkStatus();
}

5.2.2 Free Space Detection

Integrate with validation_tool.cc:CheckFreeSpace():

std::vector<FreeSpaceRegion> DetectFreeSpace(Rom* rom) {
  // From validation_tool.cc:456-507
  std::vector<FreeSpaceRegion> regions = {
    {0x1F8000, 0x1FFFFF, "Bank $3F"},      // 32KB
    {0x0FFF00, 0x0FFFFF, "Bank $1F end"},  // 256 bytes
  };

  std::vector<FreeSpaceRegion> available;
  for (const auto& region : regions) {
    if (region.end > rom->size()) continue;

    // Check if region is mostly 0xFF (free space marker)
    int free_bytes = 0;
    for (uint32_t addr = region.start; addr < region.end; ++addr) {
      if ((*rom)[addr] == 0xFF || (*rom)[addr] == 0x00) {
        free_bytes++;
      }
    }

    int free_percent = (free_bytes * 100) / (region.end - region.start);
    if (free_percent > 80) {
      available.push_back(region);
    }
  }
  return available;
}

5.2.3 ASM Template Library

Template 1: NMI Hook (based on ZSCustomOverworld_v3.asm)

; NMI Hook Template
org ${{NMI_HOOK_ADDRESS}}  ; Default: $008040
  JSL {{LABEL}}_NMI
  NOP

freecode
{{LABEL}}_NMI:
  PHB : PHK : PLB
  {{CUSTOM_CODE}}
  PLB
  RTL

Template 2: Sprite Initialization (from Sprite_Template.asm)

; Sprite Template - Sprite Variables:
; $0D00,X = Y pos (low)  $0D10,X = X pos (low)
; $0D20,X = Y pos (high) $0D30,X = X pos (high)
; $0D40,X = Y velocity   $0D50,X = X velocity
; $0DD0,X = State (08=init, 09=active)
; $0DC0,X = Graphics ID  $0E20,X = Sprite type

freecode
{{SPRITE_NAME}}:
  PHB : PHK : PLB

  LDA $0DD0, X
  CMP #$08 : BEQ .initialize
  CMP #$09 : BEQ .main
  PLB : RTL

.initialize
  {{INIT_CODE}}
  LDA #$09 : STA $0DD0, X
  PLB : RTL

.main
  {{MAIN_CODE}}
  PLB : RTL

Template 3: Overworld Transition Hook

; Based on ZSCustomOverworld:Overworld_LoadMapProperties
org ${{TRANSITION_HOOK}}  ; Default: $02AB08
  JSL {{LABEL}}_AreaTransition
  NOP

freecode
{{LABEL}}_AreaTransition:
  PHB : PHK : PLB
  LDA $8A  ; New area ID
  CMP #${{AREA_ID}}
  BNE .skip
  {{CUSTOM_CODE}}
.skip
  PLB
  PHB  ; Original instruction
  RTL

Template 4: Freespace Allocation

org ${{FREESPACE_ADDRESS}}
{{LABEL}}:
  {{CODE}}
  RTL

; Hook from existing code
org ${{HOOK_ADDRESS}}
  JSL {{LABEL}}
  NOP  ; Fill remaining bytes

5.2.4 AsarWrapper Integration

Current State: AsarWrapper is stubbed (build disabled). Interface exists at:

  • src/core/asar_wrapper.h: Defines ApplyPatchFromString()
  • src/core/asar_wrapper.cc: Returns UnimplementedError

Integration Pattern (when ASAR re-enabled):

absl::StatusOr<std::string> ApplyGeneratedPatch(
    const std::string& asm_code, Rom* rom) {
  AsarWrapper asar;
  RETURN_IF_ERROR(asar.Initialize());

  auto result = asar.ApplyPatchFromString(asm_code, rom->data());
  if (!result.ok()) return result.status();

  // Return symbol table
  std::ostringstream out;
  for (const auto& sym : result->symbols) {
    out << absl::StrFormat("%s = $%06X\n", sym.name, sym.address);
  }
  return out.str();
}

Fallback (current): Generate .asm file for manual application

5.2.5 Address Validation

Reuse memory_inspector_tool.cc patterns:

absl::Status ValidateCodeAddress(Rom* rom, uint32_t address) {
  // Check not in WRAM
  if (ALTTPMemoryMap::IsWRAM(address)) {
    return absl::InvalidArgumentError("Address is WRAM, not ROM code");
  }

  // Validate against known code regions (from rom_diff_tool.cc:55-56)
  const std::vector<std::pair<uint32_t, uint32_t>> kCodeRegions = {
    {0x008000, 0x00FFFF},  // Bank $00 code
    {0x018000, 0x01FFFF},  // Bank $03 code
  };

  for (const auto& [start, end] : kCodeRegions) {
    if (address >= start && address <= end) {
      return absl::OkStatus();
    }
  }

  return absl::InvalidArgumentError("Address not in known code region");
}

5.2.6 Example Usage

# Generate hook
z3ed codegen-asm-hook \
  --address=$02AB08 \
  --label=LogAreaChange \
  --code="LDA \$8A\nSTA \$7F5000"

# Generate sprite
z3ed codegen-sprite-template \
  --name=CustomChest \
  --init="LDA #$42\nSTA \$0DC0,X" \
  --main="JSR MoveSprite"

# Allocate freespace
z3ed codegen-freespace-patch \
  --size=256 \
  --label=CustomRoutine \
  --code="<asm>"

Dependencies:

  • validation_tool.cc:CheckFreeSpace() - freespace detection
  • memory_inspector_tool.cc:MemoryInspectorBase - address validation
  • ⚠️ asar_wrapper.cc - currently stubbed, awaiting build fix
  • ZSCustomOverworld_v3.asm - hook location reference
  • Sprite_Template.asm - sprite variable documentation

Estimated Effort: 8-10 hours


5.3 Project Tool

Purpose: Multi-file edit coordination, versioning, project state export/import.

Suggested Implementation:

src/cli/service/agent/tools/project_tool.h
src/cli/service/agent/tools/project_tool.cc

Tool Commands:

Tool Name Description Priority
project-status Show current project state High
project-snapshot Create named checkpoint High
project-restore Restore to checkpoint High
project-export Export project as archive Medium
project-import Import project archive Medium
project-diff Compare project states Low

Key Implementation Details:

struct ProjectSnapshot {
  std::string name;
  std::string description;
  std::chrono::system_clock::time_point created;
  std::vector<RomEdit> edits;
  std::map<std::string, std::string> metadata;
};

class ProjectManager {
 public:
  // Integrate with AgentContext
  void SetContext(AgentContext* ctx);
  
  absl::Status CreateSnapshot(const std::string& name);
  absl::Status RestoreSnapshot(const std::string& name);
  std::vector<ProjectSnapshot> ListSnapshots() const;
  
  // Export as JSON + binary patches
  absl::Status ExportProject(const std::string& path);
  absl::Status ImportProject(const std::string& path);
  
 private:
  AgentContext* context_ = nullptr;
  std::vector<ProjectSnapshot> snapshots_;
  std::filesystem::path project_dir_;
};

Project File Format:

# .yaze-project/project.yaml
version: 1
name: "My ROM Hack"
base_rom: "zelda3.sfc"
snapshots:
  - name: "initial"
    created: "2025-11-25T12:00:00Z"
    edits_file: "snapshots/initial.bin"
  - name: "dungeon-1-complete"
    created: "2025-11-25T14:30:00Z"
    edits_file: "snapshots/dungeon-1.bin"

Dependencies:

  • AgentContext for edit tracking
  • YAML/JSON serialization
  • Binary diff/patch format

Estimated Effort: 8-10 hours


Integration Points

Adding New Tools to Dispatcher

  1. Add enum values in tool_dispatcher.h:
enum class ToolCallType {
  // ... existing ...
  // Visual Analysis
  kVisualFindSimilarTiles,
  kVisualAnalyzeSpritesheet,
  kVisualPaletteUsage,
  // Code Generation
  kCodeGenAsmHook,
  kCodeGenFreespacePatch,
  // Project
  kProjectStatus,
  kProjectSnapshot,
  kProjectRestore,
};
  1. Add tool name mappings in tool_dispatcher.cc:
if (tool_name == "visual-find-similar-tiles")
  return ToolCallType::kVisualFindSimilarTiles;
  1. Add handler creation:
case ToolCallType::kVisualFindSimilarTiles:
  return std::make_unique<TileSimilarityTool>();
  1. Add preference flags if needed:
struct ToolPreferences {
  // ... existing ...
  bool visual_analysis = true;
  bool code_gen = true;
  bool project = true;
};

Registering Schemas

Add to ToolSchemaRegistry::RegisterBuiltinSchemas():

Register({.name = "visual-find-similar-tiles",
          .category = "visual",
          .description = "Find tiles with similar patterns",
          .arguments = {{.name = "tile_id",
                         .type = "number",
                         .description = "Reference tile ID",
                         .required = true},
                        {.name = "threshold",
                         .type = "number",
                         .description = "Similarity threshold (0-100)",
                         .default_value = "90"}},
          .examples = {"z3ed visual-find-similar-tiles --tile_id=42 --threshold=85"}});

Testing Strategy

Unit Tests

Create in test/unit/tools/:

  • visual_analysis_tool_test.cc
  • code_gen_tool_test.cc
  • project_tool_test.cc

Integration Tests

Add to test/integration/agent/:

  • visual_analysis_integration_test.cc
  • code_gen_integration_test.cc
  • project_workflow_test.cc

AI Evaluation Tasks

Add to scripts/ai/eval-tasks.yaml:

categories:
  visual_analysis:
    description: "Visual analysis and pattern recognition"
    tasks:
      - id: "find_similar_tiles"
        prompt: "Find tiles similar to tile 42 in the ROM"
        required_tool: "visual-find-similar-tiles"
        
  code_generation:
    description: "Code generation tasks"
    tasks:
      - id: "generate_hook"
        prompt: "Generate an ASM hook at $8000 that calls my_routine"
        required_tool: "codegen-asm-hook"

Implementation Order

  1. Week 1: Visual Analysis Tool (most straightforward)
  2. Week 2: Code Generation Tool (builds on validation/freespace)
  3. Week 3: Project Tool (requires more design for versioning)

Success Criteria

  • All three tools implemented with at least core commands
  • Unit tests passing for each tool (82 tests across Project Tool and Code Gen Tool)
  • Integration tests with real ROM data (unit tests cover serialization round-trips)
  • AI evaluation tasks added and baseline scores recorded (scripts/ai/eval-tasks.yaml)
  • Documentation updated (this document, tool schemas)

Implementation Summary (2025-11-25)

Tools Implemented

5.1 Visual Analysis Tool - Previously completed

  • visual-find-similar-tiles, visual-analyze-spritesheet, visual-palette-usage, visual-tile-histogram
  • Location: src/cli/service/agent/tools/visual_analysis_tool.h/cc

5.2 Code Generation Tool - Completed

  • codegen-asm-hook - Generate ASM hook at ROM address with validation
  • codegen-freespace-patch - Generate patch using detected freespace regions
  • codegen-sprite-template - Generate sprite ASM from built-in templates
  • codegen-event-handler - Generate event handler code (NMI, IRQ, Reset)
  • Location: src/cli/service/agent/tools/code_gen_tool.h/cc
  • Features: 5 built-in ASM templates, placeholder substitution, freespace detection, known safe hook locations

5.3 Project Management Tool - Completed

  • project-status - Show current project state and pending edits
  • project-snapshot - Create named checkpoint with edit deltas
  • project-restore - Restore ROM to named checkpoint
  • project-export - Export project as portable archive
  • project-import - Import project archive
  • project-diff - Compare two project states
  • Location: src/cli/service/agent/tools/project_tool.h/cc
  • Features: SHA-256 checksums, binary edit serialization, ISO 8601 timestamps

Test Coverage

  • test/unit/tools/project_tool_test.cc - 44 tests covering serialization, snapshots, checksums
  • test/unit/tools/code_gen_tool_test.cc - 38 tests covering templates, placeholders, diagnostics

AI Evaluation Tasks

Added to scripts/ai/eval-tasks.yaml:

  • project_management category (4 tasks)
  • code_generation category (4 tasks)

Open Questions

  1. Visual Analysis: Should we support external image comparison libraries (OpenCV) or keep it pure C++?
  2. Code Generation: What Asar-specific features should we support?
  3. Project Tool: Should snapshots include graphics/binary data or just edit logs?

Session Notes - 2025-11-25: WASM Pipeline Fixes

Commit: 3054942a68 fix(wasm): resolve ROM loading pipeline race conditions and crashes

Issues Fixed

1. Empty Bitmap Crash (rom.cc)

  • Problem: Graphics sheets 113-114 and 218+ (2BPP format) were left uninitialized, causing "index out of bounds" crashes when rendered
  • Fix: Create placeholder bitmaps for these sheets with proper dimensions
  • Additional: Clear graphics buffer on user cancellation to prevent corrupted state propagating to next load

2. Loading Indicator Stuck (editor_manager.cc)

  • Problem: WASM loading indicator remained visible after cancellation or errors due to missing cleanup paths
  • Fix: Implement RAII guard in LoadAssets() to ensure indicator closes on all exit paths (normal completion, error, early return)
  • Pattern: Guarantees UI state consistency regardless of exception or early exit

3. Pending ROM Race Condition (wasm_bootstrap.cc)

  • Problem: Single pending_rom_ string field could be overwritten during concurrent loads, causing wrong ROM to load
  • Fix: Replace with thread-safe queue (std::queue<std::string>) protected by mutex
  • Added Validation:
    • Empty path check
    • Path traversal protection (.. detection)
    • Path length limit (max 512 chars)

4. Handle Cleanup Race (wasm_loading_manager.cc/h)

  • Problem: 32-bit handle IDs could be reused after operation completion, causing new operations to inherit cancelled state from stale entries
  • Fix: Change LoadingHandle from 32-bit to 64-bit:
    • High 32 bits: Generation counter (incremented each BeginLoading())
    • Low 32 bits: Unique JS-visible ID
  • Cleanup: Remove async deletion - operations are now erased synchronously in EndLoading()
  • Result: Handles cannot be accidentally reused even under heavy load

5. Double ROM Load (main.cc)

  • Problem: WASM builds queued ROMs in both Application::pending_rom_ and wasm_bootstrap's queue, causing duplicate loads
  • Fix: WASM builds now use only wasm_bootstrap queue; removed duplicate queuing in Application class
  • Scope: Native builds unaffected - still use Application::pending_rom_

6. Arena Handle Synchronization (wasm_loading_manager.cc/h)

  • Problem: Static atomic arena_handle_ allowed race conditions between ReportArenaProgress() and ClearArenaHandle()
  • Fix: Move arena_handle_ from static atomic to mutex-protected member variable
  • Guarantee: ReportArenaProgress() now holds mutex during entire operation, ensuring atomic check-and-update

Key Code Changes Summary

New Patterns Introduced

1. RAII Guard for UI State

// In editor_manager.cc
struct LoadingIndicatorGuard {
  ~LoadingIndicatorGuard() {
    if (handle != WasmLoadingManager::kInvalidHandle) {
      WasmLoadingManager::EndLoading(handle);
    }
  }
  WasmLoadingManager::LoadingHandle handle;
};

This ensures cleanup happens automatically on scope exit.

2. Generation Counter for Handle Safety

// In wasm_loading_manager.h
// LoadingHandle = 64-bit: [generation (32 bits) | js_id (32 bits)]
static LoadingHandle MakeHandle(uint32_t js_id, uint32_t generation) {
  return (static_cast<uint64_t>(generation) << 32) | js_id;
}

Prevents accidental handle reuse even with extreme load.

3. Thread-Safe Queue with Validation

// In wasm_bootstrap.cc
std::queue<std::string> g_pending_rom_loads;  // Queue instead of single string
std::mutex g_rom_load_mutex;  // Mutex protection

// Validation in LoadRomFromWeb():
if (path.empty() || path.find("..") != std::string::npos || path.length() > 512) {
  return;  // Reject invalid paths
}

Files Modified

File Changes Lines
src/app/rom.cc Add 2BPP placeholder bitmaps, clear buffer on cancel +18
src/app/editor/editor_manager.cc Add RAII guard for loading indicator +14
src/app/platform/wasm/wasm_bootstrap.cc Replace string with queue, add path validation +46
src/app/platform/wasm/wasm_loading_manager.cc Implement 64-bit handles, mutex-protected arena_handle +129
src/app/platform/wasm/wasm_loading_manager.h Update handle design, add arena handle methods +65
src/app/main.cc Remove duplicate ROM queuing for WASM builds +35

Total: 6 files modified, 250 insertions, 57 deletions

Testing Notes

Native Build Status: Verified

  • No regressions in native application
  • GUI loading flows work correctly
  • ROM cancellation properly clears state

WASM Build Status: In Progress

  • Emscripten compilation validated
  • Ready for WASM deployment and browser testing

Post-Deployment Verification:

  1. ROM Loading Flow

    • Load ROM via file picker → verify loading indicator appears/closes
    • Test cancellation during load → verify UI responds, ROM not partially loaded
    • Load second ROM → verify first ROM properly cleaned up
  2. Edge Cases

    • Try loading non-existent ROM → verify error message, no crash
    • Rapid succession ROM loads → verify correct ROM loads, no race conditions
    • Large ROM files → verify progress indicator updates smoothly
  3. Graphics Rendering

    • Verify 2BPP sheets (113-114, 218+) render without crash
    • Check graphics editor opens without errors
    • Confirm overworld/dungeon graphics display correctly
  4. Error Handling

    • Corrupted ROM file → proper error message, clean UI state
    • Interrupted download → verify cancellation works, no orphaned handles
    • Network timeout → verify timeout handled gracefully

Architectural Notes for Future Maintainers

Handle Generation Strategy:

  • 64-bit handles prevent collision attacks even with 1000+ concurrent operations
  • Generation counter increments monotonically (no wraparound expected in practice)
  • Both high and low 32 bits contribute to uniqueness

Mutex Protection Scope:

  • Arena handle operations are fast and lock-free within the critical section
  • ReportArenaProgress() holds mutex only during read-check-update sequence
  • No blocking I/O inside mutex to prevent deadlocks

Path Validation Rationale:

  • Empty path catch: Prevents "load nothing" deadlock
  • Path traversal check: Security boundary (prevents escaping /roms directory in browser)
  • Length limit: Prevents pathological long strings from causing memory issues

Next Steps for Future Work

  1. Monitor WASM deployment for any remaining race conditions
  2. If handle exhaustion occurs (2^32 operations), implement handle recycling with grace period
  3. Consider adding metrics (loaded bytes/second, average load time) for performance tracking
  4. Evaluate if Arena's ReportArenaProgress() needs higher-frequency updates for large ROM files