z3ed: AI-Powered CLI for YAZE

Status: Active Development | Test Harness Enhancement Phase

Overview

z3ed is a command-line interface for YAZE that enables AI-driven ROM modifications through a proposal-based workflow. It provides both human-accessible commands for developers and machine-readable APIs for LLM integration, forming the backbone of an agentic development ecosystem.

Recent Focus: Evolving the ImGuiTestHarness from basic GUI automation into a comprehensive testing platform that serves dual purposes:

AI-Driven Workflows: Widget discovery, test introspection, and dynamic interaction learning
Traditional GUI Testing: Test recording/replay, CI/CD integration, and regression testing

🤖 Why This Matters: These enhancements are critical for AI agent autonomy. Without them, AI agents can't verify their changes worked (no test polling), discover UI elements dynamically (hardcoded names), learn from demonstrations (no recording), or debug failures (no screenshots). The test harness evolution enables fully autonomous agents that can execute → verify → self-correct without human intervention.

📋 Implementation Status: Core infrastructure complete (Phases 1-6, AW-01 to AW-04, IT-01 to IT-09). Currently focusing on LLM Integration to enable practical AI-driven workflows. See LLM-INTEGRATION-PLAN.md for the detailed roadmap (Ollama, Gemini, Claude).

This directory contains the primary documentation for the z3ed system.

📋 Documentation Status: Consolidated (Oct 2, 2025) - 10 core files, 6,547 lines

Core Documentation

Start here to understand the architecture, learn how to use the commands, and see the current development status.

E6-z3ed-cli-design.md - Design & Architecture
- The "source of truth" for the system's architecture, design goals, and the agentic workflow framework. Read this first to understand why the system is built the way it is.
E6-z3ed-reference.md - Technical Reference & Guides
- A complete command reference, API documentation, implementation guides, and troubleshooting tips. Use this as your day-to-day manual for working with z3ed.
E6-z3ed-implementation-plan.md - Roadmap & Status
- The project's task backlog, roadmap, progress tracking, and a list of known issues. Check this document for current priorities and to see what's next.

Quick Start

Build z3ed

# Basic build (without GUI automation support)
cmake --build build --target z3ed

# Build with gRPC support (for GUI automation)
cmake -B build-grpc-test -DYAZE_WITH_GRPC=ON
cmake --build build-grpc-test --target z3ed

Common Commands

# Create an agent proposal in a safe sandbox
z3ed agent run --prompt "Make all soldier armor red" --rom=zelda3.sfc --sandbox

# List all active and past proposals
z3ed agent list

# View the changes for the latest proposal
z3ed agent diff

# Run an automated GUI test (requires test harness to be running)
z3ed agent test --prompt "Open the Overworld editor and verify it loads"

# Discover available GUI widgets for AI interaction
z3ed agent gui discover --window "Overworld" --type button

# Record a test session for regression testing
z3ed agent test record start --output tests/overworld_load.json
# ... perform actions ...
z3ed agent test record stop

# Replay recorded test
z3ed agent test replay tests/overworld_load.json

# Query test execution status
z3ed agent test status --test-id grpc_click_12345678 --follow

See the Technical Reference for a full command list.

Recent Enhancements

LLM Integration Priority Shift (Oct 3, 2025) 🤖

📋 Deprioritized IT-10 (Collaborative Editing) in favor of practical LLM integration
📄 Created comprehensive implementation plan for Ollama, Gemini, and Claude integration
✅ New documentation: LLM-INTEGRATION-PLAN.md, LLM-IMPLEMENTATION-CHECKLIST.md, LLM-INTEGRATION-SUMMARY.md
🚀 Ready to enable real AI-driven ROM modifications with natural language prompts
Estimated effort: 12-15 hours across 4 phases
Why now: All infrastructure complete (CLI, proposals, sandbox, GUI automation) - only LLM connection missing

Recent Progress (Oct 3, 2025)

✅ IT-09 CLI Test Suite Tooling Complete: run/validate/create commands + JUnit output
- Full suite runner with group/tag filters, parametrization, retries, and CI-friendly exit codes
- Interactive agent test suite create scaffolds YAML definitions in tests/
- Default JUnit reports under test-results/junit/ for CI upload
✅ IT-08 Enhanced Error Reporting Complete: Full diagnostic capture on test failures
- IT-08a: Screenshot RPC with SDL capture (BMP format, 1536x864)
- IT-08b: Auto-capture execution context on failures (frame, window, widget)
- IT-08c: Widget state dumps with comprehensive UI snapshot (JSON format)
- Proto schema supports screenshot_path, failure_context, and widget_state
- GetTestResults RPC returns full failure diagnostics for debugging
✅ IT-05 Implementation Complete: Test introspection API fully operational
- GetTestStatus, ListTests, and GetTestResults RPCs implemented and tested
- CLI commands (z3ed agent test {status,list,results}) fully functional
- E2E validation script confirms production readiness
- Thread-safe execution history with bounded memory management
✅ IT-08a Screenshot RPC Complete: Visual debugging now available
- SDL-based screenshot capture implemented (1536x864 BMP format)
- Successfully tested via gRPC (5.3MB output files)
- Foundation for auto-capture on test failures
- AI agents can now capture visual context for debugging
✅ IT-07 Test Recording & Replay Complete: Regression testing workflow operational
✅ Server-side wiring for test lifecycle tracking inside TestManager
✅ gRPC status mapping helper to surface accurate error codes back to clients
✅ CLI integration with YAML/JSON output formats
✅ End-to-end introspection tests with comprehensive validation

Next Priority: IT-08b (Auto-capture on failure) + IT-08c (Widget state dumps) to complete enhanced error reporting

Test Harness Evolution (In Progress: IT-05 to IT-09 | 78% Complete):

Test Introspection: ✅ Query test status, results, and execution history
Widget Discovery: ✅ AI agents can enumerate available GUI interactions dynamically
Test Recording: ✅ Capture manual workflows as JSON scripts for regression testing
Enhanced Debugging: 🔄 Screenshot capture (✅ IT-08a), widget state dumps (📋 IT-08c), execution context on failures (📋 IT-08b)
CI/CD Integration: 📋 Standardized test suite format with JUnit XML output

See E6-z3ed-cli-design.md § 9 for detailed architecture and implementation roadmap.

📖 Getting Started:

New to z3ed? Start with this README.md then E6-z3ed-cli-design.md
Want to use z3ed? See QUICK_REFERENCE.md for all commands
Setting up AI agents? See LLM-INTEGRATION-PLAN.md for Ollama/Gemini/Claude setup

🔧 Implementation Guides:

LLM-IMPLEMENTATION-CHECKLIST.md - Step-by-step LLM integration tasks ⭐ START HERE
IT-05-IMPLEMENTATION-GUIDE.md - Test Introspection API (complete ✅)
IT-08-IMPLEMENTATION-GUIDE.md - Enhanced Error Reporting (complete ✅)

📚 Reference:

E6-z3ed-reference.md - Technical reference and API docs
E6-z3ed-implementation-plan.md - Task backlog and roadmap
QUICK_REFERENCE.md - Quick command reference

7.9 KiB Raw Blame History