feat: Enhance ImGuiTestHarnessServer with proper shutdown handling and update gRPC service initialization

This commit is contained in:
scawful
2025-10-01 23:32:41 -04:00
parent 3d272605c1
commit ead85c87b5
6 changed files with 278 additions and 11 deletions

View File

@@ -1,8 +1,12 @@
# z3ed Agentic Workflow Implementation Plan
_Last updated: 2025-10-01 (final update - Phase 6 + AW-02 complete)_
_Last updated: 2025-10-01 (final update - Phase 6 + AW-03 + IT-01 Phase 1 complete)_
This plan decomposes the design additions (Sections 1115 of `E6-z3ed-cli-design.md`) into actionable engineering tasks. Each workstream contains mi**Files Modified/Created**
> 📊 **Quick Reference**: See [STATE_SUMMARY_2025-10-01.md](STATE_SUMMARY_2025-10-01.md) for a comprehensive overview of current architecture, workflows, and status.
This plan decomposes the design additions (Sections 1115 of `E6-z3ed-cli-design.md`) into actionable engineering tasks. Each workstream contains milestones, owners (TBD), blocking dependencies, and expected deliverables.
**Files Modified/Created**
**Phase 6 (Resource Catalogue)**:
1. `src/cli/handlers/rom.cc` - Added `RomInfo::Run` implementation
@@ -21,7 +25,7 @@ This plan decomposes the design additions (Sections 1115 of `E6-z3ed-cli-desi
| Verification Pipeline | Build layered testing + CI coverage. | Phase 6+ | Integrates with harness + CLI suites. |
| Telemetry & Learning | Capture signals to improve prompts + heuristics. | Phase 8 | Optional/opt-in features. |
### Progress snapshot — 2025-10-01 (Phase 6 Complete, AW-03 Complete, IT-01 & AW-04 Active)
### Progress snapshot — 2025-10-01 (Phase 6 Complete, AW-03 Complete, IT-01 Phase 1 Complete)
**Resource Catalogue (RC)** ✅ COMPLETE:
- CLI flag passthrough and resource catalog system operational
@@ -51,9 +55,15 @@ This plan decomposes the design additions (Sections 1115 of `E6-z3ed-cli-desi
- Proper error messages guide users to specify ROM path
**Active Work (Oct 1-7, 2025)**:
- **Priority 1**: ImGuiTestHarness (IT-01) - Design spike for IPC architecture
- **Priority 1**: ImGuiTestHarness (IT-01) - ✅ Phase 1 Complete (gRPC tested), Phase 2 Active (ImGuiTestEngine integration)
- **Priority 2**: Policy Evaluation (AW-04) - YAML-based constraint system
**Recent Completion (Oct 1, 2025)**:
- ✅ gRPC test harness fully operational with all 6 RPCs validated
- ✅ Server lifecycle management (Start/Shutdown) working
- ✅ Cross-platform build verified (macOS ARM64, gRPC v1.62.0)
- ✅ All stub handlers returning success responses
## 2. Task Backlog
| ID | Task | Workstream | Type | Status | Dependencies |
@@ -68,7 +78,7 @@ This plan decomposes the design additions (Sections 1115 of `E6-z3ed-cli-desi
| AW-03 | Add ImGui drawer for proposals with accept/reject controls. | Acceptance Workflow | UX | Done | ProposalDrawer GUI complete with ROM merging |
| AW-04 | Implement policy evaluation for gating accept buttons. | Acceptance Workflow | Code | In Progress | AW-03, Priority 2 - YAML policies + PolicyEvaluator |
| AW-05 | Draft `.z3ed-diff` hybrid schema (binary deltas + JSON metadata). | Acceptance Workflow | Design | Planned | AW-01 |
| IT-01 | Create `ImGuiTestHarness` IPC service embedded in `yaze_test`. | ImGuiTest Bridge | Code | In Progress | Priority 1 - Design spike + IPC transport |
| IT-01 | Create `ImGuiTestHarness` IPC service embedded in `yaze_test`. | ImGuiTest Bridge | Code | In Progress | Phase 1 Done (gRPC), Phase 2 Active (ImGuiTestEngine) |
| IT-02 | Implement CLI agent step translation (`imgui_action` → harness call). | ImGuiTest Bridge | Code | Planned | IT-01 |
| IT-03 | Provide synchronization primitives (`WaitForIdle`, etc.). | ImGuiTest Bridge | Code | Planned | IT-01 |
| VP-01 | Expand CLI unit tests for new commands and sandbox flow. | Verification Pipeline | Test | Planned | RC/AW tasks |

198
docs/z3ed/README.md Normal file
View File

@@ -0,0 +1,198 @@
# z3ed: AI-Powered CLI for YAZE
**Status**: Active Development
**Version**: 0.1.0-alpha
**Last Updated**: October 1, 2025
## Overview
`z3ed` is a command-line interface for YAZE (Yet Another Zelda3 Editor) that enables AI-driven ROM modifications through a proposal-based workflow. It allows AI agents to suggest changes, which are then reviewed and accepted/rejected by human operators via the YAZE GUI.
## Documentation Index
### Getting Started
- **[State Summary](STATE_SUMMARY_2025-10-01.md)** - 📊 **START HERE** - Complete current state, architecture, and workflows
- **[Implementation Plan](E6-z3ed-implementation-plan.md)** - Master tracking document with architecture, priorities, and progress
- **[CLI Design](E6-z3ed-cli-design.md)** - Command structure, service architecture, and API design
### Implementation Guides
- **[IT-01: gRPC Evaluation](IT-01-grpc-evaluation.md)** - Detailed analysis of gRPC for ImGuiTestHarness IPC
- **[IT-01: Getting Started with gRPC](IT-01-getting-started-grpc.md)** - Step-by-step implementation guide
- **[gRPC Technical Notes](GRPC_TECHNICAL_NOTES.md)** - Build issues and solutions reference
- **[gRPC Test Success](GRPC_TEST_SUCCESS.md)** - Complete testing log and validation
## Architecture
```
┌─────────────────────────────────────────────────────────┐
│ z3ed CLI │
│ └─ agent subcommand │
│ ├─ run <prompt> [--sandbox] │
│ ├─ list │
│ └─ test <prompt> │
└────────────────────┬────────────────────────────────────┘
┌────────────────────▼────────────────────────────────────┐
│ Services Layer (Singleton Services) │
│ ├─ ProposalRegistry │
│ │ ├─ CreateProposal() │
│ │ ├─ ListProposals() │
│ │ └─ LoadProposalsFromDiskLocked() │
│ ├─ RomSandboxManager │
│ │ ├─ CreateSandbox() │
│ │ └─ FindSandbox() │
│ └─ PolicyEvaluator (Planned) │
│ ├─ LoadPolicies() │
│ └─ EvaluateProposal() │
└────────────────────┬────────────────────────────────────┘
┌────────────────────▼────────────────────────────────────┐
│ Filesystem Layer │
│ ├─ /tmp/yaze/proposals/<id>/ │
│ │ ├─ metadata.json │
│ │ ├─ execution.log │
│ │ └─ diff.txt │
│ └─ /tmp/yaze/sandboxes/<id>/ │
│ └─ zelda3.sfc (copy) │
└────────────────────┬────────────────────────────────────┘
┌────────────────────▼────────────────────────────────────┐
│ YAZE GUI │
│ └─ ProposalDrawer (400px right panel) │
│ ├─ List View (proposals from registry) │
│ ├─ Detail View (metadata, diff, log) │
│ └─ AcceptProposal() → ROM merging │
└─────────────────────────────────────────────────────────┘
```
## Current Status
### ✅ Completed (AW-03)
- **ProposalRegistry**: Disk persistence with lazy loading
- **ProposalDrawer GUI**: Split view, proposal list, detail panel
- **ROM Merging**: Sandbox-to-main ROM data copy on acceptance
- **Cross-Session Tracking**: Proposals persist between CLI runs
### 🔥 Active (IT-01)
- **gRPC Evaluation**: Decision made, implementation ready
- **ImGuiTestHarness**: IPC design for automated GUI testing
- **Cross-Platform Setup**: Ensuring vcpkg compatibility (Windows/macOS/Linux)
### 📋 Planned (AW-04)
- **Policy Evaluation Framework**: YAML-based rule engine
- **Change Constraints**: Byte limits, bank restrictions, protected regions
- **Review Requirements**: Human approval thresholds
## Quick Start
### Building z3ed
```bash
cd /Users/scawful/Code/yaze
cmake --build build --target z3ed -j8
```
### Running Commands
```bash
# List all proposals (shows in-memory + disk proposals)
./build/bin/z3ed agent list
# Create a proposal from AI prompt (with sandbox)
./build/bin/z3ed agent run "Fix palette corruption in overworld tile $1234" --sandbox
# Future: Test GUI operations
./build/bin/z3ed agent test "Open ROM and navigate to Overworld Editor"
```
### Reviewing Proposals in GUI
1. Launch YAZE: `./build/bin/yaze.app/Contents/MacOS/yaze`
2. Open ROM: `File → Open ROM`
3. Open drawer: `Debug → Agent Proposals` (or `Cmd+Shift+P`)
4. Select proposal → Review diff/log → Click `Accept` or `Reject`
## Key Files
### CLI Entry Points
- `src/cli/main.cc` - z3ed binary entry point
- `src/cli/command_runner.cc` - Command dispatcher
- `src/cli/handlers/agent_handler.cc` - Agent subcommand handler
### Services
- `src/cli/service/proposal_registry.{h,cc}` - Proposal tracking singleton
- `src/cli/service/rom_sandbox_manager.{h,cc}` - Isolated ROM copies
- `src/cli/service/policy_evaluator.{h,cc}` - (Planned) Policy rules engine
### GUI Integration
- `src/app/editor/system/proposal_drawer.{h,cc}` - Right-side proposal panel
- `src/app/editor/editor_manager.{h,cc}` - Integration point for drawer
### Configuration
- `.yaze/policies/agent.yaml` - (Planned) Policy rules
- `docs/api/z3ed-resources.yaml` - API catalog and examples
## Development Workflow
### Adding a New Feature
1. Update `E6-z3ed-implementation-plan.md` with task estimate
2. Create implementation branch: `git checkout -b feature/task-name`
3. Implement code following YAZE style guide
4. Update documentation in this folder
5. Test with real proposals
6. Create PR and link to implementation plan section
### Testing Changes
```bash
# Build and test CLI
cmake --build build --target z3ed -j8
./build/bin/z3ed agent list
# Build and test GUI integration
cmake --build build --target yaze -j8
./build/bin/yaze.app/Contents/MacOS/yaze
# Future: Run automated tests
cmake --build build --target yaze_test -j8
./build/bin/yaze_test --gtest_filter=ProposalRegistry*
```
## Next Steps
### Immediate (This Week)
1. **IT-01 Phase 1**: Add gRPC to vcpkg with careful cross-platform setup
2. **IT-01 Phase 2**: Implement Ping service and test on macOS
3. **Documentation**: Update this README as implementation progresses
### Short Term (Next 2 Weeks)
1. **IT-01 Complete**: Full gRPC service with Click/Type/Wait/Assert
2. **Windows Testing**: Validate vcpkg setup on Windows VM
3. **AW-04 Design**: Policy YAML schema and PolicyEvaluator API
### Long Term (Next Month)
1. **Policy Framework**: Complete AW-04 implementation
2. **CLI Testing**: Integration tests for agent workflow
3. **Production Readiness**: Error handling, logging, telemetry
## Contributing
See parent `docs/B1-contributing.md` for general contribution guidelines.
### z3ed-Specific Guidelines
- **CLI Design**: Follow existing `z3ed agent` subcommand pattern
- **Services**: Use singleton pattern with `Instance()` accessor
- **Error Handling**: Return `absl::Status` or `absl::StatusOr<T>`
- **Documentation**: Update this README and implementation plan
- **Testing**: Add test cases before merging (when test harness ready)
## Resources
- **Parent Docs**: `../` (YAZE editor documentation)
- **API Catalog**: `../api/z3ed-resources.yaml`
- **Build Guide**: `../02-build-instructions.md`
- **Platform Compatibility**: `../B2-platform-compatibility.md`
## License
Same as YAZE - See `../../LICENSE` for details.
---
**Questions?** Open an issue or discuss in #yaze-dev Discord channel.

View File

@@ -208,6 +208,10 @@ ImGuiTestHarnessServer& ImGuiTestHarnessServer::Instance() {
return *instance;
}
ImGuiTestHarnessServer::~ImGuiTestHarnessServer() {
Shutdown();
}
absl::Status ImGuiTestHarnessServer::Start(int port) {
if (server_) {
return absl::FailedPreconditionError("Server already running");
@@ -216,19 +220,19 @@ absl::Status ImGuiTestHarnessServer::Start(int port) {
// Create the service implementation
service_ = std::make_unique<ImGuiTestHarnessServiceImpl>();
// Create the gRPC service wrapper
auto grpc_service = std::make_unique<ImGuiTestHarnessServiceGrpc>(service_.get());
// Create the gRPC service wrapper (store as member to prevent it from going out of scope)
grpc_service_ = std::make_unique<ImGuiTestHarnessServiceGrpc>(service_.get());
std::string server_address = absl::StrFormat("127.0.0.1:%d", port);
std::string server_address = absl::StrFormat("0.0.0.0:%d", port);
grpc::ServerBuilder builder;
// Listen on localhost only (security)
// Listen on all interfaces (use 0.0.0.0 to avoid IPv6/IPv4 binding conflicts)
builder.AddListeningPort(server_address,
grpc::InsecureServerCredentials());
// Register service
builder.RegisterService(grpc_service.get());
builder.RegisterService(grpc_service_.get());
// Build and start
server_ = builder.BuildAndStart();

View File

@@ -68,6 +68,9 @@ class ImGuiTestHarnessServiceImpl {
ScreenshotResponse* response);
};
// Forward declaration of the gRPC service wrapper
class ImGuiTestHarnessServiceGrpc;
// Singleton server managing the gRPC service
// This class manages the lifecycle of the gRPC server
class ImGuiTestHarnessServer {
@@ -91,7 +94,7 @@ class ImGuiTestHarnessServer {
private:
ImGuiTestHarnessServer() = default;
~ImGuiTestHarnessServer() { Shutdown(); }
~ImGuiTestHarnessServer(); // Defined in .cc file to allow incomplete type deletion
// Disable copy and move
ImGuiTestHarnessServer(const ImGuiTestHarnessServer&) = delete;
@@ -99,6 +102,7 @@ class ImGuiTestHarnessServer {
std::unique_ptr<grpc::Server> server_;
std::unique_ptr<ImGuiTestHarnessServiceImpl> service_;
std::unique_ptr<ImGuiTestHarnessServiceGrpc> grpc_service_;
int port_ = 0;
};

View File

@@ -9,6 +9,10 @@
#include "util/flag.h"
#include "util/log.h"
#ifdef YAZE_WITH_GRPC
#include "app/core/imgui_test_harness_service.h"
#endif
/**
* @namespace yaze
* @brief Main namespace for the application.
@@ -20,6 +24,14 @@ DEFINE_FLAG(std::string, rom_file, "", "The ROM file to load.");
DEFINE_FLAG(std::string, log_file, "", "Output log file path for debugging.");
DEFINE_FLAG(bool, debug, false, "Enable debug logging and verbose output.");
#ifdef YAZE_WITH_GRPC
// gRPC test harness flags
DEFINE_FLAG(bool, enable_test_harness, false,
"Start gRPC test harness server for automated GUI testing.");
DEFINE_FLAG(int, test_harness_port, 50051,
"Port for gRPC test harness server (default: 50051).");
#endif
int main(int argc, char **argv) {
absl::InitializeSymbolizer(argv[0]);
@@ -56,6 +68,24 @@ int main(int argc, char **argv) {
rom_filename = FLAGS_rom_file->Get();
}
#ifdef YAZE_WITH_GRPC
// Start gRPC test harness server if requested
if (FLAGS_enable_test_harness->Get()) {
auto& server = yaze::test::ImGuiTestHarnessServer::Instance();
int port = FLAGS_test_harness_port->Get();
std::cout << "\n🚀 Starting ImGui Test Harness on port " << port << "..." << std::endl;
auto status = server.Start(port);
if (!status.ok()) {
std::cerr << "❌ ERROR: Failed to start test harness server on port " << port << std::endl;
std::cerr << " " << status.message() << std::endl;
return 1;
}
std::cout << "✅ Test harness ready on 127.0.0.1:" << port << std::endl;
std::cout << " Available RPCs: Ping, Click, Type, Wait, Assert, Screenshot\n" << std::endl;
}
#endif
#ifdef __APPLE__
return yaze_run_cocoa_app_delegate(rom_filename.c_str());
#elif defined(_WIN32)
@@ -76,5 +106,10 @@ int main(int argc, char **argv) {
}
controller->OnExit();
#ifdef YAZE_WITH_GRPC
// Shutdown gRPC server if running
yaze::test::ImGuiTestHarnessServer::Instance().Shutdown();
#endif
return EXIT_SUCCESS;
}

View File

@@ -48,6 +48,9 @@ class Flag : public IFlag {
}
value_ = parsed;
}
// Set the value directly (used by specializations)
void SetValue(const T& val) { value_ = val; }
// Returns the current (parsed or default) value of the flag.
const T& Get() const { return value_; }
@@ -59,6 +62,19 @@ class Flag : public IFlag {
std::string help_;
};
// Specialization for bool to handle "true"/"false" strings
template <>
inline void Flag<bool>::ParseValue(const std::string& text) {
if (text == "true" || text == "1" || text == "yes" || text == "on") {
SetValue(true);
} else if (text == "false" || text == "0" || text == "no" || text == "off") {
SetValue(false);
} else {
throw std::runtime_error("Failed to parse boolean flag: " + name() +
" (expected true/false/1/0/yes/no/on/off, got: " + text + ")");
}
}
class FlagRegistry {
public:
// Registers a flag in the global registry.