Enhance ImGuiTestHarness with dynamic test integration and end-to-end validation
- Updated README.md to reflect the completion of IT-01 and the transition to end-to-end validation phase. - Introduced a new end-to-end test script (scripts/test_harness_e2e.sh) for validating all RPC methods of the ImGuiTestHarness gRPC service. - Implemented dynamic test functionality in ImGuiTestHarnessService for Type, Wait, and Assert methods, utilizing ImGuiTestEngine. - Enhanced error handling and response messages for better clarity during test execution. - Updated existing methods to support dynamic test registration and execution, ensuring robust interaction with the GUI elements.
This commit is contained in:
714
docs/z3ed/NEXT_PRIORITIES_OCT2.md
Normal file
714
docs/z3ed/NEXT_PRIORITIES_OCT2.md
Normal file
@@ -0,0 +1,714 @@
|
||||
# z3ed Next Priorities - October 2, 2025
|
||||
|
||||
**Current Status**: IT-01 Complete ✅ | AW-03 Complete ✅ | Ready for E2E Validation
|
||||
|
||||
This document outlines the immediate next steps for the z3ed agent workflow system after completing IT-01 Phase 3 (ImGuiTestEngine integration).
|
||||
|
||||
---
|
||||
|
||||
## Priority 1: End-to-End Workflow Validation (ACTIVE) 🔄
|
||||
|
||||
**Goal**: Validate the complete AI agent workflow from proposal creation through ROM commit
|
||||
**Time Estimate**: 2-3 hours
|
||||
**Status**: Ready to execute
|
||||
**Blocking**: None - all prerequisites complete
|
||||
|
||||
### Why This First?
|
||||
- Validate all systems work together in production
|
||||
- Identify any integration issues before building more features
|
||||
- Establish baseline for acceptable UX and performance
|
||||
- Document real-world usage patterns for future improvements
|
||||
|
||||
### Task Breakdown
|
||||
|
||||
#### 1.1. Automated Test Script Validation (30 min)
|
||||
**Goal**: Verify E2E test script works correctly
|
||||
|
||||
```bash
|
||||
# Run the automated test script
|
||||
./scripts/test_harness_e2e.sh
|
||||
|
||||
# Expected: All 6 tests pass
|
||||
# - Ping (health check)
|
||||
# - Click (button interaction)
|
||||
# - Type (text input)
|
||||
# - Wait (condition polling)
|
||||
# - Assert (state validation)
|
||||
# - Screenshot (stub - not implemented message)
|
||||
```
|
||||
|
||||
**Success Criteria**:
|
||||
- Script runs without errors
|
||||
- All RPCs return success responses
|
||||
- Server starts and stops cleanly
|
||||
- No port conflicts or hanging processes
|
||||
|
||||
**Troubleshooting**:
|
||||
- If port 50052 in use: `killall yaze` or use different port
|
||||
- If grpcurl missing: `brew install grpcurl`
|
||||
- If binary not found: Build with `cmake --build build-grpc-test`
|
||||
|
||||
#### 1.2. Manual Workflow Testing (60 min)
|
||||
**Goal**: Test complete proposal lifecycle with real GUI
|
||||
|
||||
**Steps**:
|
||||
1. **Create Proposal via CLI**:
|
||||
```bash
|
||||
# Build z3ed
|
||||
cmake --build build --target z3ed -j8
|
||||
|
||||
# Create test proposal with sandbox
|
||||
./build/bin/z3ed agent run "Test proposal for validation" --sandbox
|
||||
|
||||
# Verify proposal created
|
||||
./build/bin/z3ed agent list
|
||||
./build/bin/z3ed agent diff --proposal-id <ID>
|
||||
```
|
||||
|
||||
2. **Launch YAZE GUI**:
|
||||
```bash
|
||||
./build/bin/yaze.app/Contents/MacOS/yaze
|
||||
|
||||
# Open ROM: File → Open ROM → assets/zelda3.sfc
|
||||
# Open drawer: Debug → Agent Proposals
|
||||
```
|
||||
|
||||
3. **Test ProposalDrawer UI**:
|
||||
- ✅ Verify proposal appears in list
|
||||
- ✅ Click proposal to select
|
||||
- ✅ Review metadata (ID, timestamp, sandbox_id)
|
||||
- ✅ Review execution log content
|
||||
- ✅ Review diff content (if any)
|
||||
- ✅ Test filtering (All/Pending/Accepted/Rejected)
|
||||
- ✅ Test Refresh button
|
||||
|
||||
4. **Test Accept Workflow**:
|
||||
- ✅ Click "Accept" button
|
||||
- ✅ Confirm dialog appears
|
||||
- ✅ Verify ROM marked dirty (save prompt)
|
||||
- ✅ File → Save ROM
|
||||
- ✅ Verify proposal status changes to "Accepted"
|
||||
|
||||
5. **Test Reject Workflow**:
|
||||
- ✅ Create another test proposal
|
||||
- ✅ Click "Reject" button
|
||||
- ✅ Confirm dialog appears
|
||||
- ✅ Verify status changes to "Rejected"
|
||||
- ✅ Verify sandbox ROM unchanged
|
||||
|
||||
6. **Test Delete Workflow**:
|
||||
- ✅ Create another test proposal
|
||||
- ✅ Click "Delete" button
|
||||
- ✅ Confirm dialog appears
|
||||
- ✅ Verify proposal removed from list
|
||||
- ✅ Verify files cleaned up from disk
|
||||
|
||||
**Success Criteria**:
|
||||
- All workflows complete without crashes
|
||||
- ROM merging works correctly
|
||||
- Status updates persist across sessions
|
||||
- UI responsive and intuitive
|
||||
|
||||
**Known Issues to Document**:
|
||||
- Any UX friction points
|
||||
- Performance concerns with large diffs
|
||||
- Edge cases that need handling
|
||||
|
||||
#### 1.3. Real Widget Testing (60 min)
|
||||
**Goal**: Test GUI automation with actual YAZE widgets
|
||||
|
||||
**Workflow 1: Open Overworld Editor**:
|
||||
```bash
|
||||
# Start YAZE with test harness
|
||||
./build-grpc-test/bin/yaze.app/Contents/MacOS/yaze \
|
||||
--enable_test_harness \
|
||||
--test_harness_port=50052 \
|
||||
--rom_file=assets/zelda3.sfc &
|
||||
|
||||
# Wait for startup
|
||||
sleep 2
|
||||
|
||||
# Test workflow
|
||||
grpcurl -plaintext -import-path src/app/core/proto -proto imgui_test_harness.proto \
|
||||
-d '{"target":"button:Overworld","type":"LEFT"}' \
|
||||
127.0.0.1:50052 yaze.test.ImGuiTestHarness/Click
|
||||
|
||||
grpcurl -plaintext -import-path src/app/core/proto -proto imgui_test_harness.proto \
|
||||
-d '{"condition":"window_visible:Overworld Editor","timeout_ms":5000}' \
|
||||
127.0.0.1:50052 yaze.test.ImGuiTestHarness/Wait
|
||||
|
||||
grpcurl -plaintext -import-path src/app/core/proto -proto imgui_test_harness.proto \
|
||||
-d '{"condition":"visible:Overworld Editor"}' \
|
||||
127.0.0.1:50052 yaze.test.ImGuiTestHarness/Assert
|
||||
```
|
||||
|
||||
**Workflow 2: Open Dungeon Editor**:
|
||||
- Click "button:Dungeon"
|
||||
- Wait "window_visible:Dungeon Editor"
|
||||
- Assert "visible:Dungeon Editor"
|
||||
|
||||
**Workflow 3: Type in Input Field** (if applicable):
|
||||
- Click "input:FieldName"
|
||||
- Type text with clear_first
|
||||
- Assert text_contains (partial implementation)
|
||||
|
||||
**Success Criteria**:
|
||||
- All real widgets respond to automation
|
||||
- Timeouts work correctly (5s default)
|
||||
- Error messages helpful when widgets not found
|
||||
- No crashes or hangs during automation
|
||||
|
||||
**Document**:
|
||||
- Widget naming conventions (button:Name, window:Name, input:Name)
|
||||
- Common timeout values needed
|
||||
- Edge cases (disabled buttons, hidden windows, etc.)
|
||||
|
||||
#### 1.4. Documentation Updates (30 min)
|
||||
**Goal**: Capture learnings and update guides
|
||||
|
||||
**Files to Update**:
|
||||
1. **IT-01-QUICKSTART.md**:
|
||||
- Add real widget examples
|
||||
- Document common workflows
|
||||
- Add troubleshooting for real scenarios
|
||||
|
||||
2. **E6-z3ed-implementation-plan.md**:
|
||||
- Mark Priority 1 as complete
|
||||
- Add lessons learned section
|
||||
- Update known limitations
|
||||
|
||||
3. **STATE_SUMMARY_2025-10-02.md**:
|
||||
- Add E2E validation results
|
||||
- Update status metrics
|
||||
- Document performance characteristics
|
||||
|
||||
**Success Criteria**:
|
||||
- New users can follow guides without getting stuck
|
||||
- Common issues documented with solutions
|
||||
- Real-world examples added
|
||||
|
||||
---
|
||||
|
||||
## Priority 2: CLI Agent Test Command (IT-02) 📋
|
||||
|
||||
**Goal**: Natural language prompt → automated GUI test workflow
|
||||
**Time Estimate**: 4-6 hours
|
||||
**Status**: Ready to start after Priority 1
|
||||
**Blocking Dependency**: Priority 1 completion
|
||||
|
||||
### Why This Next?
|
||||
- Enables AI agents to drive YAZE GUI automatically
|
||||
- Makes GUI automation accessible via simple CLI commands
|
||||
- Provides foundation for complex multi-step workflows
|
||||
- Demonstrates value of IT-01 infrastructure
|
||||
|
||||
### Design Overview
|
||||
|
||||
```
|
||||
User Input:
|
||||
z3ed agent test --prompt "Open Overworld editor and verify it loads"
|
||||
|
||||
Workflow:
|
||||
1. Parse prompt → identify intent (open editor, verify visibility)
|
||||
2. Generate RPC sequence:
|
||||
- Click "button:Overworld"
|
||||
- Wait "window_visible:Overworld Editor" (5s timeout)
|
||||
- Assert "visible:Overworld Editor"
|
||||
3. Execute RPCs via gRPC client
|
||||
4. Capture results and report
|
||||
5. Optional: Screenshot for LLM feedback
|
||||
|
||||
Output:
|
||||
✓ Clicked button:Overworld (85ms)
|
||||
✓ Waited for window:Overworld Editor (1234ms)
|
||||
✓ Asserted visible:Overworld Editor (12ms)
|
||||
|
||||
Test passed in 1.331s
|
||||
```
|
||||
|
||||
### Implementation Tasks
|
||||
|
||||
#### 2.1. Create gRPC Client Library (2 hours)
|
||||
**Files**:
|
||||
- `src/cli/service/gui_automation_client.h`
|
||||
- `src/cli/service/gui_automation_client.cc`
|
||||
|
||||
**Interface**:
|
||||
```cpp
|
||||
class GuiAutomationClient {
|
||||
public:
|
||||
static GuiAutomationClient& Instance();
|
||||
|
||||
absl::Status Connect(const std::string& host, int port);
|
||||
absl::StatusOr<PingResponse> Ping(const std::string& message);
|
||||
absl::StatusOr<ClickResponse> Click(const std::string& target, ClickType type);
|
||||
absl::StatusOr<TypeResponse> Type(const std::string& target,
|
||||
const std::string& text,
|
||||
bool clear_first);
|
||||
absl::StatusOr<WaitResponse> Wait(const std::string& condition,
|
||||
int timeout_ms,
|
||||
int poll_interval_ms);
|
||||
absl::StatusOr<AssertResponse> Assert(const std::string& condition);
|
||||
absl::StatusOr<ScreenshotResponse> Screenshot(const std::string& region,
|
||||
const std::string& format);
|
||||
|
||||
private:
|
||||
std::unique_ptr<yaze::test::ImGuiTestHarness::Stub> stub_;
|
||||
};
|
||||
```
|
||||
|
||||
**Implementation Notes**:
|
||||
- Use gRPC C++ client API
|
||||
- Handle connection errors gracefully
|
||||
- Support timeout configuration
|
||||
- Return structured results (not raw proto messages)
|
||||
|
||||
#### 2.2. Create Test Workflow Generator (1.5 hours)
|
||||
**Files**:
|
||||
- `src/cli/service/test_workflow_generator.h`
|
||||
- `src/cli/service/test_workflow_generator.cc`
|
||||
|
||||
**Interface**:
|
||||
```cpp
|
||||
struct TestStep {
|
||||
enum Type { kClick, kType, kWait, kAssert, kScreenshot };
|
||||
Type type;
|
||||
std::string target;
|
||||
std::string value;
|
||||
int timeout_ms = 5000;
|
||||
};
|
||||
|
||||
struct TestWorkflow {
|
||||
std::string description;
|
||||
std::vector<TestStep> steps;
|
||||
};
|
||||
|
||||
class TestWorkflowGenerator {
|
||||
public:
|
||||
static absl::StatusOr<TestWorkflow> GenerateFromPrompt(
|
||||
const std::string& prompt);
|
||||
|
||||
private:
|
||||
static absl::StatusOr<TestWorkflow> ParseSimplePrompt(
|
||||
const std::string& prompt);
|
||||
static absl::StatusOr<TestWorkflow> ParseComplexPrompt(
|
||||
const std::string& prompt);
|
||||
};
|
||||
```
|
||||
|
||||
**Supported Prompt Patterns**:
|
||||
1. **Simple Open**: "Open Overworld editor"
|
||||
- Click "button:Overworld"
|
||||
- Wait "window_visible:Overworld Editor"
|
||||
|
||||
2. **Open and Verify**: "Open Dungeon editor and verify it loads"
|
||||
- Click "button:Dungeon"
|
||||
- Wait "window_visible:Dungeon Editor"
|
||||
- Assert "visible:Dungeon Editor"
|
||||
|
||||
3. **Type and Validate**: "Type 'zelda3.sfc' in filename input"
|
||||
- Click "input:Filename"
|
||||
- Type "zelda3.sfc" with clear_first
|
||||
- Assert "text_contains:Filename:zelda3.sfc"
|
||||
|
||||
4. **Multi-Step**: "Open Overworld, click tile, verify properties panel"
|
||||
- Click "button:Overworld"
|
||||
- Wait "window_visible:Overworld Editor"
|
||||
- Click "canvas:Overworld" (x, y coordinates)
|
||||
- Wait "window_visible:Properties"
|
||||
|
||||
**Implementation Strategy**:
|
||||
- Start with simple regex/pattern matching
|
||||
- Add more complex patterns iteratively
|
||||
- Return error for unsupported prompts
|
||||
- Suggest valid alternatives
|
||||
|
||||
#### 2.3. Implement `z3ed agent test` Command (1.5 hours)
|
||||
**Files**:
|
||||
- `src/cli/handlers/agent.cc` (add `HandleTestCommand`)
|
||||
- Update `src/cli/modern_cli.cc` routing
|
||||
|
||||
**Command Interface**:
|
||||
```bash
|
||||
z3ed agent test --prompt "..." [--host localhost] [--port 50052] [--timeout 30s]
|
||||
```
|
||||
|
||||
**Implementation**:
|
||||
```cpp
|
||||
absl::Status HandleTestCommand(const AgentOptions& options) {
|
||||
// 1. Parse prompt → workflow
|
||||
auto workflow_result = TestWorkflowGenerator::GenerateFromPrompt(
|
||||
options.prompt);
|
||||
if (!workflow_result.ok()) {
|
||||
return workflow_result.status();
|
||||
}
|
||||
TestWorkflow workflow = std::move(*workflow_result);
|
||||
|
||||
// 2. Connect to test harness
|
||||
auto& client = GuiAutomationClient::Instance();
|
||||
auto status = client.Connect(options.host, options.port);
|
||||
if (!status.ok()) {
|
||||
return status;
|
||||
}
|
||||
|
||||
// 3. Execute workflow steps
|
||||
for (const auto& step : workflow.steps) {
|
||||
auto result = ExecuteStep(client, step);
|
||||
if (!result.ok()) {
|
||||
return result;
|
||||
}
|
||||
PrintStepResult(step, *result);
|
||||
}
|
||||
|
||||
std::cout << "\nTest passed!\n";
|
||||
return absl::OkStatus();
|
||||
}
|
||||
```
|
||||
|
||||
**Output Format**:
|
||||
- Progress indicators for each step
|
||||
- Execution time per step
|
||||
- Success/failure status
|
||||
- Error messages with context
|
||||
- Final summary
|
||||
|
||||
#### 2.4. Testing and Documentation (1 hour)
|
||||
**Test Cases**:
|
||||
1. Simple open editor test
|
||||
2. Multi-step workflow test
|
||||
3. Timeout handling test
|
||||
4. Connection error test
|
||||
5. Invalid widget test
|
||||
|
||||
**Documentation**:
|
||||
- Add IT-02 completion doc
|
||||
- Update implementation plan
|
||||
- Add examples to IT-01-QUICKSTART.md
|
||||
- Update resource catalog with `agent test` command
|
||||
|
||||
**Success Criteria**:
|
||||
- `z3ed agent test` works with 5+ different prompts
|
||||
- Error messages helpful for debugging
|
||||
- Documentation complete with examples
|
||||
- Ready for AI agent integration
|
||||
|
||||
---
|
||||
|
||||
## Priority 3: Policy Evaluation Framework (AW-04) 📋
|
||||
|
||||
**Goal**: YAML-based constraint system for gating proposal acceptance
|
||||
**Time Estimate**: 6-8 hours
|
||||
**Status**: Can work in parallel with Priority 2
|
||||
**Blocking Dependency**: None (UI integration requires AW-03)
|
||||
|
||||
### Why This Matters?
|
||||
- Prevents dangerous/unwanted changes from being accepted
|
||||
- Enforces project-specific constraints (byte limits, bank restrictions)
|
||||
- Requires test coverage before acceptance
|
||||
- Provides audit trail for policy violations
|
||||
|
||||
### Design Overview
|
||||
|
||||
**Policy Configuration** (`.yaze/policies/agent.yaml`):
|
||||
```yaml
|
||||
version: 1.0
|
||||
policies:
|
||||
# Test Requirements
|
||||
- name: require_tests
|
||||
type: test_requirement
|
||||
enabled: true
|
||||
severity: critical # critical | warning | info
|
||||
rules:
|
||||
- test_suite: "overworld_rendering"
|
||||
min_pass_rate: 0.95
|
||||
- test_suite: "palette_integrity"
|
||||
min_pass_rate: 1.0
|
||||
|
||||
# Change Constraints
|
||||
- name: limit_change_scope
|
||||
type: change_constraint
|
||||
enabled: true
|
||||
severity: critical
|
||||
rules:
|
||||
- max_bytes_changed: 10240 # 10KB limit
|
||||
- allowed_banks: [0x00, 0x01, 0x0E] # Graphics banks only
|
||||
- forbidden_ranges:
|
||||
- start: 0xFFB0 # ROM header
|
||||
end: 0xFFFF
|
||||
- start: 0x0000 # System RAM
|
||||
end: 0x1FFF
|
||||
|
||||
# Review Requirements
|
||||
- name: human_review_required
|
||||
type: review_requirement
|
||||
enabled: true
|
||||
severity: warning
|
||||
rules:
|
||||
- if: bytes_changed > 1024
|
||||
then: require_diff_review
|
||||
- if: commands_executed > 10
|
||||
then: require_log_review
|
||||
- if: new_files_created
|
||||
then: require_approval
|
||||
|
||||
# CVE Checks
|
||||
- name: security_validation
|
||||
type: security_check
|
||||
enabled: true
|
||||
severity: critical
|
||||
rules:
|
||||
- check: no_known_cves
|
||||
message: "Dependencies must not have known CVEs"
|
||||
- check: checksum_valid
|
||||
message: "ROM checksum must be valid after changes"
|
||||
```
|
||||
|
||||
### Implementation Tasks
|
||||
|
||||
#### 3.1. Policy Schema and Parser (2 hours)
|
||||
**Files**:
|
||||
- `src/cli/service/policy_evaluator.h`
|
||||
- `src/cli/service/policy_evaluator.cc`
|
||||
- `.yaze/policies/agent.yaml` (example)
|
||||
|
||||
**Data Structures**:
|
||||
```cpp
|
||||
enum class PolicySeverity { kCritical, kWarning, kInfo };
|
||||
enum class PolicyType {
|
||||
kTestRequirement,
|
||||
kChangeConstraint,
|
||||
kReviewRequirement,
|
||||
kSecurityCheck
|
||||
};
|
||||
|
||||
struct PolicyRule {
|
||||
std::string condition;
|
||||
std::string action;
|
||||
std::map<std::string, std::string> parameters;
|
||||
};
|
||||
|
||||
struct Policy {
|
||||
std::string name;
|
||||
PolicyType type;
|
||||
PolicySeverity severity;
|
||||
bool enabled;
|
||||
std::vector<PolicyRule> rules;
|
||||
};
|
||||
|
||||
struct PolicyViolation {
|
||||
std::string policy_name;
|
||||
PolicySeverity severity;
|
||||
std::string message;
|
||||
std::string actual_value;
|
||||
std::string expected_value;
|
||||
};
|
||||
|
||||
struct PolicyResult {
|
||||
bool passed;
|
||||
std::vector<PolicyViolation> violations;
|
||||
|
||||
bool HasCriticalViolations() const;
|
||||
bool HasWarnings() const;
|
||||
};
|
||||
```
|
||||
|
||||
**YAML Parsing**:
|
||||
- Use `yaml-cpp` library (already in vcpkg)
|
||||
- Parse policy file on startup
|
||||
- Validate schema (version, required fields)
|
||||
- Cache parsed policies in memory
|
||||
|
||||
#### 3.2. Policy Evaluation Engine (2.5 hours)
|
||||
**Interface**:
|
||||
```cpp
|
||||
class PolicyEvaluator {
|
||||
public:
|
||||
static PolicyEvaluator& Instance();
|
||||
|
||||
absl::Status LoadPolicies(const std::string& policy_dir = ".yaze/policies");
|
||||
absl::StatusOr<PolicyResult> EvaluateProposal(const std::string& proposal_id);
|
||||
|
||||
private:
|
||||
absl::StatusOr<PolicyResult> EvaluateTestRequirements(
|
||||
const ProposalMetadata& proposal);
|
||||
absl::StatusOr<PolicyResult> EvaluateChangeConstraints(
|
||||
const ProposalMetadata& proposal);
|
||||
absl::StatusOr<PolicyResult> EvaluateReviewRequirements(
|
||||
const ProposalMetadata& proposal);
|
||||
absl::StatusOr<PolicyResult> EvaluateSecurityChecks(
|
||||
const ProposalMetadata& proposal);
|
||||
|
||||
std::vector<Policy> policies_;
|
||||
};
|
||||
```
|
||||
|
||||
**Evaluation Logic**:
|
||||
1. Load proposal metadata (bytes changed, commands executed, etc.)
|
||||
2. Load proposal diff (for bank/range analysis)
|
||||
3. For each enabled policy:
|
||||
- Evaluate all rules
|
||||
- Collect violations
|
||||
- Determine overall pass/fail
|
||||
4. Return structured result
|
||||
|
||||
**Example Evaluations**:
|
||||
- **Test Requirements**: Check if test results exist and meet thresholds
|
||||
- **Change Constraints**: Analyze diff for byte count, bank ranges, forbidden areas
|
||||
- **Review Requirements**: Check metadata (bytes, commands, files)
|
||||
- **Security Checks**: Run ROM validation, checksum verification
|
||||
|
||||
#### 3.3. ProposalDrawer Integration (2 hours)
|
||||
**Files**:
|
||||
- `src/app/editor/system/proposal_drawer.cc` (update)
|
||||
|
||||
**UI Changes**:
|
||||
1. **Add Policy Status Section** (in detail view):
|
||||
```
|
||||
Policy Status: [✓ Passed | ⚠ Warnings | ⛔ Failed]
|
||||
|
||||
Critical Issues:
|
||||
⛔ Test pass rate 85% < 95% (overworld_rendering)
|
||||
⛔ Forbidden range modified: 0xFFB0-0xFFFF (ROM header)
|
||||
|
||||
Warnings:
|
||||
⚠ 2048 bytes changed > 1024 (requires diff review)
|
||||
```
|
||||
|
||||
2. **Gate Accept Button**:
|
||||
- Disable if critical violations exist
|
||||
- Show tooltip: "Accept blocked: 2 critical policy violations"
|
||||
- Enable override button (with confirmation + logging)
|
||||
|
||||
3. **Policy Override Dialog**:
|
||||
```
|
||||
Override Policy Violations?
|
||||
|
||||
This action will be logged for audit purposes.
|
||||
|
||||
Violations:
|
||||
• Test pass rate below threshold
|
||||
• ROM header modified
|
||||
|
||||
Reason (required): [___________________________]
|
||||
|
||||
[Cancel] [Override and Accept]
|
||||
```
|
||||
|
||||
**Integration Points**:
|
||||
```cpp
|
||||
void ProposalDrawer::DrawProposalDetail(const ProposalMetadata& proposal) {
|
||||
// ... existing metadata, diff, log sections ...
|
||||
|
||||
// Add policy section
|
||||
ImGui::Separator();
|
||||
if (ImGui::CollapsingHeader("Policy Status", ImGuiTreeNodeFlags_DefaultOpen)) {
|
||||
DrawPolicyStatus(proposal.id);
|
||||
}
|
||||
}
|
||||
|
||||
void ProposalDrawer::DrawPolicyStatus(const std::string& proposal_id) {
|
||||
auto& evaluator = PolicyEvaluator::Instance();
|
||||
auto result = evaluator.EvaluateProposal(proposal_id);
|
||||
|
||||
if (!result.ok()) {
|
||||
ImGui::TextColored(ImVec4(1, 0, 0, 1), "Error evaluating policies");
|
||||
return;
|
||||
}
|
||||
|
||||
const auto& policy_result = *result;
|
||||
|
||||
// Show overall status
|
||||
if (policy_result.passed) {
|
||||
ImGui::TextColored(ImVec4(0, 1, 0, 1), "✓ All policies passed");
|
||||
} else if (policy_result.HasCriticalViolations()) {
|
||||
ImGui::TextColored(ImVec4(1, 0, 0, 1), "⛔ Critical violations");
|
||||
} else {
|
||||
ImGui::TextColored(ImVec4(1, 1, 0, 1), "⚠ Warnings present");
|
||||
}
|
||||
|
||||
// List violations
|
||||
for (const auto& violation : policy_result.violations) {
|
||||
DrawViolation(violation);
|
||||
}
|
||||
}
|
||||
|
||||
void ProposalDrawer::AcceptProposal(const std::string& proposal_id) {
|
||||
// Evaluate policies before accepting
|
||||
auto& evaluator = PolicyEvaluator::Instance();
|
||||
auto result = evaluator.EvaluateProposal(proposal_id);
|
||||
|
||||
if (result.ok() && result->HasCriticalViolations()) {
|
||||
// Show override dialog instead of accepting directly
|
||||
show_policy_override_dialog_ = true;
|
||||
pending_accept_proposal_id_ = proposal_id;
|
||||
return;
|
||||
}
|
||||
|
||||
// ... existing accept logic ...
|
||||
}
|
||||
```
|
||||
|
||||
#### 3.4. Testing and Documentation (1.5 hours)
|
||||
**Test Cases**:
|
||||
1. Valid proposal (all policies pass)
|
||||
2. Test requirement violation
|
||||
3. Change constraint violation
|
||||
4. Multiple violations
|
||||
5. Policy override workflow
|
||||
|
||||
**Documentation**:
|
||||
- Create AW-04-POLICY-FRAMEWORK.md with:
|
||||
- Policy schema reference
|
||||
- Built-in policy examples
|
||||
- How to write custom policies
|
||||
- Override audit trail
|
||||
- Update implementation plan
|
||||
- Update ProposalDrawer documentation
|
||||
|
||||
**Success Criteria**:
|
||||
- Policies loaded and evaluated correctly
|
||||
- UI clearly shows policy status
|
||||
- Accept button gated on critical violations
|
||||
- Override workflow functional with logging
|
||||
- Documentation complete
|
||||
|
||||
---
|
||||
|
||||
## Timeline Summary
|
||||
|
||||
**Week of Oct 2-8, 2025**:
|
||||
- Days 1-2: Priority 1 (E2E Validation)
|
||||
- Days 3-4: Priority 2 (CLI Agent Test)
|
||||
- Days 5-7: Priority 3 (Policy Framework)
|
||||
|
||||
**Expected Completion**: October 8, 2025
|
||||
|
||||
**Next After This**:
|
||||
- Windows cross-platform testing
|
||||
- Screenshot implementation
|
||||
- Production telemetry (opt-in)
|
||||
- Advanced policy features
|
||||
|
||||
---
|
||||
|
||||
## Success Metrics
|
||||
|
||||
**By End of Week**:
|
||||
- ✅ Complete proposal workflow validated end-to-end
|
||||
- ✅ `z3ed agent test` command operational with 5+ prompt patterns
|
||||
- ✅ Policy framework implemented and integrated
|
||||
- ✅ Documentation updated for all new features
|
||||
- ✅ Zero known blockers for production use
|
||||
|
||||
**Quality Bar**:
|
||||
- All code builds cleanly on macOS ARM64
|
||||
- No crashes or hangs in normal workflows
|
||||
- Error messages helpful and actionable
|
||||
- Documentation sufficient for new contributors
|
||||
- Ready for Windows testing phase
|
||||
|
||||
---
|
||||
|
||||
**Last Updated**: October 2, 2025
|
||||
**Contributors**: @scawful, GitHub Copilot
|
||||
**License**: Same as YAZE (see ../../LICENSE)
|
||||
Reference in New Issue
Block a user