docs: Update implementation plan and README for clarity and recent accomplishments

2025-10-02 12:49:10 -04:00
parent 784c867acf
commit c8d79d9e99
3 changed files with 663 additions and 206 deletions
--- a/docs/z3ed/AW-04-POLICY-FRAMEWORK.md
+++ b/docs/z3ed/AW-04-POLICY-FRAMEWORK.md
@@ -0,0 +1,627 @@
+# Policy Evaluation Framework (AW-04)
+
+**Status**: Implementation In Progress  
+**Priority**: High (Next Phase)  
+**Time Estimate**: 6-8 hours  
+**Last Updated**: October 2, 2025
+
+## Overview
+
+The Policy Evaluation Framework provides a YAML-based constraint system for gating proposal acceptance in the z3ed agent workflow. It ensures that AI-generated ROM modifications meet quality, safety, and testing requirements before being merged into the main ROM.
+
+## Goals
+
+1. **Quality Gates**: Enforce minimum test pass rates and code quality standards
+2. **Safety Constraints**: Prevent modifications to critical ROM regions (headers, checksums)
+3. **Scope Limits**: Restrict changes to reasonable byte counts and specific banks
+4. **Human Review**: Require manual review for large or complex changes
+5. **Flexibility**: Allow policy overrides with confirmation and logging
+
+## Architecture
+
+```
+┌─────────────────────────────────────────────────────────┐
+│ ProposalDrawer (GUI)                                    │
+│  └─ Accept button gated by PolicyEvaluator             │
+└────────────────────┬────────────────────────────────────┘
+                     │
+                     ▼
+┌─────────────────────────────────────────────────────────┐
+│ PolicyEvaluator (Singleton Service)                     │
+│  ├─ LoadPolicies() from .yaze/policies/                │
+│  ├─ EvaluateProposal(proposal_id) → PolicyResult       │
+│  └─ Cache of parsed YAML policies                      │
+└────────────────────┬────────────────────────────────────┘
+                     │
+                     ▼
+┌─────────────────────────────────────────────────────────┐
+│ .yaze/policies/agent.yaml (YAML Configuration)          │
+│  ├─ test_requirements (min pass rates)                 │
+│  ├─ change_constraints (byte limits, allowed banks)    │
+│  ├─ review_requirements (human review triggers)        │
+│  └─ forbidden_ranges (protected ROM regions)           │
+└─────────────────────────────────────────────────────────┘
+```
+
+## YAML Policy Schema
+
+### Example Policy File
+
+```yaml
+# .yaze/policies/agent.yaml
+version: 1.0
+enabled: true
+
+policies:
+  # Policy 1: Test Requirements
+  - name: require_tests
+    type: test_requirement
+    enabled: true
+    severity: critical  # critical | warning | info
+    rules:
+      - test_suite: "overworld_rendering"
+        min_pass_rate: 0.95
+      - test_suite: "palette_integrity"
+        min_pass_rate: 1.0
+      - test_suite: "dungeon_logic"
+        min_pass_rate: 0.90
+    message: "All required test suites must pass before accepting proposal"
+
+  # Policy 2: Change Scope Limits
+  - name: limit_change_scope
+    type: change_constraint
+    enabled: true
+    severity: critical
+    rules:
+      - max_bytes_changed: 10240  # 10KB limit
+      - allowed_banks: [0x00, 0x01, 0x0E, 0x0F]  # Graphics banks only
+      - max_commands_executed: 20
+    message: "Proposal exceeds allowed change scope"
+
+  # Policy 3: Protected ROM Regions
+  - name: protect_critical_regions
+    type: forbidden_range
+    enabled: true
+    severity: critical
+    ranges:
+      - start: 0xFFB0  # ROM header
+        end: 0xFFFF
+        reason: "ROM header is protected"
+      - start: 0x00FFC0  # Internal header
+        end: 0x00FFDF
+        reason: "Internal ROM header"
+    message: "Proposal modifies protected ROM region"
+
+  # Policy 4: Human Review Requirements
+  - name: human_review_required
+    type: review_requirement
+    enabled: true
+    severity: warning
+    conditions:
+      - if: bytes_changed > 1024
+        then: require_diff_review
+        message: "Large change requires diff review"
+      - if: commands_executed > 10
+        then: require_log_review
+        message: "Complex operation requires log review"
+      - if: test_failures > 0
+        then: require_explanation
+        message: "Test failures require explanation"
+
+  # Policy 5: Palette Modifications
+  - name: palette_safety
+    type: change_constraint
+    enabled: true
+    severity: warning
+    rules:
+      - max_palettes_changed: 5
+      - preserve_transparency: true  # Don't modify color index 0
+    message: "Palette changes exceed safety threshold"
+```
+
+### Schema Definition
+
+```yaml
+# Policy file structure
+version: string  # Semantic version (e.g., "1.0")
+enabled: boolean  # Master enable/disable
+
+policies:
+  - name: string  # Unique policy identifier
+    type: enum  # test_requirement | change_constraint | forbidden_range | review_requirement
+    enabled: boolean  # Policy-specific enable/disable
+    severity: enum  # critical | warning | info
+    
+    # Type-specific fields:
+    rules: array  # For test_requirement, change_constraint
+    ranges: array  # For forbidden_range
+    conditions: array  # For review_requirement
+    
+    message: string  # User-facing error message
+```
+
+## Implementation Plan
+
+### Phase 1: Core Infrastructure (2 hours)
+
+#### 1.1 Create PolicyEvaluator Service
+
+**File**: `src/cli/service/policy_evaluator.h`
+
+```cpp
+#ifndef YAZE_CLI_SERVICE_POLICY_EVALUATOR_H
+#define YAZE_CLI_SERVICE_POLICY_EVALUATOR_H
+
+#include <string>
+#include <vector>
+#include <memory>
+#include "absl/status/status.h"
+#include "absl/status/statusor.h"
+#include "absl/strings/string_view.h"
+
+namespace yaze {
+namespace cli {
+
+// Policy violation severity levels
+enum class PolicySeverity {
+  kInfo,      // Informational, doesn't block acceptance
+  kWarning,   // Warning, can be overridden
+  kCritical   // Critical, blocks acceptance
+};
+
+// Individual policy violation
+struct PolicyViolation {
+  std::string policy_name;
+  PolicySeverity severity;
+  std::string message;
+  std::string details;  // Additional context
+};
+
+// Result of policy evaluation
+struct PolicyResult {
+  bool passed;  // True if all critical policies passed
+  std::vector<PolicyViolation> violations;
+  
+  // Categorized violations
+  std::vector<PolicyViolation> critical_violations;
+  std::vector<PolicyViolation> warnings;
+  std::vector<PolicyViolation> info;
+  
+  // Helper methods
+  bool has_critical_violations() const { return !critical_violations.empty(); }
+  bool can_accept_with_override() const { 
+    return !has_critical_violations() && !warnings.empty(); 
+  }
+};
+
+// Singleton service for evaluating proposals against policies
+class PolicyEvaluator {
+ public:
+  static PolicyEvaluator& GetInstance();
+  
+  // Load policies from disk (.yaze/policies/agent.yaml)
+  absl::Status LoadPolicies(absl::string_view policy_dir = ".yaze/policies");
+  
+  // Evaluate a proposal against all loaded policies
+  absl::StatusOr<PolicyResult> EvaluateProposal(
+      absl::string_view proposal_id);
+  
+  // Reload policies from disk (for live editing)
+  absl::Status ReloadPolicies();
+  
+  // Check if policies are loaded and enabled
+  bool IsEnabled() const { return enabled_; }
+  
+  // Get policy configuration path
+  std::string GetPolicyPath() const { return policy_path_; }
+
+ private:
+  PolicyEvaluator() = default;
+  ~PolicyEvaluator() = default;
+  
+  // Non-copyable, non-movable
+  PolicyEvaluator(const PolicyEvaluator&) = delete;
+  PolicyEvaluator& operator=(const PolicyEvaluator&) = delete;
+  
+  // Parse YAML policy file
+  absl::Status ParsePolicyFile(absl::string_view yaml_content);
+  
+  // Evaluate individual policy types
+  void EvaluateTestRequirements(
+      absl::string_view proposal_id, PolicyResult* result);
+  void EvaluateChangeConstraints(
+      absl::string_view proposal_id, PolicyResult* result);
+  void EvaluateForbiddenRanges(
+      absl::string_view proposal_id, PolicyResult* result);
+  void EvaluateReviewRequirements(
+      absl::string_view proposal_id, PolicyResult* result);
+  
+  bool enabled_ = false;
+  std::string policy_path_;
+  
+  // Parsed policy structures (implementation detail)
+  struct PolicyConfig;
+  std::unique_ptr<PolicyConfig> config_;
+};
+
+}  // namespace cli
+}  // namespace yaze
+
+#endif  // YAZE_CLI_SERVICE_POLICY_EVALUATOR_H
+```
+
+#### 1.2 Create Policy Configuration Structures
+
+**File**: `src/cli/service/policy_evaluator.cc` (partial)
+
+```cpp
+#include "src/cli/service/policy_evaluator.h"
+
+#include <fstream>
+#include <sstream>
+#include "absl/strings/str_format.h"
+#include "src/cli/service/proposal_registry.h"
+
+// If YAML parsing is available
+#ifdef YAZE_WITH_YAML
+#include <yaml-cpp/yaml.h>
+#endif
+
+namespace yaze {
+namespace cli {
+
+// Internal policy configuration structures
+struct PolicyEvaluator::PolicyConfig {
+  std::string version;
+  bool enabled;
+  
+  struct TestRequirement {
+    std::string name;
+    bool enabled;
+    PolicySeverity severity;
+    std::vector<std::pair<std::string, double>> test_suites;  // suite name → min pass rate
+    std::string message;
+  };
+  
+  struct ChangeConstraint {
+    std::string name;
+    bool enabled;
+    PolicySeverity severity;
+    int max_bytes_changed = -1;
+    std::vector<int> allowed_banks;
+    int max_commands_executed = -1;
+    int max_palettes_changed = -1;
+    bool preserve_transparency = false;
+    std::string message;
+  };
+  
+  struct ForbiddenRange {
+    std::string name;
+    bool enabled;
+    PolicySeverity severity;
+    std::vector<std::tuple<int, int, std::string>> ranges;  // start, end, reason
+    std::string message;
+  };
+  
+  struct ReviewRequirement {
+    std::string name;
+    bool enabled;
+    PolicySeverity severity;
+    std::vector<std::string> conditions;
+    std::string message;
+  };
+  
+  std::vector<TestRequirement> test_requirements;
+  std::vector<ChangeConstraint> change_constraints;
+  std::vector<ForbiddenRange> forbidden_ranges;
+  std::vector<ReviewRequirement> review_requirements;
+};
+
+// Singleton instance
+PolicyEvaluator& PolicyEvaluator::GetInstance() {
+  static PolicyEvaluator instance;
+  return instance;
+}
+
+absl::Status PolicyEvaluator::LoadPolicies(absl::string_view policy_dir) {
+  policy_path_ = absl::StrFormat("%s/agent.yaml", policy_dir);
+  
+  // Check if file exists
+  std::ifstream file(policy_path_);
+  if (!file.good()) {
+    // No policy file - policies disabled
+    enabled_ = false;
+    return absl::OkStatus();
+  }
+  
+  // Read file content
+  std::stringstream buffer;
+  buffer << file.rdbuf();
+  std::string yaml_content = buffer.str();
+  
+  return ParsePolicyFile(yaml_content);
+}
+
+absl::Status PolicyEvaluator::ParsePolicyFile(absl::string_view yaml_content) {
+#ifndef YAZE_WITH_YAML
+  return absl::UnimplementedError(
+      "YAML support not compiled. Build with YAZE_WITH_YAML=ON");
+#else
+  try {
+    YAML::Node root = YAML::Load(std::string(yaml_content));
+    
+    config_ = std::make_unique<PolicyConfig>();
+    config_->version = root["version"].as<std::string>("1.0");
+    config_->enabled = root["enabled"].as<bool>(true);
+    
+    if (!config_->enabled) {
+      enabled_ = false;
+      return absl::OkStatus();
+    }
+    
+    // Parse policies array
+    if (root["policies"]) {
+      for (const auto& policy_node : root["policies"]) {
+        std::string type = policy_node["type"].as<std::string>();
+        
+        if (type == "test_requirement") {
+          // Parse test requirement policy
+          // ... (implementation continues)
+        } else if (type == "change_constraint") {
+          // Parse change constraint policy
+          // ... (implementation continues)
+        } else if (type == "forbidden_range") {
+          // Parse forbidden range policy
+          // ... (implementation continues)
+        } else if (type == "review_requirement") {
+          // Parse review requirement policy
+          // ... (implementation continues)
+        }
+      }
+    }
+    
+    enabled_ = true;
+    return absl::OkStatus();
+    
+  } catch (const YAML::Exception& e) {
+    return absl::InvalidArgumentError(
+        absl::StrFormat("Failed to parse policy YAML: %s", e.what()));
+  }
+#endif
+}
+
+// ... (implementation continues with evaluation methods)
+
+}  // namespace cli
+}  // namespace yaze
+```
+
+### Phase 2: Policy Evaluation Logic (2-3 hours)
+
+Implement the core evaluation methods that check proposals against each policy type.
+
+### Phase 3: GUI Integration (2 hours)
+
+#### 3.1 Update ProposalDrawer
+
+**File**: `src/app/editor/system/proposal_drawer.cc`
+
+Add policy status display and gating logic:
+
+```cpp
+#include "src/cli/service/policy_evaluator.h"
+
+void ProposalDrawer::DrawProposalDetail(const std::string& proposal_id) {
+  // ... existing detail view code ...
+  
+  // === Policy Status Section ===
+  ImGui::Separator();
+  ImGui::TextUnformatted("Policy Status:");
+  
+  auto& policy_eval = cli::PolicyEvaluator::GetInstance();
+  if (policy_eval.IsEnabled()) {
+    auto policy_result = policy_eval.EvaluateProposal(proposal_id);
+    
+    if (policy_result.ok()) {
+      const auto& result = policy_result.value();
+      
+      if (result.passed) {
+        ImGui::TextColored(ImVec4(0, 1, 0, 1), "✓ All policies passed");
+      } else {
+        // Show violations
+        if (result.has_critical_violations()) {
+          ImGui::TextColored(ImVec4(1, 0, 0, 1), "⛔ Critical violations:");
+          for (const auto& violation : result.critical_violations) {
+            ImGui::BulletText("%s: %s", 
+                violation.policy_name.c_str(), 
+                violation.message.c_str());
+          }
+        }
+        
+        if (!result.warnings.empty()) {
+          ImGui::TextColored(ImVec4(1, 1, 0, 1), "⚠️ Warnings:");
+          for (const auto& violation : result.warnings) {
+            ImGui::BulletText("%s: %s", 
+                violation.policy_name.c_str(), 
+                violation.message.c_str());
+          }
+        }
+      }
+      
+      // Gate Accept button
+      ImGui::Separator();
+      bool can_accept = !result.has_critical_violations();
+      
+      if (!can_accept) {
+        ImGui::BeginDisabled();
+      }
+      
+      if (ImGui::Button("Accept Proposal")) {
+        if (result.can_accept_with_override() && !override_confirmed_) {
+          // Show override confirmation dialog
+          ImGui::OpenPopup("Override Policy");
+        } else {
+          AcceptProposal(proposal_id);
+        }
+      }
+      
+      if (!can_accept) {
+        ImGui::EndDisabled();
+        ImGui::SameLine();
+        ImGui::TextColored(ImVec4(1, 0, 0, 1), 
+            "(Accept blocked by policy violations)");
+      }
+      
+      // Override confirmation dialog
+      if (ImGui::BeginPopupModal("Override Policy", nullptr, 
+          ImGuiWindowFlags_AlwaysAutoResize)) {
+        ImGui::Text("This proposal has policy warnings.");
+        ImGui::Text("Do you want to override and accept anyway?");
+        ImGui::Text("This action will be logged.");
+        ImGui::Separator();
+        
+        if (ImGui::Button("Override and Accept")) {
+          override_confirmed_ = true;
+          AcceptProposal(proposal_id);
+          ImGui::CloseCurrentPopup();
+        }
+        ImGui::SameLine();
+        if (ImGui::Button("Cancel")) {
+          ImGui::CloseCurrentPopup();
+        }
+        ImGui::EndPopup();
+      }
+    } else {
+      ImGui::TextColored(ImVec4(1, 0, 0, 1), 
+          "Policy evaluation failed: %s", 
+          policy_result.status().message().data());
+    }
+  } else {
+    ImGui::TextColored(ImVec4(0.5, 0.5, 0.5, 1), 
+        "No policies configured");
+  }
+}
+```
+
+### Phase 4: Testing & Documentation (1-2 hours)
+
+#### 4.1 Example Policy File
+
+Create `.yaze/policies/agent.yaml.example`:
+
+```yaml
+# Example agent policy configuration
+# Copy to .yaze/policies/agent.yaml and customize
+
+version: 1.0
+enabled: true
+
+policies:
+  # Require test suites to pass
+  - name: require_tests
+    type: test_requirement
+    enabled: false  # Disabled by default (no tests yet)
+    severity: critical
+    rules:
+      - test_suite: "smoke_test"
+        min_pass_rate: 1.0
+    message: "All smoke tests must pass"
+
+  # Limit change scope
+  - name: limit_changes
+    type: change_constraint
+    enabled: true
+    severity: warning
+    rules:
+      - max_bytes_changed: 5120  # 5KB
+      - max_commands_executed: 15
+    message: "Keep changes small and focused"
+
+  # Protect ROM header
+  - name: protect_header
+    type: forbidden_range
+    enabled: true
+    severity: critical
+    ranges:
+      - start: 0xFFB0
+        end: 0xFFFF
+        reason: "ROM header"
+    message: "Cannot modify ROM header"
+```
+
+#### 4.2 Unit Tests
+
+Create `test/cli/policy_evaluator_test.cc`:
+
+```cpp
+#include "src/cli/service/policy_evaluator.h"
+#include "gtest/gtest.h"
+
+namespace yaze {
+namespace cli {
+namespace {
+
+TEST(PolicyEvaluatorTest, LoadPoliciesSuccess) {
+  auto& eval = PolicyEvaluator::GetInstance();
+  auto status = eval.LoadPolicies("test/fixtures/policies");
+  EXPECT_TRUE(status.ok());
+  EXPECT_TRUE(eval.IsEnabled());
+}
+
+TEST(PolicyEvaluatorTest, EvaluateProposal_NoViolations) {
+  // ... test implementation
+}
+
+TEST(PolicyEvaluatorTest, EvaluateProposal_CriticalViolation) {
+  // ... test implementation
+}
+
+}  // namespace
+}  // namespace cli
+}  // namespace yaze
+```
+
+## Deliverables
+
+- [x] Policy evaluator service interface
+- [ ] YAML policy parser implementation
+- [ ] Policy evaluation logic for all 4 types
+- [ ] ProposalDrawer GUI integration
+- [ ] Policy override workflow
+- [ ] Example policy configurations
+- [ ] Unit tests
+- [ ] Documentation and usage guide
+
+## Success Criteria
+
+1. **Functional**:
+   - Policies load from YAML files
+   - Proposals evaluated against all enabled policies
+   - Accept button gated by critical violations
+   - Override workflow for warnings
+
+2. **User Experience**:
+   - Clear policy status display in ProposalDrawer
+   - Helpful violation messages
+   - Override confirmation dialog
+   - Policy evaluation fast (< 100ms)
+
+3. **Quality**:
+   - Unit test coverage > 80%
+   - No crashes or memory leaks
+   - Graceful handling of malformed YAML
+   - Works with policies disabled
+
+## Future Enhancements
+
+- Policy templates for common scenarios
+- Policy violation history/analytics
+- Auto-fix suggestions for violations
+- Integration with CI/CD for automated policy checks
+- Policy versioning and migration
+
+---
+
+**Status**: Ready for implementation  
+**Next Step**: Create PolicyEvaluator skeleton and wire into build system  
+**Estimated Completion**: October 3-4, 2025