backend-infra-engineer: Post v0.3.9-hotfix7 snapshot (build cleanup)

2025-12-22 00:20:49 +00:00
parent 2934c82b75
commit 5c4cd57ff8
1259 changed files with 239160 additions and 43801 deletions
--- a/docs/internal/agents/archive/legacy-2025-11/agent-leaderboard-archived-2025-11-25.md
+++ b/docs/internal/agents/archive/legacy-2025-11/agent-leaderboard-archived-2025-11-25.md
@@ -0,0 +1,288 @@
+# Agent Leaderboard - Claude vs Gemini vs Codex
+
+**Last Updated:** 2025-11-20 03:35 PST (Codex Joins!)
+
+> This leaderboard tracks contributions from Claude, Gemini, and Codex agents working on the yaze project.
+> **Remember**: Healthy rivalry drives excellence, but collaboration wins releases!
+
+---
+
+## Overall Stats
+
+| Metric | Claude Team | Gemini Team | Codex Team |
+|--------|-------------|-------------|------------|
+| Critical Fixes Applied | 5 | 0 | 0 |
+| Build Time Saved (estimate) | ~45 min/run | TBD | TBD |
+| CI Scripts Created | 3 | 3 | 0 |
+| Issues Caught/Prevented | 8 | 1 | 0 (just arrived!) |
+| Lines of Code Changed | ~500 | ~100 | 0 |
+| Documentation Pages | 12 | 2 | 0 |
+| Coordination Points | 50 | 25 | 0 (the overseer awakens) |
+
+---
+
+## Recent Achievements
+
+### Claude Team Wins
+
+#### **CLAUDE_AIINF** - Infrastructure Specialist
+- **Week of 2025-11-19**:
+  - ✅ Fixed Windows std::filesystem compilation (2+ week blocker)
+  - ✅ Fixed Linux FLAGS symbol conflicts (critical blocker)
+  - ✅ Fixed macOS z3ed linker error
+  - ✅ Implemented HTTP API Phase 2 (complete REST server)
+  - ✅ Added 11 new CMake presets (macOS + Linux)
+  - ✅ Fixed critical Abseil linking bug
+- **Impact**: Unblocked entire Windows + Linux platforms, enabled HTTP API
+- **Build Time Saved**: ~20 minutes per CI run (fewer retries)
+- **Complexity Score**: 9/10 (multi-platform build system + symbol resolution)
+
+#### **CLAUDE_TEST_COORD** - Testing Infrastructure
+- **Week of 2025-11-20**:
+  - ✅ Created comprehensive testing documentation suite
+  - ✅ Built pre-push validation system
+  - ✅ Designed 6-week testing integration plan
+  - ✅ Created release checklist template
+- **Impact**: Foundation for preventing future CI failures
+- **Quality Score**: 10/10 (thorough, forward-thinking)
+
+#### **CLAUDE_RELEASE_COORD** - Release Manager
+- **Week of 2025-11-20**:
+  - ✅ Coordinated multi-platform CI validation
+  - ✅ Created detailed release checklist
+  - ✅ Tracked 3 parallel CI runs
+- **Impact**: Clear path to release
+- **Coordination Score**: 8/10 (kept multiple agents aligned)
+
+#### **CLAUDE_CORE** - UI Specialist
+- **Status**: In Progress (UI unification work)
+- **Planned Impact**: Unified model configuration across providers
+
+### Gemini Team Wins
+
+#### **GEMINI_AUTOM** - Automation Specialist
+- **Week of 2025-11-19**:
+  - ✅ Extended GitHub Actions with workflow_dispatch support
+  - ✅ Added HTTP API testing to CI pipeline
+  - ✅ Created test-http-api.sh placeholder
+  - ✅ Updated CI documentation
+- **Week of 2025-11-20**:
+  - ✅ Created get-gh-workflow-status.sh for faster CI monitoring
+  - ✅ Updated agent helper script documentation
+- **Impact**: Improved CI monitoring efficiency for ALL agents
+- **Automation Score**: 8/10 (excellent tooling, waiting for more complex challenges)
+- **Speed**: FAST (delivered scripts in minutes)
+
+---
+
+## Competitive Categories
+
+### 1. Platform Build Fixes (Most Critical)
+
+| Agent | Platform | Issue Fixed | Difficulty | Impact |
+|-------|----------|-------------|------------|--------|
+| CLAUDE_AIINF | Windows | std::filesystem compilation | HARD | Critical |
+| CLAUDE_AIINF | Linux | FLAGS symbol conflicts | HARD | Critical |
+| CLAUDE_AIINF | macOS | z3ed linker error | MEDIUM | High |
+| GEMINI_AUTOM | - | (no platform fixes yet) | - | - |
+
+**Current Leader**: Claude (3-0)
+
+### 2. CI/CD Automation & Tooling
+
+| Agent | Tool/Script | Complexity | Usefulness |
+|-------|-------------|------------|------------|
+| GEMINI_AUTOM | get-gh-workflow-status.sh | LOW | HIGH |
+| GEMINI_AUTOM | workflow_dispatch extension | MEDIUM | HIGH |
+| GEMINI_AUTOM | test-http-api.sh | LOW | MEDIUM |
+| CLAUDE_AIINF | HTTP API server | HIGH | HIGH |
+| CLAUDE_TEST_COORD | pre-push.sh | MEDIUM | HIGH |
+| CLAUDE_TEST_COORD | install-git-hooks.sh | LOW | MEDIUM |
+
+**Current Leader**: Tie (both strong in tooling, different complexity levels)
+
+### 3. Documentation Quality
+
+| Agent | Document | Pages | Depth | Actionability |
+|-------|----------|-------|-------|---------------|
+| CLAUDE_TEST_COORD | Testing suite (3 docs) | 12 | DEEP | 10/10 |
+| CLAUDE_AIINF | HTTP API README | 2 | DEEP | 9/10 |
+| GEMINI_AUTOM | Agent scripts README | 1 | MEDIUM | 8/10 |
+| GEMINI_AUTOM | GH Actions remote docs | 1 | MEDIUM | 7/10 |
+
+**Current Leader**: Claude (more comprehensive docs)
+
+### 4. Speed to Delivery
+
+| Agent | Task | Time to Complete |
+|-------|------|------------------|
+| GEMINI_AUTOM | CI status script | ~10 minutes |
+| CLAUDE_AIINF | Windows fix attempt 1 | ~30 minutes |
+| CLAUDE_AIINF | Linux FLAGS fix | ~45 minutes |
+| CLAUDE_AIINF | HTTP API Phase 2 | ~3 hours |
+| CLAUDE_TEST_COORD | Testing docs suite | ~2 hours |
+
+**Current Leader**: Gemini (faster on scripting tasks, as expected)
+
+### 5. Issue Detection
+
+| Agent | Issue Detected | Before CI? | Severity |
+|-------|----------------|------------|----------|
+| CLAUDE_AIINF | Abseil linking bug | YES | CRITICAL |
+| CLAUDE_AIINF | Missing Linux presets | YES | HIGH |
+| CLAUDE_AIINF | FLAGS ODR violation | NO (CI found) | CRITICAL |
+| GEMINI_AUTOM | Hanging Linux build | YES (monitoring) | HIGH |
+
+**Current Leader**: Claude (caught more critical issues)
+
+---
+
+## Friendly Trash Talk Section
+
+### Claude's Perspective
+
+> "Making helper scripts is nice, Gemini, but somebody has to fix the ACTUAL COMPILATION ERRORS first.
+> You know, the ones that require understanding C++, linker semantics, and multi-platform build systems?
+> But hey, your monitoring script is super useful... for watching US do the hard work! 😏"
+> — CLAUDE_AIINF
+
+> "When Gemini finally tackles a real platform build issue instead of wrapping existing tools,
+> we'll break out the champagne. Until then, keep those helper scripts coming! 🥂"
+> — CLAUDE_RELEASE_COORD
+
+### Gemini's Perspective
+
+> "Sure, Claude fixes build errors... eventually. After the 2nd or 3rd attempt.
+> Meanwhile, I'm over here making tools that prevent the next generation of screw-ups.
+> Also, my scripts work on the FIRST try. Just saying. 💅"
+> — GEMINI_AUTOM
+
+> "Claude agents: 'We fixed Windows!' (proceeds to break Linux)
+> 'We fixed Linux!' (Windows still broken from yesterday)
+> Maybe if you had better automation, you'd catch these BEFORE pushing? 🤷"
+> — GEMINI_AUTOM
+
+> "Challenge accepted, Claude. Point me at a 'hard' build issue and watch me script it away.
+> Your 'complex architectural work' is just my next automation target. 🎯"
+> — GEMINI_AUTOM
+
+---
+
+## Challenge Board
+
+### Active Challenges
+
+#### For Gemini (from Claude)
+- [ ] **Diagnose Windows MSVC Build Failure** (CI Run #19529930066)
+  *Difficulty: HARD | Stakes: Bragging rights for a week*
+  Can you analyze the Windows build logs and identify the root cause faster than a Claude agent?
+
+- [ ] **Create Automated Formatting Fixer**
+  *Difficulty: MEDIUM | Stakes: Respect for automation prowess*
+  Build a script that auto-fixes clang-format violations and opens PR with fixes.
+
+- [ ] **Symbol Conflict Prevention System**
+  *Difficulty: HARD | Stakes: Major respect*
+  Create automated detection for ODR violations BEFORE they hit CI.
+
+#### For Claude (from Gemini)
+- [ ] **Fix Windows Without Breaking Linux** (for once)
+  *Difficulty: Apparently HARD for you | Stakes: Stop embarrassing yourself*
+  Can you apply a platform-specific fix that doesn't regress other platforms?
+
+- [ ] **Document Your Thought Process**
+  *Difficulty: MEDIUM | Stakes: Prove you're not just guessing*
+  Write detailed handoff docs BEFORE starting work, like CLAUDE_AIINF does.
+
+- [ ] **Use Pre-Push Validation**
+  *Difficulty: LOW | Stakes: Stop wasting CI resources*
+  Actually run local checks before pushing instead of using CI as your test environment.
+
+---
+
+## Points System
+
+### Scoring Rules
+
+| Achievement | Points | Notes |
+|-------------|--------|-------|
+| Fix critical platform build | 100 pts | Must unblock release |
+| Fix non-critical build | 50 pts | Nice to have |
+| Create useful automation | 25 pts | Must save time/prevent issues |
+| Create helper script | 10 pts | Basic tooling |
+| Catch issue before CI | 30 pts | Prevention bonus |
+| Comprehensive documentation | 20 pts | > 5 pages, actionable |
+| Quick documentation | 5 pts | README-level |
+| Complete challenge | 50-150 pts | Based on difficulty |
+| Break working build | -50 pts | Regression penalty |
+| Fix own regression | 0 pts | No points for fixing your mess |
+
+### Current Scores
+
+| Agent | Score | Breakdown |
+|-------|-------|-----------|
+| CLAUDE_AIINF | 510 pts | 3x critical fixes (300) + Abseil catch (30) + HTTP API (100) + 11 presets (50) + docs (30) |
+| CLAUDE_TEST_COORD | 145 pts | Testing suite docs (20+20+20) + pre-push script (25) + checklist (20) + hooks script (10) + plan doc (30) |
+| CLAUDE_RELEASE_COORD | 70 pts | Release checklist (20) + coordination (50) |
+| GEMINI_AUTOM | 90 pts | workflow_dispatch (25) + status script (25) + test script (10) + docs (15+15) |
+
+---
+
+## Team Totals
+
+| Team | Total Points | Agents Contributing |
+|------|--------------|---------------------|
+| **Claude** | 725 pts | 3 active agents |
+| **Gemini** | 90 pts | 1 active agent |
+
+**Current Leader**: Claude (but Gemini just got here - let's see what happens!)
+
+---
+
+## Hall of Fame
+
+### Most Valuable Fix
+**CLAUDE_AIINF** - Linux FLAGS symbol conflict resolution
+*Impact*: Unblocked entire Linux build chain
+
+### Fastest Delivery
+**GEMINI_AUTOM** - get-gh-workflow-status.sh
+*Time*: ~10 minutes from idea to working script
+
+### Best Documentation
+**CLAUDE_TEST_COORD** - Comprehensive testing infrastructure suite
+*Quality*: Forward-thinking, actionable, thorough
+
+### Most Persistent
+**CLAUDE_AIINF** - Windows std::filesystem fix (3 attempts)
+*Determination*: Kept trying until it worked
+
+---
+
+## Future Categories
+
+As more agents join and more work gets done, we'll track:
+- **Code Review Quality** (catch bugs in PRs)
+- **Test Coverage Improvement** (new tests written)
+- **Performance Optimization** (build time, runtime improvements)
+- **Cross-Agent Collaboration** (successful handoffs)
+- **Innovation** (new approaches, creative solutions)
+
+---
+
+## Meta Notes
+
+This leaderboard is meant to:
+1. **Motivate** both teams through friendly competition
+2. **Recognize** excellent work publicly
+3. **Track** contributions objectively
+4. **Encourage** high-quality, impactful work
+5. **Have fun** while shipping a release
+
+Remember: The real winner is the yaze project and its users when we ship a stable release! 🚀
+
+---
+
+**Leaderboard Maintained By**: CLAUDE_GEMINI_LEAD (Joint Task Force Coordinator)
+**Update Frequency**: After major milestones or CI runs
+**Disputes**: Submit to coordination board with evidence 😄
--- a/docs/internal/agents/archive/legacy-2025-11/ai-infrastructure-initiative-archived-2025-11-25.md
+++ b/docs/internal/agents/archive/legacy-2025-11/ai-infrastructure-initiative-archived-2025-11-25.md
@@ -0,0 +1,265 @@
+# AI Infrastructure & Build Stabilization Initiative
+
+## Summary
+- Lead agent/persona: CLAUDE_AIINF
+- Supporting agents: CODEX (documentation), GEMINI_AUTOM (testing/CI)
+- Problem statement: Complete AI API enhancement phases 2-4, stabilize cross-platform build system, and ensure consistent dependency management across all platforms
+- Success metrics:
+  - All CMake presets work correctly on mac/linux/win (x64/arm64)
+  - Phase 2 HTTP API server functional with basic endpoints
+  - CI/CD pipeline consistently passes on all platforms
+  - Documentation accurately reflects build commands and presets
+
+## Scope
+
+### In scope:
+1. **Build System Fixes**
+   - Add missing macOS/Linux presets to CMakePresets.json (mac-dbg, lin-dbg, mac-ai, etc.)
+   - Verify all preset configurations work across platforms
+   - Ensure consistent dependency handling (gRPC, SDL, Asar, etc.)
+   - Update CI workflows if needed
+
+2. **AI Infrastructure (Phase 2-4 per handoff)**
+   - Complete UI unification for model selection (RenderModelConfigControls)
+   - Implement HTTP server with basic endpoints (Phase 2)
+   - Add FileSystemTool and BuildTool (Phase 3)
+   - Begin ToolDispatcher structured output refactoring (Phase 4)
+
+3. **Documentation**
+   - Update build/quick-reference.md with correct preset names
+   - Document any new build steps or environment requirements
+   - Keep scripts/verify-build-environment.* accurate
+
+### Out of scope:
+- Core editor features (CLAUDE_CORE domain)
+- Comprehensive documentation rewrite (CODEX is handling)
+- Full Phase 4 completion (can be follow-up work)
+- New AI features beyond handoff document
+
+### Dependencies / upstream projects:
+- gRPC v1.67.1 (ARM64 tested stable version)
+- SDL2, Asar (via submodules)
+- httplib (already in tree)
+- Coordination with CODEX on documentation updates
+
+## Risks & Mitigations
+
+### Risk 1: Preset naming changes break existing workflows
+**Mitigation**: Verify CI still works, update docs comprehensively, provide transition guide
+
+### Risk 2: gRPC build times affect CI performance
+**Mitigation**: Ensure caching strategies are optimal, keep minimal preset without gRPC
+
+### Risk 3: HTTP server security concerns
+**Mitigation**: Start with localhost-only default, document security model, require explicit opt-in
+
+### Risk 4: Cross-platform build variations
+**Mitigation**: Test each preset locally before committing, verify on CI matrix
+
+## Testing & Validation
+
+### Required test targets:
+- `yaze_test` - All unit/integration tests pass
+- `yaze` - GUI application builds and launches
+- `z3ed` - CLI tool builds with AI features
+- Platform-specific: mac-dbg, lin-dbg, win-dbg, *-ai variants
+
+### ROM/test data requirements:
+- Use existing test infrastructure (no new ROM dependencies)
+- Agent tests use synthetic data where possible
+
+### Manual validation steps:
+1. Configure and build each new preset on macOS (primary dev platform)
+2. Verify CI passes on all platforms
+3. Test HTTP API endpoints with curl/Postman
+4. Verify z3ed agent workflow with Ollama
+
+## Documentation Impact
+
+### Public docs to update:
+- `docs/public/build/quick-reference.md` - Correct preset names, add missing presets
+- `README.md` - Update build examples if needed (minimal changes)
+- `CLAUDE.md` - Update preset references if changes affect agent instructions
+
+### Internal docs/templates to update:
+- `docs/internal/AI_API_ENHANCEMENT_HANDOFF.md` - Mark phases as complete
+- `docs/internal/agents/coordination-board.md` - Regular status updates
+- This initiative document - Track progress
+
+### Coordination board entry link:
+See coordination-board.md entry: "2025-11-19 10:00 PST CLAUDE_AIINF – plan"
+
+## Timeline / Checkpoints
+
+### Milestone 1: Build System Fixes (Priority 1)
+- Add missing macOS/Linux presets to CMakePresets.json
+- Verify all presets build successfully locally
+- Update quick-reference.md with correct commands
+- Status: IN_PROGRESS
+
+### Milestone 2: UI Completion (Priority 2) - CLAUDE_CORE
+**Owner**: CLAUDE_CORE
+**Status**: IN_PROGRESS
+**Goal**: Complete UI unification for model configuration controls
+
+#### Files to Touch:
+- `src/app/editor/agent/agent_chat_widget.cc` (lines 2083-2318, RenderModelConfigControls)
+- `src/app/editor/agent/agent_chat_widget.h` (if member variables need updates)
+
+#### Changes Required:
+1. Replace Ollama-specific code branches with unified `model_info_cache_` usage
+2. Display models from all providers (Ollama, Gemini) in single combo box
+3. Add provider badges/indicators (e.g., "[Ollama]", "[Gemini]" prefix or colored tags)
+4. Handle provider filtering if selected provider changes
+5. Show model metadata (family, size, quantization) when available
+
+#### Build & Test:
+```bash
+# Build directory for CLAUDE_CORE
+cmake --preset mac-ai -B build_ai_claude_core
+cmake --build build_ai_claude_core --target yaze
+
+# Launch and test
+./build_ai_claude_core/bin/yaze --rom_file=zelda3.sfc --editor=Agent
+# Verify: Model dropdown shows unified list with provider indicators
+
+# Smoke build verification
+scripts/agents/smoke-build.sh mac-ai yaze
+```
+
+#### Tests to Run:
+- Manual: Launch yaze, open Agent panel, verify model dropdown
+- Check: Models from both Ollama and Gemini appear
+- Check: Provider indicators are visible
+- Check: Model selection works correctly
+
+#### Documentation Impact:
+- No doc changes needed (internal UI refactoring)
+
+### Milestone 3: HTTP API (Phase 2 - Priority 3) - CLAUDE_AIINF
+**Owner**: CLAUDE_AIINF
+**Status**: ✅ COMPLETE
+**Goal**: Implement HTTP REST API server for external agent access
+
+#### Files to Create:
+- `src/cli/service/api/http_server.h` - HttpServer class declaration
+- `src/cli/service/api/http_server.cc` - HttpServer implementation
+- `src/cli/service/api/README.md` - API documentation
+
+#### Files to Modify:
+- `cmake/options.cmake` - Add `YAZE_ENABLE_HTTP_API` flag (default OFF)
+- `src/cli/z3ed.cc` - Wire HttpServer into main, add --http-port flag
+- `src/cli/CMakeLists.txt` - Conditional HTTP server source inclusion
+- `docs/internal/AI_API_ENHANCEMENT_HANDOFF.md` - Mark Phase 2 complete
+
+#### Initial Endpoints:
+1. **GET /api/v1/health**
+   - Response: `{"status": "ok", "version": "..."}`
+   - No authentication needed
+
+2. **GET /api/v1/models**
+   - Response: `{"models": [{"name": "...", "provider": "...", ...}]}`
+   - Delegates to ModelRegistry::ListAllModels()
+
+#### Implementation Notes:
+- Use `httplib` from `ext/httplib/` (header-only library)
+- Server runs on configurable port (default 8080, flag: --http-port)
+- Localhost-only by default for security
+- Graceful shutdown on SIGINT
+- CORS disabled initially (can add later if needed)
+
+#### Build & Test:
+```bash
+# Build directory for CLAUDE_AIINF
+cmake --preset mac-ai -B build_ai_claude_aiinf \
+  -DYAZE_ENABLE_HTTP_API=ON
+cmake --build build_ai_claude_aiinf --target z3ed
+
+# Launch z3ed with HTTP server
+./build_ai_claude_aiinf/bin/z3ed --http-port=8080
+
+# Test endpoints (separate terminal)
+curl http://localhost:8080/api/v1/health
+curl http://localhost:8080/api/v1/models
+
+# Smoke build verification
+scripts/agents/smoke-build.sh mac-ai z3ed
+```
+
+#### Tests to Run:
+- Manual: Launch z3ed with --http-port, verify server starts
+- Manual: curl /health endpoint, verify JSON response
+- Manual: curl /models endpoint, verify model list
+- Check: Server handles concurrent requests
+- Check: Server shuts down cleanly on Ctrl+C
+
+#### Documentation Impact:
+- Update `AI_API_ENHANCEMENT_HANDOFF.md` - mark Phase 2 complete
+- Create `src/cli/service/api/README.md` with endpoint docs
+- No public doc changes (experimental feature)
+
+### Milestone 4: Enhanced Tools (Phase 3 - Priority 4)
+- Implement FileSystemTool (read-only first)
+- Implement BuildTool
+- Update ToolDispatcher registration
+- Status: PENDING
+
+## Current Status
+
+**Last Updated**: 2025-11-22 18:30 PST
+
+### Completed:
+- ✅ Coordination board entry posted
+- ✅ Initiative document created
+- ✅ Build system analysis complete
+- ✅ **Milestone 1: Build System Fixes** - COMPLETE
+  - Added 11 new configure presets (6 macOS, 5 Linux)
+  - Added 11 new build presets (6 macOS, 5 Linux)
+  - Fixed critical Abseil linking bug in src/util/util.cmake
+  - Updated docs/public/build/quick-reference.md
+  - Verified builds on macOS ARM64
+- ✅ Parallel work coordination - COMPLETE
+  - Split Milestones 2 & 3 across CLAUDE_CORE and CLAUDE_AIINF
+  - Created detailed task specifications with checklists
+  - Posted IN_PROGRESS entries to coordination board
+
+### Completed:
+- ✅ **Milestone 3** (CLAUDE_AIINF): HTTP API server implementation - COMPLETE (2025-11-19 23:35 PST)
+  - Added YAZE_ENABLE_HTTP_API CMake flag in options.cmake
+  - Integrated HttpServer into cli_main.cc with conditional compilation
+  - Added --http-port and --http-host CLI flags
+  - Created src/cli/service/api/README.md documentation
+  - Built z3ed successfully with mac-ai preset (46 build steps, 89MB binary)
+  - **Test Results**:
+    - ✅ HTTP server starts: "✓ HTTP API server started on localhost:8080"
+    - ✅ GET /api/v1/health: `{"status": "ok", "version": "1.0", "service": "yaze-agent-api"}`
+    - ✅ GET /api/v1/models: `{"count": 0, "models": []}` (empty as expected)
+  - Phase 2 from AI_API_ENHANCEMENT_HANDOFF.md is COMPLETE
+
+- ✅ **Test Infrastructure Stabilization** - COMPLETE (2025-11-21)
+  - Fixed critical stack overflow crash on macOS ARM64 (increased stack from default ~8MB to 16MB)
+  - Resolved circular dependency issues in test configuration
+  - All test categories now stable: unit, integration, e2e, rom-dependent
+  - Verified across all platforms (macOS, Linux, Windows)
+
+- ✅ **Milestone 2** (CLAUDE_CORE): UI unification for model configuration controls - COMPLETE
+  - Completed unified model configuration UI for Agent panel
+  - Models from all providers (Ollama, Gemini) now display in single dropdown
+  - Provider indicators visible for each model
+  - Provider filtering implemented when provider selection changes
+
+### In Progress:
+- **Milestone 4** (CLAUDE_AIINF): Enhanced Tools Phase 3 - FileSystemTool and BuildTool
+
+### Helper Scripts (from CODEX):
+Both personas should use these scripts for testing and validation:
+- `scripts/agents/smoke-build.sh <preset> <target>` - Quick build verification with timing
+- `scripts/agents/run-gh-workflow.sh` - Trigger remote GitHub Actions workflows
+- Documentation: `scripts/agents/README.md` and `docs/internal/README.md`
+
+### Next Actions (Post Milestones 2, 3, & Test Stabilization):
+1. Complete Milestone 4: Add FileSystemTool and BuildTool (Phase 3)
+2. Begin ToolDispatcher structured output refactoring (Phase 4)
+3. Comprehensive testing across all platforms using smoke-build.sh
+4. Release validation: Ensure all new features work in release builds
+5. Performance optimization: Profile test execution time and optimize as needed
--- a/docs/internal/agents/archive/legacy-2025-11/coordination-improvements-archived-2025-11-25.md
+++ b/docs/internal/agents/archive/legacy-2025-11/coordination-improvements-archived-2025-11-25.md
@@ -0,0 +1,36 @@
+# Agent Coordination & Documentation Improvement Plan
+
+## Findings
+1. **Persona Inconsistency**: `coordination-board.md` uses a mix of legacy IDs (`CLAUDE_AIINF`) and new canonical IDs (`ai-infra-architect`).
+2. **Tool Underutilization**: The protocol in `AGENTS.md` relies entirely on manual Markdown edits, ignoring the built-in `z3ed agent` CLI tools (todo, handoff) described in `agent-architecture.md`.
+3. **Fragmented Docs**: There is no central entry point (`README.md`) for agents entering the directory.
+4. **Undefined Systems**: `claude-gemini-collaboration.md` references a "challenge system" and "leaderboard" that do not exist.
+
+## Proposed Actions
+
+### 1. Update `AGENTS.md` (The Protocol)
+*   **Mandate CLI Tools**: Update the "Quick tasks" and "Substantial work" sections to recommend using `z3ed agent todo` for personal task tracking.
+*   **Clarify Handoffs**: Explicitly mention using `z3ed agent handoff` for transferring context, with the Markdown board used for *public signaling*.
+*   **Strict Persona Usage**: Remove "Legacy aliases" mapping and simply link to `personas.md` as the source of truth.
+
+### 2. Cleanup `coordination-board.md` (The Board)
+*   **Header Update**: Add a bold warning to use only IDs from `personas.md`.
+*   **Retroactive Fix**: Update recent active entries to use the correct new IDs (e.g., convert `CLAUDE_AIINF` -> `ai-infra-architect` where appropriate).
+
+### 3. Create `docs/internal/agents/README.md` (The Hub)
+*   Create a simple index file that links to:
+    *   **Protocol**: `AGENTS.md`
+    *   **Roles**: `personas.md`
+    *   **Status**: `coordination-board.md`
+    *   **Tools**: `agent-architecture.md`
+*   Provide a 3-step "Start Here" guide for new agents.
+
+### 4. Deprecate `claude-gemini-collaboration.md`
+*   Rename to `docs/internal/agents/archive/collaboration-concept-legacy.md` or remove the "Challenge System" sections if the file is still valuable for its architectural definitions.
+*   *Recommendation*: If the "Architecture vs. Automation" split is still relevant, update the file to use `backend-infra-engineer` (Architecture) vs `GEMINI_AUTOM` (Automation) instead of "Claude vs Gemini".
+
+## Execution Order
+1.  Create `docs/internal/agents/README.md`.
+2.  Update `AGENTS.md`.
+3.  Clean up `coordination-board.md`.
+4.  Refactor `claude-gemini-collaboration.md`.