391 lines
11 KiB
Markdown
391 lines
11 KiB
Markdown
# Testing Infrastructure Gap Analysis
|
|
|
|
## Executive Summary
|
|
|
|
Recent CI failures revealed critical gaps in our testing infrastructure that allowed platform-specific build failures to reach CI. This document analyzes what we currently test, what we missed, and what infrastructure is needed to catch issues earlier.
|
|
|
|
**Date**: 2025-11-20
|
|
**Triggered By**: Multiple CI failures in commits 43a0e5e314, c2bb90a3f1, and related fixes
|
|
|
|
---
|
|
|
|
## 1. Issues We Didn't Catch Locally
|
|
|
|
### 1.1 Windows Abseil Include Path Issues (c2bb90a3f1)
|
|
**Problem**: Abseil headers not found during Windows/clang-cl compilation
|
|
**Why it wasn't caught**:
|
|
- No local pre-push compilation check
|
|
- CMake configuration validates successfully, but compilation fails later
|
|
- Include path propagation from gRPC/Abseil not validated until full compile
|
|
|
|
**What would have caught it**:
|
|
- ✅ Smoke compilation test (compile subset of files to catch header issues)
|
|
- ✅ CMake configuration validator (check include path propagation)
|
|
- ✅ Header dependency checker
|
|
|
|
### 1.2 Linux FLAGS Symbol Conflicts (43a0e5e314, eb77bbeaff)
|
|
**Problem**: ODR (One Definition Rule) violation - multiple `FLAGS` symbols across libraries
|
|
**Why it wasn't caught**:
|
|
- Symbol conflicts only appear at link time
|
|
- No cross-library symbol conflict detection
|
|
- Static analysis doesn't catch ODR violations
|
|
- Unit tests don't link full dependency graph
|
|
|
|
**What would have caught it**:
|
|
- ✅ Symbol conflict scanner (nm/objdump analysis)
|
|
- ✅ ODR violation detector
|
|
- ✅ Full integration build test (link all libraries together)
|
|
|
|
### 1.3 Platform-Specific Configuration Issues
|
|
**Problem**: Preprocessor flags, compiler detection, and platform-specific code paths
|
|
**Why it wasn't caught**:
|
|
- No local cross-platform validation
|
|
- CMake configuration differences between platforms not tested
|
|
- Compiler detection logic (clang-cl vs MSVC) not validated
|
|
|
|
**What would have caught it**:
|
|
- ✅ CMake configuration dry-run on multiple platforms
|
|
- ✅ Preprocessor flag validation
|
|
- ✅ Compiler detection smoke test
|
|
|
|
---
|
|
|
|
## 2. Current Testing Coverage
|
|
|
|
### 2.1 What We Test Well
|
|
|
|
#### Unit Tests (test/unit/)
|
|
- **Coverage**: Core algorithms, data structures, parsers
|
|
- **Speed**: Fast (<1s for most tests)
|
|
- **Isolation**: Mocked dependencies, no ROM required
|
|
- **CI**: ✅ Runs on every PR
|
|
- **Example**: `hex_test.cc`, `asar_wrapper_test.cc`, `snes_palette_test.cc`
|
|
|
|
**Strengths**:
|
|
- Catches logic errors quickly
|
|
- Good for TDD
|
|
- Platform-independent
|
|
|
|
**Gaps**:
|
|
- Doesn't catch build system issues
|
|
- Doesn't catch linking problems
|
|
- Doesn't validate dependencies
|
|
|
|
#### Integration Tests (test/integration/)
|
|
- **Coverage**: Multi-component interactions, ROM operations
|
|
- **Speed**: Slower (1-10s per test)
|
|
- **Dependencies**: May require ROM files
|
|
- **CI**: ✅ Runs on develop/master
|
|
- **Example**: `asar_integration_test.cc`, `dungeon_editor_v2_test.cc`
|
|
|
|
**Strengths**:
|
|
- Tests component interactions
|
|
- Validates ROM operations
|
|
|
|
**Gaps**:
|
|
- Still doesn't catch platform-specific issues
|
|
- Doesn't validate symbol conflicts
|
|
- Doesn't test cross-library linking
|
|
|
|
#### E2E Tests (test/e2e/)
|
|
- **Coverage**: Full UI workflows, user interactions
|
|
- **Speed**: Very slow (10-60s per test)
|
|
- **Dependencies**: GUI, ImGuiTestEngine
|
|
- **CI**: ⚠️ Limited (only on macOS z3ed-agent-test)
|
|
- **Example**: `dungeon_editor_smoke_test.cc`, `canvas_selection_test.cc`
|
|
|
|
**Strengths**:
|
|
- Validates real user workflows
|
|
- Tests UI responsiveness
|
|
|
|
**Gaps**:
|
|
- Not run consistently across platforms
|
|
- Slow feedback loop
|
|
- Requires display/window system
|
|
|
|
### 2.2 What We DON'T Test
|
|
|
|
#### Build System Validation
|
|
- ❌ CMake configuration correctness per preset
|
|
- ❌ Include path propagation from dependencies
|
|
- ❌ Compiler flag compatibility
|
|
- ❌ Linker flag validation
|
|
- ❌ Cross-preset compatibility
|
|
|
|
#### Symbol-Level Issues
|
|
- ❌ ODR (One Definition Rule) violations
|
|
- ❌ Duplicate symbol detection across libraries
|
|
- ❌ Symbol visibility (public/private)
|
|
- ❌ ABI compatibility between libraries
|
|
|
|
#### Platform-Specific Compilation
|
|
- ❌ Header-only compilation checks
|
|
- ❌ Preprocessor branch coverage
|
|
- ❌ Platform macro validation
|
|
- ❌ Compiler-specific feature detection
|
|
|
|
#### Dependency Health
|
|
- ❌ Include path conflicts
|
|
- ❌ Library version mismatches
|
|
- ❌ Transitive dependency validation
|
|
- ❌ Static vs shared library conflicts
|
|
|
|
---
|
|
|
|
## 3. CI/CD Coverage Analysis
|
|
|
|
### 3.1 Current CI Matrix (.github/workflows/ci.yml)
|
|
|
|
| Platform | Build | Test (stable) | Test (unit) | Test (integration) | Test (AI) |
|
|
|----------|-------|---------------|-------------|-------------------|-----------|
|
|
| Ubuntu 22.04 (GCC-12) | ✅ | ✅ | ✅ | ❌ | ❌ |
|
|
| macOS 14 (Clang) | ✅ | ✅ | ✅ | ❌ | ✅ |
|
|
| Windows 2022 (Core) | ✅ | ✅ | ✅ | ❌ | ❌ |
|
|
| Windows 2022 (AI) | ✅ | ✅ | ✅ | ❌ | ❌ |
|
|
|
|
**CI Job Flow**:
|
|
1. **build**: Configure + compile full project
|
|
2. **test**: Run stable + unit tests
|
|
3. **windows-agent**: Full AI stack (gRPC + AI runtime)
|
|
4. **code-quality**: clang-format, cppcheck, clang-tidy
|
|
5. **memory-sanitizer**: AddressSanitizer (Linux only)
|
|
6. **z3ed-agent-test**: Full agent test suite (macOS only)
|
|
|
|
### 3.2 CI Gaps
|
|
|
|
#### Missing Early Feedback
|
|
- ❌ No compilation-only job (fails after 15-20 min build)
|
|
- ❌ No CMake configuration validation job (would catch in <1 min)
|
|
- ❌ No symbol conflict checking job
|
|
|
|
#### Limited Platform Coverage
|
|
- ⚠️ Only Linux gets AddressSanitizer
|
|
- ⚠️ Only macOS gets full z3ed agent tests
|
|
- ⚠️ Windows AI stack not tested on PRs (only post-merge)
|
|
|
|
#### Incomplete Testing
|
|
- ❌ Integration tests not run in CI
|
|
- ❌ E2E tests not run on Linux/Windows
|
|
- ❌ No ROM-dependent testing
|
|
- ❌ No performance regression detection
|
|
|
|
---
|
|
|
|
## 4. Developer Workflow Gaps
|
|
|
|
### 4.1 Pre-Commit Hooks
|
|
**Current State**: None
|
|
**Gap**: No automatic checks before local commits
|
|
|
|
**Should Include**:
|
|
- clang-format check
|
|
- Build system sanity check
|
|
- Copyright header validation
|
|
|
|
### 4.2 Pre-Push Validation
|
|
**Current State**: Manual testing only
|
|
**Gap**: Easy to push broken code to CI
|
|
|
|
**Should Include**:
|
|
- Smoke build test (quick compilation check)
|
|
- Unit test run
|
|
- Symbol conflict detection
|
|
|
|
### 4.3 Local Cross-Platform Testing
|
|
**Current State**: Developer-dependent
|
|
**Gap**: No easy way to test across platforms locally
|
|
|
|
**Should Include**:
|
|
- Docker-based Linux testing
|
|
- VM-based Windows testing (for macOS/Linux devs)
|
|
- Preset validation tool
|
|
|
|
---
|
|
|
|
## 5. Root Cause Analysis by Issue Type
|
|
|
|
### 5.1 Windows Abseil Include Paths
|
|
|
|
**Timeline**:
|
|
- ✅ Local macOS build succeeds
|
|
- ✅ CMake configuration succeeds on all platforms
|
|
- ❌ Windows compilation fails 15 minutes into CI
|
|
- ❌ Fix attempt 1 fails (14d1f5de4c)
|
|
- ❌ Fix attempt 2 fails (c2bb90a3f1)
|
|
- ✅ Final fix succeeds
|
|
|
|
**Why Multiple Attempts**:
|
|
1. No local Windows testing environment
|
|
2. CMake configuration doesn't validate actual compilation
|
|
3. No header-only compilation check
|
|
4. 15-20 minute feedback cycle from CI
|
|
|
|
**Prevention**:
|
|
- Header compilation smoke test
|
|
- CMake include path validator
|
|
- Local Windows testing (Docker/VM)
|
|
|
|
### 5.2 Linux FLAGS Symbol Conflicts
|
|
|
|
**Timeline**:
|
|
- ✅ Local macOS build succeeds
|
|
- ✅ Unit tests pass
|
|
- ❌ Linux full build fails at link time
|
|
- ❌ ODR violation: multiple `FLAGS` definitions
|
|
- ✅ Fix: move FLAGS definition, rename conflicts
|
|
|
|
**Why It Happened**:
|
|
1. gflags creates `FLAGS_*` symbols in headers
|
|
2. Multiple translation units define same symbols
|
|
3. macOS linker more permissive than Linux ld
|
|
4. No symbol conflict detection
|
|
|
|
**Prevention**:
|
|
- Symbol conflict scanner
|
|
- ODR violation checker
|
|
- Cross-platform link test
|
|
|
|
---
|
|
|
|
## 6. Recommended Testing Levels
|
|
|
|
We propose a **5-level testing pyramid**:
|
|
|
|
### Level 0: Static Analysis (< 1s)
|
|
- clang-format
|
|
- clang-tidy on changed files
|
|
- Copyright headers
|
|
- CMakeLists.txt syntax
|
|
|
|
### Level 1: Configuration Validation (< 10s)
|
|
- CMake configure dry-run
|
|
- Include path validation
|
|
- Compiler detection check
|
|
- Preprocessor flag validation
|
|
|
|
### Level 2: Smoke Compilation (< 2 min)
|
|
- Compile subset of files (1 file per library)
|
|
- Header-only compilation
|
|
- Template instantiation check
|
|
- Platform-specific branch validation
|
|
|
|
### Level 3: Symbol Validation (< 5 min)
|
|
- Full project compilation
|
|
- Symbol conflict detection (nm/dumpbin)
|
|
- ODR violation check
|
|
- Library dependency graph
|
|
|
|
### Level 4: Test Execution (5-30 min)
|
|
- Unit tests (fast)
|
|
- Integration tests (medium)
|
|
- E2E tests (slow)
|
|
- ROM-dependent tests (optional)
|
|
|
|
---
|
|
|
|
## 7. Actionable Recommendations
|
|
|
|
### 7.1 Immediate Actions (This Initiative)
|
|
|
|
1. **Create pre-push scripts** (`scripts/pre-push-test.sh`, `scripts/pre-push-test.ps1`)
|
|
- Run Level 0-2 checks locally
|
|
- Estimated time: <2 minutes
|
|
- Blocks 90% of CI failures
|
|
|
|
2. **Create symbol conflict detector** (`scripts/verify-symbols.sh`)
|
|
- Scan built libraries for duplicate symbols
|
|
- Run as part of pre-push
|
|
- Catches ODR violations
|
|
|
|
3. **Document testing strategy** (`docs/internal/testing/testing-strategy.md`)
|
|
- Clear explanation of each test level
|
|
- When to run which tests
|
|
- CI vs local testing
|
|
|
|
4. **Create pre-push checklist** (`docs/internal/testing/pre-push-checklist.md`)
|
|
- Interactive checklist for developers
|
|
- Links to tools and scripts
|
|
|
|
### 7.2 Short-Term Improvements (Next Sprint)
|
|
|
|
1. **Add CI compile-only job**
|
|
- Runs in <5 minutes
|
|
- Catches compilation issues before full build
|
|
- Fails fast
|
|
|
|
2. **Add CI symbol checking job**
|
|
- Runs after compile-only
|
|
- Detects ODR violations
|
|
- Platform-specific
|
|
|
|
3. **Add CMake configuration validation job**
|
|
- Tests all presets
|
|
- Validates include paths
|
|
- <2 minutes
|
|
|
|
4. **Enable integration tests in CI**
|
|
- Run on develop/master only (not PRs)
|
|
- Requires ROM file handling
|
|
|
|
### 7.3 Long-Term Improvements (Future)
|
|
|
|
1. **Docker-based local testing**
|
|
- Linux environment for macOS/Windows devs
|
|
- Matches CI exactly
|
|
- Fast feedback
|
|
|
|
2. **Cross-platform test matrix locally**
|
|
- Run tests across multiple platforms
|
|
- Automated VM/container management
|
|
|
|
3. **Performance regression detection**
|
|
- Benchmark suite
|
|
- Historical tracking
|
|
- Automatic alerts
|
|
|
|
4. **Coverage tracking**
|
|
- Line coverage per PR
|
|
- Coverage trends over time
|
|
- Uncovered code reports
|
|
|
|
---
|
|
|
|
## 8. Success Metrics
|
|
|
|
### 8.1 Developer Experience
|
|
- **Target**: <2 minutes pre-push validation time
|
|
- **Target**: 90% reduction in CI build failures
|
|
- **Target**: <3 attempts to fix CI issues (down from 5-10)
|
|
|
|
### 8.2 CI Efficiency
|
|
- **Target**: <5 minutes to first failure signal
|
|
- **Target**: 50% reduction in wasted CI time
|
|
- **Target**: 95% PR pass rate (up from ~70%)
|
|
|
|
### 8.3 Code Quality
|
|
- **Target**: Zero ODR violations
|
|
- **Target**: Zero platform-specific include issues
|
|
- **Target**: 100% symbol conflict detection
|
|
|
|
---
|
|
|
|
## 9. Reference
|
|
|
|
### Similar Issues in Recent History
|
|
- Windows std::filesystem support (19196ca87c, b556b155a5)
|
|
- Linux circular dependency (0812a84a22, e36d81f357)
|
|
- macOS z3ed linker error (9c562df277)
|
|
- Windows clang-cl detection (84cdb09a5b, cbdc6670a1)
|
|
|
|
### Related Documentation
|
|
- `docs/public/build/quick-reference.md` - Build commands
|
|
- `docs/public/build/troubleshooting.md` - Platform-specific fixes
|
|
- `CLAUDE.md` - Build system guidelines
|
|
- `.github/workflows/ci.yml` - CI configuration
|
|
|
|
### Tools Used
|
|
- `nm` (Unix) / `dumpbin` (Windows) - Symbol inspection
|
|
- `clang-tidy` - Static analysis
|
|
- `cppcheck` - Code quality
|
|
- `cmake --preset <name> --list-presets` - Preset validation
|