backend-infra-engineer: Release v0.3.3 snapshot
This commit is contained in:
440
docs/internal/testing/symbol-conflict-detection.md
Normal file
440
docs/internal/testing/symbol-conflict-detection.md
Normal file
@@ -0,0 +1,440 @@
|
||||
# Symbol Conflict Detection System
|
||||
|
||||
## Overview
|
||||
|
||||
The Symbol Conflict Detection System is designed to catch **One Definition Rule (ODR) violations** and symbol conflicts **before linking fails**. This prevents wasted time debugging linker errors and improves development velocity.
|
||||
|
||||
**The Problem:**
|
||||
- Developers accidentally define the same symbol in multiple translation units
|
||||
- Errors only appear at link time (after 10-15+ minutes of compilation on some platforms)
|
||||
- The error message is often cryptic: `symbol already defined in object`
|
||||
- No early warning during development
|
||||
|
||||
**The Solution:**
|
||||
- Extract symbols from compiled object files immediately after compilation
|
||||
- Build a symbol database with conflict detection
|
||||
- Pre-commit hook warns about conflicts before committing
|
||||
- CI/CD job fails early if conflicts detected
|
||||
- Fast analysis: <5 seconds for typical builds
|
||||
|
||||
## Quick Start
|
||||
|
||||
### Generate Symbol Database
|
||||
|
||||
```bash
|
||||
# Extract all symbols and create database
|
||||
./scripts/extract-symbols.sh
|
||||
|
||||
# Output: build/symbol_database.json
|
||||
```
|
||||
|
||||
### Check for Conflicts
|
||||
|
||||
```bash
|
||||
# Analyze database for conflicts
|
||||
./scripts/check-duplicate-symbols.sh
|
||||
|
||||
# Output: List of conflicting symbols with file locations
|
||||
```
|
||||
|
||||
### Combined Usage
|
||||
|
||||
```bash
|
||||
# Extract and check in one command
|
||||
./scripts/extract-symbols.sh && ./scripts/check-duplicate-symbols.sh
|
||||
```
|
||||
|
||||
## Components
|
||||
|
||||
### 1. Symbol Extraction Tool (`scripts/extract-symbols.sh`)
|
||||
|
||||
Scans all compiled object files and extracts symbol definitions.
|
||||
|
||||
**Features:**
|
||||
- Cross-platform support (macOS/Linux/Windows)
|
||||
- Uses `nm` on Unix/macOS, `dumpbin` on Windows
|
||||
- Generates JSON database with symbol metadata
|
||||
- Skips undefined symbols (references only)
|
||||
- Tracks symbol type (text, data, read-only)
|
||||
|
||||
**Usage:**
|
||||
```bash
|
||||
# Default: scan ./build directory, output to build/symbol_database.json
|
||||
./scripts/extract-symbols.sh
|
||||
|
||||
# Custom build directory
|
||||
./scripts/extract-symbols.sh /path/to/custom/build
|
||||
|
||||
# Custom output file
|
||||
./scripts/extract-symbols.sh build symbols.json
|
||||
```
|
||||
|
||||
**Output Format:**
|
||||
```json
|
||||
{
|
||||
"metadata": {
|
||||
"platform": "Darwin",
|
||||
"build_dir": "build",
|
||||
"timestamp": "2025-11-20T10:30:45.123456Z",
|
||||
"object_files_scanned": 145,
|
||||
"total_symbols": 8923,
|
||||
"total_conflicts": 2
|
||||
},
|
||||
"conflicts": [
|
||||
{
|
||||
"symbol": "FLAGS_rom",
|
||||
"count": 2,
|
||||
"definitions": [
|
||||
{
|
||||
"object_file": "flags.cc.o",
|
||||
"type": "D"
|
||||
},
|
||||
{
|
||||
"object_file": "emu_test.cc.o",
|
||||
"type": "D"
|
||||
}
|
||||
]
|
||||
}
|
||||
],
|
||||
"symbols": {
|
||||
"FLAGS_rom": [...]
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
**Symbol Types:**
|
||||
- `T` = Text/Code (function in `.text` section)
|
||||
- `D` = Data (initialized global variable in `.data` section)
|
||||
- `R` = Read-only (constant in `.rodata` section)
|
||||
- `B` = BSS (uninitialized global in `.bss` section)
|
||||
- `U` = Undefined (external reference, not a definition)
|
||||
|
||||
### 2. Duplicate Symbol Checker (`scripts/check-duplicate-symbols.sh`)
|
||||
|
||||
Analyzes symbol database and reports conflicts in a developer-friendly format.
|
||||
|
||||
**Usage:**
|
||||
```bash
|
||||
# Check default database (build/symbol_database.json)
|
||||
./scripts/check-duplicate-symbols.sh
|
||||
|
||||
# Specify custom database
|
||||
./scripts/check-duplicate-symbols.sh /path/to/symbol_database.json
|
||||
|
||||
# Verbose output (show all symbols)
|
||||
./scripts/check-duplicate-symbols.sh --verbose
|
||||
|
||||
# Include fix suggestions
|
||||
./scripts/check-duplicate-symbols.sh --fix-suggestions
|
||||
```
|
||||
|
||||
**Output Example:**
|
||||
```
|
||||
=== Duplicate Symbol Checker ===
|
||||
Database: build/symbol_database.json
|
||||
Platform: Darwin
|
||||
Build directory: build
|
||||
Timestamp: 2025-11-20T10:30:45.123456Z
|
||||
Object files scanned: 145
|
||||
Total symbols: 8923
|
||||
Total conflicts: 2
|
||||
|
||||
CONFLICTS FOUND:
|
||||
|
||||
[1/2] FLAGS_rom (x2)
|
||||
1. flags.cc.o (type: D)
|
||||
2. emu_test.cc.o (type: D)
|
||||
|
||||
[2/2] g_global_counter (x2)
|
||||
1. utils.cc.o (type: D)
|
||||
2. utils_test.cc.o (type: D)
|
||||
|
||||
=== Summary ===
|
||||
Total conflicts: 2
|
||||
Fix these before linking!
|
||||
```
|
||||
|
||||
**Exit Codes:**
|
||||
- `0` = No conflicts found
|
||||
- `1` = Conflicts detected
|
||||
|
||||
### 3. Pre-Commit Hook (`.githooks/pre-commit`)
|
||||
|
||||
Runs automatically before committing code (can be bypassed with `--no-verify`).
|
||||
|
||||
**Features:**
|
||||
- Only checks changed `.cc` and `.h` files
|
||||
- Fast analysis: ~2-3 seconds
|
||||
- Warns about conflicts in affected object files
|
||||
- Suggests common fixes
|
||||
- Non-blocking (just a warning, doesn't fail the commit)
|
||||
|
||||
**Usage:**
|
||||
```bash
|
||||
# Automatically runs on git commit
|
||||
git commit -m "Your message"
|
||||
|
||||
# Skip hook if needed
|
||||
git commit --no-verify -m "Your message"
|
||||
```
|
||||
|
||||
**Setup (first time):**
|
||||
```bash
|
||||
# Configure Git to use .githooks directory
|
||||
git config core.hooksPath .githooks
|
||||
|
||||
# Make hook executable
|
||||
chmod +x .githooks/pre-commit
|
||||
```
|
||||
|
||||
**Hook Output:**
|
||||
```
|
||||
[Pre-Commit] Checking for symbol conflicts...
|
||||
Changed files:
|
||||
src/cli/flags.cc
|
||||
test/emu_test.cc
|
||||
|
||||
Affected object files:
|
||||
build/CMakeFiles/z3ed.dir/src/cli/flags.cc.o
|
||||
build/CMakeFiles/z3ed_test.dir/test/emu_test.cc.o
|
||||
|
||||
Analyzing symbols...
|
||||
|
||||
WARNING: Symbol conflicts detected!
|
||||
|
||||
Duplicate symbols in affected files:
|
||||
FLAGS_rom
|
||||
- flags.cc.o
|
||||
- emu_test.cc.o
|
||||
|
||||
You can:
|
||||
1. Fix the conflicts before committing
|
||||
2. Skip this check: git commit --no-verify
|
||||
3. Run full analysis: ./scripts/extract-symbols.sh && ./scripts/check-duplicate-symbols.sh
|
||||
|
||||
Common fixes:
|
||||
- Add 'static' keyword to make it internal linkage
|
||||
- Use anonymous namespace in .cc files
|
||||
- Use 'inline' keyword for function/variable definitions
|
||||
```
|
||||
|
||||
## Common Fixes for ODR Violations
|
||||
|
||||
### Problem: Global Variable Defined in Multiple Files
|
||||
|
||||
**Bad:**
|
||||
```cpp
|
||||
// flags.cc
|
||||
ABSL_FLAG(std::string, rom, "", "Path to ROM");
|
||||
|
||||
// test.cc
|
||||
ABSL_FLAG(std::string, rom, "", "Path to ROM"); // ERROR: Duplicate definition
|
||||
```
|
||||
|
||||
**Fix 1: Use `static` (internal linkage)**
|
||||
```cpp
|
||||
// test.cc
|
||||
static ABSL_FLAG(std::string, rom, "", "Path to ROM"); // Now local to this file
|
||||
```
|
||||
|
||||
**Fix 2: Use Anonymous Namespace**
|
||||
```cpp
|
||||
// test.cc
|
||||
namespace {
|
||||
ABSL_FLAG(std::string, rom, "", "Path to ROM");
|
||||
} // Now has internal linkage
|
||||
```
|
||||
|
||||
**Fix 3: Declare in Header, Define in One .cc**
|
||||
```cpp
|
||||
// flags.h
|
||||
extern ABSL_FLAG(std::string, rom);
|
||||
|
||||
// flags.cc
|
||||
ABSL_FLAG(std::string, rom, "", "Path to ROM");
|
||||
|
||||
// test.cc
|
||||
// Use via flags.h declaration, don't redefine
|
||||
```
|
||||
|
||||
### Problem: Duplicate Function Definitions
|
||||
|
||||
**Bad:**
|
||||
```cpp
|
||||
// util.cc
|
||||
void ProcessData() { /* ... */ }
|
||||
|
||||
// util_test.cc
|
||||
void ProcessData() { /* ... */ } // ERROR: Already defined
|
||||
```
|
||||
|
||||
**Fix 1: Make `inline`**
|
||||
```cpp
|
||||
// util.h
|
||||
inline void ProcessData() { /* ... */ }
|
||||
|
||||
// util.cc and util_test.cc can include and use it
|
||||
```
|
||||
|
||||
**Fix 2: Use `static`**
|
||||
```cpp
|
||||
// util.cc
|
||||
static void ProcessData() { /* ... */ } // Internal linkage
|
||||
```
|
||||
|
||||
**Fix 3: Use Anonymous Namespace**
|
||||
```cpp
|
||||
// util.cc
|
||||
namespace {
|
||||
void ProcessData() { /* ... */ }
|
||||
} // Internal linkage
|
||||
```
|
||||
|
||||
### Problem: Class Static Member Initialization
|
||||
|
||||
**Bad:**
|
||||
```cpp
|
||||
// widget.h
|
||||
class Widget {
|
||||
static int instance_count; // Declaration only
|
||||
};
|
||||
|
||||
// widget.cc
|
||||
int Widget::instance_count = 0;
|
||||
|
||||
// widget_test.cc (accidentally includes impl)
|
||||
int Widget::instance_count = 0; // ERROR: Multiple definitions
|
||||
```
|
||||
|
||||
**Fix: Define in Only One .cc**
|
||||
```cpp
|
||||
// widget.h
|
||||
class Widget {
|
||||
static int instance_count;
|
||||
};
|
||||
|
||||
// widget.cc (ONLY definition)
|
||||
int Widget::instance_count = 0;
|
||||
|
||||
// widget_test.cc (only uses, doesn't redefine)
|
||||
```
|
||||
|
||||
## Integration with CI/CD
|
||||
|
||||
### GitHub Actions Example
|
||||
|
||||
Add to `.github/workflows/ci.yml`:
|
||||
|
||||
```yaml
|
||||
- name: Extract symbols
|
||||
if: success()
|
||||
run: |
|
||||
./scripts/extract-symbols.sh build
|
||||
./scripts/check-duplicate-symbols.sh
|
||||
|
||||
- name: Upload symbol report
|
||||
if: always()
|
||||
uses: actions/upload-artifact@v3
|
||||
with:
|
||||
name: symbol-database
|
||||
path: build/symbol_database.json
|
||||
```
|
||||
|
||||
### Workflow:
|
||||
1. **Build completes** (generates .o/.obj files)
|
||||
2. **Extract symbols** runs immediately
|
||||
3. **Check for conflicts** analyzes database
|
||||
4. **Fail job** if duplicates found
|
||||
5. **Upload report** for inspection
|
||||
|
||||
## Performance Notes
|
||||
|
||||
### Typical Build Timings
|
||||
|
||||
| Operation | Time | Notes |
|
||||
|-----------|------|-------|
|
||||
| Extract symbols (145 obj files) | ~2-3s | macOS/Linux with `nm` |
|
||||
| Extract symbols (145 obj files) | ~5-7s | Windows with `dumpbin` |
|
||||
| Check duplicates | <100ms | JSON parsing and analysis |
|
||||
| Pre-commit hook (5 changed files) | ~1-2s | Only checks affected objects |
|
||||
|
||||
### Optimization Tips
|
||||
|
||||
1. **Run only affected files in pre-commit hook** - Don't scan entire build
|
||||
2. **Cache symbol database** - Reuse between checks if no new objects
|
||||
3. **Parallel extraction** - Future enhancement for large builds
|
||||
4. **Filter by symbol type** - Focus on data/text symbols, skip weak symbols
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### "Symbol database not found"
|
||||
|
||||
**Issue:** Script says database doesn't exist
|
||||
```
|
||||
Error: Symbol database not found: build/symbol_database.json
|
||||
```
|
||||
|
||||
**Solution:** Generate it first
|
||||
```bash
|
||||
./scripts/extract-symbols.sh
|
||||
```
|
||||
|
||||
### "No object files found"
|
||||
|
||||
**Issue:** Extraction found 0 object files
|
||||
```
|
||||
Warning: No object files found in build
|
||||
```
|
||||
|
||||
**Solution:** Rebuild the project first
|
||||
```bash
|
||||
cmake --build build # or appropriate build command
|
||||
./scripts/extract-symbols.sh
|
||||
```
|
||||
|
||||
### "No compiled objects found for changed files"
|
||||
|
||||
**Issue:** Pre-commit hook can't find object files for changes
|
||||
```
|
||||
[Pre-Commit] No compiled objects found for changed files (might not be built yet)
|
||||
```
|
||||
|
||||
**Solution:** This is normal if you haven't built yet. Just commit normally:
|
||||
```bash
|
||||
git commit -m "Your message"
|
||||
```
|
||||
|
||||
### Symbol not appearing in conflicts
|
||||
|
||||
**Issue:** Manual review found duplicate, but tool doesn't report it
|
||||
|
||||
**Cause:** Symbol might be weak, or in template/header-only code
|
||||
|
||||
**Solution:** Check with `nm` directly:
|
||||
```bash
|
||||
nm build/CMakeFiles/*/*.o | grep symbol_name
|
||||
```
|
||||
|
||||
## Future Enhancements
|
||||
|
||||
1. **Incremental checking** - Only re-scan changed object files
|
||||
2. **HTML reports** - Generate visual conflict reports with source references
|
||||
3. **Automatic fixes** - Suggest patches for common ODR patterns
|
||||
4. **Integration with IDE** - Clangd/LSP warnings for duplicate definitions
|
||||
5. **Symbol lifecycle tracking** - Track which symbols were added/removed per build
|
||||
6. **Statistics dashboard** - Monitor symbol health over time
|
||||
|
||||
## References
|
||||
|
||||
- [C++ One Definition Rule (cppreference)](https://en.cppreference.com/w/cpp/language/definition)
|
||||
- [Linker Errors (isocpp.org)](https://isocpp.org/wiki/faq/linker-errors)
|
||||
- [GNU nm Manual](https://sourceware.org/binutils/docs/binutils/nm.html)
|
||||
- [Windows dumpbin Documentation](https://learn.microsoft.com/en-us/cpp/build/reference/dumpbin-reference)
|
||||
|
||||
## Support
|
||||
|
||||
For issues or suggestions:
|
||||
1. Check `.githooks/pre-commit` is executable: `chmod +x .githooks/pre-commit`
|
||||
2. Verify git hooks path is configured: `git config core.hooksPath`
|
||||
3. Run full analysis for detailed debugging: `./scripts/check-duplicate-symbols.sh --verbose`
|
||||
4. Open an issue with the `symbol-detection` label
|
||||
Reference in New Issue
Block a user