Add detailed root cause analysis for Windows and Linux CI build failures. Key findings: - gRPC version 1.75.1 has MSVC compilation errors on Windows - CPM configuration missing Windows version override logic - Long gRPC build times (30-40 min) causing potential timeouts - CI preset forces gRPC build even though it's optional Recommended solution: Use platform-specific CI presets with gRPC disabled to reduce build times from 40 min to 5-10 min and eliminate errors. Document includes: - Detailed root cause analysis - Multiple solution options with code examples - 3-phase implementation plan - Testing procedures and risk assessment
11 KiB
CI/CD Build Failure Analysis
Date: 2025-11-03 Branch: develop Analyzed By: Claude Code
Executive Summary
The Windows and Linux CI/CD builds are failing due to gRPC build configuration issues. The root cause is a conflict between two different gRPC configuration systems and version mismatches that cause compilation errors on Windows and timeouts on Linux.
Root Causes
1. gRPC Version Mismatch (Critical - Windows)
Problem: Windows MSVC builds are using the wrong gRPC version.
Details:
cmake/dependencies.lockspecifies gRPC version1.75.1cmake/grpc.cmakehas conditional logic to usev1.67.1for Windows MSVC (MSVC-compatible)- However, the build system uses
cmake/dependencies/grpc.cmake(CPM-based) which reads fromdependencies.lock - Result: Windows builds get gRPC
1.75.1which has MSVC compilation errors in UPB (micro protobuf) code
Error Expected:
error C2099: initializer is not a constant
File Location: cmake/dependencies.lock:10
2. Duplicate gRPC Configuration Systems (Critical)
Problem: Two competing gRPC configuration files exist, causing confusion.
Details:
cmake/grpc.cmake- FetchContent-based, has platform-specific version logiccmake/dependencies/grpc.cmake- CPM-based, uses dependencies.lock version- The newer CPM-based approach (via
cmake/dependencies.cmake:40) is being used - But it doesn't have the Windows version override logic from
cmake/grpc.cmake:74-80
Impact: Windows builds fail because they don't get the MSVC-compatible gRPC version.
3. CI Preset Forces gRPC Build (Major - All Platforms)
Problem: The "ci" preset mandates building gRPC from source.
Details:
- CI workflow (
.github/workflows/ci.yml:56,64) uses preset "ci" - Preset "ci" (
CMakePresets.json:88-101) setsYAZE_ENABLE_GRPC: "ON" - Building gRPC from source is slow and error-prone:
- Windows: ~30-40 min (if it works) + MSVC compatibility issues
- Linux: ~30-40 min + potential timeout
- macOS: Has ARM64 Abseil issues (documented in troubleshooting)
Alternative: The "minimal" preset exists but isn't used by CI.
4. Long Build Times May Cause CI Timeouts (Major - Linux)
Problem: gRPC builds take 30-40+ minutes from source.
Details:
- GitHub Actions runners have execution time limits
- First-time gRPC builds compile hundreds of targets (protobuf, abseil, grpc++)
- Even with ccache/sccache, initial builds are very slow
- No pre-built packages are used on Linux CI
5. Windows Preset Configuration Issues (Minor)
Problem: Windows CI uses the wrong preset architecture.
Details:
- The CI workflow uses preset "ci" for all platforms
- But "ci" inherits from "base" which doesn't set Windows-specific options
- Windows-specific presets (win-dbg, win-rel) exist but aren't used by CI
- Missing proper MSVC runtime library settings in the base CI preset
Affected Files
Configuration Files:
cmake/dependencies.lock- Has incorrect gRPC version for Windowscmake/dependencies/grpc.cmake- Missing Windows version overrideCMakePresets.json- CI preset forces gRPC build.github/workflows/ci.yml- Uses problematic "ci" preset
CMake Modules:
cmake/grpc.cmake- Has correct version logic but isn't being usedcmake/grpc_windows.cmake- vcpkg fallback exists but not enabled in CI
Solutions
Immediate Fixes (Recommended)
Fix 1: Update gRPC Version in dependencies.lock for Windows Compatibility
Change: Update cmake/dependencies.lock:10 to use MSVC-compatible version
# Before:
set(GRPC_VERSION "1.75.1" CACHE STRING "gRPC version")
# After:
set(GRPC_VERSION "1.67.1" CACHE STRING "gRPC version - MSVC compatible")
Impact:
- ✅ Fixes Windows MSVC compilation errors
- ⚠️ Linux/macOS will use older gRPC version (but stable)
Risk: Low - v1.67.1 is tested and stable on all platforms per troubleshooting docs
Fix 2: Add Platform-Specific gRPC Version Logic to CPM Configuration
Change: Update cmake/dependencies/grpc.cmake to use conditional versioning
# Add before line 60 in cmake/dependencies/grpc.cmake:
# Use MSVC-compatible gRPC version on Windows, latest elsewhere
if(WIN32 AND MSVC)
set(GRPC_VERSION "1.67.1" CACHE STRING "gRPC version - MSVC compatible" FORCE)
set(_GRPC_VERSION_REASON "MSVC-compatible, avoids UPB linker regressions")
else()
# Use version from dependencies.lock for other platforms
set(_GRPC_VERSION_REASON "ARM64 macOS + modern Clang compatibility")
endif()
message(STATUS "Using gRPC version: ${GRPC_VERSION} (${_GRPC_VERSION_REASON})")
Impact:
- ✅ Windows gets v1.67.1 (MSVC-compatible)
- ✅ Linux/macOS can use v1.75.1 (newer features)
Risk: Low - mirrors existing logic from cmake/grpc.cmake
Fix 3: Create Separate CI Presets for Each Platform
Change: Add platform-specific CI presets to CMakePresets.json
{
"name": "ci-windows",
"inherits": "windows-base",
"displayName": "CI Build - Windows",
"description": "Windows CI build without gRPC (fast, reliable)",
"cacheVariables": {
"CMAKE_BUILD_TYPE": "RelWithDebInfo",
"YAZE_BUILD_TESTS": "ON",
"YAZE_ENABLE_GRPC": "OFF",
"YAZE_ENABLE_JSON": "ON",
"YAZE_ENABLE_AI": "OFF",
"YAZE_MINIMAL_BUILD": "ON"
}
},
{
"name": "ci-linux",
"inherits": "base",
"displayName": "CI Build - Linux",
"description": "Linux CI build without gRPC (fast, reliable)",
"cacheVariables": {
"CMAKE_BUILD_TYPE": "RelWithDebInfo",
"YAZE_BUILD_TESTS": "ON",
"YAZE_ENABLE_GRPC": "OFF",
"YAZE_ENABLE_JSON": "ON",
"YAZE_ENABLE_AI": "OFF",
"YAZE_MINIMAL_BUILD": "ON"
}
},
{
"name": "ci-macos",
"inherits": "base",
"displayName": "CI Build - macOS",
"description": "macOS CI build without gRPC (fast, reliable)",
"cacheVariables": {
"CMAKE_BUILD_TYPE": "RelWithDebInfo",
"YAZE_BUILD_TESTS": "ON",
"YAZE_ENABLE_GRPC": "OFF",
"YAZE_ENABLE_JSON": "ON",
"YAZE_ENABLE_AI": "OFF",
"YAZE_MINIMAL_BUILD": "ON"
}
}
Update CI workflow (.github/workflows/ci.yml):
# Line 52-64 - Update matrix to use platform-specific presets
matrix:
include:
- name: "Ubuntu 22.04 (GCC-12)"
os: ubuntu-22.04
platform: linux
preset: ci-linux # Changed from "ci"
- name: "macOS 14 (Clang)"
os: macos-14
platform: macos
preset: ci-macos # Changed from "ci"
- name: "Windows 2022 (MSVC)"
os: windows-2022
platform: windows
preset: ci-windows # Changed from "ci"
Impact:
- ✅ Disables gRPC in CI builds (faster, more reliable)
- ✅ Reduces CI build time from ~40 min to ~5-10 min
- ✅ Eliminates Windows MSVC gRPC errors
- ✅ Eliminates Linux timeout issues
Risk: Low - "minimal" preset already exists and works
Long-Term Solutions (Optional)
Option 1: Use System gRPC Packages on Linux CI
Change: Install gRPC from apt on Ubuntu CI runners
# In .github/actions/setup-build/action.yml, add to Linux section:
- name: Setup build environment (Linux)
if: inputs.platform == 'linux'
shell: bash
run: |
sudo apt-get update
sudo apt-get install -y $(tr '\n' ' ' < .github/workflows/scripts/linux-ci-packages.txt) gcc-12 g++-12
# Add gRPC system packages
sudo apt-get install -y libgrpc++-dev libprotobuf-dev protobuf-compiler-grpc
sudo apt-get clean
Then enable YAZE_USE_SYSTEM_DEPS in Linux CI preset.
Impact:
- ✅ Very fast (pre-compiled)
- ⚠️ Version depends on Ubuntu repository (might be older)
Risk: Medium - system packages may have different versions
Option 2: Use vcpkg for Windows gRPC
Change: Enable vcpkg in Windows CI
This is already documented in the codebase but not enabled. See cmake/grpc_windows.cmake:14-16.
Impact:
- ✅ Fast Windows builds (~5-10 min with vcpkg cache)
- ⚠️ Requires vcpkg setup in CI
Risk: Medium - adds vcpkg dependency to CI
Option 3: Create Separate gRPC-Enabled CI Workflow
Change: Keep current CI fast, add optional gRPC testing
Create .github/workflows/ci-full.yml with gRPC enabled, running only on:
- Manual workflow_dispatch
- Weekly schedule
- Specific branches (e.g., release branches)
Impact:
- ✅ Main CI stays fast and reliable
- ✅ Still tests gRPC integration periodically
Risk: Low - provides best of both worlds
Recommended Implementation Plan
Phase 1 (Immediate - Fix Broken Builds)
- ✅ Apply Fix #3: Create platform-specific CI presets with gRPC disabled
- ✅ Update CI workflow to use new presets
- ✅ Test on develop branch
Result: CI builds become fast (5-10 min) and reliable
Phase 2 (Short-term - Maintain gRPC Support)
- Apply Fix #1 or Fix #2: Fix gRPC versioning
- Create separate weekly gRPC test workflow (Option 3)
- Document gRPC build requirements for local development
Result: gRPC support maintained without slowing down CI
Phase 3 (Long-term - Optimization)
- Evaluate vcpkg for Windows (Option 2)
- Evaluate system packages for Linux (Option 1)
- Consider pre-built gRPC artifacts
Result: Fast builds even with gRPC enabled
Testing Plan
Test 1: Verify Windows Build
# On Windows with MSVC
cmake --preset ci-windows
cmake --build --preset ci-windows
Expected: Build completes in ~5-10 minutes without errors
Test 2: Verify Linux Build
# On Ubuntu 22.04
cmake --preset ci-linux
cmake --build --preset ci-linux
Expected: Build completes in ~5-10 minutes without errors
Test 3: Verify Tests Run
# After successful build
cd build
ctest --output-on-failure -L stable
Expected: All stable tests pass
Test 4: Verify gRPC Version Fix (if applying Fix #1 or #2)
# On Windows
cmake --preset win-ai # Preset with gRPC enabled
cmake --build --preset win-ai
Expected: Build completes with gRPC 1.67.1, no MSVC errors
Additional Notes
Why gRPC Causes Issues
- Complexity: gRPC bundles protobuf, abseil, c-ares, re2, zlib, ssl
- Build Time: Compiles 200+ targets on first build
- Platform Quirks:
- Windows: MSVC has strict C++ conformance, UPB uses C99 features
- macOS ARM64: Abseil uses x86 SSE intrinsics, needs filtering
- Linux: Usually works but slow
Why Disabling gRPC is Safe for CI
- gRPC is used for:
- AI agent communication (optional feature)
- Remote debugging (developer tool)
- Core editor functionality works without gRPC
- Tests don't require gRPC
- Developers can still test gRPC locally with dev presets
References
- Build Troubleshooting Guide:
docs/BUILD-TROUBLESHOOTING.md - Build Guide:
docs/BUILD-GUIDE.md - CI Workflow:
.github/workflows/ci.yml - CMake Presets:
CMakePresets.json
Conclusion
The root cause of CI build failures is gRPC version incompatibility and excessive build complexity. The recommended solution is to disable gRPC in CI builds using platform-specific presets, which will make builds fast (5-10 min) and reliable. gRPC support can be maintained through local development builds and periodic CI testing.