9.7 KiB
Codebase Investigation: Yaze vs Mesen2 SNES Emulation
Executive Summary
This investigation compares the architecture of yaze (Yet Another Zelda Editor's emulator) with Mesen2 (a high-accuracy multi-system emulator). The goal is to identify areas where yaze can be improved to approach Mesen2's level of accuracy.
Fundamental Difference:
- Yaze is an instruction-level / scanline-based emulator. It executes entire CPU instructions at once and catches up other subsystems (APU, PPU) at specific checkpoints (memory access, scanline end).
- Mesen2 is a bus-level / cycle-based emulator. It advances the system state (timers, DMA, interrupts) on every single CPU bus cycle (read/write/idle), allowing for sub-instruction synchronization.
Detailed Comparison
1. CPU Timing & Bus Arbitration
| Feature | Yaze (Snes::RunOpcode, Cpu::ExecuteInstruction) |
Mesen2 (SnesCpu::Exec, Read/Write) |
|---|---|---|
| Granularity | Executes full instruction, then adds cycles. Batches bus cycles around memory accesses. | Executes micro-ops. Read/Write calls ProcessCpuCycle to advance system state per byte. |
| Timing | Snes::CpuRead runs access_time - 4 cycles, reads, then 4 cycles. |
SnesCpu::Read determines speed (GetCpuSpeed), runs cycles, then reads. |
| Interrupts | Checked at instruction boundaries (RunOpcode). |
Checked on every cycle (ProcessCpuCycle -> DetectNmiSignalEdge). |
Improvement Opportunity:
The current yaze approach of batching cycles in CpuRead (RunCycles(access_time - 4)) is a good approximation but fails for edge cases where an IRQ or DMA might trigger during an instruction's execution (e.g., between operand bytes).
- Recommendation: Refactor
Cpu::ReadByte/Cpu::WriteBytecallbacks to advance the system clock before returning data. This movesyazecloser to a cycle-stepped architecture without rewriting the entire core state machine.
2. PPU Rendering & Raster Effects
| Feature | Yaze (Ppu::RunLine) |
Mesen2 (SnesPpu) |
|---|---|---|
| Rendering | Scanline-based. Renders full line at H=512 (next_horiz_event). |
Dot-based (effectively). Handles cycle-accurate register writes. |
| Mid-Line Changes | Register writes (WriteBBus) update internal state immediately, but rendering only happens later. Raster effects (H-IRQ) will apply to the whole line or be missed. |
Register writes catch up the renderer to the current dot before applying changes. |
Improvement Opportunity:
This is the biggest accuracy gap. Games like Tales of Phantasia or Star Ocean that use raster effects (changing color/brightness/windowing mid-scanline) will not render correctly in yaze.
- Recommendation: Implement a "Just-In-Time" PPU Catch-up.
- Add a
Ppu::CatchUp(uint16_t h_pos)method. - Call
ppu_.CatchUp(memory_.h_pos())insideSnes::WriteBBus(PPU register writes). CatchUpshould render pixels fromlast_rendered_xtocurrent_x, then updatelast_rendered_x.
- Add a
3. APU Synchronization
| Feature | Yaze (Snes::CatchUpApu) |
Mesen2 (Spc::IncCycleCount) |
|---|---|---|
| Sync Method | Catch-up. Runs APU to match CPU master cycles on every port read/write (ReadBBus/WriteBBus). |
Cycle interleaved. |
| Ratio | Fixed-point math (kApuCyclesNumerator...). |
Floating point ratio derived from sample rates. |
Assessment:
yaze's APU synchronization strategy is actually very robust. Calling CatchUpApu on every IO port access ($2140-$2143) ensures the SPC700 sees the correct data timing relative to the CPU. The handshake tracker (ApuHandshakeTracker) confirms this logic is working well for boot sequences.
- Recommendation: No major architectural changes needed here. Focus on
Spc700opcode accuracy and DSP mixing quality.
4. Input & Auto-Joypad Reading
| Feature | Yaze (Snes::HandleInput) |
Mesen2 (InternalRegisters::ProcessAutoJoypad) |
|---|---|---|
| Timing | Runs once at VBlank start. Populates all registers immediately. | Runs continuously over ~4224 master clocks during VBlank. |
| Accuracy | Games reading $4218 too early in VBlank will see finished data (correct values, wrong timing). |
Games reading too early see 0 or partial data. |
Improvement Opportunity: Some games rely on the duration of the auto-joypad read to time their VBlank routines.
- Recommendation: Implement a state machine for auto-joypad reading in
Snes::RunCycle. Instead of fillingport_auto_read_instantly, fill it bit-by-bit over the correct number of cycles.
5. AI & Editor Integration Architecture
To support AI-driven debugging and dynamic editor integration (e.g., "Teleport & Test"), the emulator must evolve from a "black box" to an observable, controllable simulation.
A. Dynamic State Injection (The "Test Sprite" Button)
Currently, testing requires a full reset or loading a binary save state. We need a State Patching API to programmatically set up game scenarios.
- Proposal:
Emulator::InjectState(const GameStatePatch& patch)GameStatePatch: A structure containing target WRAM values (e.g., Room ID, Coordinates, Inventory) and CPU state (PC location).- Workflow:
- Reset & Fast-Boot: Reset emulator and fast-forward past the boot sequence (e.g., until
GameModeRAM indicates "Gameplay"). - Injection: Pause execution and write the
patchvalues directly to WRAM/SRAM. - Resume: Hand control to the user or AI agent.
- Reset & Fast-Boot: Reset emulator and fast-forward past the boot sequence (e.g., until
- Use Case: "Test this sprite in Room 0x12." -> The editor builds a patch setting
ROOM_ID=0x12,LINK_X=StartPos, and injects it.
B. Semantic Inspection Layer (The "AI Eyes")
Multimodal models struggle with raw pixel streams for precise logic debugging. They need a "semantic overlay" that grounds visuals in game data.
- Proposal:
SemanticIntrospectionEngine- Symbol Mapping: Uses
SymbolProviderandMemoryMap(fromyazeproject) to decode raw RAM into meaningful concepts. - Structured Context: Expose a method
GetSemanticState()returning JSON/Struct:{ "mode": "Underworld", "room_id": 24, "link": { "x": 1200, "y": 800, "state": "SwordSlash", "hp": 16 }, "sprites": [ { "id": 0, "type": "Stalfos", "x": 1250, "y": 800, "state": "Active", "hp": 2 } ] } - Visual Grounding: Provide an API to generate "debug frames" where hitboxes and interaction zones are drawn over the game feed. This allows Vision Models to correlate "Link is overlapping Stalfos" visually with
Link.x ~= Stalfos.xlogically.
- Symbol Mapping: Uses
C. Headless & Fast-Forward Control
For automated verification (e.g., "Does entering this room crash?"), rendering overhead is unnecessary.
- Proposal: Decoupled Rendering Pipeline
- Allow
Emulatorto run in "Headless Mode":- PPU renders to a simplified RAM buffer (or skips rendering if only logic is being tested).
- Audio backend is disabled or set to
NullBackend. - Execution speed is uncapped (limited only by CPU).
RunUntil(Condition)API: Allow the agent to execute complex commands like:RunUntil(PC == 0x8000)(Breakpoint match)RunUntil(Memory[0x10] == 0x01)(Game mode change)RunUntil(FrameCount == Target + 60)(Time duration)
- Allow
Recent Improvements
SDL3 Audio Backend (2025-11-23)
A new SDL3 audio backend has been implemented to modernize the emulator's audio subsystem:
Implementation Details:
- Stream-based architecture: Replaces SDL2's queue-based approach with SDL3's
SDL_AudioStreamAPI - Files added:
src/app/emu/audio/sdl3_audio_backend.h/cc- Complete SDL3 backend implementationsrc/app/platform/sdl_compat.h- Cross-version compatibility layer
- Factory integration:
AudioBackendFactorynow supportsBackendType::SDL3 - Resampling support: Native handling of SPC700's 32kHz output to device rate
- Volume control: Optimized fast-path for unity gain (common case)
Benefits:
- Lower audio latency potential with stream-based processing
- Better synchronization between audio and video subsystems
- Native resampling reduces CPU overhead for rate conversion
- Future-proof architecture aligned with SDL3's design philosophy
Testing:
- Unit tests added in
test/unit/sdl3_audio_backend_test.cc - Conditional compilation via
YAZE_USE_SDL3flag ensures backward compatibility - Seamless fallback to SDL2 when SDL3 unavailable
Action Plan
To upgrade yaze for both accuracy and AI integration, follow this implementation order:
-
PPU Catch-up (Accuracy - High Impact)
- Modify
Pputo tracklast_rendered_x. - Split
RunLineintoRenderRange(start_x, end_x). - Inject
ppu_.CatchUp()calls inSnes::WriteBBus.
- Modify
-
Semantic Inspection API (AI - High Impact)
- Create
SemanticIntrospectionEngineclass. - Connect it to
MemoryandSymbolProvider. - Implement basic
GetPlayerState()andGetSpriteState()using known ALTTP RAM offsets.
- Create
-
State Injection API (Integration - Medium Impact)
- Implement
Emulator::InjectState. - Add specific "presets" for common ALTTP testing scenarios (e.g., "Dungeon Test", "Overworld Test").
- Implement
-
Refined CPU Timing (Accuracy - Low Impact, High Effort)
- Audit
Cpu::ExecuteInstructionfor missingcallbacks_.idle()calls. - Ensure "dummy read" cycles in RMW instructions trigger side effects.
- Audit
-
Auto-Joypad Progressive Read (Accuracy - Low Impact)
- Change
auto_joy_timer_to drive bit-shifting inport_auto_read_registers.
- Change