17 KiB
Gemini Workflow Instructions for the yaze Project
This document provides a summary of the yaze project to guide an AI assistant in understanding the codebase, architecture, and development workflows.
Coordination Requirement
Gemini-based agents must read and update the shared coordination board (docs/internal/agents/coordination-board.md) before making changes. Follow the protocol inAGENTS.md, use the appropriate persona ID (e.g.,GEMINI_AUTOM), and respond to any pending entries targeting you.
User Profile
- User: A Google programmer working on ROM hacking projects on macOS.
- IDE: Visual Studio Code with the CMake Tools extension.
- Build System: CMake with a preference for the "Unix Makefiles" generator.
- Workflow: Uses CMake presets and a separate
build_testdirectory for test builds. - AI Assistant Build Policy: When the AI assistant needs to build the project, it must use a dedicated build directory (e.g.,
build_aiorbuild_agent) to avoid interrupting the user's active builds. Never usebuildorbuild_testdirectories.
Project Overview
yaze: A cross-platform GUI editor for "The Legend of Zelda: A Link to the Past" (ALTTP) ROMs. It is designed for compatibility with ZScream projects.z3ed: A powerful command-line interface (CLI) foryaze. It features a resource-oriented design (z3ed <resource> <action>) and serves as the primary API for an AI-driven conversational agent.yaze.org: The filedocs/yaze.orgis an Emacs Org-Mode file used as a development tracker for active issues and features.
Build Instructions
- Use the presets in
CMakePresets.json(debug, AI, release, dev, CI, etc.). Always run the verifier script before the first build on a machine. - Gemini agents must configure/build in dedicated directories (
build_ai,build_agent, …) to avoid touching the user’sbuildorbuild_testfolders. - Consult
docs/public/build/quick-reference.mdfor the canonical command list, preset overview, and testing guidance.
Testing
- Framework: GoogleTest.
- Test Categories:
STABLE: Fast, reliable, run in CI.ROM_DEPENDENT: Require a ROM file, skipped in CI unless a ROM is provided.EXPERIMENTAL: May be unstable, allowed to fail.
- Running Tests:
# Run stable tests using ctest and presets ctest --preset dev # Run comprehensive overworld tests (requires a ROM) ./scripts/run_overworld_tests.sh /path/to/zelda3.sfc - E2E GUI Testing: The project includes a sophisticated end-to-end testing framework using
ImGuiTestEngine, accessible via a gRPC service. Thez3ed agent testcommand can execute natural language prompts as GUI tests.
Core Architecture & Features
- Overworld Editor: Full support for vanilla and
ZSCustomOverworldv2/v3 ROMs, ensuring compatibility with ZScream projects. - Dungeon Editor: A modular, component-based system for editing rooms, objects, sprites, and more. See
docs/E2-dungeon-editor-guide.mdanddocs/E3-dungeon-editor-design.md. - Graphics System: A performant system featuring:
Arena-based resource management.Bitmapclass for SNES-specific graphics formats.Tilemapwith an LRU cache.AtlasRendererfor batched drawing.
- Asar Integration: Built-in support for the Asar 65816 assembler to apply assembly patches to the ROM.
Editor System Architecture
The editor system is designed around a central EditorManager that orchestrates multiple editors and UI components.
EditorManager: The top-level class that manages multipleRomSessions, the main menu, and all editor windows. It handles the application's main update loop.RomSession: Each session encapsulates aRominstance and a correspondingEditorSet, allowing multiple ROMs to be open simultaneously.EditorSet: A container for all individual editor instances (Overworld, Dungeon, etc.) associated with a single ROM session.Editor(Base Class): A virtual base class (src/app/editor/editor.h) that defines a common interface for all editors, including methods likeInitialize,Load,Update, andSave.
Component-Based Editors
The project is moving towards a component-based architecture to improve modularity and maintainability. This is most evident in the Dungeon Editor.
-
Dungeon Editor: Refactored into a collection of single-responsibility components orchestrated by
DungeonEditor:DungeonRoomSelector: Manages the UI for selecting rooms and entrances.DungeonCanvasViewer: Handles the rendering of the dungeon room on the main canvas.DungeonObjectSelector: Provides the UI for browsing and selecting objects, sprites, and other editable elements.DungeonObjectInteraction: Manages mouse input, selection, and drag-and-drop on the canvas.DungeonToolset: The main toolbar for the editor.DungeonRenderer: A dedicated rendering engine for dungeon objects, featuring a cache for performance.DungeonRoomLoader: Handles the logic for loading all room data from the ROM, now optimized with parallel processing.DungeonUsageTracker: Analyzes and displays statistics on resource usage (blocksets, palettes, etc.).
-
Overworld Editor: Also employs a component-based approach with helpers like
OverworldEditorManagerfor ZSCustomOverworld v3 features andMapPropertiesSystemfor UI panels.
Specialized Editors
- Code Editors:
AssemblyEditor(a full-featured text editor) andMemoryEditor(a hex viewer). - Graphics Editors:
GraphicsEditor,PaletteEditor,GfxGroupEditor,ScreenEditor, and the highly detailedTile16Editorprovide tools for all visual assets. - Content Editors:
SpriteEditor,MessageEditor, andMusicEditormanage specific game content.
System Components
Located in src/app/editor/system/, these components provide the core application framework:
SettingsEditor: Manages global and project-specific feature flags.PopupManager,ToastManager: Handle all UI dialogs and notifications.ShortcutManager,CommandManager: Manage keyboard shortcuts and command palette functionality.ProposalDrawer,AgentChatWidget: Key UI components for the AI Agent Workflow, allowing for proposal review and conversational interaction.
Game Data Models (zelda3 Namespace)
The logic and data structures for ALTTP are primarily located in src/zelda3/.
-
RomClass (app/rom.h): This is the most critical data class. It holds the entire ROM content in astd::vector<uint8_t>and provides the central API for all data access.- Responsibilities: Handles loading/saving ROM files, stripping SMC headers, and providing low-level read/write primitives (e.g.,
ReadByte,WriteWord). - Game-Specific Loading: The
LoadZelda3method populates game-specific data structures like palettes (palette_groups_) and graphics groups. - State: Manages a
dirty_flag to track unsaved changes.
- Responsibilities: Handles loading/saving ROM files, stripping SMC headers, and providing low-level read/write primitives (e.g.,
-
Overworld Model (
zelda3/overworld/):Overworld: The main container class that orchestrates the loading of all overworld data, including maps, tiles, and entities. It correctly handles logic for both vanilla andZSCustomOverworldROMs.OverworldMap: Represents a single overworld screen, loading its own properties (palettes, graphics, music) based on the ROM version.GameEntity: A base class inzelda3/common.hfor all interactive overworld elements likeOverworldEntrance,OverworldExit,OverworldItem, andSprite.
-
Dungeon Model (
zelda3/dungeon/):DungeonEditorSystem: A high-level API that serves as the backend for the UI, managing all dungeon editing logic (adding/removing sprites, items, doors, etc.).Room: Represents a single dungeon room, containing its objects, sprites, layout, and header information.RoomObject&RoomLayout: Define the structural elements of a room.ObjectParser&ObjectRenderer: High-performance components for directly parsing object data from the ROM and rendering them, avoiding the need for full SNES emulation.
-
Sprite Model (
zelda3/sprite/):Sprite: Represents an individual sprite (enemy, NPC).SpriteBuilder: A fluent API for programmatically constructing custom sprites.zsprite.h: Data structures for compatibility with Zarby's ZSpriteMaker format.
-
Other Data Models:
MessageData(message/): Handles the game's text and dialogue system.Inventory,TitleScreen,DungeonMap(screen/): Represent specific non-gameplay screens.music::Tracker(music/): Contains legacy code from Hyrule Magic for handling SNES music data.
Graphics System (gfx Namespace)
The gfx namespace contains a highly optimized graphics engine tailored for SNES ROM hacking.
-
Core Concepts:
Bitmap: The fundamental class for image data. It supports SNES pixel formats, palette management, and is optimized with features like dirty-region tracking and a hash-map-based palette lookup cache for O(1) performance.- SNES Formats:
snes_color,snes_palette, andsnes_tileprovide structures and conversion functions for handling SNES-specific data (15-bit color, 4BPP/8BPP tiles, etc.). Tilemap: Represents a collection of tiles, using a textureatlasand aTileCache(with LRU eviction) for efficient rendering.
-
Resource Management & Performance:
Arena: A singleton that manages all graphics resources. It poolsSDL_TextureandSDL_Surfaceobjects to reduce allocation overhead and uses custom deleters for automatic cleanup. It also manages the 223 global graphics sheets for the game.MemoryPool: A low-level, high-performance memory allocator that provides pre-allocated blocks for common graphics sizes, reducingmallocoverhead and memory fragmentation.AtlasRenderer: A key performance component that batches draw calls by combining multiple smaller graphics onto a single large texture atlas.- Batching: The
Arenasupports batching texture updates viaQueueTextureUpdate, which minimizes expensive, blocking calls to the SDL rendering API.
-
Format Handling & Optimization:
compression.h: Contains SNES-specific compression algorithms (LC-LZ2, Hyrule Magic) for handling graphics data from the ROM.BppFormatManager: A system for analyzing and converting between different bits-per-pixel (BPP) formats.GraphicsOptimizer: A high-level tool that uses theBppFormatManagerto analyze graphics sheets and recommend memory and performance optimizations.scad_format.h: Provides compatibility with legacy Nintendo CAD file formats (CGX, SCR, COL) from the "gigaleak".
-
Performance Monitoring:
PerformanceProfiler: A comprehensive system for timing operations using aScopedTimerRAII class.PerformanceDashboard: An ImGui-based UI for visualizing real-time performance metrics collected by the profiler.
GUI System (gui Namespace)
The yaze user interface is built with ImGui and is located in src/app/gui. It features a modern, component-based architecture designed for modularity, performance, and testability.
Canvas System (gui::Canvas)
The canvas is the core of all visual editors. The main Canvas class (src/app/gui/canvas.h) has been refactored from a monolithic class into a coordinator that leverages a set of single-responsibility components found in src/app/gui/canvas/.
Canvas: The main canvas widget. It provides a modern, ImGui-style interface (Begin/End) and coordinates the various sub-components for drawing, interaction, and configuration.CanvasInteractionHandler: Manages all direct user input on the canvas, such as mouse clicks, drags, and selections for tile painting and object manipulation.CanvasContextMenu: A powerful, data-driven context menu system. It is aware of the currentCanvasUsagemode (e.g.,TilePainting,PaletteEditing) and displays relevant menu items dynamically.CanvasModals: Handles all modal dialogs related to the canvas, such as "Advanced Properties," "Scaling Controls," and "BPP Conversion," ensuring a consistent UX.CanvasUsageTracker&CanvasPerformanceIntegration: These components provide deep analytics and performance monitoring. They track user interactions, operation timings, and memory usage, integrating with the globalPerformanceDashboardto identify bottlenecks.CanvasUtils: A collection of stateless helper functions for common canvas tasks like grid drawing, coordinate alignment, and size calculations, promoting code reuse.
Theming and Styling
The visual appearance of the editor is highly customizable through a robust theming system.
ThemeManager: A singleton that managesEnhancedThemeobjects. It can load custom themes from.themefiles, allowing users to personalize the editor's look and feel. It includes a built-in theme editor and selector UI.BackgroundRenderer: Renders the animated, futuristic grid background for the main docking space, providing a polished, modern aesthetic.style.cc&color.cc: Contain custom ImGui styling functions (ColorsYaze),SnesColorconversion utilities, and other helpers to maintain a consistent visual identity.
Specialized UI Components
The gui namespace includes several powerful, self-contained widgets for specific ROM hacking tasks.
BppFormatUI: A comprehensive UI for managing SNES bits-per-pixel (BPP) graphics formats. It provides a format selector, a detailed analysis panel, a side-by-side conversion preview, and a batch comparison tool.EnhancedPaletteEditor: An advanced tool for editingSnesPaletteobjects. It features a grid-based editor, a ROM palette manager for loading palettes directly from the game, and a color analysis view to inspect pixel distribution.TextEditor: A full-featured text editor widget with syntax highlighting for 65816 assembly, undo/redo functionality, and standard text manipulation features.AssetBrowser: A flexible, icon-based browser for viewing and managing game assets, such as graphics sheets.
Widget Registry for Automation
A key feature for test automation and AI agent integration is the discoverability of UI elements.
WidgetIdRegistry: A singleton that catalogs all registered GUI widgets. It assigns a stable, hierarchical path (e.g.,Overworld/Canvas/Map) to each widget, making the UI programmatically discoverable. This is the backbone of thez3ed agent testcommand.WidgetIdScope: An RAII helper that simplifies the creation of hierarchical widget IDs by managing an ID stack, ensuring that widget paths are consistent and predictable.
ROM Hacking Context
- Game: The Legend of Zelda: A Link to the Past (US/JP).
ZSCustomOverworld: A popular system for expanding overworld editing capabilities.yazeis designed to be fully compatible with ZScream's implementation of v2 and v3.- Assembly: Uses
asarfor 65816 assembly. A style guide is available atdocs/E1-asm-style-guide.md. usdasmDisassembly: The user has a local copy of theusdasmALTTP disassembly at/Users/scawful/Code/usdasmwhich can be used for reference.
Git Workflow
The project follows a simplified Git workflow for pre-1.0 development, with a more formal process documented for the future. For details, see docs/B4-git-workflow.md.
- Current (Pre-1.0): A relaxed model is in use. Direct commits to
developormasterare acceptable for documentation and small fixes. Feature branches are used for larger, potentially breaking changes. - Future (Post-1.0): The project will adopt a formal Git Flow model with
master,develop, feature, release, and hotfix branches.
AI Agent Workflow (z3ed agent)
A primary focus of the yaze project is its AI-driven agentic workflow, orchestrated by the z3ed CLI.
- Vision: To create a conversational ROM hacking assistant that can inspect the ROM and perform edits based on natural language.
- Core Loop (MCP):
- Model (Plan): The user provides a prompt. The agent uses an LLM (Ollama or Gemini) to create a plan, which is a sequence of
z3edcommands. - Code (Generate): The LLM generates the commands based on a machine-readable catalog of the CLI's capabilities.
- Program (Execute): The
z3edagent executes the commands.
- Model (Plan): The user provides a prompt. The agent uses an LLM (Ollama or Gemini) to create a plan, which is a sequence of
- Proposal System: To ensure safety, agent-driven changes are not applied directly. They are executed in a sandboxed ROM copy and saved as a proposal.
- Review & Acceptance: The user can review the proposed changes via
z3ed agent diffor a dedicatedProposalDrawerin theyazeGUI. The user must explicitly accept a proposal to merge the changes into the main ROM. - Tool Use: The agent can use read-only
z3edcommands (e.g.,overworld-find-tile,dungeon-list-sprites) as "tools" to inspect the ROM and gather context to answer questions or formulate a plan. - API Discovery: The agent learns the available commands and their schemas by calling
z3ed agent describe, which exports the entire CLI surface area in a machine-readable format. - Function Schemas: The Gemini AI service uses function calling schemas defined in
assets/agent/function_schemas.json. These schemas are automatically copied to the build directory and loaded at runtime. To modify the available functions, edit this JSON file rather than hardcoding them in the C++ source.