AGENTS.md - ast-merge Development Guide
π― Project Overview
ast-merge is a shared infrastructure library for the *-merge gem family. It provides base classes, modules, and RSpec shared examples for building intelligent file mergers using AST analysis. It powers prism-merge, psych-merge, json-merge, markly-merge, and other format-specific merge gems.
Core Philosophy: Write once, run anywhere. Define the merge protocol once in ast-merge; implement it in each *-merge gem for a specific file format.
Repository: https://github.com/kettle-rb/ast-merge
Current Version: 4.0.5
Required Ruby: >= 3.2.0 (currently developed against Ruby 4.0.1)
ποΈ Architecture: The Base Library Pattern
What ast-merge Provides
-
Ast::Merge::SmartMergerBaseβ Abstract base class for all format-specificSmartMergerimplementations -
Ast::Merge::FileAnalyzableβ Mixin for file analysis classes; provides freeze block detection, signature generation, and line access -
Ast::Merge::AstNodeβ Base class for synthetic AST nodes (backed byTreeHaver::Base::Node) -
Ast::Merge::MergeResultBaseβ Base class for merge result objects -
Ast::Merge::MergerConfigβ Configuration object encapsulating merge options -
Ast::Merge::FreezeNodeBase/Ast::Merge::Freezableβ Freeze block support -
Ast::Merge::PartialTemplateMergerBaseβ Base for section-level partial merges -
Ast::Merge::SectionTypingβ AST-aware section classification -
Ast::Merge::NodeTypingβ Per-node-type preference overrides -
Ast::Merge::Navigableβ Injection point finding for partial merges -
Ast::Merge::Recipeβ YAML-driven merge recipe runner (ast-merge-recipeexecutable) -
Ast::Merge::Text::SmartMergerβ Concrete line-based text merger (included in this gem) -
Ast::Merge::Comment::*β Generic, language-agnostic comment classes -
Ast::Merge::Detector::*β Content detectors (fenced code blocks, YAML/TOML frontmatter, etc.) -
Ast::Merge::RSpecβ Full RSpec support infrastructure for all*-mergegems
Key Dependencies
| Gem | Role |
|---|---|
tree_haver (~> 5.0) |
Unified AST parsing adapter; provides TreeHaver::Base::Node and backend tags |
version_gem (~> 1.1) |
Version management |
Vendor Directory
IMPORTANT: Nothing in vendor/ is part of this project. The vendor/ directory is used for local development only and is not committed to the repository and does not exist in CI. All vendor gems must be loaded via their published gem versions or git sources in Gemfile.lock.
π Project Structure
lib/ast/merge/
βββ ast_node.rb # Base class for synthetic nodes (TreeHaver::Base::Node subclass)
βββ comment/ # Generic comment classes (line, block, empty, style, parser)
βββ conflict_resolver_base.rb # Abstract conflict resolver
βββ content_match_refiner.rb # Fuzzy match refiner
βββ debug_logger.rb # Logging mixin
βββ detector/ # Content detectors (fenced_code_block, frontmatter, etc.)
βββ diff_mapper_base.rb # Diff/alignment base
βββ emitter_base.rb # Source emitter base
βββ file_analyzable.rb # Mixin for FileAnalysis classes
βββ freezable.rb # Freeze node mixin
βββ freeze_node_base.rb # Base for freeze block nodes
βββ match_refiner_base.rb # Abstract match refiner
βββ match_score_base.rb # Match scoring base
βββ merge_result_base.rb # Base for merge result objects
βββ merger_config.rb # Merge options configuration
βββ navigable/ # Injection point, statement, finder
βββ node_typing/ # Per-node-type preferences (wrapper, frozen_wrapper, normalizer)
βββ partial_template_merger_base.rb # Base for partial/section merges
βββ recipe/ # YAML recipe runner (config, preset, runner, script_loader)
βββ rspec/ # Full RSpec support infrastructure (see below)
βββ section_typing.rb # AST-aware section classification
βββ smart_merger_base.rb # Abstract SmartMerger base class
βββ text/ # Concrete line-based text merger
β βββ smart_merger.rb
β βββ file_analysis.rb
β βββ conflict_resolver.rb
β βββ merge_result.rb
β βββ section.rb
β βββ section_splitter.rb
β βββ line_node.rb
β βββ word_node.rb
βββ version.rb
exe/
βββ ast-merge-recipe # Executable for running YAML merge recipes
βββ ast-merge-diff # Executable for merge diffs
π§ Development Workflows
Running Tests
# Full suite (required for coverage thresholds)
bundle exec rspec
# Single file (disable coverage threshold check)
K_SOUP_COV_MIN_HARD=false bundle exec rspec spec/ast/merge/text/smart_merger_spec.rb
Note: Always run commands in the project root (/home/pboling/src/kettle-rb/ast-merge). Allow direnv to load environment variables first by doing a plain cd before running commands.
Example (two separate commands):
cd /home/pboling/src/kettle-rb/ast-merge
bundle exec rspec spec/ast/merge/smart_merger_base_spec.rb
Coverage Reports
cd /home/pboling/src/kettle-rb/ast-merge
bin/rake coverage && bin/kettle-soup-cover -d
This runs tests with coverage instrumentation and generates reports in the coverage/ directory.
Key ENV variables (set in .envrc, loaded via direnv allow):
-
K_SOUP_COV_DO=trueβ Enable coverage (default in.envrc) -
K_SOUP_COV_MIN_LINE=91β Line coverage threshold -
K_SOUP_COV_MIN_BRANCH=81β Branch coverage threshold -
K_SOUP_COV_MIN_HARD=trueβ Fail if thresholds not met -
K_SOUP_COV_FORMATTERS="html,xml,rcov,lcov,json,tty"β Output formats
Never review HTML reports β use JSON (preferred), XML, LCOV, or the kettle-soup-cover -d TTY output.
Code Quality
bundle exec rake reek
bundle exec rake rubocop_gradual
Prepare and Release
kettle-changelog && kettle-release
π Project Conventions
API Conventions
SmartMergerBase API
-
mergeβ Returns a String (the merged content) -
merge_resultβ Returns a MergeResult object -
to_son MergeResult returns the merged content as a string -
content_stringis legacy β useto_sinstead
Forward Compatibility with **options
CRITICAL DESIGN PRINCIPLE: All constructors and public API methods that accept keyword arguments MUST include **options as the final parameter.
β CORRECT:
def initialize(source, freeze_token: DEFAULT, signature_generator: nil, **options)
@source = source
@freeze_token = freeze_token
@signature_generator = signature_generator
# **options captures future parameters for forward compatibility
end
β WRONG:
def initialize(source, freeze_token: DEFAULT, signature_generator: nil)
# Breaks when new parameters are added to SmartMergerBase
end
Applies to: FileAnalysis#initialize, SmartMerger#initialize, and any method accepting a variable set of options.
Comment Classes
-
Ast::Merge::Comment::*β Generic, language-agnostic comment classes - Format-specific comment classes belong in their respective
*-mergegem (e.g.,Prism::Merge::Comment::*for Ruby magic comments)
Naming Conventions
- File paths must match class namespace paths (Ruby convention)
- Example:
Ast::Merge::Comment::Lineβlib/ast/merge/comment/line.rb
kettle-dev Tooling
This project uses kettle-dev for gem maintenance automation:
-
Rakefile: Sourced from kettle-dev template (
# kettle-dev Rakefile v1.1.60) - CI Workflows: GitHub Actions and GitLab CI are managed via kettle-dev templates
-
Templating: Lines between
kettle-dev:freeze/kettle-dev:unfreezecomments are preserved during template updates -
Releases: Use
kettle-releasefor the automated release process
Version Requirements
- Ruby >= 3.2.0 (gemspec), developed against Ruby 4.0.1 (
.tool-versions) -
tree_haver>= 5.0.3 required
π§ͺ Testing Patterns
kettle-test RSpec Helpers
All spec files load require "kettle/test/rspec" which provides RSpec helpers from the kettle-test gem. Do NOT recreate these helpers.
Environment Variable Helpers (from rspec-stubbed_env):
before do
stub_env("MY_ENV_VAR" => "value")
end
before do
hide_env("HOME", "USER")
end
Other Helpers:
-
block_is_expectedβ Enhanced block expectations (rspec-block_is_expected) -
captureβ Capture output (silent_stream) - Timecop integration for time manipulation
MergeGemRegistry and Dependency Tags
ast-merge maintains a MergeGemRegistry for all known *-merge gems. Tags are available for conditional spec execution.
Available dependency tags (from lib/ast/merge/rspec/dependency_tags.rb):
| Tag | Gem Required |
|---|---|
:markly_merge |
markly-merge |
:commonmarker_merge |
commonmarker-merge |
:markdown_merge |
markdown-merge |
:prism_merge |
prism-merge |
:bash_merge |
bash-merge |
:rbs_merge |
rbs-merge |
:json_merge |
json-merge |
:jsonc_merge |
jsonc-merge |
:toml_merge |
toml-merge |
:psych_merge |
psych-merge |
:dotenv_merge |
dotenv-merge |
:any_markdown_merge |
any markdown merge gem |
TreeHaver also provides backend tags (:markly, :commonmarker, :prism_backend, etc.) β see tree_haver/rspec/dependency_tags.
β CORRECT β Use dependency tag on describe/context/it:
RSpec.describe SomeClass, :markly_merge do
# Entire describe block is skipped if markly-merge unavailable
end
it "does something", :json_merge do
# Skipped if json-merge unavailable
end
β WRONG β Never use require inside spec files:
before do
require "markly/merge" # DO NOT DO THIS
end
Loading Order in spec_helper.rb (ast-mergeβs own suite)
ast-merge uses the split loading pattern to preserve SimpleCov coverage instrumentation:
- Load
tree_haverandtree_haver/rspecearly (before SimpleCov) - Start SimpleCov (
kettle-soup-cover) -
require "ast/merge"(instrumented by SimpleCov) -
require "ast/merge/rspec/setup"(registry + helpers only) -
Ast::Merge::RSpec::MergeGemRegistry.register_known_gems(...)(register all known gems) -
require "ast/merge/rspec/dependency_tags_config"(configure RSpec exclusion filters) -
require "ast/merge/rspec/shared_examples"(load shared examples) - Load merge gems via
requirein a rescue block (silently skip unavailable ones)
For other *-merge gems (not ast-merge itself), use the simple pattern:
require "ast/merge/rspec" # Loads everything: TreeHaver tags + Ast::Merge tags + shared examples
Shared Examples
ast-merge provides shared examples for testing *-merge implementations:
lib/ast/merge/rspec/shared_examples/
βββ conflict_resolver_base.rb # "Ast::Merge::ConflictResolverBase"
βββ debug_logger.rb # "Ast::Merge::DebugLogger"
βββ file_analyzable.rb # "Ast::Merge::FileAnalyzable"
βββ freeze_node_base.rb # "Ast::Merge::FreezeNodeBase"
βββ merge_result_base.rb # "Ast::Merge::MergeResultBase"
βββ merger_config.rb # "Ast::Merge::MergerConfig"
βββ reproducible_merge.rb # "a reproducible merge" (idempotency tests)
The "a reproducible merge" shared example requires:
-
let(:fixtures_path)β Path to fixtures directory -
let(:merger_class)β The SmartMerger class under test - Optional:
let(:file_extension)β File extension for fixtures (default:"")
Fixture structure:
fixtures_path/
scenario_name/
template.{ext}
destination.{ext}
result.{ext}
MergeGemRegistry: Registering a New Merge Gem
When a new *-merge gem is created, add it to KNOWN_GEMS in lib/ast/merge/rspec/merge_gem_registry.rb and to register_known_gems(...) in spec/spec_helper.rb.
External gems can also self-register when loaded:
# In your-gem/lib/your/merge.rb
if defined?(Ast::Merge::RSpec::MergeGemRegistry)
Ast::Merge::RSpec::MergeGemRegistry.register(
:your_merge,
require_path: "your/merge",
merger_class: "Your::Merge::SmartMerger",
test_source: "example source",
category: :data # :markdown, :data, :code, :config, :other
)
end
π Critical Files
| File | Purpose |
|---|---|
lib/ast/merge/smart_merger_base.rb |
Abstract base for all SmartMerger implementations (501 lines) |
lib/ast/merge/file_analyzable.rb |
Mixin for FileAnalysis classes; freeze detection, signatures (312 lines) |
lib/ast/merge/ast_node.rb |
Base for synthetic AST nodes; implements TreeHaver::Node protocol (284 lines) |
lib/ast/merge/merger_config.rb |
Merge options configuration object (261 lines) |
lib/ast/merge/rspec/merge_gem_registry.rb |
Registry for merge gem dependency tag availability (455 lines) |
lib/ast/merge/partial_template_merger_base.rb |
Base for partial/section merges (349 lines) |
lib/ast/merge/section_typing.rb |
AST-aware section classification (306 lines) |
lib/ast/merge/rspec/dependency_tags_config.rb |
RSpec exclusion filter configuration |
lib/ast/merge/rspec/dependency_tags_helpers.rb |
DependencyTags helper module |
lib/ast/merge/rspec/setup.rb |
Registry-only loader (no RSpec config); used by ast-mergeβs own spec suite |
lib/ast/merge/rspec.rb |
Full RSpec entry point (TreeHaver tags + Ast::Merge tags + shared examples) |
exe/ast-merge-recipe |
YAML-driven merge recipe CLI executable |
spec/spec_helper.rb |
Test suite entry point; demonstrates split loading pattern |
.envrc |
Coverage thresholds, tree-sitter paths, and dev environment variables |
π Common Tasks
# Run all specs with coverage
bundle exec rake spec
# Generate coverage report
bundle exec rake coverage
# Check code quality
bundle exec rake reek
bundle exec rake rubocop_gradual
# Run benchmarks (skipped on CI)
bundle exec rake bench
# Prepare changelog for release, build and release
kettle-changelog && kettle-release
π Integration Points
-
tree_haver: All backends (MRI, Rust, FFI, Java, Prism, Psych, Citrus, Parslet, Commonmarker, Markly) via the unified TreeHaver adapter.AstNodeinherits fromTreeHaver::Base::Node. -
*-mergegems: Useast-mergebase classes to implement format-specific merging. Each gem registers itself withMergeGemRegistry. -
RSpec: Deep integration via
lib/ast/merge/rspec.rbfor dependency tagging and shared examples. -
SimpleCov: Coverage tracked for
lib/**/*.rbandlib/**/*.rake; spec, vendor, examples, andlib/ast/merge/rspec/directories are excluded from coverage. -
Recipe system:
ast-merge-recipeCLI +Ast::Merge::Recipe::*classes for YAML-driven merge automation.
π‘ Key Insights
- No backward compatibility: The maintainer explicitly prohibits backward compatibility shims, aliases, or deprecation layers. Make clean breaks.
-
vendor/is not part of the project: It is used for local development only and does not exist in CI. -
Split loading for SimpleCov:
ast-mergeβs own spec suite MUST use the split loading pattern to ensure SimpleCov instruments the library code before itβs required. -
mergereturns a String:SmartMergerBase#mergereturns a String directly.merge_resultreturns the result object. -
content_stringis legacy: Useto_son the result object instead. -
merged_sourcedoesnβt exist: Usemergeormerge_result.to_s. -
Magic comments are Ruby-specific: They belong in
prism-merge, not inast-merge. -
FrozenWrappervsFreezeNodeBase:FreezeNodeBaseusesfreeze_signature(content-based matching).FrozenWrapperis unwrapped byFileAnalyzable#generate_signatureto use the underlying nodeβs structural signature β this prevents duplication when frozen node content differs between template and destination. -
text_node.textstrips markdown formatting: When matching Markdown nodes by.text, backticks, bold, italic, and link text are stripped. Match plain text only.
π§© Markdown Text Matching Behavior
CRITICAL: When matching Markdown nodes by text content (e.g., anchor patterns in merge recipes or PartialTemplateMerger), the .text method returns plain text without markdown formatting.
Example:
- Markdown source:
### The `*-merge` Gem Family -
.textreturns:"The *-merge Gem Family\n"
Stripped formatting includes: bold, italic, code spans, links, images.
Pattern examples:
# β WRONG - backticks won't be found
anchor: { type: :heading, text: /`\*-merge` Gem Family/ }
# β
CORRECT - match plain text
anchor: { type: :heading, text: /\*-merge.*Gem Family/ }
In YAML recipes (double escaping needed):
anchor:
type: heading
text: "/^The \\*-merge Gem Family/"
π« Common Pitfalls
- NEVER add backward compatibility β No shims, aliases, or deprecation layers.
-
NEVER use
requireinside spec files β Use dependency tags instead. -
NEVER pipe test commands through
head/tailβ Run tests without output truncation. - Do NOT load vendor gems β They are not part of this project; they do not exist in CI.
-
Use
tmp/for temporary files β Never use/tmpor other system directories. -
Do NOT chain
cdwith&&β Runcdas a separate command sodirenvloads ENV.