* docs(agents): add per-crate guides, architecture invariants, and generated-files list Add agent/contributor navigation docs modeled on the AGENTS.md convention: - Per-crate AGENTS.md for hot crates (mito2, metric-engine, flow, frontend, meta-srv): module map, read/write paths, change-coupling points, test commands, and gotchas. - .agents/architecture-invariants.md: repo-wide rules that clippy and the style guide do not cover (format compatibility, crate layering, async runtimes, error handling, experimental gating, the DataFusion fork). - .agents/generated-files.md: tool-generated artifacts that must not be hand-edited (sqlness .result, config.md, dashboards, build.rs output, proto). - Anchor the .gitignore CLAUDE.md/AGENTS.md rules to the repo root so per-crate AGENTS.md files are tracked while root-level personal config stays ignored. Signed-off-by: Dennis Zhuang <killme2008@gmail.com> * chore: update crate AGENTS.md and fix config.md path Signed-off-by: Dennis Zhuang <killme2008@gmail.com> * docs(agents): fix DataFusion patch layout and SQL query lifecycle order Address review feedback on #8346: - architecture-invariants: the DataFusion sub-crates pin an exact crates.io version in [workspace.dependencies] and are redirected to the fork rev in [patch.crates-io]; the two sections hold different forms, not the same rev. - frontend: the SQL query lifecycle runs the pre_parsing/post_parsing interceptors around parsing, before the per-statement permission check. Signed-off-by: Dennis Zhuang <killme2008@gmail.com> --------- Signed-off-by: Dennis Zhuang <killme2008@gmail.com>
4.3 KiB
metric-engine — Agent & Contributor Guide
Navigation aid for src/metric-engine. Keep it short and point to code. Paths
are relative to the repo root.
Repo-wide rules that apply here: .agents/architecture-invariants.md.
What this crate does
The Metric Engine is optimized for Prometheus-style workloads with a huge number
of small tables. Many logical regions (one per metric table) share a single
physical pair of Mito2 regions: a data region and a metadata region. Rows
are multiplexed with metric-engine internal identity: dense primary-key mode
injects __table_id and __tsid, while the default sparse mode encodes them
into __primary_key. Reads still add a logical-table filter before forwarding
to the physical data region. It implements RegionEngine and delegates all real
storage to mito2.
The architecture is documented at the top of src/metric-engine/src/lib.rs.
Module map
| Module | Path | Purpose |
|---|---|---|
engine |
src/metric-engine/src/engine.rs, src/metric-engine/src/engine/ |
MetricEngine (RegionEngine impl) and per-op handlers (create.rs, put.rs, read.rs, alter.rs, drop.rs, ...) |
metadata_region |
src/metric-engine/src/metadata_region.rs |
K-V over a Mito2 region storing logical table/column metadata, with an LRU cache |
data_region |
src/metric-engine/src/data_region.rs |
Wraps the Mito2 data region; forwards writes and manages physical columns |
row_modifier |
src/metric-engine/src/row_modifier.rs |
Rewrites incoming rows for dense or sparse primary-key encoding |
batch_modifier |
src/metric-engine/src/batch_modifier.rs |
RecordBatch-level TSID computation and sparse primary-key encoding |
state |
src/metric-engine/src/engine/state.rs |
In-memory cache of physical columns and logical column metadata |
repeated_task |
src/metric-engine/src/repeated_task.rs |
Periodic metadata-region flush task |
utils |
src/metric-engine/src/utils.rs |
RegionId conversions (data vs metadata group), manifest encoding |
config |
src/metric-engine/src/config.rs |
EngineConfig (metadata flush interval, sparse PK) |
test_util |
src/metric-engine/src/test_util.rs |
TestEnv building the Mito2 + Metric stack |
Write path
MetricEngine::handle_request(Put) (engine/put.rs) rejects direct writes to a
physical region, resolves the physical region for the logical id, loads logical
columns from metadata_region, then row_modifier rewrites the rows according
to the data region's primary-key encoding and forwards the request to the Mito2
data region.
Read path
MetricEngine::handle_query (engine/read.rs): a query against a logical region
is rewritten to add a __table_id == <logical_id> filter and forwarded to the
Mito2 data region. Queries against a physical region pass straight through.
Public surface
- Entry:
MetricEngineinsrc/metric-engine/src/engine.rs, built viatry_new(mito, config). - Trait:
impl RegionEngine for MetricEngine(name ="metric"). - Depends on
mito2,store-api, andmito-codec(sparse primary key codec).
When you change X, also touch Y
- Reserved column ids / names (
__tsid,__table_id,__primary_key): seestore-api's metric engine consts; keep them in sync withengine.rs. - Metadata K-V encoding (
metadata_region.rs): changes the on-disk metadata layout. - RegionId group mapping (
utils.rs): data vs metadata region derivation. - Physical column rules (
engine/alter/): a physical region allows only one field column.
Testing
cargo nextest run -p metric-engine
TestEnv in test_util.rs gives you mito() and metric() handles.
Gotchas
- Physical vs logical region confusion: physical regions reject direct user writes; operate on logical region ids.
- TSID must be stable for the same tag set — it is a hash over sorted tag names
- values and may be stored in
__tsidor encoded into__primary_key.
- values and may be stored in
- Metadata is cached (LRU with a TTL); after an alter, stale reads are possible until invalidation/expiry.
- Always convert ids via
utils::to_data_region_id/to_metadata_region_id.
Maintenance contract
Update this file when you change the logical/physical region model, the injected columns, the metadata encoding, or the public engine entry points.