Files
greptimedb/src/metric-engine/AGENTS.md
dennis zhuang 3b8f55e490 docs(agents): add per-crate guides, architecture invariants, and generated-files list (#8346)
* docs(agents): add per-crate guides, architecture invariants, and generated-files list

Add agent/contributor navigation docs modeled on the AGENTS.md convention:

- Per-crate AGENTS.md for hot crates (mito2, metric-engine, flow, frontend,
  meta-srv): module map, read/write paths, change-coupling points, test
  commands, and gotchas.
- .agents/architecture-invariants.md: repo-wide rules that clippy and the
  style guide do not cover (format compatibility, crate layering, async
  runtimes, error handling, experimental gating, the DataFusion fork).
- .agents/generated-files.md: tool-generated artifacts that must not be
  hand-edited (sqlness .result, config.md, dashboards, build.rs output, proto).
- Anchor the .gitignore CLAUDE.md/AGENTS.md rules to the repo root so per-crate
  AGENTS.md files are tracked while root-level personal config stays ignored.

Signed-off-by: Dennis Zhuang <killme2008@gmail.com>

* chore: update crate AGENTS.md and fix config.md path

Signed-off-by: Dennis Zhuang <killme2008@gmail.com>

* docs(agents): fix DataFusion patch layout and SQL query lifecycle order

Address review feedback on #8346:

- architecture-invariants: the DataFusion sub-crates pin an exact crates.io
  version in [workspace.dependencies] and are redirected to the fork rev in
  [patch.crates-io]; the two sections hold different forms, not the same rev.
- frontend: the SQL query lifecycle runs the pre_parsing/post_parsing
  interceptors around parsing, before the per-statement permission check.

Signed-off-by: Dennis Zhuang <killme2008@gmail.com>

---------

Signed-off-by: Dennis Zhuang <killme2008@gmail.com>
2026-06-24 09:14:09 +00:00

4.3 KiB

metric-engine — Agent & Contributor Guide

Navigation aid for src/metric-engine. Keep it short and point to code. Paths are relative to the repo root.

Repo-wide rules that apply here: .agents/architecture-invariants.md.

What this crate does

The Metric Engine is optimized for Prometheus-style workloads with a huge number of small tables. Many logical regions (one per metric table) share a single physical pair of Mito2 regions: a data region and a metadata region. Rows are multiplexed with metric-engine internal identity: dense primary-key mode injects __table_id and __tsid, while the default sparse mode encodes them into __primary_key. Reads still add a logical-table filter before forwarding to the physical data region. It implements RegionEngine and delegates all real storage to mito2.

The architecture is documented at the top of src/metric-engine/src/lib.rs.

Module map

Module Path Purpose
engine src/metric-engine/src/engine.rs, src/metric-engine/src/engine/ MetricEngine (RegionEngine impl) and per-op handlers (create.rs, put.rs, read.rs, alter.rs, drop.rs, ...)
metadata_region src/metric-engine/src/metadata_region.rs K-V over a Mito2 region storing logical table/column metadata, with an LRU cache
data_region src/metric-engine/src/data_region.rs Wraps the Mito2 data region; forwards writes and manages physical columns
row_modifier src/metric-engine/src/row_modifier.rs Rewrites incoming rows for dense or sparse primary-key encoding
batch_modifier src/metric-engine/src/batch_modifier.rs RecordBatch-level TSID computation and sparse primary-key encoding
state src/metric-engine/src/engine/state.rs In-memory cache of physical columns and logical column metadata
repeated_task src/metric-engine/src/repeated_task.rs Periodic metadata-region flush task
utils src/metric-engine/src/utils.rs RegionId conversions (data vs metadata group), manifest encoding
config src/metric-engine/src/config.rs EngineConfig (metadata flush interval, sparse PK)
test_util src/metric-engine/src/test_util.rs TestEnv building the Mito2 + Metric stack

Write path

MetricEngine::handle_request(Put) (engine/put.rs) rejects direct writes to a physical region, resolves the physical region for the logical id, loads logical columns from metadata_region, then row_modifier rewrites the rows according to the data region's primary-key encoding and forwards the request to the Mito2 data region.

Read path

MetricEngine::handle_query (engine/read.rs): a query against a logical region is rewritten to add a __table_id == <logical_id> filter and forwarded to the Mito2 data region. Queries against a physical region pass straight through.

Public surface

  • Entry: MetricEngine in src/metric-engine/src/engine.rs, built via try_new(mito, config).
  • Trait: impl RegionEngine for MetricEngine (name = "metric").
  • Depends on mito2, store-api, and mito-codec (sparse primary key codec).

When you change X, also touch Y

  • Reserved column ids / names (__tsid, __table_id, __primary_key): see store-api's metric engine consts; keep them in sync with engine.rs.
  • Metadata K-V encoding (metadata_region.rs): changes the on-disk metadata layout.
  • RegionId group mapping (utils.rs): data vs metadata region derivation.
  • Physical column rules (engine/alter/): a physical region allows only one field column.

Testing

cargo nextest run -p metric-engine

TestEnv in test_util.rs gives you mito() and metric() handles.

Gotchas

  • Physical vs logical region confusion: physical regions reject direct user writes; operate on logical region ids.
  • TSID must be stable for the same tag set — it is a hash over sorted tag names
    • values and may be stored in __tsid or encoded into __primary_key.
  • Metadata is cached (LRU with a TTL); after an alter, stale reads are possible until invalidation/expiry.
  • Always convert ids via utils::to_data_region_id / to_metadata_region_id.

Maintenance contract

Update this file when you change the logical/physical region model, the injected columns, the metadata encoding, or the public engine entry points.