mirror of
https://github.com/GreptimeTeam/greptimedb.git
synced 2026-07-04 04:50:37 +00:00
* docs(agents): add per-crate guides, architecture invariants, and generated-files list Add agent/contributor navigation docs modeled on the AGENTS.md convention: - Per-crate AGENTS.md for hot crates (mito2, metric-engine, flow, frontend, meta-srv): module map, read/write paths, change-coupling points, test commands, and gotchas. - .agents/architecture-invariants.md: repo-wide rules that clippy and the style guide do not cover (format compatibility, crate layering, async runtimes, error handling, experimental gating, the DataFusion fork). - .agents/generated-files.md: tool-generated artifacts that must not be hand-edited (sqlness .result, config.md, dashboards, build.rs output, proto). - Anchor the .gitignore CLAUDE.md/AGENTS.md rules to the repo root so per-crate AGENTS.md files are tracked while root-level personal config stays ignored. Signed-off-by: Dennis Zhuang <killme2008@gmail.com> * chore: update crate AGENTS.md and fix config.md path Signed-off-by: Dennis Zhuang <killme2008@gmail.com> * docs(agents): fix DataFusion patch layout and SQL query lifecycle order Address review feedback on #8346: - architecture-invariants: the DataFusion sub-crates pin an exact crates.io version in [workspace.dependencies] and are redirected to the fork rev in [patch.crates-io]; the two sections hold different forms, not the same rev. - frontend: the SQL query lifecycle runs the pre_parsing/post_parsing interceptors around parsing, before the per-statement permission check. Signed-off-by: Dennis Zhuang <killme2008@gmail.com> --------- Signed-off-by: Dennis Zhuang <killme2008@gmail.com>
86 lines
4.3 KiB
Markdown
86 lines
4.3 KiB
Markdown
# metric-engine — Agent & Contributor Guide
|
|
|
|
Navigation aid for `src/metric-engine`. Keep it short and point to code. Paths
|
|
are relative to the repo root.
|
|
|
|
Repo-wide rules that apply here: [`.agents/architecture-invariants.md`](../../.agents/architecture-invariants.md).
|
|
|
|
## What this crate does
|
|
|
|
The Metric Engine is optimized for Prometheus-style workloads with a huge number
|
|
of small tables. Many **logical** regions (one per metric table) share a single
|
|
**physical** pair of Mito2 regions: a data region and a metadata region. Rows
|
|
are multiplexed with metric-engine internal identity: dense primary-key mode
|
|
injects `__table_id` and `__tsid`, while the default sparse mode encodes them
|
|
into `__primary_key`. Reads still add a logical-table filter before forwarding
|
|
to the physical data region. It implements `RegionEngine` and delegates all real
|
|
storage to `mito2`.
|
|
|
|
The architecture is documented at the top of `src/metric-engine/src/lib.rs`.
|
|
|
|
## Module map
|
|
|
|
| Module | Path | Purpose |
|
|
| --- | --- | --- |
|
|
| `engine` | `src/metric-engine/src/engine.rs`, `src/metric-engine/src/engine/` | `MetricEngine` (`RegionEngine` impl) and per-op handlers (`create.rs`, `put.rs`, `read.rs`, `alter.rs`, `drop.rs`, ...) |
|
|
| `metadata_region` | `src/metric-engine/src/metadata_region.rs` | K-V over a Mito2 region storing logical table/column metadata, with an LRU cache |
|
|
| `data_region` | `src/metric-engine/src/data_region.rs` | Wraps the Mito2 data region; forwards writes and manages physical columns |
|
|
| `row_modifier` | `src/metric-engine/src/row_modifier.rs` | Rewrites incoming rows for dense or sparse primary-key encoding |
|
|
| `batch_modifier` | `src/metric-engine/src/batch_modifier.rs` | RecordBatch-level TSID computation and sparse primary-key encoding |
|
|
| `state` | `src/metric-engine/src/engine/state.rs` | In-memory cache of physical columns and logical column metadata |
|
|
| `repeated_task` | `src/metric-engine/src/repeated_task.rs` | Periodic metadata-region flush task |
|
|
| `utils` | `src/metric-engine/src/utils.rs` | `RegionId` conversions (data vs metadata group), manifest encoding |
|
|
| `config` | `src/metric-engine/src/config.rs` | `EngineConfig` (metadata flush interval, sparse PK) |
|
|
| `test_util` | `src/metric-engine/src/test_util.rs` | `TestEnv` building the Mito2 + Metric stack |
|
|
|
|
## Write path
|
|
|
|
`MetricEngine::handle_request(Put)` (`engine/put.rs`) rejects direct writes to a
|
|
physical region, resolves the physical region for the logical id, loads logical
|
|
columns from `metadata_region`, then `row_modifier` rewrites the rows according
|
|
to the data region's primary-key encoding and forwards the request to the Mito2
|
|
data region.
|
|
|
|
## Read path
|
|
|
|
`MetricEngine::handle_query` (`engine/read.rs`): a query against a logical region
|
|
is rewritten to add a `__table_id == <logical_id>` filter and forwarded to the
|
|
Mito2 data region. Queries against a physical region pass straight through.
|
|
|
|
## Public surface
|
|
|
|
- Entry: `MetricEngine` in `src/metric-engine/src/engine.rs`, built via `try_new(mito, config)`.
|
|
- Trait: `impl RegionEngine for MetricEngine` (name = `"metric"`).
|
|
- Depends on `mito2`, `store-api`, and `mito-codec` (sparse primary key codec).
|
|
|
|
## When you change X, also touch Y
|
|
|
|
- **Reserved column ids / names** (`__tsid`, `__table_id`, `__primary_key`): see
|
|
`store-api`'s metric engine consts; keep them in sync with `engine.rs`.
|
|
- **Metadata K-V encoding** (`metadata_region.rs`): changes the on-disk metadata layout.
|
|
- **RegionId group mapping** (`utils.rs`): data vs metadata region derivation.
|
|
- **Physical column rules** (`engine/alter/`): a physical region allows only one field column.
|
|
|
|
## Testing
|
|
|
|
```bash
|
|
cargo nextest run -p metric-engine
|
|
```
|
|
|
|
`TestEnv` in `test_util.rs` gives you `mito()` and `metric()` handles.
|
|
|
|
## Gotchas
|
|
|
|
- Physical vs logical region confusion: physical regions reject direct user
|
|
writes; operate on logical region ids.
|
|
- TSID must be stable for the same tag set — it is a hash over sorted tag names
|
|
+ values and may be stored in `__tsid` or encoded into `__primary_key`.
|
|
- Metadata is cached (LRU with a TTL); after an alter, stale reads are possible
|
|
until invalidation/expiry.
|
|
- Always convert ids via `utils::to_data_region_id` / `to_metadata_region_id`.
|
|
|
|
## Maintenance contract
|
|
|
|
Update this file when you change the logical/physical region model, the injected
|
|
columns, the metadata encoding, or the public engine entry points.
|