* fix(metric-engine): report query load under physical region id
Propagate the physical region ID through the scanner, record batch
stream, and query engine so that query-load metrics (CPU time, scanned
bytes) are attributed to the correct physical region rather than always
to the logical region.
- `src/store-api/src/region_engine.rs` — add `query_load_region_id` to
`ScannerProperties` and `set_query_load_region_id` to `RegionScanner`
trait
- `src/mito2/src/read/seq_scan.rs`,
`src/mito2/src/read/series_scan.rs`,
`src/mito2/src/read/unordered_scan.rs` — implement the new trait
method on each scanner
- `src/common/recordbatch/src/adapter.rs` — carry region id through
`RecordBatchStreamAdapter` into `RecordBatchMetrics`
- `src/table/src/table/scan.rs` — expose region id on `RegionScanExec`
- `src/query/src/datafusion.rs` — extract region id from the physical
plan and set it on the output stream
- `src/query/src/dist_plan/merge_scan.rs` — use metrics-contained region
id in query-load reporting, falling back to the logical region id
- `src/metric-engine/src/engine/read.rs` — set region id on the metric
engine scanner
Signed-off-by: Lei, HUANG <ratuthomm@gmail.com>
* fix(query): satisfy clippy for query load region id
Signed-off-by: Lei, HUANG <ratuthomm@gmail.com>
* fix(query): ignore missing query load region ids
Signed-off-by: Lei, HUANG <ratuthomm@gmail.com>
---------
Signed-off-by: Lei, HUANG <ratuthomm@gmail.com>
Co-authored-by: Lei, HUANG <ratuthomm@gmail.com>
* docs(agents): add per-crate guides, architecture invariants, and generated-files list
Add agent/contributor navigation docs modeled on the AGENTS.md convention:
- Per-crate AGENTS.md for hot crates (mito2, metric-engine, flow, frontend,
meta-srv): module map, read/write paths, change-coupling points, test
commands, and gotchas.
- .agents/architecture-invariants.md: repo-wide rules that clippy and the
style guide do not cover (format compatibility, crate layering, async
runtimes, error handling, experimental gating, the DataFusion fork).
- .agents/generated-files.md: tool-generated artifacts that must not be
hand-edited (sqlness .result, config.md, dashboards, build.rs output, proto).
- Anchor the .gitignore CLAUDE.md/AGENTS.md rules to the repo root so per-crate
AGENTS.md files are tracked while root-level personal config stays ignored.
Signed-off-by: Dennis Zhuang <killme2008@gmail.com>
* chore: update crate AGENTS.md and fix config.md path
Signed-off-by: Dennis Zhuang <killme2008@gmail.com>
* docs(agents): fix DataFusion patch layout and SQL query lifecycle order
Address review feedback on #8346:
- architecture-invariants: the DataFusion sub-crates pin an exact crates.io
version in [workspace.dependencies] and are redirected to the fork rev in
[patch.crates-io]; the two sections hold different forms, not the same rev.
- frontend: the SQL query lifecycle runs the pre_parsing/post_parsing
interceptors around parsing, before the per-statement permission check.
Signed-off-by: Dennis Zhuang <killme2008@gmail.com>
---------
Signed-off-by: Dennis Zhuang <killme2008@gmail.com>
* Initial plan
* fix: guard structured json alignment to fix Clippy CI failure
- Add `is_structured_json_field` function that only returns true for
fields with both JSON extension type AND Struct Arrow data type
- Replace all usages of `is_json_extension_type` / `has_json_extension_field`
with `is_structured_json_field` to prevent legacy JSONB binary columns
from entering structured JSON alignment paths
- Fix logic in `FlatProjectionMapper::new_with_read_columns` to guard
JSON type hint concretization for JSON2 columns only
- Fix `create_column` in show_create_table.rs to only emit JSON structure
settings for JSON2 columns
- Move `mod tests` to end of flat_projection.rs to fix clippy::items_after_test_module
- Add tests for legacy JSON behavior
* fix: guard narrow_read_columns_by_json_type_hint with is_structured_json_field check
---------
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
* feat: expose region read load through Prometheus metrics and heartbeat
Introduce region-level query load tracking (CPU time and scanned bytes)
collected by `RegionScanExec`, exposed via Prometheus metrics and optionally
reported through heartbeat region stats.
- **Region metrics** (`src/mito2/src/metrics.rs`, `src/store-api/src/metrics.rs`): Add
`greptime_mito_region_query_cpu_time`, `greptime_mito_region_query_scanned_bytes`,
and `greptime_mito_region_written_bytes_since_open` gauge metrics.
- **MitoRegion** (`src/mito2/src/region.rs`, `src/mito2/src/region/opener.rs`,
`src/mito2/src/region_write_ctx.rs`): Replace `AtomicU64` `written_bytes` with
`IntGauge`; add `query_cpu_time`/`query_scanned_bytes` fields with lifecycle
management (init, reset, remove-on-drop).
- **RegionStatistic** (`src/store-api/src/region_engine.rs`,
`src/store-api/src/storage/requests.rs`): Add `query_cpu_time` and
`query_scanned_bytes` fields.
- **Metric-engine** (`src/metric-engine/src/utils.rs`): Aggregate query load from
metadata and data regions.
- **Heartbeat** (`src/datanode/src/heartbeat.rs`,
`src/common/meta/src/datanode.rs`): Relay region query load via heartbeat
`RegionStat`; add test.
- **Query engine** (`src/query/src/options.rs`,
`src/query/src/query_engine/state.rs`, `src/query/src/datafusion.rs`,
`src/query/src/dist_plan/merge_scan.rs`,
`src/query/src/dist_plan/analyzer.rs`,
`src/query/src/dummy_catalog.rs`): Add `enable_region_query_load_report` config;
wire `RegionScanExec` to accumulate CPU time and scanned bytes.
- **Table scan** (`src/table/src/table/scan.rs`,
`src/table/src/table/metrics.rs`): Wire table scan metrics.
- **Config** (`config/standalone.example.toml`, `config/datanode.example.toml`,
`config/frontend.example.toml`, `config/config.md`): Add example config and
documentation for `enable_region_query_load_report`.
- **Tests** (`src/mito2/src/engine/basic_test.rs`,
`src/mito2/src/engine/close_test.rs`,
`src/cmd/tests/load_config_test.rs`,
`src/flow/src/adapter.rs`): Add unit tests for region query load reporting
and metric cleanup on region close; set default config values.
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
* feat: move region read load report config from query layer to mito engine
Move the `enable_region_query_load_report` setting from query-level config
(`QueryOptions`/`DistPlannerOptions`) into the mito2 storage engine config
(`MitoConfig`), and expose it through the `RegionScanner` trait instead
of `ScanRequest`/`PrepareRequest`.
- Mito config: `src/mito2/src/config.rs`, `src/mito2/src/engine.rs`
- Scan region plumbing: `src/mito2/src/read/scan_region.rs`
- RegionScanner trait: `src/store-api/src/region_engine.rs`
- Scanner impls: `src/mito2/src/read/seq_scan.rs`, `src/mito2/src/read/series_scan.rs`, `src/mito2/src/read/unordered_scan.rs`
- RegionScanExec: `src/table/src/table/scan.rs`
- Removed from query layer: `src/query/src/options.rs`, `src/query/src/dist_plan/analyzer.rs`, `src/query/src/query_engine/state.rs`, `src/query/src/datafusion.rs`, `src/query/src/dummy_catalog.rs`
- Removed from test/config: `src/query/src/dist_plan/analyzer/test.rs`, `src/flow/src/adapter.rs`, `src/cmd/tests/load_config_test.rs`, `src/store-api/src/storage/requests.rs`
- Config docs: `config/config.md`, `config/datanode.example.toml`, `config/frontend.example.toml`, `config/standalone.example.toml`
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
* feat: move region query load report config from MitoConfig to LoggingOptions
Relocate the `enable_region_query_load_report` setting from
`MitoConfig` to `LoggingOptions` (as `enable_per_region_metrics`),
and thread it into `MitoEngineBuilder` instead of reading from
the engine config directly. This makes the region read-load
reporting a per-node logging/observability concern rather than
a per-engine storage setting.
- `config/config.md`
- `config/datanode.example.toml`
- `config/standalone.example.toml`
- `src/common/telemetry/src/logging.rs`
- `src/datanode/src/datanode.rs`
- `src/mito2/src/config.rs`
- `src/mito2/src/engine.rs`
- `src/mito2/src/region.rs`
Signed-off-by: Lei Huang <lei@huang.to>
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
* feat: report region query load on stream drop instead of stream end
Move `report_region_query_load()` from `StreamWithMetricWrapper::poll_next()`
to `Drop::drop()` so that region query load is reported even when the
stream is dropped prematurely (not just when fully consumed).
Affected files:
- `src/table/src/table/scan.rs`
Signed-off-by: Lei, Huang <huanglei@qiyi.com>
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
* feat: make region query load reporting configurable
Introduce `enable_region_query_load_report` flag to optionally report
per-region `query_cpu_time` and `query_scanned_bytes` metrics instead
of always creating them. When disabled, the Prometheus gauges are not
created (`None`), avoiding metric churn for workloads that do not
need query-level load tracking.
- `src/common/meta/src/datanode.rs` — Placeholder fields for query load
- `src/mito2/src/region.rs` — Make query metrics `Option<IntGauge>`, conditional create/remove/reset
- `src/mito2/src/region/opener.rs` — Thread flag through `RegionOpener`
- `src/mito2/src/worker.rs` — Thread flag through `WorkerGroup`/`WorkerStarter`/`RegionWorkerLoop`
- `src/mito2/src/worker/handle_catchup.rs` — Pass flag on region open
- `src/mito2/src/worker/handle_create.rs` — Pass flag on region create
- `src/mito2/src/worker/handle_open.rs` — Pass flag on region open
- `src/mito2/src/engine.rs` — Pass flag from `MitoEngineBuilder`
- `src/mito2/src/test_util.rs` — Test helpers for both modes
- `src/mito2/src/engine/basic_test.rs` — Cover disabled and preserve cases
- `src/mito2/src/engine/close_test.rs` — Adapt to optional metrics
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
* refactor: remove elapsed_compute metric from scan stream
The elapsed_compute metric conflated poll-wait time with actual CPU
computation, making it misleading. Removed the metric and its
recording path from StreamMetrics and StreamWithMetricWrapper.
Added a test asserting that poll duration is not reported as
elapsed_compute.
- `src/table/src/table/metrics.rs` — removed elapsed_compute field,
builder, and record_elapsed_compute method
- `src/table/src/table/scan.rs` — removed record_elapsed_compute
call; added SlowRecordBatchStream test helper and
wrapper_poll_time_is_not_elapsed_compute test
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
* feat: disable region query load report for compaction scans
Compaction scans are internal operations initiated by the engine,
not user queries. Disable region query load reporting when the
scan input is marked as compaction to avoid misleading load metrics.
- `src/mito2/src/read/scan_region.rs` — set `enable_region_query_load_report`
to `false` when compaction is enabled; add unit test
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
* test: add `enable_per_region_metrics` config to HTTP integration test
- Enable per-region metrics config in HTTP test setup
\`tests-integration/tests/http.rs\`
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
* refactor: remove region query load reporting tests and helpers
Remove the region query load reporting feature from the codebase,
including tests, test utilities, and helper infrastructure that were
part of this now-deprecated functionality.
Specifically:
- Remove region query load reporting tests from
`src/mito2/src/engine/basic_test.rs` and
`src/table/src/table/scan.rs`, and the region close metrics test
from `src/mito2/src/engine/close_test.rs`
- Remove region query load report test utilities and simplify engine
construction helpers in `src/mito2/src/test_util.rs`
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
* perf: avoid disabled region query load timing
Summary:
- Avoid per-poll `Instant::now` and elapsed-time accumulation when `enable_region_query_load_report` is disabled.
- Keep region query-load CPU accounting active only when reporting is enabled.
Files:
- `src/table/src/table/scan.rs`
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
* feat: move per-region query load reporting from storage to query engine
Move `enable_per_region_metrics` from datanode to frontend config and
migrate query load tracking (CPU time, scanned bytes) from mito2
storage engine to the query engine's distributed scan planner. The
storage-level metrics plumbing and `enable_region_query_load_report`
flag are removed from mito2, `ScanInput`, `ScanRegion`, and
`RegionScanner`. Query-level metrics are now collected in
`merge_scan.rs` via `scan_region_load`.
- `src/mito2/` -- Remove `query_cpu_time`, `query_scanned_bytes`
metrics, `enable_region_query_load_report` plumbing from engine,
region, opener, scanner types, workers
- `src/store-api/` -- Remove `query_cpu_time`, `query_scanned_bytes`
from `RegionStatistic`
- `src/metric-engine/` -- Remove query load fields from
`get_region_statistic`
- `src/query/` -- Add `enable_per_region_metrics` to `QueryOptions`;
wire through planner, optimizer, merge scan with `scan_region_load`
metrics
- `src/frontend/` -- Pass `enable_per_region_metrics` into
`QueryOptions`
- `src/common/meta/` -- Remove TODO for query load fields
- `config/` -- Move `enable_per_region_metrics` from datanode to
frontend and standalone example configs
- `src/cmd/tests/` -- Add `enable_per_region_metrics` to flownode
config test
- `src/flow/` -- Add `enable_per_region_metrics` default to flownode
options
- `src/table/` -- Remove unused query load fields from scan
- `src/datanode/` -- Remove
`with_enable_region_query_load_report` calls
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
* refactor: remove obsolete mito write load metric
Remove obsolete mito-side region written-bytes metric plumbing that is not needed by the frontend read-load reporting path.
Related files:
- \`src/mito2/src/metrics.rs\`
- \`src/mito2/src/region.rs\`
- \`src/mito2/src/region/opener.rs\`
- \`src/mito2/src/region_write_ctx.rs\`
- \`src/mito2/src/engine/basic_test.rs\`
- \`src/mito2/src/worker.rs\`
- \`src/mito2/src/config.rs\`
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
* feat: change region query load metrics from gauge to counter
Change `REGION_QUERY_CPU_TIME` and `REGION_QUERY_SCANNED_BYTES` from
`IntGaugeVec` to `IntCounterVec` since these values are monotonically
increasing and do not need gauge semantics. Update corresponding `add`
calls to `inc_by` in merge scan reporting.
Files:
- `src/store-api/src/metrics.rs` — metric type and label changes
- `src/query/src/dist_plan/merge_scan.rs` — caller adaptation
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
* refactor: pass ReadItem directly to report_region_query_load
Move `region_scan_load` call to the caller, so `report_region_query_load`
accepts the already-computed `ReadItem` instead of `RecordBatchMetrics`.
- `src/query/src/dist_plan/merge_scan.rs` — update signature, inline call,
remove stale test
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
* feat: ensure region query load is reported on MergeScanExec drop
Remove the `enable_per_region_metrics` parameter from `report_region_query_load`
so region load metrics are always emitted. Add a `Drop` impl for
`MergeScanExec` that reports sub-stage metrics when the executor is
dropped, covering edge cases where per-region metric emission was
missed. Add a unit test verifying CPU time and scanned bytes are
recorded on drop.
Affected file: `src/query/src/dist_plan/merge_scan.rs`
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
* fix: gate region query load reporting
Guard drop-time region query load reporting with the configured per-region metrics flag.
Related files:
- \`src/query/src/dist_plan/merge_scan.rs\`
Symbols:
- \`MergeScanExec::drop\`
- \`enable_per_region_metrics\`
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
* fix: clean region query load metrics on drop
Remove per-region query load metric labels when a region is dropped so stale label series do not remain in the registry.
Related files:
- \`src/mito2/src/region.rs\`
Symbols:
- \`MitoRegion::drop\`
- \`REGION_QUERY_CPU_TIME\`
- \`REGION_QUERY_SCANNED_BYTES\`
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
---------
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
Signed-off-by: Lei Huang <lei@huang.to>
Signed-off-by: Lei, Huang <huanglei@qiyi.com>
* feat: track scan output bytes and use them for read costing
Track the actual byte count of each record batch produced by
`RegionScanExec` and use it in place of the aggregated per-plan
`memory_usage` as the `table_scan` cost input. This avoids double
counting bytes that flow through multiple operators.
A named constant `REGION_SCAN_EXEC_NAME` exposes the plan node name
so downstream metric parsers remain correct if the struct is renamed.
Affected files:
- `src/table/src/table/metrics.rs` -- add `output_bytes` counter
- `src/table/src/table/scan.rs` -- record bytes, export name constant
- `src/query/src/dist_plan/merge_scan.rs` -- consume scan bytes
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
* refactor: remove misleading `mem_used` gauge from scan metrics
The `mem_used` gauge was used with `add()` on every output batch, but a
RegionScan does not hold memory across polls — each batch is yielded
immediately to the upstream operator. The gauge semantics were
incorrect (cumulative `add()` on a gauge) and the value duplicated
`output_bytes` anyway.
Affected files:
- `src/table/src/table/metrics.rs` — drop `mem_used` field and methods
- `src/table/src/table/scan.rs` — remove `record_mem_usage` call
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
* fix: use stable plan names for scan byte metrics
Expose `plan_name` alongside rendered plan text so scan byte extraction no longer depends on EXPLAIN formatting. Keep old serialized metrics compatible by defaulting missing `plan_name`.
Affected files:
- `src/common/recordbatch/src/adapter.rs` -- add `plan_name` to `PlanMetrics` and populate it from `ExecutionPlan::name`
- `src/query/src/dist_plan/merge_scan.rs` -- match `REGION_SCAN_EXEC_NAME` through `plan_name`
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
* fix: preserve merge scan CPU read cost
Record read cost whenever stream metrics are available, default scan bytes to zero, and aggregate scan output bytes across region scan nodes.
Files: `src/query/src/dist_plan/merge_scan.rs`
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
* fix: correct doc comment in `StreamMetrics`
Fix doc comment on `StreamMetrics::new()` to reference the correct
struct name (`StreamMetrics`) instead of the old `MemoryUsageMetrics`.
- `src/table/src/table/metrics.rs` — fix struct name in doc comment
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
---------
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
* feat: add password hash generation command
Signed-off-by: Dennis Zhuang <killme2008@gmail.com>
* fix: reject empty password in hash-password command
A blank plaintext password is rejected by the user provider before
verifier comparison, so a verifier generated from an empty password is
unusable. Fail fast on EOF or an empty line from --password-stdin (and on
an empty --password) instead of printing a dead verifier.
Signed-off-by: Dennis Zhuang <killme2008@gmail.com>
---------
Signed-off-by: Dennis Zhuang <killme2008@gmail.com>
Co-authored-by: Yingwen <realevenyag@gmail.com>
* feat: add per-partition timings to merge scan partition metrics
Signed-off-by: evenyag <realevenyag@gmail.com>
* fix: stop global timers for partitions with no region
Signed-off-by: evenyag <realevenyag@gmail.com>
---------
Signed-off-by: evenyag <realevenyag@gmail.com>