Commit Graph

5608 Commits

Author SHA1 Message Date
shuiyisong
669010abca chore: client_ip error logs skip internal API (#8362)
* chore: client_ip error logs skip internal API

Signed-off-by: shuiyisong <xixing.sys@gmail.com>

* fix: fmt

Signed-off-by: shuiyisong <xixing.sys@gmail.com>

* chore: use const

Signed-off-by: shuiyisong <xixing.sys@gmail.com>

* chore: use const

Signed-off-by: shuiyisong <xixing.sys@gmail.com>

---------

Signed-off-by: shuiyisong <xixing.sys@gmail.com>
2026-06-25 10:57:13 +00:00
discord9
b72465e679 feat: add remote dynamic filter metrics (#8309)
* feat: add remote dyn filter metrics

Signed-off-by: discord9 <discord9@163.com>

* fix: unify remote dyn filter payload budget

Signed-off-by: discord9 <discord9@163.com>

* style: import remote dyn filter metrics

Signed-off-by: discord9 <discord9@163.com>

* fix: separate remote dyn filter unregister metrics

Signed-off-by: discord9 <discord9@163.com>

* fix: rename remote dyn filter update outcome metric

Signed-off-by: discord9 <discord9@163.com>

---------

Signed-off-by: discord9 <discord9@163.com>
2026-06-25 10:04:33 +00:00
Ritwij Aryan Parmar
a0fcfd2110 fix: stream tables for prometheus label discovery (#8341)
Signed-off-by: Ritwij Aryan Parmar <ritwij.aryan.parmar@gmail.com>
2026-06-25 08:41:12 +00:00
Lei, HUANG
9b1672316c fix(metric-engine): report query load under physical region id (#8355)
* fix(metric-engine): report query load under physical region id

Propagate the physical region ID through the scanner, record batch
stream, and query engine so that query-load metrics (CPU time, scanned
bytes) are attributed to the correct physical region rather than always
to the logical region.

- `src/store-api/src/region_engine.rs` — add `query_load_region_id` to
  `ScannerProperties` and `set_query_load_region_id` to `RegionScanner`
  trait
- `src/mito2/src/read/seq_scan.rs`,
  `src/mito2/src/read/series_scan.rs`,
  `src/mito2/src/read/unordered_scan.rs` — implement the new trait
  method on each scanner
- `src/common/recordbatch/src/adapter.rs` — carry region id through
  `RecordBatchStreamAdapter` into `RecordBatchMetrics`
- `src/table/src/table/scan.rs` — expose region id on `RegionScanExec`
- `src/query/src/datafusion.rs` — extract region id from the physical
  plan and set it on the output stream
- `src/query/src/dist_plan/merge_scan.rs` — use metrics-contained region
  id in query-load reporting, falling back to the logical region id
- `src/metric-engine/src/engine/read.rs` — set region id on the metric
  engine scanner

Signed-off-by: Lei, HUANG <ratuthomm@gmail.com>

* fix(query): satisfy clippy for query load region id

Signed-off-by: Lei, HUANG <ratuthomm@gmail.com>

* fix(query): ignore missing query load region ids

Signed-off-by: Lei, HUANG <ratuthomm@gmail.com>

---------

Signed-off-by: Lei, HUANG <ratuthomm@gmail.com>
Co-authored-by: Lei, HUANG <ratuthomm@gmail.com>
v1.2.0-8513d8d6b-20260625-1782368704
2026-06-25 06:49:28 +00:00
Yingwen
8513d8d6b6 feat: add flow batching metrics to grafana dashboard (#8353)
Signed-off-by: evenyag <realevenyag@gmail.com>
2026-06-24 09:35:25 +00:00
jeremyhi
619460f742 test(cli): add minio export-import v2 e2e (#8314)
* test(cli): add minio export-import v2 e2e

Signed-off-by: jeremyhi <fengjiachun@gmail.com>

* test(cli): reuse e2e connection helper in minio test

Signed-off-by: jeremyhi <fengjiachun@gmail.com>

* test(cli): avoid stderr in minio e2e cleanup

Signed-off-by: jeremyhi <fengjiachun@gmail.com>

* test(cli): report minio snapshot cleanup failures

Signed-off-by: jeremyhi <fengjiachun@gmail.com>

* ci: run import-v2 minio e2e

Signed-off-by: jeremyhi <fengjiachun@gmail.com>

* test(cli): cleanup minio snapshot on failure

Signed-off-by: jeremyhi <fengjiachun@gmail.com>

---------

Signed-off-by: jeremyhi <fengjiachun@gmail.com>
2026-06-24 09:25:10 +00:00
dennis zhuang
3b8f55e490 docs(agents): add per-crate guides, architecture invariants, and generated-files list (#8346)
* docs(agents): add per-crate guides, architecture invariants, and generated-files list

Add agent/contributor navigation docs modeled on the AGENTS.md convention:

- Per-crate AGENTS.md for hot crates (mito2, metric-engine, flow, frontend,
  meta-srv): module map, read/write paths, change-coupling points, test
  commands, and gotchas.
- .agents/architecture-invariants.md: repo-wide rules that clippy and the
  style guide do not cover (format compatibility, crate layering, async
  runtimes, error handling, experimental gating, the DataFusion fork).
- .agents/generated-files.md: tool-generated artifacts that must not be
  hand-edited (sqlness .result, config.md, dashboards, build.rs output, proto).
- Anchor the .gitignore CLAUDE.md/AGENTS.md rules to the repo root so per-crate
  AGENTS.md files are tracked while root-level personal config stays ignored.

Signed-off-by: Dennis Zhuang <killme2008@gmail.com>

* chore: update crate AGENTS.md and fix config.md path

Signed-off-by: Dennis Zhuang <killme2008@gmail.com>

* docs(agents): fix DataFusion patch layout and SQL query lifecycle order

Address review feedback on #8346:

- architecture-invariants: the DataFusion sub-crates pin an exact crates.io
  version in [workspace.dependencies] and are redirected to the fork rev in
  [patch.crates-io]; the two sections hold different forms, not the same rev.
- frontend: the SQL query lifecycle runs the pre_parsing/post_parsing
  interceptors around parsing, before the per-statement permission check.

Signed-off-by: Dennis Zhuang <killme2008@gmail.com>

---------

Signed-off-by: Dennis Zhuang <killme2008@gmail.com>
2026-06-24 09:14:09 +00:00
discord9
db5a30c49b perf(mito): prune files by manifest time range (#8352)
* perf(mito): prune files by manifest time range

Signed-off-by: discord9 <discord9@163.com>

* chore(mito): address file pruning review

Signed-off-by: discord9 <discord9@163.com>

* chore(mito): remove verbose file pruning log

Signed-off-by: discord9 <discord9@163.com>

* chore(mito): expose file pruning metric

Signed-off-by: discord9 <discord9@163.com>

* chore(mito): shorten file pruning metric

Signed-off-by: discord9 <discord9@163.com>

* test(mito): cover file pruning edge cases

Signed-off-by: discord9 <discord9@163.com>

---------

Signed-off-by: discord9 <discord9@163.com>
2026-06-24 07:13:23 +00:00
discord9
758a166325 fix: include index files in GC listing (#8327)
* fix: include index files in GC listing

Signed-off-by: discord9 <discord9@163.com>

* chore: filter GC index listing to puffins

Signed-off-by: discord9 <discord9@163.com>

* chore: simplify GC index listing stream

Signed-off-by: discord9 <discord9@163.com>

---------

Signed-off-by: discord9 <discord9@163.com>
2026-06-24 04:40:10 +00:00
jeremyhi
019a913f7c fix: preserve bulk write grpc error details (#8349)
Signed-off-by: jeremyhi <fengjiachun@gmail.com>
2026-06-23 16:42:41 +00:00
discord9
9cf071808a fix(query): run optimizer rules before MergeScan (#8339)
* fix(query): push down join filters before MergeScan

Signed-off-by: discord9 <discord9@163.com>

* fix(query): run optimizer before MergeScan pushdown

Signed-off-by: discord9 <discord9@163.com>

* fix(query): narrow pre-MergeScan filter pushdown

Signed-off-by: discord9 <discord9@163.com>

* fix(query): refine pre-MergeScan optimizer prepass

Signed-off-by: discord9 <discord9@163.com>

* fix(query): satisfy predicate extractor clippy

Signed-off-by: discord9 <discord9@163.com>

* test(query): cover pre-MergeScan optimizer edges

Signed-off-by: discord9 <discord9@163.com>

* test(query): cover set comparison prepass

Signed-off-by: discord9 <discord9@163.com>

* fix(query): guard remote scan filter pushdown

Signed-off-by: discord9 <discord9@163.com>

* fix(query): preserve subquery planning errors

Signed-off-by: discord9 <discord9@163.com>

* fix(query): preserve usable scan predicates

Signed-off-by: discord9 <discord9@163.com>

* fix(query): simplify scan predicate extraction

Signed-off-by: discord9 <discord9@163.com>

* fix(query): keep scan filter extraction scoped

Signed-off-by: discord9 <discord9@163.com>

* docs(query): explain pre-MergeScan optimizer

Signed-off-by: discord9 <discord9@163.com>

---------

Signed-off-by: discord9 <discord9@163.com>
2026-06-23 12:15:40 +00:00
jeremyhi
f74dfacfad test(cli): harden incomplete import-v2 snapshot e2e (#8313)
Signed-off-by: jeremyhi <fengjiachun@gmail.com>
2026-06-23 06:55:21 +00:00
LFC
bb15690aa6 ci: tune jsonbench runner (#8345)
Signed-off-by: luofucong <luofc@foxmail.com>
2026-06-23 05:52:21 +00:00
jeremyhi
47581acbd3 test(cli): harden import-v2 schema filter e2e (#8312)
* test(cli): harden import-v2 schema filter e2e

Signed-off-by: jeremyhi <fengjiachun@gmail.com>

* test(cli): assert filtered export includes both schemas

Signed-off-by: jeremyhi <fengjiachun@gmail.com>

* test(cli): assert import-v2 schema filter result

Signed-off-by: jeremyhi <fengjiachun@gmail.com>

---------

Signed-off-by: jeremyhi <fengjiachun@gmail.com>
2026-06-22 12:23:28 +00:00
Yingwen
3ed14c9e7e chore: bump version to 1.2.0 (#8344)
Signed-off-by: evenyag <realevenyag@gmail.com>
2026-06-22 07:54:46 +00:00
Lei, HUANG
b571c93d18 fix: redact Kafka SASL password in debug output (#8337)
## Summary
- Mask `KafkaClientSaslConfig` password fields in debug output while keeping usernames visible.
- Cover metasrv WAL debug output with a regression test.

## Files
- `src/common/wal/src/config/kafka/common.rs`
- `src/common/wal/src/config.rs`

Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
2026-06-22 03:43:28 +00:00
Ning Sun
11543b016f refactor: try best to make sure hook is called on manifest update (#8329)
* refactor: try best to make sure hook is called on manifest update

* chore: minimize comments

* chore: make sure hook are merged for same region
2026-06-22 03:34:28 +00:00
Yingwen
b365e0dd95 chore: share agent release skills (#8336)
* chore: merge release-note skill changes

Signed-off-by: evenyag <realevenyag@gmail.com>

* chore: share agent skills from repo

Signed-off-by: evenyag <realevenyag@gmail.com>

* docs: simplify picked PR ordering

Signed-off-by: evenyag <realevenyag@gmail.com>

---------

Signed-off-by: evenyag <realevenyag@gmail.com>
v1.1.0-nightly-20260622
2026-06-19 05:15:16 +00:00
dependabot[bot]
0575215b70 chore(deps): bump tar from 0.4.45 to 0.4.46 (#8331)
Bumps [tar](https://github.com/composefs/tar-rs) from 0.4.45 to 0.4.46.
- [Release notes](https://github.com/composefs/tar-rs/releases)
- [Commits](https://github.com/composefs/tar-rs/compare/0.4.45...0.4.46)

---
updated-dependencies:
- dependency-name: tar
  dependency-version: 0.4.46
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-06-18 12:09:08 +00:00
dependabot[bot]
c65d1c8a88 chore(deps): bump rustls-webpki from 0.103.10 to 0.103.13 (#8330)
Bumps [rustls-webpki](https://github.com/rustls/webpki) from 0.103.10 to 0.103.13.
- [Release notes](https://github.com/rustls/webpki/releases)
- [Commits](https://github.com/rustls/webpki/compare/v/0.103.10...v/0.103.13)

---
updated-dependencies:
- dependency-name: rustls-webpki
  dependency-version: 0.103.13
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-06-18 12:08:50 +00:00
Weny Xu
e5930c5367 refactor(meta): centralize backend retry classification (#8333)
* refactor(meta): centralize backend retry hint classification

Signed-off-by: WenyXu <wenymedia@gmail.com>

* refactor(meta): refine backend retry classification

Signed-off-by: WenyXu <wenymedia@gmail.com>

* refactor(meta): address mysql retry review

Signed-off-by: WenyXu <wenymedia@gmail.com>

* refactor(meta): include mysql lock timeout in txn retry

Signed-off-by: WenyXu <wenymedia@gmail.com>

---------

Signed-off-by: WenyXu <wenymedia@gmail.com>
2026-06-18 11:07:34 +00:00
jeremyhi
a3e01e0018 test(cli): add import-v2 resume e2e coverage (#8311)
* test(cli): harden export import v2 e2e roundtrip

Signed-off-by: jeremyhi <fengjiachun@gmail.com>

* test(cli): add import-v2 resume e2e coverage

Signed-off-by: jeremyhi <fengjiachun@gmail.com>

* test(cli): make import-v2 resume replay check non-idempotent

Signed-off-by: jeremyhi <fengjiachun@gmail.com>

* test(cli): share export import e2e setup helpers

Signed-off-by: jeremyhi <fengjiachun@gmail.com>

* test(cli): assert import-v2 resume row identities

Signed-off-by: jeremyhi <fengjiachun@gmail.com>

---------

Signed-off-by: jeremyhi <fengjiachun@gmail.com>
2026-06-18 09:02:24 +00:00
Weny Xu
fbdc35a109 ci: add remote WAL logical pruning fuzz (#8307)
* feat: configure remote wal checkpoint intervals

Signed-off-by: WenyXu <wenymedia@gmail.com>

* feat: add logical pruning fuzz target

Signed-off-by: WenyXu <wenymedia@gmail.com>

* ci: add remote WAL logical pruning fuzz

Signed-off-by: WenyXu <wenymedia@gmail.com>

* chore: apply suggestions

Signed-off-by: WenyXu <wenymedia@gmail.com>

---------

Signed-off-by: WenyXu <wenymedia@gmail.com>
2026-06-18 08:24:47 +00:00
jeremyhi
3e36778b0e test(cli): harden export/import-v2 e2e roundtrip (#8310)
test(cli): harden export import v2 e2e roundtrip

Signed-off-by: jeremyhi <fengjiachun@gmail.com>
2026-06-18 07:41:54 +00:00
LFC
23a9e6243e ci: compare jsonbench insert results (#8326)
* collect JSONBench load data result file

Signed-off-by: luofucong <luofc@foxmail.com>

* report JSONBench load data result

Signed-off-by: luofucong <luofc@foxmail.com>

* fix jsonbench insert median label

---------

Signed-off-by: luofucong <luofc@foxmail.com>
2026-06-18 06:31:31 +00:00
Copilot
58c85dd1ce fix: guard structured JSON alignment paths against legacy JSONB columns (#8323)
* Initial plan

* fix: guard structured json alignment to fix Clippy CI failure

- Add `is_structured_json_field` function that only returns true for
  fields with both JSON extension type AND Struct Arrow data type
- Replace all usages of `is_json_extension_type` / `has_json_extension_field`
  with `is_structured_json_field` to prevent legacy JSONB binary columns
  from entering structured JSON alignment paths
- Fix logic in `FlatProjectionMapper::new_with_read_columns` to guard
  JSON type hint concretization for JSON2 columns only
- Fix `create_column` in show_create_table.rs to only emit JSON structure
  settings for JSON2 columns
- Move `mod tests` to end of flat_projection.rs to fix clippy::items_after_test_module
- Add tests for legacy JSON behavior

* fix: guard narrow_read_columns_by_json_type_hint with is_structured_json_field check

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
2026-06-18 05:27:56 +00:00
Lei, HUANG
a3461caf9d feat: expose region read load metrics (#8316)
* feat: expose region read load through Prometheus metrics and heartbeat

Introduce region-level query load tracking (CPU time and scanned bytes)
collected by `RegionScanExec`, exposed via Prometheus metrics and optionally
reported through heartbeat region stats.

- **Region metrics** (`src/mito2/src/metrics.rs`, `src/store-api/src/metrics.rs`): Add
  `greptime_mito_region_query_cpu_time`, `greptime_mito_region_query_scanned_bytes`,
  and `greptime_mito_region_written_bytes_since_open` gauge metrics.
- **MitoRegion** (`src/mito2/src/region.rs`, `src/mito2/src/region/opener.rs`,
  `src/mito2/src/region_write_ctx.rs`): Replace `AtomicU64` `written_bytes` with
  `IntGauge`; add `query_cpu_time`/`query_scanned_bytes` fields with lifecycle
  management (init, reset, remove-on-drop).
- **RegionStatistic** (`src/store-api/src/region_engine.rs`,
  `src/store-api/src/storage/requests.rs`): Add `query_cpu_time` and
  `query_scanned_bytes` fields.
- **Metric-engine** (`src/metric-engine/src/utils.rs`): Aggregate query load from
  metadata and data regions.
- **Heartbeat** (`src/datanode/src/heartbeat.rs`,
  `src/common/meta/src/datanode.rs`): Relay region query load via heartbeat
  `RegionStat`; add test.
- **Query engine** (`src/query/src/options.rs`,
  `src/query/src/query_engine/state.rs`, `src/query/src/datafusion.rs`,
  `src/query/src/dist_plan/merge_scan.rs`,
  `src/query/src/dist_plan/analyzer.rs`,
  `src/query/src/dummy_catalog.rs`): Add `enable_region_query_load_report` config;
  wire `RegionScanExec` to accumulate CPU time and scanned bytes.
- **Table scan** (`src/table/src/table/scan.rs`,
  `src/table/src/table/metrics.rs`): Wire table scan metrics.
- **Config** (`config/standalone.example.toml`, `config/datanode.example.toml`,
  `config/frontend.example.toml`, `config/config.md`): Add example config and
  documentation for `enable_region_query_load_report`.
- **Tests** (`src/mito2/src/engine/basic_test.rs`,
  `src/mito2/src/engine/close_test.rs`,
  `src/cmd/tests/load_config_test.rs`,
  `src/flow/src/adapter.rs`): Add unit tests for region query load reporting
  and metric cleanup on region close; set default config values.

Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>

* feat: move region read load report config from query layer to mito engine

Move the `enable_region_query_load_report` setting from query-level config
(`QueryOptions`/`DistPlannerOptions`) into the mito2 storage engine config
(`MitoConfig`), and expose it through the `RegionScanner` trait instead
of `ScanRequest`/`PrepareRequest`.

- Mito config: `src/mito2/src/config.rs`, `src/mito2/src/engine.rs`
- Scan region plumbing: `src/mito2/src/read/scan_region.rs`
- RegionScanner trait: `src/store-api/src/region_engine.rs`
- Scanner impls: `src/mito2/src/read/seq_scan.rs`, `src/mito2/src/read/series_scan.rs`, `src/mito2/src/read/unordered_scan.rs`
- RegionScanExec: `src/table/src/table/scan.rs`
- Removed from query layer: `src/query/src/options.rs`, `src/query/src/dist_plan/analyzer.rs`, `src/query/src/query_engine/state.rs`, `src/query/src/datafusion.rs`, `src/query/src/dummy_catalog.rs`
- Removed from test/config: `src/query/src/dist_plan/analyzer/test.rs`, `src/flow/src/adapter.rs`, `src/cmd/tests/load_config_test.rs`, `src/store-api/src/storage/requests.rs`
- Config docs: `config/config.md`, `config/datanode.example.toml`, `config/frontend.example.toml`, `config/standalone.example.toml`

Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>

* feat: move region query load report config from MitoConfig to LoggingOptions

Relocate the `enable_region_query_load_report` setting from
`MitoConfig` to `LoggingOptions` (as `enable_per_region_metrics`),
and thread it into `MitoEngineBuilder` instead of reading from
the engine config directly. This makes the region read-load
reporting a per-node logging/observability concern rather than
a per-engine storage setting.

- `config/config.md`
- `config/datanode.example.toml`
- `config/standalone.example.toml`
- `src/common/telemetry/src/logging.rs`
- `src/datanode/src/datanode.rs`
- `src/mito2/src/config.rs`
- `src/mito2/src/engine.rs`
- `src/mito2/src/region.rs`

Signed-off-by: Lei Huang <lei@huang.to>
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>

* feat: report region query load on stream drop instead of stream end

Move `report_region_query_load()` from `StreamWithMetricWrapper::poll_next()`
to `Drop::drop()` so that region query load is reported even when the
stream is dropped prematurely (not just when fully consumed).

Affected files:
- `src/table/src/table/scan.rs`

Signed-off-by: Lei, Huang <huanglei@qiyi.com>
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>

* feat: make region query load reporting configurable

Introduce `enable_region_query_load_report` flag to optionally report
per-region `query_cpu_time` and `query_scanned_bytes` metrics instead
of always creating them. When disabled, the Prometheus gauges are not
created (`None`), avoiding metric churn for workloads that do not
need query-level load tracking.

- `src/common/meta/src/datanode.rs` — Placeholder fields for query load
- `src/mito2/src/region.rs` — Make query metrics `Option<IntGauge>`, conditional create/remove/reset
- `src/mito2/src/region/opener.rs` — Thread flag through `RegionOpener`
- `src/mito2/src/worker.rs` — Thread flag through `WorkerGroup`/`WorkerStarter`/`RegionWorkerLoop`
- `src/mito2/src/worker/handle_catchup.rs` — Pass flag on region open
- `src/mito2/src/worker/handle_create.rs` — Pass flag on region create
- `src/mito2/src/worker/handle_open.rs` — Pass flag on region open
- `src/mito2/src/engine.rs` — Pass flag from `MitoEngineBuilder`
- `src/mito2/src/test_util.rs` — Test helpers for both modes
- `src/mito2/src/engine/basic_test.rs` — Cover disabled and preserve cases
- `src/mito2/src/engine/close_test.rs` — Adapt to optional metrics

Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>

* refactor: remove elapsed_compute metric from scan stream

The elapsed_compute metric conflated poll-wait time with actual CPU
computation, making it misleading. Removed the metric and its
recording path from StreamMetrics and StreamWithMetricWrapper.

Added a test asserting that poll duration is not reported as
elapsed_compute.

- `src/table/src/table/metrics.rs` — removed elapsed_compute field,
  builder, and record_elapsed_compute method
- `src/table/src/table/scan.rs` — removed record_elapsed_compute
  call; added SlowRecordBatchStream test helper and
  wrapper_poll_time_is_not_elapsed_compute test

Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>

* feat: disable region query load report for compaction scans

Compaction scans are internal operations initiated by the engine,
not user queries. Disable region query load reporting when the
scan input is marked as compaction to avoid misleading load metrics.

- `src/mito2/src/read/scan_region.rs` — set `enable_region_query_load_report`
  to `false` when compaction is enabled; add unit test

Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>

* test: add `enable_per_region_metrics` config to HTTP integration test

- Enable per-region metrics config in HTTP test setup

\`tests-integration/tests/http.rs\`

Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>

* refactor: remove region query load reporting tests and helpers

Remove the region query load reporting feature from the codebase,
including tests, test utilities, and helper infrastructure that were
part of this now-deprecated functionality.

Specifically:

- Remove region query load reporting tests from
  `src/mito2/src/engine/basic_test.rs` and
  `src/table/src/table/scan.rs`, and the region close metrics test
  from `src/mito2/src/engine/close_test.rs`

- Remove region query load report test utilities and simplify engine
  construction helpers in `src/mito2/src/test_util.rs`

Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>

* perf: avoid disabled region query load timing

Summary:
- Avoid per-poll `Instant::now` and elapsed-time accumulation when `enable_region_query_load_report` is disabled.
- Keep region query-load CPU accounting active only when reporting is enabled.

Files:
- `src/table/src/table/scan.rs`

Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>

* feat: move per-region query load reporting from storage to query engine

Move `enable_per_region_metrics` from datanode to frontend config and
migrate query load tracking (CPU time, scanned bytes) from mito2
storage engine to the query engine's distributed scan planner. The
storage-level metrics plumbing and `enable_region_query_load_report`
flag are removed from mito2, `ScanInput`, `ScanRegion`, and
`RegionScanner`. Query-level metrics are now collected in
`merge_scan.rs` via `scan_region_load`.

- `src/mito2/` -- Remove `query_cpu_time`, `query_scanned_bytes`
  metrics, `enable_region_query_load_report` plumbing from engine,
  region, opener, scanner types, workers
- `src/store-api/` -- Remove `query_cpu_time`, `query_scanned_bytes`
  from `RegionStatistic`
- `src/metric-engine/` -- Remove query load fields from
  `get_region_statistic`
- `src/query/` -- Add `enable_per_region_metrics` to `QueryOptions`;
  wire through planner, optimizer, merge scan with `scan_region_load`
  metrics
- `src/frontend/` -- Pass `enable_per_region_metrics` into
  `QueryOptions`
- `src/common/meta/` -- Remove TODO for query load fields
- `config/` -- Move `enable_per_region_metrics` from datanode to
  frontend and standalone example configs
- `src/cmd/tests/` -- Add `enable_per_region_metrics` to flownode
  config test
- `src/flow/` -- Add `enable_per_region_metrics` default to flownode
  options
- `src/table/` -- Remove unused query load fields from scan
- `src/datanode/` -- Remove
  `with_enable_region_query_load_report` calls

Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>

* refactor: remove obsolete mito write load metric

Remove obsolete mito-side region written-bytes metric plumbing that is not needed by the frontend read-load reporting path.

Related files:
- \`src/mito2/src/metrics.rs\`
- \`src/mito2/src/region.rs\`
- \`src/mito2/src/region/opener.rs\`
- \`src/mito2/src/region_write_ctx.rs\`
- \`src/mito2/src/engine/basic_test.rs\`
- \`src/mito2/src/worker.rs\`
- \`src/mito2/src/config.rs\`

Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>

* feat: change region query load metrics from gauge to counter

Change `REGION_QUERY_CPU_TIME` and `REGION_QUERY_SCANNED_BYTES` from
`IntGaugeVec` to `IntCounterVec` since these values are monotonically
increasing and do not need gauge semantics. Update corresponding `add`
calls to `inc_by` in merge scan reporting.

Files:
- `src/store-api/src/metrics.rs` — metric type and label changes
- `src/query/src/dist_plan/merge_scan.rs` — caller adaptation

Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>

* refactor: pass ReadItem directly to report_region_query_load

Move `region_scan_load` call to the caller, so `report_region_query_load`
accepts the already-computed `ReadItem` instead of `RecordBatchMetrics`.

- `src/query/src/dist_plan/merge_scan.rs` — update signature, inline call,
  remove stale test

Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>

* feat: ensure region query load is reported on MergeScanExec drop

Remove the `enable_per_region_metrics` parameter from `report_region_query_load`
so region load metrics are always emitted. Add a `Drop` impl for
`MergeScanExec` that reports sub-stage metrics when the executor is
dropped, covering edge cases where per-region metric emission was
missed. Add a unit test verifying CPU time and scanned bytes are
recorded on drop.

Affected file: `src/query/src/dist_plan/merge_scan.rs`

Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>

* fix: gate region query load reporting

Guard drop-time region query load reporting with the configured per-region metrics flag.

Related files:
- \`src/query/src/dist_plan/merge_scan.rs\`

Symbols:
- \`MergeScanExec::drop\`
- \`enable_per_region_metrics\`

Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>

* fix: clean region query load metrics on drop

Remove per-region query load metric labels when a region is dropped so stale label series do not remain in the registry.

Related files:
- \`src/mito2/src/region.rs\`

Symbols:
- \`MitoRegion::drop\`
- \`REGION_QUERY_CPU_TIME\`
- \`REGION_QUERY_SCANNED_BYTES\`

Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>

---------

Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
Signed-off-by: Lei Huang <lei@huang.to>
Signed-off-by: Lei, Huang <huanglei@qiyi.com>
2026-06-17 16:03:09 +00:00
Weny Xu
e520ff8300 feat: decouple error retryability from status codes (#8301)
* feat: add retry hint to common error

Signed-off-by: WenyXu <wenymedia@gmail.com>

* feat: propagate retry hints in core errors

Signed-off-by: WenyXu <wenymedia@gmail.com>

* fix: propagate retry hint through RPC metadata

Signed-off-by: WenyXu <wenymedia@gmail.com>

* fix: fallback retry hints to status codes

Signed-off-by: WenyXu <wenymedia@gmail.com>

* fix: add explicit retry hints for retryable errors

Signed-off-by: WenyXu <wenymedia@gmail.com>

* refactor: remove status code retry fallback

Signed-off-by: WenyXu <wenymedia@gmail.com>

* fix: classify io retry hints

Signed-off-by: WenyXu <wenymedia@gmail.com>

* fix: preserve retry hints across error wrappers

Signed-off-by: WenyXu <wenymedia@gmail.com>

* refactor: minior

Signed-off-by: WenyXu <wenymedia@gmail.com>

* fix: preserve datasource retry hints in query

Signed-off-by: WenyXu <wenymedia@gmail.com>

* fix: preserve metric engine retry hints

Signed-off-by: WenyXu <wenymedia@gmail.com>

* fix: preserve flow and frontend retry hints

Signed-off-by: WenyXu <wenymedia@gmail.com>

* fix: share rskafka retry hint mapping

Signed-off-by: WenyXu <wenymedia@gmail.com>

---------

Signed-off-by: WenyXu <wenymedia@gmail.com>
2026-06-17 08:27:53 +00:00
discord9
a3cbfdb983 fix: remove flow sink table lock (#8317)
Signed-off-by: discord9 <discord9@163.com>
2026-06-17 06:05:49 +00:00
jeremyhi
fd6c9e710d feat(cli): allow overriding import-v2 state path (#8302)
Signed-off-by: jeremyhi <fengjiachun@gmail.com>
2026-06-16 09:53:19 +00:00
Weny Xu
b01ee594f3 feat: add repartition column hint (#8291)
* feat: add repartition column hint option

Signed-off-by: WenyXu <wenymedia@gmail.com>

* feat: support altering repartition column hint

Signed-off-by: WenyXu <wenymedia@gmail.com>

* chore: update sqlness result

Signed-off-by: WenyXu <wenymedia@gmail.com>

* fix: reject time index repartition hint

Signed-off-by: WenyXu <wenymedia@gmail.com>

* refactor: treat rename table as metadata-only alter

Signed-off-by: WenyXu <wenymedia@gmail.com>

---------

Signed-off-by: WenyXu <wenymedia@gmail.com>
2026-06-16 09:04:52 +00:00
ZonaHe
71b0de6fe5 feat: update dashboard to v0.13.1 (#8306)
Co-authored-by: sunchanglong <sunchanglong@users.noreply.github.com>
2026-06-16 07:24:34 +00:00
Yingwen
9ed84cc26f fix: improve Grafana metrics dashboards (#8298)
* chore: initial changes

Signed-off-by: evenyag <realevenyag@gmail.com>

* feat: improve troubleshooting dashboard

Signed-off-by: evenyag <realevenyag@gmail.com>

* chore: rm troubleshooting-dashboard.md

Signed-off-by: evenyag <realevenyag@gmail.com>

* chore: optimize metrics dashboard

Signed-off-by: evenyag <realevenyag@gmail.com>

* docs: move troubleshooting-dashboard.md

Signed-off-by: evenyag <realevenyag@gmail.com>

* chore: move mito gc duration panel

Signed-off-by: evenyag <realevenyag@gmail.com>

* chore: cleanup the dashboard

- Overview trend panels are now aggregate-only:
    - Total Ingestion Rate Trend
    - Total Query Rate Trend

- Protocol breakdowns remain in Ingestion and Queries.
- Mito Backpressure and Failures no longer duplicates scan/GC signals.
- Removed Write Stall per Instance.
- Split Object Store and WAL into collapsed Object Store and collapsed
  WAL.
- Moved WAL/logstore panels out of Storage into WAL.
- Normalized OpenDAL “other request” matchers.
- Normalized trigger elapsed p99/p75/avg aggregation.
- Regenerated standalone JSON and dashboard YAML/Markdown.
- Updated docs/troubleshooting-dashboard.md.

Signed-off-by: evenyag <realevenyag@gmail.com>

* fix: rearrange metasrv dashboard panels

Signed-off-by: evenyag <realevenyag@gmail.com>

* feat: improve troubleshooting dashboard layout

Signed-off-by: evenyag <realevenyag@gmail.com>

* docs: remove obsolete troubleshooting dashboard doc

Signed-off-by: evenyag <realevenyag@gmail.com>

* fix: correct cluster dashboard panel queries (missing _bucket, raw counters, rate normalization)

Signed-off-by: evenyag <realevenyag@gmail.com>

* fix: correct trigger panel datasource, collapse flush/compaction, split request latency panels

Signed-off-by: evenyag <realevenyag@gmail.com>

* fix: update grafana metrics dashboard panels

Signed-off-by: evenyag <realevenyag@gmail.com>

* fix: correct Grafana dashboard units

Signed-off-by: evenyag <realevenyag@gmail.com>

* chore: regenerate Grafana dashboards

Signed-off-by: evenyag <realevenyag@gmail.com>

* fix: use throughput unit for index IO bytes

Signed-off-by: evenyag <realevenyag@gmail.com>

---------

Signed-off-by: evenyag <realevenyag@gmail.com>
2026-06-16 07:13:04 +00:00
Yingwen
1adee3cf47 chore: add release runbook skills for maintainers (#8296)
* chore: add release runbook skills for maintainers

Signed-off-by: evenyag <realevenyag@gmail.com>

* chore: address release-note skill review comments

Signed-off-by: evenyag <realevenyag@gmail.com>

* chore: fetch release ref before checking Cargo.toml

Signed-off-by: evenyag <realevenyag@gmail.com>

* chore: read FETCH_HEAD for release version check

Signed-off-by: evenyag <realevenyag@gmail.com>

---------

Signed-off-by: evenyag <realevenyag@gmail.com>
2026-06-16 06:13:30 +00:00
Lei, HUANG
4373c77d35 feat: use scan output bytes for read costing (#8276)
* feat: track scan output bytes and use them for read costing

Track the actual byte count of each record batch produced by
`RegionScanExec` and use it in place of the aggregated per-plan
`memory_usage` as the `table_scan` cost input. This avoids double
counting bytes that flow through multiple operators.

A named constant `REGION_SCAN_EXEC_NAME` exposes the plan node name
so downstream metric parsers remain correct if the struct is renamed.

Affected files:
- `src/table/src/table/metrics.rs` -- add `output_bytes` counter
- `src/table/src/table/scan.rs` -- record bytes, export name constant
- `src/query/src/dist_plan/merge_scan.rs` -- consume scan bytes

Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>

* refactor: remove misleading `mem_used` gauge from scan metrics

The `mem_used` gauge was used with `add()` on every output batch, but a
RegionScan does not hold memory across polls — each batch is yielded
immediately to the upstream operator. The gauge semantics were
incorrect (cumulative `add()` on a gauge) and the value duplicated
`output_bytes` anyway.

Affected files:
- `src/table/src/table/metrics.rs` — drop `mem_used` field and methods
- `src/table/src/table/scan.rs` — remove `record_mem_usage` call

Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>

* fix: use stable plan names for scan byte metrics

Expose `plan_name` alongside rendered plan text so scan byte extraction no longer depends on EXPLAIN formatting. Keep old serialized metrics compatible by defaulting missing `plan_name`.

Affected files:
- `src/common/recordbatch/src/adapter.rs` -- add `plan_name` to `PlanMetrics` and populate it from `ExecutionPlan::name`
- `src/query/src/dist_plan/merge_scan.rs` -- match `REGION_SCAN_EXEC_NAME` through `plan_name`

Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>

* fix: preserve merge scan CPU read cost

Record read cost whenever stream metrics are available, default scan bytes to zero, and aggregate scan output bytes across region scan nodes.

Files: `src/query/src/dist_plan/merge_scan.rs`
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>

* fix: correct doc comment in `StreamMetrics`

Fix doc comment on `StreamMetrics::new()` to reference the correct
struct name (`StreamMetrics`) instead of the old `MemoryUsageMetrics`.

- `src/table/src/table/metrics.rs` — fix struct name in doc comment

Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>

---------

Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
2026-06-16 02:58:13 +00:00
jeremyhi
c22f5d741b feat(cli): add import-v2 task parallelism (#8300)
Signed-off-by: jeremyhi <fengjiachun@gmail.com>
2026-06-16 02:55:52 +00:00
jeremyhi
90606af070 feat(cli): add export-v2 progress reporting (#8294)
Signed-off-by: jeremyhi <fengjiachun@gmail.com>
2026-06-15 09:56:31 +00:00
Weny Xu
15597047c9 feat: pass Kafka pruned entry id when creating regions (#8282)
* refactor: use typed region wal options

Signed-off-by: WenyXu <wenymedia@gmail.com>

* feat: pass kafka pruned entry id to regions

Signed-off-by: WenyXu <wenymedia@gmail.com>

* chore: remove unused error

Signed-off-by: WenyXu <wenymedia@gmail.com>

* refactor: clarify wal options serialization errors

Signed-off-by: WenyXu <wenymedia@gmail.com>

* test: cover legacy region wal options encoding

Signed-off-by: WenyXu <wenymedia@gmail.com>

* fix: recover legacy create table wal options

Signed-off-by: WenyXu <wenymedia@gmail.com>

* fix: lock remote wal during table creation

Signed-off-by: WenyXu <wenymedia@gmail.com>

* fix: fix unit tests-s

Signed-off-by: WenyXu <wenymedia@gmail.com>

* fix: refresh remote wal prune hints

Signed-off-by: WenyXu <wenymedia@gmail.com>

---------

Signed-off-by: WenyXu <wenymedia@gmail.com>
2026-06-15 08:51:50 +00:00
jeremyhi
336e4cdb84 feat(cli): add export-v2 chunk parallelism (#8292)
Signed-off-by: jeremyhi <fengjiachun@gmail.com>
2026-06-15 06:00:53 +00:00
dennis zhuang
c9062fe042 feat: add password hash generation command (#8288)
* feat: add password hash generation command

Signed-off-by: Dennis Zhuang <killme2008@gmail.com>

* fix: reject empty password in hash-password command

A blank plaintext password is rejected by the user provider before
verifier comparison, so a verifier generated from an empty password is
unusable. Fail fast on EOF or an empty line from --password-stdin (and on
an empty --password) instead of printing a dead verifier.

Signed-off-by: Dennis Zhuang <killme2008@gmail.com>

---------

Signed-off-by: Dennis Zhuang <killme2008@gmail.com>
Co-authored-by: Yingwen <realevenyag@gmail.com>
v1.1.0 v1.1.0-nightly-20260615
2026-06-12 13:28:38 +00:00
discord9
f6f56e6bc5 fix!: fence scoped flow repair snapshots (#8277)
* fix: fence scoped flow repair snapshots

Signed-off-by: discord9 <discord9@163.com>

* test: trim duplicate flow task tests

Signed-off-by: discord9 <discord9@163.com>

* test: moreless

Signed-off-by: discord9 <discord9@163.com>

* refactor: simplify flow query failure fallback

Signed-off-by: discord9 <discord9@163.com>

* test: update need eval interval

Signed-off-by: discord9 <discord9@163.com>

* docs: for helper fn

Signed-off-by: discord9 <discord9@163.com>

* chore: less test

Signed-off-by: discord9 <discord9@163.com>

* refactor: per review

Signed-off-by: discord9 <discord9@163.com>

* test: update for eval interval

Signed-off-by: discord9 <discord9@163.com>

* fix: consume dirty windows after successful query

Signed-off-by: discord9 <discord9@163.com>

* test: rm useless tests

Signed-off-by: discord9 <discord9@163.com>

* test: standalone seq&rm dead if

Signed-off-by: discord9 <discord9@163.com>

* chore: move to pending window instead

Signed-off-by: discord9 <discord9@163.com>

* chore: mark full also call abandon_fenced_repair

Signed-off-by: discord9 <discord9@163.com>

* chore: instant for pending fenced repair

Signed-off-by: discord9 <discord9@163.com>

---------

Signed-off-by: discord9 <discord9@163.com>
2026-06-12 13:27:12 +00:00
Yingwen
4a4fe749d8 feat: add per-partition timings to merge scan partition metrics (#8293)
* feat: add per-partition timings to merge scan partition metrics

Signed-off-by: evenyag <realevenyag@gmail.com>

* fix: stop global timers for partitions with no region

Signed-off-by: evenyag <realevenyag@gmail.com>

---------

Signed-off-by: evenyag <realevenyag@gmail.com>
2026-06-12 13:22:49 +00:00
Yingwen
d34d4c1aba fix(mito): collect index apply metrics in pruner verbose mode (#8290)
Signed-off-by: evenyag <realevenyag@gmail.com>
2026-06-12 11:29:18 +00:00
discord9
7ec89dcf80 test: redact dynamic filter in order_by sqlness (#8286)
Signed-off-by: discord9 <discord9@163.com>
2026-06-12 11:14:27 +00:00
Weny Xu
1ca29c3481 feat: support create region requirements (#8281)
* feat: support create region requirements

Signed-off-by: WenyXu <wenymedia@gmail.com>

* feat: require object storage for repartition creates

Signed-off-by: WenyXu <wenymedia@gmail.com>

* chore: upgrade proto

Signed-off-by: WenyXu <wenymedia@gmail.com>

---------

Signed-off-by: WenyXu <wenymedia@gmail.com>
2026-06-12 11:12:41 +00:00
jeremyhi
34c4cae9bb feat(cli): add import-v2 progress UI (#8289)
Signed-off-by: jeremyhi <fengjiachun@gmail.com>
2026-06-12 09:45:45 +00:00
Weny Xu
61f802cdf2 feat: support repartition with target partition columns (#8278)
* feat(sql): support repartition on columns syntax

Signed-off-by: WenyXu <wenymedia@gmail.com>

* feat(meta): support repartition target partition columns

Signed-off-by: WenyXu <wenymedia@gmail.com>

* feat(operator): wire repartition target columns

Signed-off-by: WenyXu <wenymedia@gmail.com>

* fix(operator): validate repartition target columns

Signed-off-by: WenyXu <wenymedia@gmail.com>

* test: cover repartition on columns integration

Signed-off-by: WenyXu <wenymedia@gmail.com>

* fix(operator): validate duplicate target partition columns

Signed-off-by: WenyXu <wenymedia@gmail.com>

* chore: upgrade proto

Signed-off-by: WenyXu <wenymedia@gmail.com>

* fix: fix unit tests

Signed-off-by: WenyXu <wenymedia@gmail.com>

---------

Signed-off-by: WenyXu <wenymedia@gmail.com>
2026-06-12 07:36:53 +00:00
jeremyhi
c567a6562a feat(cli): add import-v2 progress mode (#8283)
Signed-off-by: jeremyhi <fengjiachun@gmail.com>
2026-06-12 06:13:27 +00:00
Yingwen
770dca45a4 fix: pin tonic below 0.14.5 to avoid max connection age panic (#8265) (#8287)
Signed-off-by: evenyag <realevenyag@gmail.com>
2026-06-12 05:46:55 +00:00
Ning Sun
ba89ef40f4 feat: add field_id for internal fields (#8284)
* feat: add field_id for internal fields

* test: address file size change after including field ids

* fix: address review comments
2026-06-12 03:10:24 +00:00