greptimedb

mirror of https://github.com/GreptimeTeam/greptimedb.git synced 2026-07-03 20:40:37 +00:00

Author	SHA1	Message	Date
discord9	55852a05b8	feat: stream explain analyze metrics over http (#8380 ) * feat: stream explain analyze metrics over http Signed-off-by: discord9 <discord9@163.com> * fix: address analyze stream review comments Signed-off-by: discord9 <discord9@163.com> * test: document analyze stream protocol Signed-off-by: discord9 <discord9@163.com> * test: update config api expectation Signed-off-by: discord9 <discord9@163.com> * fix: track slow queries for analyze stream Signed-off-by: discord9 <discord9@163.com> --------- Signed-off-by: discord9 <discord9@163.com>	2026-07-02 06:24:53 +00:00
Lei, HUANG	8a11ee9462	fix: record catalog and schema in slow queries (#8387 ) * fix: record catalog and schema in slow queries Add catalog and schema context to slow query records while appending the new columns after existing fields to preserve column order. - `src/common/frontend/src/slow_query_event.rs`: extend `SlowQueryEvent` schema and rows with `catalog_name` and `schema_name`, and cover append-only ordering. - `src/catalog/src/process_manager.rs`: carry catalog and schema through `SlowQueryTimer`. - `src/frontend/src/instance.rs`: capture context for SQL, plan, and PromQL slow query timers. - `tests-integration/tests/sql.rs`: assert MySQL and PostgreSQL slow query records include catalog and schema. Signed-off-by: Lei, HUANG <ratuthomm@gmail.com> * fix: address slow query review comment Use `String::clone` when writing slow query catalog and schema values. Signed-off-by: Lei, HUANG <ratuthomm@gmail.com> * fix: keep slow query schema only Remove the slow query `catalog_name` column and keep `schema_name` as a non-null tag dimension. - `src/common/frontend/src/slow_query_event.rs`: expose only `schema_name` in `SlowQueryEvent` rows and mark it as a tag. - `src/catalog/src/process_manager.rs`: stop carrying catalog context in `SlowQueryTimer`. - `src/frontend/src/instance.rs`: pass only schema context to slow query timers. - `tests-integration/tests/sql.rs`: assert slow query records include `schema_name` without `catalog_name`. Signed-off-by: Lei, HUANG <ratuthomm@gmail.com> * fix: schema name semantic should be field Signed-off-by: Lei, HUANG <ratuthomm@gmail.com> * fix: typo Signed-off-by: Lei, HUANG <ratuthomm@gmail.com> --------- Signed-off-by: Lei, HUANG <ratuthomm@gmail.com>	2026-07-01 09:05:51 +00:00
fys	aa563628fb	feat(json2): type hint (#8247 ) * feat(json2): type hint * test(datatypes): add JsonSettings serde round-trip test * minor refactor This reverts commit 7ff5a5249a09be5396536284fe822b5761ef4e6a. * fix: code review	2026-06-30 07:37:31 +00:00
Yvan Wang	7c1009c00f	feat: accept x-greptime-pipeline-name header on /events/logs (#8371 ) * feat: accept x-greptime-pipeline-name header on /events/logs The /events/logs (and /logs/ingest) endpoint previously only read the pipeline name from the `pipeline_name` query parameter, while the OTLP/Elasticsearch/Splunk log ingestion endpoints already accept it via the `x-greptime-pipeline-name` header. This inconsistency is unfriendly to users. Make `log_ingester` resolve the pipeline name from the `x-greptime-pipeline-name` header (and the deprecated `x-greptime-log-pipeline-name`), falling back to the query parameter. The header takes precedence, consistent with how other pipeline options (e.g. `x-greptime-pipeline-params`) outrank their query-parameter counterparts. Closes #6095 Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Signed-off-by: BootstrapperSBL <yvanwww01@gmail.com> * address review: prefer non-deprecated pipeline-name header When both pipeline-name headers are present, resolve the non-deprecated `x-greptime-pipeline-name` before the deprecated `x-greptime-log-pipeline-name`, and cover the precedence with a test. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Signed-off-by: BootstrapperSBL <yvanwww01@gmail.com> --------- Signed-off-by: BootstrapperSBL <yvanwww01@gmail.com> Co-authored-by: BootstrapperSBL <yvanwww01@gmail.com> Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-29 07:08:44 +00:00
shuiyisong	59d40c39b1	feat: `prw_v2` initial commit with sample ingestion (#8361 ) * chore: add v2 entrance Signed-off-by: shuiyisong <xixing.sys@gmail.com> * chore: decode request Signed-off-by: shuiyisong <xixing.sys@gmail.com> * chore: implement remote write v2 for samples Signed-off-by: shuiyisong <xixing.sys@gmail.com> * chore: add tests Signed-off-by: shuiyisong <xixing.sys@gmail.com> * chore: remove hand-written proto Signed-off-by: shuiyisong <xixing.sys@gmail.com> * chore: refactor Signed-off-by: shuiyisong <xixing.sys@gmail.com> * fix: CR issue Signed-off-by: shuiyisong <xixing.sys@gmail.com> * fix: CR issue Signed-off-by: shuiyisong <xixing.sys@gmail.com> * fix: merge tests Signed-off-by: shuiyisong <xixing.sys@gmail.com> * chore: add source version field Signed-off-by: shuiyisong <xixing.sys@gmail.com> * chore: fix CR issues Signed-off-by: shuiyisong <xixing.sys@gmail.com> * fix: test Signed-off-by: shuiyisong <xixing.sys@gmail.com> * fix: sqlness Signed-off-by: shuiyisong <xixing.sys@gmail.com> * chore: update proto Signed-off-by: shuiyisong <xixing.sys@gmail.com> * chore: update proto Signed-off-by: shuiyisong <xixing.sys@gmail.com> --------- Signed-off-by: shuiyisong <xixing.sys@gmail.com>	2026-06-29 03:15:41 +00:00
fys	cce70b9427	fix(mito): failed to compact memtable with json2 (#8297 ) * fix(json2): failed to compact memtable * fix: cargo clippy * refactor: align schema with json2 filed in flush * chore: add unit test for json aligner * chore: add json2 integration test * fix: cr by codex * fix: use parquet schema for encoded JSON2 memtable parts * Use is_structured_json_field to determine whether the field is of JSON2 type. * fix: cargo clippy * fix: only align structured json fields * chore: assert bulk JSON2 aligner input schemas in debug	2026-06-26 08:52:06 +00:00
Yash Agrawal	daed55e836	feat: Add support for splunk HEC endpoints (#8321 ) * feat: add health endpoint for splunk HEC Signed-off-by: yash <yasha4658@gmail.com> * feat: add event endpoint and handler + tests Signed-off-by: yash <yasha4658@gmail.com> * feat: add pipeline override test Signed-off-by: yash <yasha4658@gmail.com> * feat: add error response for auth errors Signed-off-by: yash <yasha4658@gmail.com> * fix: cargo fmt + doc example test Signed-off-by: yash <yasha4658@gmail.com> * fix: clippy warnings Signed-off-by: yash <yasha4658@gmail.com> * fix: address comments Signed-off-by: yash <yasha4658@gmail.com> * fix: address copilot comments and default table name inlining Signed-off-by: yash <yasha4658@gmail.com> * fix: cargo fmt Signed-off-by: yash <yasha4658@gmail.com> * fix: add error code 12, 13 and fix the timestamp override bug Signed-off-by: yash <yasha4658@gmail.com> * fix: add host/source/sourcetype ordering Signed-off-by: yash <yasha4658@gmail.com> * fix: fix typos Signed-off-by: yash <yasha4658@gmail.com> --------- Signed-off-by: yash <yasha4658@gmail.com>	2026-06-26 04:03:41 +00:00
Lei, HUANG	a3461caf9d	feat: expose region read load metrics (#8316 ) * feat: expose region read load through Prometheus metrics and heartbeat Introduce region-level query load tracking (CPU time and scanned bytes) collected by `RegionScanExec`, exposed via Prometheus metrics and optionally reported through heartbeat region stats. - Region metrics (`src/mito2/src/metrics.rs`, `src/store-api/src/metrics.rs`): Add `greptime_mito_region_query_cpu_time`, `greptime_mito_region_query_scanned_bytes`, and `greptime_mito_region_written_bytes_since_open` gauge metrics. - MitoRegion (`src/mito2/src/region.rs`, `src/mito2/src/region/opener.rs`, `src/mito2/src/region_write_ctx.rs`): Replace `AtomicU64` `written_bytes` with `IntGauge`; add `query_cpu_time`/`query_scanned_bytes` fields with lifecycle management (init, reset, remove-on-drop). - RegionStatistic (`src/store-api/src/region_engine.rs`, `src/store-api/src/storage/requests.rs`): Add `query_cpu_time` and `query_scanned_bytes` fields. - Metric-engine (`src/metric-engine/src/utils.rs`): Aggregate query load from metadata and data regions. - Heartbeat (`src/datanode/src/heartbeat.rs`, `src/common/meta/src/datanode.rs`): Relay region query load via heartbeat `RegionStat`; add test. - Query engine (`src/query/src/options.rs`, `src/query/src/query_engine/state.rs`, `src/query/src/datafusion.rs`, `src/query/src/dist_plan/merge_scan.rs`, `src/query/src/dist_plan/analyzer.rs`, `src/query/src/dummy_catalog.rs`): Add `enable_region_query_load_report` config; wire `RegionScanExec` to accumulate CPU time and scanned bytes. - Table scan (`src/table/src/table/scan.rs`, `src/table/src/table/metrics.rs`): Wire table scan metrics. - Config (`config/standalone.example.toml`, `config/datanode.example.toml`, `config/frontend.example.toml`, `config/config.md`): Add example config and documentation for `enable_region_query_load_report`. - Tests (`src/mito2/src/engine/basic_test.rs`, `src/mito2/src/engine/close_test.rs`, `src/cmd/tests/load_config_test.rs`, `src/flow/src/adapter.rs`): Add unit tests for region query load reporting and metric cleanup on region close; set default config values. Signed-off-by: Lei, HUANG <mrsatangel@gmail.com> * feat: move region read load report config from query layer to mito engine Move the `enable_region_query_load_report` setting from query-level config (`QueryOptions`/`DistPlannerOptions`) into the mito2 storage engine config (`MitoConfig`), and expose it through the `RegionScanner` trait instead of `ScanRequest`/`PrepareRequest`. - Mito config: `src/mito2/src/config.rs`, `src/mito2/src/engine.rs` - Scan region plumbing: `src/mito2/src/read/scan_region.rs` - RegionScanner trait: `src/store-api/src/region_engine.rs` - Scanner impls: `src/mito2/src/read/seq_scan.rs`, `src/mito2/src/read/series_scan.rs`, `src/mito2/src/read/unordered_scan.rs` - RegionScanExec: `src/table/src/table/scan.rs` - Removed from query layer: `src/query/src/options.rs`, `src/query/src/dist_plan/analyzer.rs`, `src/query/src/query_engine/state.rs`, `src/query/src/datafusion.rs`, `src/query/src/dummy_catalog.rs` - Removed from test/config: `src/query/src/dist_plan/analyzer/test.rs`, `src/flow/src/adapter.rs`, `src/cmd/tests/load_config_test.rs`, `src/store-api/src/storage/requests.rs` - Config docs: `config/config.md`, `config/datanode.example.toml`, `config/frontend.example.toml`, `config/standalone.example.toml` Signed-off-by: Lei, HUANG <mrsatangel@gmail.com> * feat: move region query load report config from MitoConfig to LoggingOptions Relocate the `enable_region_query_load_report` setting from `MitoConfig` to `LoggingOptions` (as `enable_per_region_metrics`), and thread it into `MitoEngineBuilder` instead of reading from the engine config directly. This makes the region read-load reporting a per-node logging/observability concern rather than a per-engine storage setting. - `config/config.md` - `config/datanode.example.toml` - `config/standalone.example.toml` - `src/common/telemetry/src/logging.rs` - `src/datanode/src/datanode.rs` - `src/mito2/src/config.rs` - `src/mito2/src/engine.rs` - `src/mito2/src/region.rs` Signed-off-by: Lei Huang <lei@huang.to> Signed-off-by: Lei, HUANG <mrsatangel@gmail.com> * feat: report region query load on stream drop instead of stream end Move `report_region_query_load()` from `StreamWithMetricWrapper::poll_next()` to `Drop::drop()` so that region query load is reported even when the stream is dropped prematurely (not just when fully consumed). Affected files: - `src/table/src/table/scan.rs` Signed-off-by: Lei, Huang <huanglei@qiyi.com> Signed-off-by: Lei, HUANG <mrsatangel@gmail.com> * feat: make region query load reporting configurable Introduce `enable_region_query_load_report` flag to optionally report per-region `query_cpu_time` and `query_scanned_bytes` metrics instead of always creating them. When disabled, the Prometheus gauges are not created (`None`), avoiding metric churn for workloads that do not need query-level load tracking. - `src/common/meta/src/datanode.rs` — Placeholder fields for query load - `src/mito2/src/region.rs` — Make query metrics `Option<IntGauge>`, conditional create/remove/reset - `src/mito2/src/region/opener.rs` — Thread flag through `RegionOpener` - `src/mito2/src/worker.rs` — Thread flag through `WorkerGroup`/`WorkerStarter`/`RegionWorkerLoop` - `src/mito2/src/worker/handle_catchup.rs` — Pass flag on region open - `src/mito2/src/worker/handle_create.rs` — Pass flag on region create - `src/mito2/src/worker/handle_open.rs` — Pass flag on region open - `src/mito2/src/engine.rs` — Pass flag from `MitoEngineBuilder` - `src/mito2/src/test_util.rs` — Test helpers for both modes - `src/mito2/src/engine/basic_test.rs` — Cover disabled and preserve cases - `src/mito2/src/engine/close_test.rs` — Adapt to optional metrics Signed-off-by: Lei, HUANG <mrsatangel@gmail.com> * refactor: remove elapsed_compute metric from scan stream The elapsed_compute metric conflated poll-wait time with actual CPU computation, making it misleading. Removed the metric and its recording path from StreamMetrics and StreamWithMetricWrapper. Added a test asserting that poll duration is not reported as elapsed_compute. - `src/table/src/table/metrics.rs` — removed elapsed_compute field, builder, and record_elapsed_compute method - `src/table/src/table/scan.rs` — removed record_elapsed_compute call; added SlowRecordBatchStream test helper and wrapper_poll_time_is_not_elapsed_compute test Signed-off-by: Lei, HUANG <mrsatangel@gmail.com> * feat: disable region query load report for compaction scans Compaction scans are internal operations initiated by the engine, not user queries. Disable region query load reporting when the scan input is marked as compaction to avoid misleading load metrics. - `src/mito2/src/read/scan_region.rs` — set `enable_region_query_load_report` to `false` when compaction is enabled; add unit test Signed-off-by: Lei, HUANG <mrsatangel@gmail.com> * test: add `enable_per_region_metrics` config to HTTP integration test - Enable per-region metrics config in HTTP test setup \`tests-integration/tests/http.rs\` Signed-off-by: Lei, HUANG <mrsatangel@gmail.com> * refactor: remove region query load reporting tests and helpers Remove the region query load reporting feature from the codebase, including tests, test utilities, and helper infrastructure that were part of this now-deprecated functionality. Specifically: - Remove region query load reporting tests from `src/mito2/src/engine/basic_test.rs` and `src/table/src/table/scan.rs`, and the region close metrics test from `src/mito2/src/engine/close_test.rs` - Remove region query load report test utilities and simplify engine construction helpers in `src/mito2/src/test_util.rs` Signed-off-by: Lei, HUANG <mrsatangel@gmail.com> * perf: avoid disabled region query load timing Summary: - Avoid per-poll `Instant::now` and elapsed-time accumulation when `enable_region_query_load_report` is disabled. - Keep region query-load CPU accounting active only when reporting is enabled. Files: - `src/table/src/table/scan.rs` Signed-off-by: Lei, HUANG <mrsatangel@gmail.com> * feat: move per-region query load reporting from storage to query engine Move `enable_per_region_metrics` from datanode to frontend config and migrate query load tracking (CPU time, scanned bytes) from mito2 storage engine to the query engine's distributed scan planner. The storage-level metrics plumbing and `enable_region_query_load_report` flag are removed from mito2, `ScanInput`, `ScanRegion`, and `RegionScanner`. Query-level metrics are now collected in `merge_scan.rs` via `scan_region_load`. - `src/mito2/` -- Remove `query_cpu_time`, `query_scanned_bytes` metrics, `enable_region_query_load_report` plumbing from engine, region, opener, scanner types, workers - `src/store-api/` -- Remove `query_cpu_time`, `query_scanned_bytes` from `RegionStatistic` - `src/metric-engine/` -- Remove query load fields from `get_region_statistic` - `src/query/` -- Add `enable_per_region_metrics` to `QueryOptions`; wire through planner, optimizer, merge scan with `scan_region_load` metrics - `src/frontend/` -- Pass `enable_per_region_metrics` into `QueryOptions` - `src/common/meta/` -- Remove TODO for query load fields - `config/` -- Move `enable_per_region_metrics` from datanode to frontend and standalone example configs - `src/cmd/tests/` -- Add `enable_per_region_metrics` to flownode config test - `src/flow/` -- Add `enable_per_region_metrics` default to flownode options - `src/table/` -- Remove unused query load fields from scan - `src/datanode/` -- Remove `with_enable_region_query_load_report` calls Signed-off-by: Lei, HUANG <mrsatangel@gmail.com> * refactor: remove obsolete mito write load metric Remove obsolete mito-side region written-bytes metric plumbing that is not needed by the frontend read-load reporting path. Related files: - \`src/mito2/src/metrics.rs\` - \`src/mito2/src/region.rs\` - \`src/mito2/src/region/opener.rs\` - \`src/mito2/src/region_write_ctx.rs\` - \`src/mito2/src/engine/basic_test.rs\` - \`src/mito2/src/worker.rs\` - \`src/mito2/src/config.rs\` Signed-off-by: Lei, HUANG <mrsatangel@gmail.com> * feat: change region query load metrics from gauge to counter Change `REGION_QUERY_CPU_TIME` and `REGION_QUERY_SCANNED_BYTES` from `IntGaugeVec` to `IntCounterVec` since these values are monotonically increasing and do not need gauge semantics. Update corresponding `add` calls to `inc_by` in merge scan reporting. Files: - `src/store-api/src/metrics.rs` — metric type and label changes - `src/query/src/dist_plan/merge_scan.rs` — caller adaptation Signed-off-by: Lei, HUANG <mrsatangel@gmail.com> * refactor: pass ReadItem directly to report_region_query_load Move `region_scan_load` call to the caller, so `report_region_query_load` accepts the already-computed `ReadItem` instead of `RecordBatchMetrics`. - `src/query/src/dist_plan/merge_scan.rs` — update signature, inline call, remove stale test Signed-off-by: Lei, HUANG <mrsatangel@gmail.com> * feat: ensure region query load is reported on MergeScanExec drop Remove the `enable_per_region_metrics` parameter from `report_region_query_load` so region load metrics are always emitted. Add a `Drop` impl for `MergeScanExec` that reports sub-stage metrics when the executor is dropped, covering edge cases where per-region metric emission was missed. Add a unit test verifying CPU time and scanned bytes are recorded on drop. Affected file: `src/query/src/dist_plan/merge_scan.rs` Signed-off-by: Lei, HUANG <mrsatangel@gmail.com> * fix: gate region query load reporting Guard drop-time region query load reporting with the configured per-region metrics flag. Related files: - \`src/query/src/dist_plan/merge_scan.rs\` Symbols: - \`MergeScanExec::drop\` - \`enable_per_region_metrics\` Signed-off-by: Lei, HUANG <mrsatangel@gmail.com> * fix: clean region query load metrics on drop Remove per-region query load metric labels when a region is dropped so stale label series do not remain in the registry. Related files: - \`src/mito2/src/region.rs\` Symbols: - \`MitoRegion::drop\` - \`REGION_QUERY_CPU_TIME\` - \`REGION_QUERY_SCANNED_BYTES\` Signed-off-by: Lei, HUANG <mrsatangel@gmail.com> --------- Signed-off-by: Lei, HUANG <mrsatangel@gmail.com> Signed-off-by: Lei Huang <lei@huang.to> Signed-off-by: Lei, Huang <huanglei@qiyi.com>	2026-06-17 16:03:09 +00:00
Weny Xu	61f802cdf2	feat: support repartition with target partition columns (#8278 ) * feat(sql): support repartition on columns syntax Signed-off-by: WenyXu <wenymedia@gmail.com> * feat(meta): support repartition target partition columns Signed-off-by: WenyXu <wenymedia@gmail.com> * feat(operator): wire repartition target columns Signed-off-by: WenyXu <wenymedia@gmail.com> * fix(operator): validate repartition target columns Signed-off-by: WenyXu <wenymedia@gmail.com> * test: cover repartition on columns integration Signed-off-by: WenyXu <wenymedia@gmail.com> * fix(operator): validate duplicate target partition columns Signed-off-by: WenyXu <wenymedia@gmail.com> * chore: upgrade proto Signed-off-by: WenyXu <wenymedia@gmail.com> * fix: fix unit tests Signed-off-by: WenyXu <wenymedia@gmail.com> --------- Signed-off-by: WenyXu <wenymedia@gmail.com>	2026-06-12 07:36:53 +00:00
LFC	270dce5ed7	feat: decouple region edit and compaction (#8272 ) * feat: decouple region edit and compaction Signed-off-by: luofucong <luofc@foxmail.com> * make `schedule_compaction_after_edit` default to `true` Signed-off-by: luofucong <luofc@foxmail.com> * resolve PR comments Signed-off-by: luofucong <luofc@foxmail.com> --------- Signed-off-by: luofucong <luofc@foxmail.com>	2026-06-10 09:56:11 +00:00
Ning Sun	fd64ced4da	feat: introduce plugin setup functions with richer context (#8256 ) feat: enrich plugin setup context	2026-06-08 06:53:08 +00:00
Lei, HUANG	e7ce3ac0c7	fix: accept JSONB OTLP logs against existing schema (#8250 ) * fix: accept JSONB OTLP logs against existing schema Handle existing OTLP log JSONB columns that round-trip as Json while direct log rows carry pre-encoded binary JSONB payloads. Add integration coverage for repeated OTLP log batches after table auto-creation. Files: src/servers/src/otlp/logs.rs, tests-integration/tests/http.rs Signed-off-by: Lei, HUANG <mrsatangel@gmail.com> * perf: cache OTLP log JSONB schema check Cache whether an existing OTLP log column is legacy JSONB once per column before coercing rows. Files: src/servers/src/otlp/logs.rs Signed-off-by: Lei, HUANG <mrsatangel@gmail.com> * fix: narrow down schema coercion Signed-off-by: Lei, HUANG <mrsatangel@gmail.com> --------- Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>	2026-06-08 01:57:23 +00:00
shuiyisong	d26a80855b	chore: align OTLP logs with existing table schemas (#8229 ) * chore: otlp logs with schema align Signed-off-by: shuiyisong <xixing.sys@gmail.com> * chore: add tests Signed-off-by: shuiyisong <xixing.sys@gmail.com> * fix: CR issue Signed-off-by: shuiyisong <xixing.sys@gmail.com> --------- Signed-off-by: shuiyisong <xixing.sys@gmail.com>	2026-06-04 12:53:09 +00:00
dennis zhuang	31c2c1f6db	feat: table semantic layer per-table enrichment (Phase 2) (#8218 ) * feat: table semantic layer per-table enrichment (Phase 2) Phase 2 of the table semantic layer, plus a vocabulary trim so the layer only records what a machine consumer cannot cheaply recover on its own. Per-table metric enrichment (OTLP), via an internal per-table channel: - A `SemanticIndex` accumulator records, per emitted table, the declared metric keys: type / unit / temporality / metadata_quality=declared / original_name. Conflicting single-valued keys collapse to `mixed`/`unknown`. - Recording happens at the `encode_metrics` level where the base name, metric type, and proto fields are all in scope, so histogram/summary fan-out gets the correct per-subtable type (`_bucket`=histogram, `_sum`/`_count`=counter) without threading state through every encoder. - The index is serialized onto the `greptime.internal.semantic.per_table_index` context extension; `apply_per_table_semantic_options` folds each table's keys into its options at auto-create time. - `trace.conventions` is refined from the request's resource/scope `schema_url`s (concrete when uniform, else `mixed`/`unknown`). Vocabulary trimmed to only meaningful keys. Kept: signal_type, source, pipeline, trace.conventions, metric.{type,unit,temporality,metadata_quality,original_name}. Dropped: metric.monotonic (a function of type), trace.has_events/has_links (constant + derivable from columns), log.severity_scheme/body_format (constant / derivable, and body_format cost an O(rows) scan), resource/scope lineage (restates columns / collector-config concern), source_version (no cheap non-constant value today). Prometheus carries type/unit in the metric name by convention, so it gets identity only — no inferred enrichment. Identity (signal_type + source) extended to the remaining ingest protocols so the discovery view is complete: InfluxDB and OpenTSDB (metric), Loki and Elasticsearch (log). These protocols carry no type/unit metadata, so identity is all that applies. Tests: unit coverage for the accumulator, per-metric-type fan-out, and trace conventions; integration goldens updated for the OTLP metric/trace SHOW CREATE output and the new Loki identity. Signed-off-by: Dennis Zhuang <killme2008@gmail.com> * chore: validate the option value Signed-off-by: Dennis Zhuang <killme2008@gmail.com> --------- Signed-off-by: Dennis Zhuang <killme2008@gmail.com>	2026-06-04 12:52:00 +00:00
dennis zhuang	f25ab67af1	feat: table semantic layer identity (Phase 1) (#8210 ) * feat: table semantic layer identity (Phase 1) Attach a thin layer of semantic metadata to ingested tables via the existing `table_options` slot, so machine consumers (LLM agents, alert/dashboard builders, MCP servers, ETL) can align a table with the observability concept it stands for without guessing from column names. See docs/rfcs/2026-05-28-table-semantic-layer.md. Phase 1 (identity) only: - New `table::requests::semantic` module: the `greptime.semantic.` vocabulary (signal/source/source_version/pipeline + trace/metric/log/resource-scope keys, defined now, populated by later phases), value constants, the internal `greptime.internal.semantic.per_table_index` transport key (reserved for Phase 2, deliberately outside the public namespace), and `is_semantic_option_key`. - `validate_table_option` accepts the `greptime.semantic.` prefix, so the keys are valid both on the auto-create path and on explicit `CREATE TABLE ... WITH (...)`. - `fill_table_options_for_create` copies every semantic ctx extension into the new table's options (prefix passthrough alongside the fixed allowlist). - Frontend stamps identity on the context at each ingest entry: OTLP metrics (metric/opentelemetry), traces (+pipeline, has_events/has_links/conventions for the v1 model), logs (log/opentelemetry), and Prometheus remote write (metric/prometheus, metadata_quality=inferred). OTLP metric metadata_quality is left for Phase 2 (declared). - Trace identity is stamped only on the main span table; the derived `_services` / `_operations` lookup tables keep the unstamped context and carry no semantic identity (cross-table relationships are out of scope). Semantic options appear in SHOW CREATE TABLE (like table_data_model / otlp_metric_compat) and in information_schema, so an LLM inspecting a table sees its semantics directly. Tests: unit (validation prefix + internal-key rejection, ctx passthrough) and integration assertions that the common keys land for OTLP metrics (metric-engine logical table), traces, logs, and Prometheus remote write; SHOW CREATE goldens updated. Signed-off-by: Dennis Zhuang <killme2008@gmail.com> * fix: prom batcher not cover and white list for semantic keys/values Signed-off-by: Dennis Zhuang <killme2008@gmail.com> * fix: typo Signed-off-by: Dennis Zhuang <killme2008@gmail.com> --------- Signed-off-by: Dennis Zhuang <killme2008@gmail.com>	2026-06-02 01:56:29 +00:00
discord9	28fd796f4e	fix(flow): harden incremental read correctness (#8196 ) * fix(flow): harden incremental read correctness Signed-off-by: discord9 <discord9@163.com> * fix(flow): propagate dirty window options Signed-off-by: discord9 <discord9@163.com> * test: more Signed-off-by: discord9 <discord9@163.com> * chore: test config api Signed-off-by: discord9 <discord9@163.com> * refactor: split gen Signed-off-by: discord9 <discord9@163.com> * chore: per review Signed-off-by: discord9 <discord9@163.com> * fix: allowlist key Signed-off-by: discord9 <discord9@163.com> --------- Signed-off-by: discord9 <discord9@163.com>	2026-06-01 02:48:00 +00:00
dennis zhuang	ed9312f8e3	feat: global switch for creating tables automatically (#8203 ) * feat: global switch for creating table automatically Signed-off-by: Dennis Zhuang <killme2008@gmail.com> * chore: make auto_create_table as comment by default Signed-off-by: Dennis Zhuang <killme2008@gmail.com> * feat: respect gloabl switch for metric engine Signed-off-by: Dennis Zhuang <killme2008@gmail.com> --------- Signed-off-by: Dennis Zhuang <killme2008@gmail.com>	2026-05-31 23:51:14 +00:00
Weny Xu	91ac84019b	feat(meta-srv): support repartition for unpartitioned tables (#8186 ) * feat(meta-srv): update repartition partition metadata Signed-off-by: WenyXu <wenymedia@gmail.com> * feat(meta-srv): connect repartition source metadata Signed-off-by: WenyXu <wenymedia@gmail.com> * test(meta-srv): cover unpartitioned repartition rollback Signed-off-by: WenyXu <wenymedia@gmail.com> * chore: apply suggestions Signed-off-by: WenyXu <wenymedia@gmail.com> * feat: connect unpartitioned repartition SQL path Signed-off-by: WenyXu <wenymedia@gmail.com> * chore: apply suggestions Signed-off-by: WenyXu <wenymedia@gmail.com> --------- Signed-off-by: WenyXu <wenymedia@gmail.com>	2026-05-28 03:31:55 +00:00
LFC	bf7e3551fe	test: add jsonbench tests (#8165 ) Signed-off-by: luofucong <luofc@foxmail.com>	2026-05-27 08:34:06 +00:00
Yingwen	e1e75b3ffe	feat: implement a cache for the prefilter (#8102 ) * feat: cache parquet prefilter results Signed-off-by: evenyag <realevenyag@gmail.com> * chore: set result cache size Signed-off-by: evenyag <realevenyag@gmail.com> * refactor: rename is_stable to is_immutable and reject ScalarVariable Signed-off-by: evenyag <realevenyag@gmail.com> * chore: typo Signed-off-by: evenyag <realevenyag@gmail.com> * refactor: use capacity() for prefilter key memory accounting Signed-off-by: evenyag <realevenyag@gmail.com> * feat: per filter cache Signed-off-by: evenyag <realevenyag@gmail.com> * refactor: support other variants in MaybeFilter Signed-off-by: evenyag <realevenyag@gmail.com> * refactor: split compute_projection_mask Signed-off-by: evenyag <realevenyag@gmail.com> * refactor: build_prefilter_masks takes PrefilterEntry Signed-off-by: evenyag <realevenyag@gmail.com> --------- Signed-off-by: evenyag <realevenyag@gmail.com>	2026-05-25 03:10:12 +00:00
Lei, HUANG	f8df016623	feat: add InfluxDB default merge mode config (#8134 ) * feat/influxdb-default-merge-mode: add InfluxDB merge mode config - `influxdb` config: add `default_merge_mode` parsing and defaults in `src/frontend/src/service_config/influxdb.rs` and `src/frontend/src/service_config.rs` - auto-create behavior: apply configured `merge_mode` for InfluxDB ingestion in `src/frontend/src/instance.rs`, `src/frontend/src/instance/builder.rs`, `src/frontend/src/instance/influxdb.rs`, and `src/operator/src/insert.rs` - config docs: document `influxdb.default_merge_mode` in `config/frontend.example.toml`, `config/standalone.example.toml`, and `config/config.md` Signed-off-by: Lei, HUANG <mrsatangel@gmail.com> * feat/influxdb-default-merge-mode: derive merge mode default - `influxdb` config: derive `Default` for `InfluxdbMergeMode` in `src/frontend/src/service_config/influxdb.rs` Signed-off-by: Lei, HUANG <mrsatangel@gmail.com> * feat/influxdb-default-merge-mode: update config API snapshot - `config API`: include `default_merge_mode` in `tests-integration/tests/http.rs` Signed-off-by: Lei, HUANG <mrsatangel@gmail.com> * feat/influxdb-default-merge-mode: avoid default context clone - `InfluxDB merge mode`: avoid cloning `QueryContext` for default `last_non_null` in `src/frontend/src/instance/influxdb.rs` - `InfluxDB merge mode`: cover default, configured, and explicit `MERGE_MODE_KEY` paths in `src/frontend/src/instance/influxdb.rs` Signed-off-by: Lei, HUANG <mrsatangel@gmail.com> --------- Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>	2026-05-19 16:54:36 +00:00
shuiyisong	d4cfbd3400	feat: add otlp to prometheus naming translation options (#8113 ) * chore: extract otlp translator Signed-off-by: shuiyisong <xixing.sys@gmail.com> * feat: add header parsing Signed-off-by: shuiyisong <xixing.sys@gmail.com> * chore: add tests Signed-off-by: shuiyisong <xixing.sys@gmail.com> * chore: add tests Signed-off-by: shuiyisong <xixing.sys@gmail.com> * chore: simplify and merge tests Signed-off-by: shuiyisong <xixing.sys@gmail.com> --------- Signed-off-by: shuiyisong <xixing.sys@gmail.com>	2026-05-16 04:52:45 +00:00
Yingwen	7840aa1bb4	refactor(mito2)!: remove PartitionTreeMemtable (#8080 ) * feat: switch partition tree to bulk Signed-off-by: evenyag <realevenyag@gmail.com> * chore: keep partition tree memtable for migration test Restore PartitionTreeMemtable construction when memtable.type=partition_tree is explicit, and move the sparse-encoding bulk override into the default (no explicit memtable.type) arm so phase 2's memtable.type=bulk wins on reopen. Rewrite test_reopen_time_series_sparse_memtable_with_bulk to use a metric-engine-shaped schema and sparse-encoded rows with WriteHint::Sparse, so the test actually exercises a PartitionTreeMemtable in phase 1 and verifies WAL replay into the new BulkMemtable on reopen without flushing. Signed-off-by: evenyag <realevenyag@gmail.com> * chore: drop partition tree memtable from runtime Re-apply the unconditional sparse-encoding override in `MemtableBuilderProvider::builder_for_options` and route the `MemtableOptions::PartitionTree` arm to `BulkMemtable` with a deprecation warning. After this change, `PartitionTreeMemtableBuilder` is no longer reachable from the engine runtime; benchmarks still reference the type. Remove `test_reopen_time_series_sparse_memtable_with_bulk` and the `put_sparse_rows` helper added in the previous commit — that test only existed to validate the PartitionTree -> Bulk reopen migration and is unnecessary now that the override is in place. Signed-off-by: evenyag <realevenyag@gmail.com> * refactor(mito2): move timestamp_array_to_i64_slice into read module Relocate the timestamp_array_to_i64_slice helper from memtable/partition_tree/data.rs to the read module so that the read path no longer depends on the partition_tree internals. All call sites (both inside and outside the partition_tree module) now import from crate::read. Signed-off-by: evenyag <realevenyag@gmail.com> * refactor(mito2): use TimeSeriesMemtableBuilder in time_partition tests The time_partition tests use the memtable builder purely as a generic backend for the TimePartitions write/scan paths; nothing in them is specific to the partition-tree memtable. Switch the seven affected tests to TimeSeriesMemtableBuilder so the tests no longer depend on PartitionTreeMemtableBuilder. Signed-off-by: evenyag <realevenyag@gmail.com> * chore(mito2): delete PartitionTreeMemtable implementation The runtime already falls back to BulkMemtable for the PartitionTree variant. Drop the now-unreachable implementation, its metrics, the partition_tree benchmarks, the metric-engine Unsupported fallback in bulk_insert.rs, and the test helpers that only existed for the deleted module. MemtableOptions::PartitionTree, its parsing, the runtime fallback, the store-api MEMTABLE_PARTITION_TREE_* constants, and the SQL fixtures remain so existing region options keep round-tripping. Signed-off-by: evenyag <realevenyag@gmail.com> * refactor(mito-codec): drop skip_partition_column parameter PartitionTreeMemtable was the only caller passing skip_partition_column=true; every other caller passes false. Now that the partition_tree module is gone, the parameter is uniformly false and the guard branch is dead. Drop the parameter from the trait method and both impls, remove the guard and the is_partition_column helper, and update the four remaining call sites in mito2 plus the bench. Signed-off-by: evenyag <realevenyag@gmail.com> * chore(mito2): remove unused MemtableConfig enum Signed-off-by: evenyag <realevenyag@gmail.com> * chore: fmt code Signed-off-by: evenyag <realevenyag@gmail.com> * refactor: remove unused variant Signed-off-by: evenyag <realevenyag@gmail.com> * test: update test_config_api Signed-off-by: evenyag <realevenyag@gmail.com> * fix: remove unused memtable test helpers Signed-off-by: evenyag <realevenyag@gmail.com> * chore: address review comment Signed-off-by: evenyag <realevenyag@gmail.com> * fix: support bulk memtable options Signed-off-by: evenyag <realevenyag@gmail.com> * fix: sanitize config Signed-off-by: evenyag <realevenyag@gmail.com> * feat: remove partition tree options from region options Move primary_key_encoding to the top level Signed-off-by: evenyag <realevenyag@gmail.com> * test: make ssts test datetime replaced text stable Signed-off-by: evenyag <realevenyag@gmail.com> * test: update sqlness result Signed-off-by: evenyag <realevenyag@gmail.com> * chore: validate_enum_options consider bulk memtable Signed-off-by: evenyag <realevenyag@gmail.com> * refactor: pass region id when parsing region options Replace the `TryFrom<&HashMap>` impl for `RegionOptions` with `try_from_options(region_id, options_map)` so the legacy partition_tree fallback can log the affected region. The fallback now also overrides the SST format to flat in addition to clearing the memtable type. Signed-off-by: evenyag <realevenyag@gmail.com> * fix: align sst_format with bulk memtable on parse and open Signed-off-by: evenyag <realevenyag@gmail.com> --------- Signed-off-by: evenyag <realevenyag@gmail.com>	2026-05-15 11:49:27 +00:00
shuiyisong	dd420e33fe	fix: add standalone flag in standalone tests (#8100 ) * fix: test Signed-off-by: shuiyisong <xixing.sys@gmail.com> * fix: fmt and cargo.lock Signed-off-by: shuiyisong <xixing.sys@gmail.com> --------- Signed-off-by: shuiyisong <xixing.sys@gmail.com>	2026-05-12 10:43:57 +00:00
Han	b5997c6797	test: cover standalone user provider config (#8067 ) * test: cover standalone user provider config Signed-off-by: Detachm <42765252+Detachm@users.noreply.github.com> * test: cover config-driven http auth Signed-off-by: Detachm <42765252+Detachm@users.noreply.github.com> --------- Signed-off-by: Detachm <42765252+Detachm@users.noreply.github.com>	2026-05-08 08:56:22 +00:00
Lei, HUANG	42aa58aa27	feat: support env vars in heartbeat (#8064 ) * feat: support reporting env vars in heartbeat messages to metasrv Add `heartbeat_env_vars` config option for datanode and frontend. When configured, the specified environment variable values are read at startup and sent to metasrv in every heartbeat via the `extensions` map. Metasrv extracts and stores them in `NodeInfo` for use in routing decisions (e.g. AZ-aware region placement). - Add `EnvVars` helper in `common/meta/src/datanode.rs` following the existing `GcStat` extension pattern with `into_extensions`/`from_extensions` - Add `env_vars: HashMap<String, String>` field to `NodeInfo` in `common/meta/src/cluster.rs` with `#[serde(default)]` for backward compat - Add `heartbeat_env_vars: Vec<String>` config field to `DatanodeOptions`, `FrontendOptions`, and `StandaloneOptions` - Inject env vars into heartbeat `extensions` in both datanode and frontend heartbeat tasks (`datanode/src/heartbeat.rs`, `frontend/src/heartbeat.rs`) - Extract env vars from `req.extensions` in all three metasrv `CollectXxxClusterInfoHandler`s - Update `NodeInfo` construction sites in `meta-client`, `discovery/lease.rs`, and `standalone/information_extension.rs` - Update expected TOML output in `tests-integration/tests/http.rs` - Add unit tests for `EnvVars` round-trip and `NodeInfo` backward compat Signed-off-by: Lei, HUANG <leih@nvidia.com> Signed-off-by: Lei, HUANG <mrsatangel@gmail.com> * refactor: address heartbeat env review feedback Signed-off-by: Lei, HUANG <mrsatangel@gmail.com> * chore: log error on deserialization failure Signed-off-by: Lei, HUANG <mrsatangel@gmail.com> * refactor: send heartbeat env vars once Signed-off-by: Lei, HUANG <mrsatangel@gmail.com> * fix: resend heartbeat env vars after reconnect Signed-off-by: Lei, HUANG <mrsatangel@gmail.com> * revert: keep env vars in every heartbeat Signed-off-by: Lei, HUANG <mrsatangel@gmail.com> --------- Signed-off-by: Lei, HUANG <leih@nvidia.com> Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>	2026-05-08 07:37:53 +00:00
Lei, HUANG	796aae3d9f	feat(operator): allow last_row merge mode with append mode (#8065 ) * feat(operator): allow last_row merge_mode when append_mode is enabled - Update RegionOptions::validate to allow last_row merge_mode with append_mode. - Update fill_table_options_for_create to automatically set merge_mode to last_row when append_mode is enabled for LastNonNull table type. - Add unit tests in mito2 and operator to verify options validation and table creation. - Add integration test for InfluxDB write with append mode hint. Signed-off-by: Lei, HUANG <mrsatangel@gmail.com> * fix(operator): simplify append mode options Group `LastNonNull` auto-create options in a single append-mode branch. Files: - `src/operator/src/insert.rs` Signed-off-by: Lei, HUANG <mrsatangel@gmail.com> * fix: sqlness Signed-off-by: Lei, HUANG <mrsatangel@gmail.com> --------- Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>	2026-05-07 07:21:37 +00:00
QuakeWang	b8cec203ad	feat: allow detailed index config in pipeline (#8036 ) * feat: allow detailed index config in pipeline Signed-off-by: QuakeWang <wangfuzheng0814@foxmail.com> * refactor: simplify transform index option lowering Signed-off-by: QuakeWang <wangfuzheng0814@foxmail.com> * test: cover pipeline index options creation Signed-off-by: QuakeWang <wangfuzheng0814@foxmail.com> --------- Signed-off-by: QuakeWang <wangfuzheng0814@foxmail.com>	2026-04-28 08:43:59 +00:00
Weny Xu	0c942fc23a	refactor: unify frontend discovery with active peer discovery (#8031 ) * feat: add MetaClient peer discovery support Signed-off-by: WenyXu <wenymedia@gmail.com> * refactor: reuse peer discovery for flow frontend discovery Signed-off-by: WenyXu <wenymedia@gmail.com> * refactor: use active frontend peers in process manager Signed-off-by: WenyXu <wenymedia@gmail.com> * chore: apply suggestions Signed-off-by: WenyXu <wenymedia@gmail.com> * chore: apply suggestions from CR Signed-off-by: WenyXu <wenymedia@gmail.com> * chore: add comments Signed-off-by: WenyXu <wenymedia@gmail.com> * chore: apply suggestions from CR Signed-off-by: WenyXu <wenymedia@gmail.com> --------- Signed-off-by: WenyXu <wenymedia@gmail.com>	2026-04-27 06:40:37 +00:00
Yvan Wang	793545d8e6	fix(server): describe EXPLAIN statements so bind parameters work (#8035 ) * fix(server): describe EXPLAIN statements so bind parameters work `do_describe_inner` only planned `Insert`/`Query`/`Delete`, so `EXPLAIN` and `EXPLAIN ANALYZE` fell through to the non-plan branch and had no parameter-type inference. At Bind time the Postgres handler then reported `unsupported_parameter_type` even though the inner query would have worked on its own. Recurse one level into `Statement::Explain` so that an EXPLAIN wrapping a plannable statement goes through the same describe path. Adds a tokio-postgres integration test that exercises `EXPLAIN`/`EXPLAIN ANALYZE` over the extended query protocol. Fixes #8029 Signed-off-by: BootstrapperSBL <yvanwww@gmail.com> * refactor(server): extract plannable-inner check into closure Reduce duplication between the direct match and the EXPLAIN inner match by factoring out is_inner_plannable. Behaviour unchanged. Signed-off-by: BootstrapperSBL <yvanwww@gmail.com> --------- Signed-off-by: BootstrapperSBL <yvanwww@gmail.com>	2026-04-26 14:01:35 +00:00
Ning Sun	80c395ee23	refactor: update SqlPlan for more cleaner variants (#7966 ) * refactor: update SqlPlan for more cleaner variants * refactor: change how we check readonly plan * fix: don't return schema for non-query statement * chore: reflect review comments * fix: federated statements	2026-04-21 11:50:47 +00:00
LFC	43225a8eee	feat: introducing "JSON2" type (#7965 ) Signed-off-by: luofucong <luofc@foxmail.com>	2026-04-15 03:38:01 +00:00
LFC	a870b53f68	fix: mysql prepare correctly returns error instead of panic (#7963 ) feat: mysql writer support multiple statement execution Signed-off-by: luofucong <luofc@foxmail.com>	2026-04-15 01:59:16 +00:00
Ning Sun	32a2990802	feat: allow customizing trace table partitions (#7944 ) * feat: allow customizing trace table partitions * feat: add hint * feat: return error on invalid partition number * feat: add full failure/partial success/full success * chore: format * fix: address review suggestion	2026-04-13 14:10:55 +00:00
Ning Sun	59021ce83b	fix: using uint64 datatype for postgres prepared statement parameters (#7942 ) * feat: add support for decimal parameter type, remove string replacement fallback * chore: format * fix: add support for using unsigned bigint in postgres * chore: format toml * refactor: cleanup duplicated code * fix: rescale decimal	2026-04-10 07:56:33 +00:00
Yingwen	233e35c0c9	feat!: switch default sst format to flat (#7909 ) * feat: support alter from primary_key to flat Signed-off-by: evenyag <realevenyag@gmail.com> * chore: alter flat to primary_key Signed-off-by: evenyag <realevenyag@gmail.com> * feat: change default_experimental_flat_format to true Signed-off-by: evenyag <realevenyag@gmail.com> * feat: compute channel size from splitted batch size Signed-off-by: evenyag <realevenyag@gmail.com> * test: add tests for split and channel size Signed-off-by: evenyag <realevenyag@gmail.com> * fix: always set sst_format from manifest on region open sanitize_region_options did not set options.sst_format when the default (PrimaryKey) matched the manifest value, leaving it as None after reopen. This caused the alter format change to appear lost. Signed-off-by: evenyag <realevenyag@gmail.com> * test: fix tests Signed-off-by: evenyag <realevenyag@gmail.com> * test: show create table after alteration Signed-off-by: evenyag <realevenyag@gmail.com> * refactor!: rename default_experimental_flat_format to default_flat_format The flat format is no longer experimental. Remove "experimental" from the config field name, doc comments, and all references. Signed-off-by: evenyag <realevenyag@gmail.com> * chore: fix clippy Signed-off-by: evenyag <realevenyag@gmail.com> --------- Signed-off-by: evenyag <realevenyag@gmail.com>	2026-04-03 04:14:02 +00:00
shuiyisong	8d495909d3	feat: auto alter table during trace ingestion from int to float (#7871 ) * feat: impl alter table Signed-off-by: shuiyisong <xixing.sys@gmail.com> * chore: minor refactor Signed-off-by: shuiyisong <xixing.sys@gmail.com> * chore: address issues Signed-off-by: shuiyisong <xixing.sys@gmail.com> * chore: address issues Signed-off-by: shuiyisong <xixing.sys@gmail.com> --------- Signed-off-by: shuiyisong <xixing.sys@gmail.com>	2026-04-02 10:01:18 +00:00
shuiyisong	d9736407f2	fix: return empty when promql gets non-exist label name (#7899 ) * fix: return empty when promql gets non-exist label name Signed-off-by: shuiyisong <xixing.sys@gmail.com> * fix: fmt Signed-off-by: shuiyisong <xixing.sys@gmail.com> * chore: minor refactor Signed-off-by: shuiyisong <xixing.sys@gmail.com> * fix: typo Signed-off-by: shuiyisong <xixing.sys@gmail.com> --------- Signed-off-by: shuiyisong <xixing.sys@gmail.com>	2026-04-02 03:20:45 +00:00
shuiyisong	3f3407fa24	feat: partial success in trace ingestion (#7892 ) * feat: impl partial success Signed-off-by: shuiyisong <xixing.sys@gmail.com> * refactor: grouping by resource and scope Signed-off-by: shuiyisong <xixing.sys@gmail.com> * chore: remove unused code Signed-off-by: shuiyisong <xixing.sys@gmail.com> * chore: rebase main & fix clippy Signed-off-by: shuiyisong <xixing.sys@gmail.com> * chore: add trace ingestion failure counter Signed-off-by: shuiyisong <xixing.sys@gmail.com> * fix: address comments Signed-off-by: shuiyisong <xixing.sys@gmail.com> * fix: update status list and remove TODO Signed-off-by: shuiyisong <xixing.sys@gmail.com> * fix: address comments Signed-off-by: shuiyisong <xixing.sys@gmail.com> * fix: fmt Signed-off-by: shuiyisong <xixing.sys@gmail.com> * chore: add more tests Signed-off-by: shuiyisong <xixing.sys@gmail.com> * fix: fmt Signed-off-by: shuiyisong <xixing.sys@gmail.com> --------- Signed-off-by: shuiyisong <xixing.sys@gmail.com>	2026-04-01 12:14:53 +00:00
Ning Sun	e14404c677	chore: update rust toolchain to 2026-03-21 (#7849 ) * chore: update rust toolchain to 2026-03-21 * chore: new format * fix: lint * chore: resolve lint issues * chore: remove as_millis_f64 * chore: deps up	2026-03-30 12:13:14 +00:00
shuiyisong	a8fe6b5e44	fix: allow auto type upscale conversion in trace ingestion (#7870 ) * fix: allow auto type upscale conversion in trace ingestion Signed-off-by: shuiyisong <xixing.sys@gmail.com> * fix: immediate return when parse fails Signed-off-by: shuiyisong <xixing.sys@gmail.com> * fix: typos Signed-off-by: shuiyisong <xixing.sys@gmail.com> * test: add integration test and fix Signed-off-by: shuiyisong <xixing.sys@gmail.com> * feat: add Int/Float/Bool to String conversion Signed-off-by: shuiyisong <xixing.sys@gmail.com> * refactor: coerce rows together Signed-off-by: shuiyisong <xixing.sys@gmail.com> * refactor: extract coerce Signed-off-by: shuiyisong <xixing.sys@gmail.com> * refactor: save clone Signed-off-by: shuiyisong <xixing.sys@gmail.com> * chore: add comments Signed-off-by: shuiyisong <xixing.sys@gmail.com> * refactor: unify in- and cross-batch check Signed-off-by: shuiyisong <xixing.sys@gmail.com> * chore: add comments Signed-off-by: shuiyisong <xixing.sys@gmail.com> --------- Signed-off-by: shuiyisong <xixing.sys@gmail.com>	2026-03-30 08:30:32 +00:00
Lei, HUANG	b57dfc18dc	feat: pending rows batching for metrics (#7831 ) * feat: metric batch 2s PoC Signed-off-by: jeremyhi <fengjiachun@gmail.com> * chore: max_concurrent_flushes Signed-off-by: jeremyhi <fengjiachun@gmail.com> * chore: work channel size Signed-off-by: jeremyhi <fengjiachun@gmail.com> * feat(servers): add metrics and logs for pending rows batch flush Add the `FLUSH_ELAPSED` histogram metric to track the duration of pending rows batch flushes in the Prometheus store protocol handler. This provides better observability into the performance and latency of the batcher. Also update telemetry by: - Recording elapsed time for both successful and failed flush operations. - Adding an informational log upon successful flush including row count and duration. - Including elapsed time in error logs when a flush fails. Signed-off-by: Lei, HUANG <mrsatangel@gmail.com> * feat(servers): implement columnar batching for pending rows Refactor PendingRowsBatcher to use columnar batching for the metrics store. Incoming RowInsertRequests are now converted to RecordBatches, partitioned, and flushed via BulkInsert requests to datanodes. - Enhance MultiDimPartitionRule to handle scalar boolean predicates. - Add metrics for tracking flush failures and dropped rows. - Update dependencies to support columnar batching in servers. Signed-off-by: Lei, HUANG <mrsatangel@gmail.com> * feat(servers): add backpressure for pending rows Implement backpressure in PendingRowsBatcher by limiting in-flight requests with a semaphore and making the submission wait for the flush result. This ensures Prometheus write requests are throttled and only return once the data has been successfully flushed to datanodes. - Add max_inflight_requests to PromStoreOptions. - Use oneshot channels to notify submitters of flush completion. - Limit concurrent requests using a new inflight_semaphore. - Update PendingRowsBatcher::submit to wait for the flush outcome. Signed-off-by: Lei, HUANG <mrsatangel@gmail.com> * feat: add stage-level metrics for bulk ingestion Introduce histograms to track the elapsed time of various stages in the metric engine bulk insert path and the server's pending rows batcher. This provides better observability into the performance bottlenecks of the ingestion pipeline. Signed-off-by: Lei, HUANG <mrsatangel@gmail.com> * - `src/metric-engine/src/engine/bulk_insert.rs`: Removed the fallback mechanism that converted record batches to rows when bulk inserts were unsupported, along with related helper functions and unused imports. - `src/operator/src/insert.rs`: Removed an unused import (`common_time::TimeToLive::Instant`). Signed-off-by: Lei, HUANG <mrsatangel@gmail.com> * feat(servers): columnar Prom remote write Optimize the Prometheus remote write path by allowing direct conversion from decoded Prometheus samples to Arrow RecordBatches. This bypasses intermediate row-based representations when `PendingRowsBatcher` is active and no pipeline is used, improving ingestion efficiency. - Implement `as_record_batch_groups` in `TablesBuilder` and `PromWriteRequest`. - Add `submit_prom_record_batch_groups` to `PendingRowsBatcher`. - Introduce `DecodedPromWriteRequest` in `prom_store`. - Implement row-to-RecordBatch conversion logic in `prom_row_builder`. Signed-off-by: Lei, HUANG <mrsatangel@gmail.com> * Revert "feat(servers): columnar Prom remote write" This reverts commit efbb63c12a3e7fcec03858ea0351efd94fec8242. * refactor(servers): improve row to RecordBatch conversion - Use `snafu::ensure` for row validation in `rows_to_record_batch`. - Add explicit type hint for `MutableVector` to improve clarity. - Reorganize and clean up imports in `pending_rows_batcher.rs`. Signed-off-by: Lei, HUANG <mrsatangel@gmail.com> * perf(servers): use arrow builders for row conversion This commit optimizes the conversion from `api::v1::Rows` to `RecordBatch` by using Arrow builders directly. This avoids the overhead of `MutableVector` and `common_recordbatch`, leading to better performance in the `pending_rows_batcher`. Additionally, the `#[allow(dead_code)]` attribute is removed from `modify_batch_sparse` in the metric engine as it is now utilized. Signed-off-by: Lei, HUANG <mrsatangel@gmail.com> * perf(metric-engine): optimize batch modification Optimize `modify_batch_sparse` by reusing buffers, using Arrow builders, and employing fast-path encoding methods. This reduces allocations and avoids redundant downcasting and serializer overhead. Signed-off-by: Lei, HUANG <mrsatangel@gmail.com> * feat/metric-engine-support-bulk: Add Environment Variable for Batch Sync Control - `pending_rows_batcher.rs`: Introduced an environment variable `PENDING_ROWS_BATCH_SYNC` to control the synchronization behavior of batch processing. If set to true, the function will wait for the flush result; otherwise, it will return immediatel with the total rows count. Signed-off-by: Lei, HUANG <mrsatangel@gmail.com> * wip Signed-off-by: Lei, HUANG <mrsatangel@gmail.com> * chore: update and fix clippy Signed-off-by: Lei, HUANG <mrsatangel@gmail.com> * fix: failing test Signed-off-by: Lei, HUANG <mrsatangel@gmail.com> * picking-pending-rows-batcher: ### Commit Message Remove Unused Code and Simplify Error Handling - `src/error.rs`: Removed the `BatcherQueueFull` error variant and its associated logic, simplifying the error handling by removing unused code. - `src/http/prom_store.rs`: Eliminated the `try_decompress` function, streamlining the decompression logic by directly using `snappy_decompress` in `decode_remote_read_request`. Signed-off-by: Lei, HUANG <mrsatangel@gmail.com> * chore: parse PENDING_ROWS_BATCH_SYNC once Signed-off-by: Lei, HUANG <mrsatangel@gmail.com> * chore: revert unrelated changes Signed-off-by: Lei, HUANG <mrsatangel@gmail.com> * Refactor Prometheus Write Handling - `prom_store.rs`: Introduced `pre_write` method in `PromStoreProtocolHandler` to handle pre-write checks for Prometheus remote write requests. Updated `write` method to utilize `pre_write`. - `server.rs`: Modified `PendingRowsBatcher` initialization to conditionally create a batcher based on `with_metric_engine` flag. - `http/prom_store.rs`: Integrated `pre_write` checks before submitting requests to `PendingRowsBatcher`. - `query_handler.rs`: Added `pre_write` method to `PromStoreProtocolHandler` trait for pre-write operations. Signed-off-by: Lei, HUANG <mrsatangel@gmail.com> * picking-pending-rows-batcher: - Fix Label Typo: Corrected a typo in the label value from `"flush_wn ite_region"` to `"flush_write_region"` in `pending_rows_batcher.rs`. - Refactor Array Building Logic: Introduced a macro `build_array!` to streamline the construction of `ArrayRef` for different data types, reducing code duplication in `pending_rows_batcher.rs`. Signed-off-by: Lei, HUANG <mrsatangel@gmail.com> * format toml Signed-off-by: Lei, HUANG <mrsatangel@gmail.com> * picking-pending-rows-batcher: ### Update PromStore and PendingRowsBatcher Configuration - `prom_store.rs`: Set `pending_rows_flush_interval` to `Duration::ZERO` to disable automatic flushing. - `pending_rows_batcher.rs`: Enhance validation to disable the batcher when `flush_interval` is zero or configuration values like `max_batch_rows`, `max_concurrent_flushes`, `worker_channel_capacity`, or `max_inflight_requests` are zero, preventing potential panics or deadlocks. Signed-off-by: Lei, HUANG <mrsatangel@gmail.com> * picking-pending-rows-batcher: ### Update `pending_rows_flush_interval` to Zero - Files Modified: - `src/frontend/src/service_config/prom_store.rs` - `tests-integration/tests/http.rs` - Key Changes: - Updated `pending_rows_flush_interval` from `Duration::from_secs(2)` to `Duration::ZERO` in `prom_store.rs`. - Changed `pending_rows_flush_interval` configuration from `"2s"` to `"0s"` in `http.rs`. These changes set the flush interval to zero, potentially affecting how frequently pending rows are flushed. Signed-off-by: Lei, HUANG <mrsatangel@gmail.com> * picking-pending-rows-batcher: Add Worker Management Enhancements - `metrics.rs`: Introduced `PENDING_WORKERS` gauge to track active pending rows batch workers. - `pending_rows_batcher.rs`: - Added worker idle timeout logic with `WORKER_IDLE_TIMEOUT_MULTIPLIER`. - Implemented worker management functions: `spawn_worker`, `remove_worker_if_same_channel`, and `should_close_worker_on_idle_timeout`. - Enhanced worker lifecycle management to handle idle workers and ensure proper cleanup. - Tests: Added unit tests for worker removal and idle timeout logic. Signed-off-by: Lei, HUANG <mrsatangel@gmail.com> * fix: clippy Signed-off-by: Lei, HUANG <mrsatangel@gmail.com> --------- Signed-off-by: jeremyhi <fengjiachun@gmail.com> Signed-off-by: Lei, HUANG <mrsatangel@gmail.com> Co-authored-by: jeremyhi <fengjiachun@gmail.com>	2026-03-27 02:19:00 +00:00
jeremyhi	8058ce7cf2	refactor: simplify scan memory tracking (#7827 ) * refactor: simplify scan memory tracking Signed-off-by: jeremyhi <fengjiachun@gmail.com> * chore: make confg-docs Signed-off-by: jeremyhi <fengjiachun@gmail.com> * chore: by codex review comment Signed-off-by: jeremyhi <fengjiachun@gmail.com> * feat: track_with_policy Signed-off-by: jeremyhi <fengjiachun@gmail.com> * chore: minor change Signed-off-by: jeremyhi <fengjiachun@gmail.com> * chore: mem granularity mb to kb Signed-off-by: jeremyhi <fengjiachun@gmail.com> * chore: by review comment Signed-off-by: jeremyhi <fengjiachun@gmail.com> * chore: by scan_memory_on_exhausted comment Signed-off-by: jeremyhi <fengjiachun@gmail.com> * fix: by review comment Signed-off-by: jeremyhi <fengjiachun@gmail.com> * chore: typo Signed-off-by: jeremyhi <fengjiachun@gmail.com> --------- Signed-off-by: jeremyhi <fengjiachun@gmail.com>	2026-03-26 03:25:50 +00:00
Yingwen	0e22d6a72b	feat: implement partition range cache stream (#7842 ) * feat: add cache stream helpers, key construction, config wiring, and metrics for partition range cache Add range result cache size config field and wire it through cache builder chains. Implement cache key building (build_range_cache_key), stream replay/store helpers (cached_flat_range_stream, cache_flat_range_stream), dictionary compaction (compact_pk_dictionary), and partition range row group collection. Add range cache metrics (size, hit, miss) to ScanMetricsSet and PartitionMetrics. Move fingerprint tests from scan_region to range_cache module. These functions are not yet wired into scan execution. Signed-off-by: evenyag <realevenyag@gmail.com> * feat: add benchmark for cache stream Signed-off-by: evenyag <realevenyag@gmail.com> * refactor: move bench_util to test_util Signed-off-by: evenyag <realevenyag@gmail.com> * feat: share dict Signed-off-by: evenyag <realevenyag@gmail.com> * test: test ptr_eq Signed-off-by: evenyag <realevenyag@gmail.com> * chore: fmt code Signed-off-by: evenyag <realevenyag@gmail.com> * refactor: simplify value array handling Signed-off-by: evenyag <realevenyag@gmail.com> * chore: add todo for estimate size Signed-off-by: evenyag <realevenyag@gmail.com> * feat: simplify size calculation Signed-off-by: evenyag <realevenyag@gmail.com> * chore: remove one test Signed-off-by: evenyag <realevenyag@gmail.com> * test: update config test Signed-off-by: evenyag <realevenyag@gmail.com> * chore: address review comment Only ignore exprs that can extract time ranges Signed-off-by: evenyag <realevenyag@gmail.com> * test: fix tests Signed-off-by: evenyag <realevenyag@gmail.com> --------- Signed-off-by: evenyag <realevenyag@gmail.com>	2026-03-24 10:01:13 +00:00
dennis zhuang	223f6cfdf7	feat: supports sst_format for x-greptime-hints and database options (#7843 ) Signed-off-by: Dennis Zhuang <killme2008@gmail.com>	2026-03-24 02:05:16 +00:00
Ning Sun	3cdf03d830	feat: introduce APIs for storing perses dashboard definition (#7791 ) * feat: introduce APIs for storing perses dashboard definition * test: ensure we can update dashboard * refactor: construct dashboard defnition directly * refactor: don't create table on list requests	2026-03-13 03:40:04 +00:00
Ruihang Xia	9e95214fc8	feat: replace shadow-rs with self-maintained version info (#7782 ) * reimplement shadow-rs Signed-off-by: Ruihang Xia <waynestxia@gmail.com> * fix: remove timestamp build metadata * fix: refresh version build metadata * use git2 Signed-off-by: Ruihang Xia <waynestxia@gmail.com> * warn about git failure Signed-off-by: Ruihang Xia <waynestxia@gmail.com> --------- Signed-off-by: Ruihang Xia <waynestxia@gmail.com>	2026-03-10 21:08:26 +00:00
discord9	56ee8baa3f	feat: admin gc table/regions (#7619 ) * feat: gc table Signed-off-by: discord9 <discord9@163.com> * test: admin gc Signed-off-by: discord9 <discord9@163.com> * chore: after rebase fix Signed-off-by: discord9 <discord9@163.com> * refactor: GcStats Signed-off-by: discord9 <discord9@163.com> * refactor: use gc ticker for admin gc Signed-off-by: discord9 <discord9@163.com> * fix: region routes override Signed-off-by: discord9 <discord9@163.com> * test: non happy path Signed-off-by: discord9 <discord9@163.com> * refactor: gc job report enum Signed-off-by: discord9 <discord9@163.com> * test: process 0 regions Signed-off-by: discord9 <discord9@163.com> * after rebase Signed-off-by: discord9 <discord9@163.com> * feat: allow manual gc to return error Signed-off-by: discord9 <discord9@163.com> * chore: update proto Signed-off-by: discord9 <discord9@163.com> * per review Signed-off-by: discord9 <discord9@163.com> * chore: timeout and update proto Signed-off-by: discord9 <discord9@163.com> * chore: udpate proto Signed-off-by: discord9 <discord9@163.com> --------- Signed-off-by: discord9 <discord9@163.com>	2026-03-06 08:25:44 +00:00
Weny Xu	0ed3b83099	refactor: rename partition rule version to partition expr version (#7696 ) * refactor: rename partition rule version to partition expr version Signed-off-by: WenyXu <wenymedia@gmail.com> * chore: update proto Signed-off-by: WenyXu <wenymedia@gmail.com> * chore: clippy Signed-off-by: WenyXu <wenymedia@gmail.com> --------- Signed-off-by: WenyXu <wenymedia@gmail.com>	2026-02-10 10:12:47 +00:00
Weny Xu	8026b23834	feat: partition rule version validation for writes and staging (#7628 ) * feat: verify partition rule Signed-off-by: WenyXu <wenymedia@gmail.com> * feat: add partition version cache Signed-off-by: WenyXu <wenymedia@gmail.com> * chore: header check Signed-off-by: WenyXu <wenymedia@gmail.com> * chore: fmt toml Signed-off-by: WenyXu <wenymedia@gmail.com> * refactor: minor refactor Signed-off-by: WenyXu <wenymedia@gmail.com> * chore: header Signed-off-by: WenyXu <wenymedia@gmail.com> * chore: fix clippy Signed-off-by: WenyXu <wenymedia@gmail.com> * fix: fix unit tests Signed-off-by: WenyXu <wenymedia@gmail.com> * refactor: minor refactor Signed-off-by: WenyXu <wenymedia@gmail.com> * chore: apply suggestions Signed-off-by: WenyXu <wenymedia@gmail.com> * chore: nit Signed-off-by: WenyXu <wenymedia@gmail.com> * chore: nit Signed-off-by: WenyXu <wenymedia@gmail.com> * chore: apply suggestions Signed-off-by: WenyXu <wenymedia@gmail.com> * chore: apply suggestions Signed-off-by: WenyXu <wenymedia@gmail.com> --------- Signed-off-by: WenyXu <wenymedia@gmail.com>	2026-02-06 12:16:34 +00:00

1 2 3 4 5 ...

512 Commits