Commit Graph

5554 Commits

Author SHA1 Message Date
LFC
d76ecf67bd refactor: skip prefilter for expr with functions (#8280)
Signed-off-by: luofucong <luofc@foxmail.com>
2026-06-11 09:48:12 +00:00
Yingwen
340f22f301 chore: load page index in opener (#8269)
Signed-off-by: evenyag <realevenyag@gmail.com>
2026-06-11 09:09:18 +00:00
Lanqing Yang
565899e482 perf(mito): cached-size single-pass WAL entry encoder (#8254)
prost has no size caching, so WalEntry::encode_to_vec recomputes the
encoded_len of every nested message for each length delimiter. In the deep
WAL tree (WalEntry/Mutation/Rows/Row/Value) a leaf Value's length is
recomputed once per ancestor level (~5x).

Add WalEntryEncoder: one size pass caches every message body length into a
flat vector (pre-order), one encode pass writes bytes reading the cached
lengths back via a cursor, so each length is computed exactly once. Leaf
messages (Value, ColumnSchema, WriteHint, BulkWalEntry) delegate their body
encoding to prost but have their single computed length cached.

Output is byte-for-byte identical to encode_to_vec (asserted in tests
covering delete op_type, null values, empty entry, encoder reuse), which is
required for WAL replay compatibility. Wired into WalWriter::add_entry,
reused per write batch.

This compensates for prost lacking a cached_size mechanism and can be
removed if such caching lands upstream.

Signed-off-by: lyang24 <lanqingy93@gmail.com>
2026-06-11 08:50:36 +00:00
LFC
92a915b5a0 feat: json2 field access pushdown to parquet (#8267)
* feat: json2 field access pushdown to parquet

Signed-off-by: luofucong <luofc@foxmail.com>

* resolve PR comments

Signed-off-by: luofucong <luofc@foxmail.com>

* resolve PR comments

Signed-off-by: luofucong <luofc@foxmail.com>

---------

Signed-off-by: luofucong <luofc@foxmail.com>
2026-06-11 02:10:30 +00:00
dennis zhuang
ca21c9f048 fix!: correct information_schema index metadata (#8275)
fix: correct information_schema index metadata

Signed-off-by: Dennis Zhuang <killme2008@gmail.com>
2026-06-11 01:36:23 +00:00
discord9
8fc5a3b1c7 feat: apply remote dynamic filters on datanode scans (#8262)
* feat: apply rdf

Signed-off-by: discord9 <discord9@163.com>

* chore: clippy

Signed-off-by: discord9 <discord9@163.com>

* fix: drop remote dyn filter fallback exec

Signed-off-by: discord9 <discord9@163.com>

* Revert "fix: drop remote dyn filter fallback exec"

This reverts commit bb757a596c.

Signed-off-by: discord9 <discord9@163.com>

* refactor: use rdf receiver logical plan instead

Signed-off-by: discord9 <discord9@163.com>

* test: update sqlness

Signed-off-by: discord9 <discord9@163.com>

* feat: rdf disable option

Signed-off-by: discord9 <discord9@163.com>

* tests: large int tests

Signed-off-by: discord9 <discord9@163.com>

* chore: clippy

Signed-off-by: discord9 <discord9@163.com>

* refactor: per review

Signed-off-by: discord9 <discord9@163.com>

* test: update prec fix

Signed-off-by: discord9 <discord9@163.com>

* fix: make receiver node works

Signed-off-by: discord9 <discord9@163.com>

* refactor: per review

Signed-off-by: discord9 <discord9@163.com>

* fix: tql disable rdf

Signed-off-by: discord9 <discord9@163.com>

* chore: rm useless joins

Signed-off-by: discord9 <discord9@163.com>

* fix: also disable in flow tql

Signed-off-by: discord9 <discord9@163.com>

* refactor: per review rm to promql

Signed-off-by: discord9 <discord9@163.com>

* chore: promql ut

Signed-off-by: discord9 <discord9@163.com>

* per review

Signed-off-by: discord9 <discord9@163.com>

* test: rm misleading&add some nested/cleanup

Signed-off-by: discord9 <discord9@163.com>

---------

Signed-off-by: discord9 <discord9@163.com>
2026-06-10 12:57:23 +00:00
jeremyhi
0a2dc6d3f2 feat(cli): add export import progress abstraction (#8270)
Signed-off-by: fengjiachun <3860496+fengjiachun@users.noreply.github.com>
Co-authored-by: fengjiachun <3860496+fengjiachun@users.noreply.github.com>
2026-06-10 11:35:32 +00:00
Weny Xu
10074f04a6 fix(ci): retry repartition chaos row validation (#8271)
fix: retry repartition chaos row validation

Signed-off-by: WenyXu <wenymedia@gmail.com>
2026-06-10 11:13:37 +00:00
LFC
270dce5ed7 feat: decouple region edit and compaction (#8272)
* feat: decouple region edit and compaction

Signed-off-by: luofucong <luofc@foxmail.com>

* make `schedule_compaction_after_edit` default to `true`

Signed-off-by: luofucong <luofc@foxmail.com>

* resolve PR comments

Signed-off-by: luofucong <luofc@foxmail.com>

---------

Signed-off-by: luofucong <luofc@foxmail.com>
2026-06-10 09:56:11 +00:00
LFC
962990009c ci: notify jsonbench result (#8273)
Signed-off-by: luofucong <luofc@foxmail.com>
2026-06-10 08:51:37 +00:00
Weny Xu
05c4588f90 feat: support remote WAL logical pruning (#8259)
* feat: support logical deletion for remote WAL pruning

Signed-off-by: WenyXu <wenymedia@gmail.com>

* fix: trigger remote WAL flush for lagging prunable regions

Signed-off-by: WenyXu <wenymedia@gmail.com>

* chore: use from_mins

Signed-off-by: WenyXu <wenymedia@gmail.com>

* fix: fix unit tests

Signed-off-by: WenyXu <wenymedia@gmail.com>

---------

Signed-off-by: WenyXu <wenymedia@gmail.com>
2026-06-10 04:40:51 +00:00
jeremyhi
6ceae4498f refactor(cli): stream export-v2 verify data listing (#8268)
Signed-off-by: fengjiachun <fengjiachun@gmail.com>
2026-06-10 02:59:27 +00:00
Lei, HUANG
e74a73638d feat: separate datanode query and ingestion runtimes (#8246)
* feat: add datanode runtime options

Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>

* feat: add datanode runtime handles

Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>

* refactor: wire datanode runtimes into region server

Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>

* feat: route datanode ingestion to ingestion runtime

Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>

* feat: add datanode query runtime stream bridge

Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>

* feat: route datanode reads to query runtime

Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>

* feat: add datanode global runtimes

Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>

* refactor: use common datanode runtimes

Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>

* feat: run mito scan tasks on query runtime

Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>

* refactor: split datanode runtime options

Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>

* fix: clippy

Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>

* fix: share global fallback for datanode runtimes

Use the global runtime as the fallback for datanode query and ingestion
runtimes when datanode-specific pools are not initialized. This avoids
creating unused datanode worker pools in non-datanode services.

Files:
- `src/common/runtime/src/global.rs`

Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>

* fix: docs

Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>

* fix: forward query runtime stream metrics

Forward inner stream metrics through the datanode query runtime bridge so
`EXPLAIN ANALYZE` can report plan metrics after stream polling moves to the
query runtime.

Files:
- `src/datanode/src/query_stream.rs`
- `src/datanode/src/region_server.rs`

Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>

* fix: route metric batch puts to ingest runtime

Run the optimized metric batch put path on the datanode ingest runtime so
metric ingestion does not bypass runtime isolation.

Files:
- `src/datanode/src/region_server.rs`

Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>

* fix: abort query producer on stream drop

Abort the datanode query runtime producer when the returned read stream is
dropped so cancelled clients do not leave query work running in the
background.

Files:
- `src/datanode/src/query_stream.rs`
- `src/datanode/src/region_server.rs`

Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>

* refactor: simplify query stream bridge setup

Create the inner read stream before spawning the datanode query runtime
producer so setup does not use an extra task and initialization channel.

Files:
- `src/datanode/src/region_server.rs`

Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>

* feat/runtime-priority:
 ### Update Datanode Runtime Options and Region Server Logic

 - **`global.rs`**: Adjusted `datanode_ingest_rt_size` to utilize all available CPUs for improved performance.
 - **`region_server.rs`**: Simplified the collection of `put_requests` and optimized the `put_regions_batch` call for better efficiency.

Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>

* feat/runtime-priority:
 ### Remove Redundant Checks and Simplify Code

 - **`global.rs`**: Removed the assertion check for already initialized global runtimes to streamline the initialization process.
 - **`region_server.rs`**: Simplified the extraction of `Put` requests by removing unnecessary cloning and restructuring the iterator logic.

Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>

* fix: remove redundant spawn_datanode_query in RegionServer::handle_read

The outer `spawn_datanode_query` wrapped `handle_read_inner` on the
same runtime, creating a nested spawn that consumed query runtime
threads unnecessarily under concurrent read load. The gRPC handler
already provides runtime isolation, so the inner call is sufficient.

- `src/datanode/src/region_server.rs` — inline `handle_read_inner`
  directly instead of spawning onto the datanode query runtime

Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>

* fix: resolve test mismatch and redundant spawn in handle_remote_read

- `src/common/runtime/src/global.rs` — update test assertion to match
  default `datanode_ingest_rt_size` of `cpus` instead of `1`
- `src/datanode/src/region_server.rs` — inline `handle_remote_read_inner`
  directly instead of spawning onto the datanode query runtime

Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>

* refactor: rename datanode runtimes

Summary:
- Rename datanode runtime APIs from `datanode_query` and `datanode_ingest` to `query` and `ingest`.
- Rename runtime config keys from `datanode_query_rt_size` and `datanode_ingest_rt_size` to `query_rt_size` and `ingest_rt_size`.
- Update config docs, example config, and config-loading coverage.

Files:
- `src/common/runtime/src/global.rs`
- `src/common/runtime/src/lib.rs`
- `src/cmd/tests/load_config_test.rs`
- `src/datanode/src/region_server.rs`
- `src/mito2/src/read/pruner.rs`
- `src/mito2/src/read/range_cache.rs`
- `src/mito2/src/read/scan_region.rs`
- `src/mito2/src/read/series_scan.rs`
- `config/datanode.example.toml`
- `config/config.md`

Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>

* refactor: consolidate runtime options

Summary:
- Embed datanode runtime sizes in shared `RuntimeOptions` and remove the extra `GreptimeOptions` runtime type parameter.
- Use the unified `RuntimeOptions` for datanode global and datanode-specific runtime initialization.
- Update datanode runtime config coverage and ingest runtime default documentation.

Files:
- `src/common/runtime/src/global.rs`
- `src/common/runtime/src/lib.rs`
- `src/cmd/src/options.rs`
- `src/cmd/src/datanode.rs`
- `src/cmd/src/datanode/builder.rs`
- `src/cmd/tests/load_config_test.rs`
- `config/datanode.example.toml`
- `config/config.md`

Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>

* feat: guard against double initialization of datanode runtimes

Add an assertion in `init_datanode_runtimes` to panic when global runtimes
are already initialized, preventing silent overwrites.

- `src/common/runtime/src/global.rs` — assert guard in `init_datanode_runtimes`
  and test `test_set_datanode_runtimes_panics_after_global_runtimes_initialized`

Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>

---------

Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
2026-06-09 08:21:50 +00:00
Yingwen
1451a51ea0 feat: add tools for development (#8260)
* feat(cmd): add dev-tools binary with parquetbench command

Signed-off-by: evenyag <realevenyag@gmail.com>

* chore: add feature gate dev-tools

Signed-off-by: evenyag <realevenyag@gmail.com>

* feat(dev-tools): print output column count in parquetbench

Signed-off-by: evenyag <realevenyag@gmail.com>

* feat(dev-tools): print output schema in parquetbench

Signed-off-by: evenyag <realevenyag@gmail.com>

* refactor(cmd): move parquetbench into datanode CLI, drop dev-tools binary

Signed-off-by: evenyag <realevenyag@gmail.com>

* fix(dev-tools): validate pprof-after-warmup requires >=2 iterations

Signed-off-by: evenyag <realevenyag@gmail.com>

* fix(dev-tools): use absolute path instead of super:: import in parquetbench

Signed-off-by: evenyag <realevenyag@gmail.com>

* fix(dev-tools): base parquetbench throughput on scanned row-group bytes

Signed-off-by: evenyag <realevenyag@gmail.com>

---------

Signed-off-by: evenyag <realevenyag@gmail.com>
2026-06-09 08:12:21 +00:00
shuiyisong
fc653c84c8 refactor: loki ingestion (#8266)
* refactor: loki ingestion

Signed-off-by: shuiyisong <xixing.sys@gmail.com>

* fix: clippy

Signed-off-by: shuiyisong <xixing.sys@gmail.com>

---------

Signed-off-by: shuiyisong <xixing.sys@gmail.com>
2026-06-09 07:53:45 +00:00
jeremyhi
1e53b1a157 fix(config): align scan memory limit default with code (#8228)
* fix(config): align scan memory limit default with code

Signed-off-by: jeremyhi <fengjiachun@gmail.com>

* fix: by AI comments

Signed-off-by: jeremyhi <fengjiachun@gmail.com>

---------

Signed-off-by: jeremyhi <fengjiachun@gmail.com>
2026-06-08 14:01:48 +00:00
discord9
1c7b4c11fc fix(flow): support complex SQL full-query flows (#8242)
* fix(flow): skip TWE detection for complex plans

Signed-off-by: discord9 <discord9@163.com>

* chore: helper

Signed-off-by: discord9 <discord9@163.com>

* fix(flow): allow SQL CTE flow parsing

Signed-off-by: discord9 <discord9@163.com>

* fix(flow): execute unscoped SQL snapshots as SQL

Signed-off-by: discord9 <discord9@163.com>

* fix: column list

Signed-off-by: discord9 <discord9@163.com>

* refactor: per review

Signed-off-by: discord9 <discord9@163.com>

---------

Signed-off-by: discord9 <discord9@163.com>
2026-06-08 13:07:43 +00:00
dennis zhuang
e403133eb2 feat: add information_schema statistics table (#8253)
* feat: add information_schema statistics table

Signed-off-by: Dennis Zhuang <killme2008@gmail.com>

* fix: use index-local sequence in statistics

Signed-off-by: Dennis Zhuang <killme2008@gmail.com>

* fix: ordinal_position for pk

Signed-off-by: Dennis Zhuang <killme2008@gmail.com>

* fix: statistics.nullable uses empty string for non-nullable columns

Signed-off-by: Dennis Zhuang <killme2008@gmail.com>

---------

Signed-off-by: Dennis Zhuang <killme2008@gmail.com>
2026-06-08 12:52:02 +00:00
Copilot
38909d6ddf chore: Increase GreptimeDB cluster setup timeout for distributed fuzz CI (#8263)
Increase GreptimeDB cluster setup timeout

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
2026-06-08 11:30:14 +00:00
Weny Xu
9a814ca123 chore: add repartition debug logs (#8245)
Signed-off-by: WenyXu <wenymedia@gmail.com>
2026-06-08 08:48:23 +00:00
discord9
651cf04525 feat: fan out remote dynamic filter updates from FE (#8241)
* feat: fe send update

Signed-off-by: discord9 <discord9@163.com>

* refactor: after rebase

Signed-off-by: discord9 <discord9@163.com>

* fix: encode at least send bounds

Signed-off-by: discord9 <discord9@163.com>

* refactor: per review

Signed-off-by: discord9 <discord9@163.com>

---------

Signed-off-by: discord9 <discord9@163.com>
2026-06-08 08:24:57 +00:00
shuiyisong
5fd5b91b29 chore: remove auth in flownode (#8244)
* chore: remove auth in flownode

Signed-off-by: shuiyisong <xixing.sys@gmail.com>

* chore: update docs

Signed-off-by: shuiyisong <xixing.sys@gmail.com>

* chore: add flow startup check

Signed-off-by: shuiyisong <xixing.sys@gmail.com>

* fix: clippy

Signed-off-by: shuiyisong <xixing.sys@gmail.com>

---------

Signed-off-by: shuiyisong <xixing.sys@gmail.com>
2026-06-08 08:15:19 +00:00
Ning Sun
0881b7ba32 feat: pgwire 0.40 (#8257) 2026-06-08 07:31:08 +00:00
Ning Sun
fd64ced4da feat: introduce plugin setup functions with richer context (#8256)
feat: enrich plugin setup context
2026-06-08 06:53:08 +00:00
Lei, HUANG
e7ce3ac0c7 fix: accept JSONB OTLP logs against existing schema (#8250)
* fix: accept JSONB OTLP logs against existing schema

Handle existing OTLP log JSONB columns that round-trip as Json while direct log rows carry pre-encoded binary JSONB payloads.

Add integration coverage for repeated OTLP log batches after table auto-creation.

Files: src/servers/src/otlp/logs.rs, tests-integration/tests/http.rs
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>

* perf: cache OTLP log JSONB schema check

Cache whether an existing OTLP log column is legacy JSONB once per column before coercing rows.

Files: src/servers/src/otlp/logs.rs
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>

* fix: narrow down schema coercion

Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>

---------

Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
2026-06-08 01:57:23 +00:00
Weny Xu
7baa4f6854 fix: reset region detector after migration (#8234)
Signed-off-by: WenyXu <wenymedia@gmail.com>
2026-06-05 09:14:13 +00:00
Ruihang Xia
428cd8137d feat: identify noneffective binary modifiers (#8230)
* perf(promql): use tsid for full-label modifier joins

* fix(promql): collect narrow range joins by tsid

* verify on for both sides

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* check if tsid exists

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* update sqlness

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* correct behavior

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

---------

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
2026-06-05 08:44:50 +00:00
Yingwen
8e5c34aa75 ci: revert arm64 runner default to 4xlarge (#8243)
Signed-off-by: evenyag <realevenyag@gmail.com>
2026-06-05 08:31:13 +00:00
Yingwen
f580c42a1f perf: read primary key as binary if it overflows the dictionary (#8187)
* perf(mito2): read __primary_key as binary when chunk exceeds dict page limit

Signed-off-by: evenyag <realevenyag@gmail.com>

* fix(mito2): handle binary __primary_key column in prefilter

Signed-off-by: evenyag <realevenyag@gmail.com>

* fix(mito2): align prefilter binary PK test with eval_pk_group_mask

Signed-off-by: evenyag <realevenyag@gmail.com>

* chore: address review comments

Signed-off-by: evenyag <realevenyag@gmail.com>

---------

Signed-off-by: evenyag <realevenyag@gmail.com>
2026-06-05 08:19:35 +00:00
dennis zhuang
d6c37778ae feat: table semantic layer information_schema view (Phase 3) (#8240)
* feat: table semantic layer information_schema view (Phase 3)

Add `information_schema.table_semantics`, a queryable view over the table
semantic layer. One row per table that carries at least one
`greptime.semantic.*` option: the signal-agnostic keys
(signal_type/source/pipeline/metadata_quality) are promoted to columns and
the remaining signal-specific keys are folded into a `semantic_options`
JSON string. Tables with no semantic key are excluded.

Stacked on Phase 2.

Signed-off-by: Dennis Zhuang <killme2008@gmail.com>

* chore: address PR review on table_semantics

- fold JSON serialization failure into None instead of unwrap/panic
- drop per-row Vec allocation in predicate eval; use a fixed array
- align RFC view name with the shipped `table_semantics`

Signed-off-by: Dennis Zhuang <killme2008@gmail.com>

* chore: update results

Signed-off-by: Dennis Zhuang <killme2008@gmail.com>

---------

Signed-off-by: Dennis Zhuang <killme2008@gmail.com>
2026-06-05 08:16:41 +00:00
shuiyisong
d26a80855b chore: align OTLP logs with existing table schemas (#8229)
* chore: otlp logs with schema align

Signed-off-by: shuiyisong <xixing.sys@gmail.com>

* chore: add tests

Signed-off-by: shuiyisong <xixing.sys@gmail.com>

* fix: CR issue

Signed-off-by: shuiyisong <xixing.sys@gmail.com>

---------

Signed-off-by: shuiyisong <xixing.sys@gmail.com>
2026-06-04 12:53:09 +00:00
dennis zhuang
31c2c1f6db feat: table semantic layer per-table enrichment (Phase 2) (#8218)
* feat: table semantic layer per-table enrichment (Phase 2)

Phase 2 of the table semantic layer, plus a vocabulary trim so the layer only
records what a machine consumer cannot cheaply recover on its own.

Per-table metric enrichment (OTLP), via an internal per-table channel:
- A `SemanticIndex` accumulator records, per emitted table, the declared metric
  keys: type / unit / temporality / metadata_quality=declared / original_name.
  Conflicting single-valued keys collapse to `mixed`/`unknown`.
- Recording happens at the `encode_metrics` level where the base name, metric
  type, and proto fields are all in scope, so histogram/summary fan-out gets the
  correct per-subtable type (`_bucket`=histogram, `_sum`/`_count`=counter)
  without threading state through every encoder.
- The index is serialized onto the `greptime.internal.semantic.per_table_index`
  context extension; `apply_per_table_semantic_options` folds each table's keys
  into its options at auto-create time.
- `trace.conventions` is refined from the request's resource/scope `schema_url`s
  (concrete when uniform, else `mixed`/`unknown`).

Vocabulary trimmed to only meaningful keys. Kept: signal_type, source, pipeline,
trace.conventions, metric.{type,unit,temporality,metadata_quality,original_name}.
Dropped: metric.monotonic (a function of type), trace.has_events/has_links
(constant + derivable from columns), log.severity_scheme/body_format (constant /
derivable, and body_format cost an O(rows) scan), resource/scope lineage
(restates columns / collector-config concern), source_version (no cheap
non-constant value today). Prometheus carries type/unit in the metric name by
convention, so it gets identity only — no inferred enrichment.

Identity (signal_type + source) extended to the remaining ingest protocols so
the discovery view is complete: InfluxDB and OpenTSDB (metric), Loki and
Elasticsearch (log). These protocols carry no type/unit metadata, so identity is
all that applies.

Tests: unit coverage for the accumulator, per-metric-type fan-out, and trace
conventions; integration goldens updated for the OTLP metric/trace SHOW CREATE
output and the new Loki identity.

Signed-off-by: Dennis Zhuang <killme2008@gmail.com>

* chore: validate the option value

Signed-off-by: Dennis Zhuang <killme2008@gmail.com>

---------

Signed-off-by: Dennis Zhuang <killme2008@gmail.com>
2026-06-04 12:52:00 +00:00
discord9
6b7772e457 feat: add remote dynamic filter frontend registration (#8148)
* feat: filter id

Signed-off-by: discord9 <discord9@163.com>

* feat: dyn filter registry

Signed-off-by: discord9 <discord9@163.com>

* feat: filter id&refactor to type

Signed-off-by: discord9 <discord9@163.com>

* feat: merge scan register dyn filter(not send yet)

Signed-off-by: discord9 <discord9@163.com>

* feat: init reg dyn filter

Signed-off-by: discord9 <discord9@163.com>

* wip: remote dyn filter task 03

Signed-off-by: discord9 <discord9@163.com>

* fix: resolve remote dyn filter rebase fallout

Signed-off-by: discord9 <discord9@163.com>

* chore: keep remote dyn filter docs local

Signed-off-by: discord9 <discord9@163.com>

* chore: remove stale filter id allow

Signed-off-by: discord9 <discord9@163.com>

* chore: clippy

Signed-off-by: discord9 <discord9@163.com>

* chore: fix remote dyn filter import style

Signed-off-by: discord9 <discord9@163.com>

* chore: fix query metrics test fallout

Signed-off-by: discord9 <discord9@163.com>

* fix: exclude region from remote dyn filter id

Signed-off-by: discord9 <discord9@163.com>

* chore: import

Signed-off-by: discord9 <discord9@163.com>

* refactor: rm some to latter

Signed-off-by: discord9 <discord9@163.com>

* feat: add initial dyn filter snapshot

Signed-off-by: discord9 <discord9@163.com>

* refactor: per review

Signed-off-by: discord9 <discord9@163.com>

* docs: better comment, rm some slop

Signed-off-by: discord9 <discord9@163.com>

* chore: per review

Signed-off-by: discord9 <discord9@163.com>

---------

Signed-off-by: discord9 <discord9@163.com>
2026-06-04 12:31:42 +00:00
Weny Xu
42eeeaa514 fix: avoid stale metadata cache after invalidation (#8235)
Signed-off-by: WenyXu <wenymedia@gmail.com>
2026-06-04 12:05:17 +00:00
QuakeWang
6ed67167bb feat: support headerless CSV copy from (#8233)
* feat: support headerless CSV copy from

Signed-off-by: QuakeWang <wangfuzheng0814@foxmail.com>

* fix: update csv copy sqlness result

Signed-off-by: QuakeWang <wangfuzheng0814@foxmail.com>

* test: cover headerless CSV copy from

Signed-off-by: QuakeWang <wangfuzheng0814@foxmail.com>

* test: cover headerless CSV column count mismatch

Signed-off-by: QuakeWang <wangfuzheng0814@foxmail.com>

---------

Signed-off-by: QuakeWang <wangfuzheng0814@foxmail.com>
2026-06-04 12:04:15 +00:00
shuiyisong
52b51c31fc refactor: streamline pipeline ingestion row building (#8236)
* refactor: minor pipeline refactor

Signed-off-by: shuiyisong <xixing.sys@gmail.com>

* refactor: use suffix as group key

Signed-off-by: shuiyisong <xixing.sys@gmail.com>

---------

Signed-off-by: shuiyisong <xixing.sys@gmail.com>
2026-06-04 09:03:20 +00:00
Ruihang Xia
edd563d2e1 feat: join simplifier for promql binary op (#8211)
* basic impl

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* refactor, remove shallow methods

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* more sqlness cases

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* clear ctx

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* prefix leaf aliase

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

---------

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
2026-06-04 08:13:24 +00:00
Weny Xu
bb1de2a744 test: move CSV skip bad records coverage to integration (#8237)
Signed-off-by: WenyXu <wenymedia@gmail.com>
2026-06-04 07:58:58 +00:00
Ruihang Xia
a6cb4d2083 feat: look up cache with range calculation (#8123)
* feat: loop up cache with range calculation

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* copy fragements

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* release dashmap reference

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* fix bugs

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* remove dashmap

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* remove keys

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* clean up

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

---------

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
2026-06-04 07:41:49 +00:00
Weny Xu
3bd915865a fix: improve remote WAL replay checkpoint handling (#8225)
* fix: correct topic_latest_entry_id

Signed-off-by: WenyXu <wenymedia@gmail.com>

* fix: use pruned wal id for replay checkpoint

Signed-off-by: WenyXu <wenymedia@gmail.com>

* feat: persist remote wal checkpoints periodically

Signed-off-by: WenyXu <wenymedia@gmail.com>

* test: cover remote wal topic latest entry id

Signed-off-by: WenyXu <wenymedia@gmail.com>

* chore: apply suggestions

Signed-off-by: WenyXu <wenymedia@gmail.com>

* fix: merge pruned wal id for migration checkpoint

Signed-off-by: WenyXu <wenymedia@gmail.com>

---------

Signed-off-by: WenyXu <wenymedia@gmail.com>
2026-06-04 06:10:38 +00:00
Lei, HUANG
da9d314fa4 fix: align bulk insert schema handling (#8222)
* fix: align bulk insert schema handling

Carry `aligned_schema_version` through bulk insert requests so datanode can recheck stale schemas only when needed. Use the physical data region schema version for metric logical bulk inserts, while external bulk insert callers leave alignment unknown.

Related files:
- `Cargo.toml`
- `Cargo.lock`
- `src/store-api/src/region_request.rs`
- `src/metric-engine/src/engine/bulk_insert.rs`
- `src/mito2/src/request.rs`
- `src/mito2/src/worker/handle_bulk_insert.rs`
- `src/mito2/src/worker/handle_write.rs`
- `src/operator/src/bulk_insert.rs`
- `src/servers/src/pending_rows_batcher.rs`
- `src/mito2/src/engine/edit_region_test.rs`

Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>

* test: cover bulk insert missing columns

Add Flight bulk insert coverage for batches that omit nullable columns and verify the database fills the missing values with nulls.

Related files:
- `tests-integration/src/grpc/flight.rs`

Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>

* chore: update proto to main branch commit

Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>

* fix: align physical metric bulk insert schema

Pass the physical schema version through metric bulk insert requests so
physical and logical region writes share the same schema alignment behavior.

Files:
- \`src/metric-engine/src/engine/bulk_insert.rs\` — updates \`bulk_insert_physical_region\` and reuses \`physical_schema_version\` for bulk insert schema alignment.

Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>

* chore: add some doc

Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>

---------

Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
2026-06-04 02:55:09 +00:00
discord9
d304df6e75 fix: run eval-interval flow without time window (#8231)
* fix: run eval-interval flow without time window

Signed-off-by: discord9 <discord9@163.com>

* test: cover eval-interval flow join query

Signed-off-by: discord9 <discord9@163.com>

* fix: address eval interval flow review comments

Signed-off-by: discord9 <discord9@163.com>

---------

Signed-off-by: discord9 <discord9@163.com>
2026-06-03 13:06:22 +00:00
QuakeWang
ca07a53deb feat: support CSV copy skip bad records (#8198)
* feat: support CSV copy headers and skip bad records

Signed-off-by: QuakeWang <wangfuzheng0814@foxmail.com>

* refactor: reuse CSV skippable error helper

Signed-off-by: QuakeWang <wangfuzheng0814@foxmail.com>

* fix: address CSV skip bad records review

Signed-off-by: QuakeWang <wangfuzheng0814@foxmail.com>

* docs: clarify CSV skip bad records performance tradeoff

Signed-off-by: QuakeWang <wangfuzheng0814@foxmail.com>

---------

Signed-off-by: QuakeWang <wangfuzheng0814@foxmail.com>
2026-06-03 12:11:02 +00:00
Weny Xu
31cdd864d2 fix: format detected IPv6 grpc address (#8221)
Signed-off-by: WenyXu <wenymedia@gmail.com>
2026-06-03 12:05:50 +00:00
liyang
91d7dc547b ci: update arm64 runner (#8226)
* ci: update arm64 runner

Fixed: https://github.com/GreptimeTeam/greptimedb/actions/runs/26856891871/job/79202060647

* Update dev-build.yml
2026-06-03 06:42:49 +00:00
shuiyisong
11f06709cd chore: remove HttpConfigurator (#8224)
Signed-off-by: shuiyisong <xixing.sys@gmail.com>
2026-06-03 04:13:32 +00:00
jeremyhi
aaffe1a048 refactor(cli): split export-v2 verify into plan/scan/reconcile (#8206)
* refactor(cli): split export-v2 verify into plan/scan/reconcile

Signed-off-by: jeremyhi <fengjiachun@gmail.com>

* fix: by AI comments

Signed-off-by: jeremyhi <fengjiachun@gmail.com>

---------

Signed-off-by: jeremyhi <fengjiachun@gmail.com>
2026-06-03 03:10:10 +00:00
shuiyisong
f3919a3117 chore: update vrl to 0.33 (#8219)
* chore: update vrl to 0.33

Signed-off-by: shuiyisong <xixing.sys@gmail.com>

* chore: reuse thread local runtime

Signed-off-by: shuiyisong <xixing.sys@gmail.com>

---------

Signed-off-by: shuiyisong <xixing.sys@gmail.com>
2026-06-02 10:04:15 +00:00
discord9
7f76ad5439 feat: validate batching flow sink schema on create (#8176)
* feat: check schema on create

Signed-off-by: discord9 <discord9@163.com>

* chore: update sqlness

Signed-off-by: discord9 <discord9@163.com>

* fix(flow): avoid duplicate fields when matching sink schema

Signed-off-by: discord9 <discord9@163.com>

* fix: null handling

Signed-off-by: discord9 <discord9@163.com>

* chore: per review

Signed-off-by: discord9 <discord9@163.com>

* chore: debug log

Signed-off-by: discord9 <discord9@163.com>

---------

Signed-off-by: discord9 <discord9@163.com>
2026-06-02 08:24:50 +00:00
Yingwen
50b1a07232 feat(datanode): hold query permit for stream and expose limiter timeout (#8215)
* feat(datanode): hold query permit for stream and expose limiter timeout

Signed-off-by: evenyag <realevenyag@gmail.com>

* style(datanode): use as_mut() to poll inner stream in PermitGuardedStream

Signed-off-by: evenyag <realevenyag@gmail.com>

---------

Signed-off-by: evenyag <realevenyag@gmail.com>
2026-06-02 08:00:01 +00:00