Commit Graph

4346 Commits

Author SHA1 Message Date
Yingwen
5231ee40c8 feat: add parquet pk prefilter helpers (#7850)
* feat: extract parquet pk prefilter helpers

Signed-off-by: evenyag <realevenyag@gmail.com>

* chore: fmt code

Signed-off-by: evenyag <realevenyag@gmail.com>

* chore: fix warnings

Signed-off-by: evenyag <realevenyag@gmail.com>

* chore: update todo

Signed-off-by: evenyag <realevenyag@gmail.com>

---------

Signed-off-by: evenyag <realevenyag@gmail.com>
2026-03-24 03:57:18 +00:00
dennis zhuang
223f6cfdf7 feat: supports sst_format for x-greptime-hints and database options (#7843)
Signed-off-by: Dennis Zhuang <killme2008@gmail.com>
2026-03-24 02:05:16 +00:00
Ruihang Xia
f999d5e70e feat: avoid some vector-array conversions on flat projection (#7804)
* perf(mito2): optimize flat projection conversion

* shrink the diff size

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* apply gemini's sugg

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* nit

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

---------

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
2026-03-24 00:11:37 +00:00
Lei, HUANG
7874282089 feat(mito): flat scan for time series memtable (#7814)
* feat/flat-for-time-series:
 ### Commit Message

 Enhance `TimeSeriesMemtable` with Record Batch Support

 - **`time_series.rs`**:
   - Introduced `BatchToRecordBatchContext` to facilitate conversion of batch iterators to record batch iterators.
   - Added `build_record_batch` method in `TimeSeriesIterBuilder` to support record batch creation.
   - Implemented multiple test cases to validate the functionality of record batch creation, including tests for projections,
 deduplication, sequence filtering, and data correctness.

Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>

* feat/flat-for-time-series:
 Refactor `TimeSeriesMemtable` and `TimeSeriesIterBuilder`

 - Renamed `adapter_context` to `batch_to_record_batch` in `TimeSeriesMemtable` for clarity.
 - Simplified `MemtableRangeContext` initialization by removing the `batch_to_record_batch` parameter.
 - Added `is_record_batch` method to `TimeSeriesIterBuilder` to indicate record batch status.

Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>

* feat/flat-for-time-series:
 ### Add Time Range Filtering and Predicate Group Enhancements

 - **`memtable.rs`**: Updated `IterBuilder` to include `time_range` parameter in `build_record_batch` method, enhancing record batch iteration with time range filtering.
 - **`time_series.rs`**: Modified `TimeSeriesIterBuilder` to use `PredicateGroup` instead of `Predicate`, and integrated `PruneTimeIterator` for time-based filtering.
 - **`memtable_util.rs`**: Removed unused `Predicate` import, reflecting changes in predicate handling.

Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>

---------

Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
2026-03-23 19:39:57 +00:00
Lei, HUANG
72f289df50 chore: remove GrpcQueryHandler::put_record_batch (#7844)
chore: remove GrpcQueryHandler::put_record_batch, we should use GrpcQueryHandler::handle_put_record_batch_stream instead

Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
2026-03-23 07:12:39 +00:00
jeremyhi
805536aed1 fix: windows file path (#7839)
Signed-off-by: jeremyhi <fengjiachun@gmail.com>
2026-03-20 08:19:41 +00:00
Ning Sun
d14817bfa6 fix: resolve optimization issue for extended query (#7824)
* fix: resolve optimization issue for extended query

* fix: type cast from subquery

* chore: update error information in sqlness

* chore: switch to released pgwire

* refactor: remove optimize function completely

* chore: add more tests

* test: attempt to fix the fuzz issue

* fix: try to resolve the test issue
2026-03-20 03:58:39 +00:00
Ruihang Xia
f034255fe6 perf: support group accumulators for state wrapper (#7826)
* perf: support group accumulators for state wrapper

* new tests and avoid clone

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

---------

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
2026-03-19 22:40:52 +00:00
jeremyhi
16fcbb2729 feat: export import v2 pr1 (#7785)
* feat: v2 schema handling

Signed-off-by: jeremyhi <fengjiachun@gmail.com>

* feat: impl m1.5 ddl export/import and schema tests

Signed-off-by: jeremyhi <fengjiachun@gmail.com>

* chore: git ignore update

Signed-off-by: jeremyhi <fengjiachun@gmail.com>

* chore: add license header

Signed-off-by: jeremyhi <fengjiachun@gmail.com>

* chore: make fmt-check happy

Signed-off-by: jeremyhi <fengjiachun@gmail.com>

* fix: Run imported DDL against the intended schema

Signed-off-by: jeremyhi <fengjiachun@gmail.com>

* fix: Canonicalize schema names after case-insensitive check

Signed-off-by: jeremyhi <fengjiachun@gmail.com>

* fix: escape sql funcs

Signed-off-by: jeremyhi <fengjiachun@gmail.com>

* fix: Fixed by carrying explicit execution_schema in DdlStatement instead of parsing schema from SQL

Signed-off-by: jeremyhi <fengjiachun@gmail.com>

* fix: Fixed by encoding schema names as safe path segments in shared DDL path helpers

Signed-off-by: jeremyhi <fengjiachun@gmail.com>

* refactor(cli): make export/import v2 schema recovery DDL-only

Signed-off-by: jeremyhi <fengjiachun@gmail.com>

* chore: by clippy

Signed-off-by: jeremyhi <fengjiachun@gmail.com>

* chore: follow our styling

Signed-off-by: jeremyhi <fengjiachun@gmail.com>

* fix(cli): reject remote snapshot URIs with empty root

Signed-off-by: jeremyhi <fengjiachun@gmail.com>

* fix(cli): dedupe schema filters after canonicalization

Signed-off-by: jeremyhi <fengjiachun@gmail.com>

* fix(cli): schema-scoped detection to cover external tables

Signed-off-by: jeremyhi <fengjiachun@gmail.com>

---------

Signed-off-by: jeremyhi <fengjiachun@gmail.com>
2026-03-19 21:26:41 +00:00
Ruihang Xia
2af3951944 feat: cache decoded region metadata alone with parquet metadata (#7813)
* cache decoded region metadata

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* fix: account for decoded sst metadata cache weight

* take optional pre-exist metadata

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

---------

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
2026-03-19 03:09:47 +00:00
ZonaHe
cc441b5642 feat: update dashboard to v0.12.0 (#7823)
Co-authored-by: sunchanglong <sunchanglong@users.noreply.github.com>
2026-03-17 18:25:14 +00:00
Lei, HUANG
dc98e0215b feat(metric-engine): support bulk inserts with put fallback (#7792)
* feat(metric-engine): support bulk inserts

Implement `RegionRequest::BulkInserts` to support efficient columnar data
ingestion in the metric engine.

Key changes:
- Implement `bulk_insert_region` to handle logical-to-physical region mapping
  and dispatch writes.
- Add `batch_modifier` for `RecordBatch` transformations, specifically for
  `__tsid` generation and sparse primary key encoding.
- Integrate `BulkInserts` into the `MetricEngine` request handling logic.
- Provide a row-based fallback mechanism if the underlying storage doesn't
  support bulk writes.

Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>

* feat/metric-engine-bulk-insert:
 ### Update `bulk_insert.rs` to Support Partition Expression Version

 - **Enhancements**:
   - Added support for `partition_expr_version` in `RegionBulkInsertsRequest` and `RegionPutRequest`.
   - Modified the handling of `partition_expr_version` to be dynamically set from the `request` object.

 Files affected:
 - `src/metric-engine/src/engine/bulk_insert.rs`

Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>

* fix: cargo lock revert

Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>

* add doc for conversions

Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>

* chore: simplify test

Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>

* feat/metric-engine-bulk-insert:
 ### Refactor `bulk_insert.rs` in `metric-engine`

 - **Refactor Functionality**:
   - Replaced `resolve_tag_columns` with `resolve_tag_columns_from_metadata` to streamline tag column resolution.
   - Moved logic for resolving tag columns directly into `resolve_tag_columns_from_metadata`, removing the need for an external function call.
 - **Enhancements**:
   - Improved error handling and context provision for missing physical regions and columns.
   - Optimized tag column sorting and index management within the batch processing logic.

Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>

* feat/metric-engine-bulk-insert:
 ### Refactor `record_batch_to_rows` Function in `bulk_insert.rs`

 - Simplified the `record_batch_to_rows` function by removing the `logical_metadata` parameter and directly validating column types within the function.
 - Enhanced error handling for timestamp, value, and tag columns by checking their data types and providing detailed error messages.
 - Replaced the use of `Helper::try_into_vector` with direct downcasting to `TimestampMillisecondArray`, `Float64Array`, and `StringArray` for improved type safety and clarity.
 - Updated the construction of `api::v1::Rows` to directly handle null values and construct `api::v1::Value` objects accordingly.

Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>

* feat/metric-engine-bulk-insert:
 ## Commit Message

 Refactor `bulk_insert.rs` to optimize state access

 - Moved the state read operation inside a new block to limit its scope and improve code clarity.
 - Adjusted logic for processing `tag_columns` and `non_tag_indices` to work within the new block structure.

Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>

* feat/metric-engine-bulk-insert:
 ### Refactor `compute_tsid_array` Function

 - **Refactored `compute_tsid_array` function**: Modified the function signature to accept `tag_arrays` as a parameter instead of building it internally. This change affects the following files:
   - `src/metric-engine/src/batch_modifier.rs`

 - **Updated test cases**: Adjusted test cases to accommodate the new `compute_tsid_array` function signature by passing `tag_arrays` explicitly.

Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>

* docs: add doc for bulk_insert_region

Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>

* feat/metric-engine-bulk-insert:
 ### Commit Message

 Refactor `bulk_insert.rs` in `metric-engine`:

 - Removed error handling for unsupported status codes in `write_data` method.
 - Eliminated `record_batch_to_rows` function, simplifying the data insertion process.
 - Streamlined the `write_data` method by removing fallback logic for unsupported operations.

Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>

* feat/metric-engine-bulk-insert:
 - **Optimize Primary Key Construction**: Refactored `modify_batch_sparse` in `batch_modifier.rs` to use `BinaryBuilder` for more efficient primary key construction.
 - **Add Fallback for Unsupported Bulk Inserts**: Updated `bulk_insert.rs` to handle unsupported bulk inserts by converting record batches to rows and using `RegionPutRequest`.
 - **Implement Record Batch to Rows Conversion**: Added `record_batch_to_rows` function in `bulk_insert.rs` to convert `RecordBatch` to `api::v1::Rows` for fallback operations.

Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>

* feat/metric-engine-bulk-insert:
 Add test for handling null values in `record_batch_to_rows`

 - Added a new test `test_record_batch_to_rows_with_null_values` in `bulk_insert.rs` to verify the handling of null values in the `record_batch_to_rows` function.
 - The test checks the conversion of a `RecordBatch` with null values in various fields to ensure correct row creation and schema handling.

Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>

* feat/metric-engine-bulk-insert:
 Add fallback path for unsupported status and improve error context handling

 - **`bulk_insert.rs`**:
   - Added a fallback path for `PartitionTreeMemtable` in case of unsupported status code.
   - Enhanced error handling by using `with_context` for better error messages when timestamp and value columns are not found in `RecordBatch`.

Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>

---------

Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
2026-03-17 11:28:06 +00:00
Yingwen
e0aadffb91 feat: add flat last row reader to the final stream (#7818)
Signed-off-by: evenyag <realevenyag@gmail.com>
2026-03-17 07:55:48 +00:00
Yingwen
5a37e58b4f feat(mito2): add partition range cache infrastructure (#7798)
* feat: add partition range cache infra

Signed-off-by: evenyag <realevenyag@gmail.com>

* refactor: optimize scan request fingerprint cloning

Signed-off-by: evenyag <realevenyag@gmail.com>

* refactor: merge loops

Signed-off-by: evenyag <realevenyag@gmail.com>

* chore: more docs

Signed-off-by: evenyag <realevenyag@gmail.com>

* chore: update estimated size method and comment

Signed-off-by: evenyag <realevenyag@gmail.com>

* chore: fix clippy

Signed-off-by: evenyag <realevenyag@gmail.com>

* feat: only cache when we scan files

Signed-off-by: evenyag <realevenyag@gmail.com>

* fix: address PR review comments for partition range cache

- Remove TimeSeriesDistribution from fingerprint as it only affects yield order
- Disable range cache when dyn filters are present since they change at runtime

Signed-off-by: evenyag <realevenyag@gmail.com>

* chore: fmt code

Signed-off-by: evenyag <realevenyag@gmail.com>

---------

Signed-off-by: evenyag <realevenyag@gmail.com>
2026-03-17 03:53:20 +00:00
Ning Sun
dd82fcac00 chore: update visibility of BatchToRecordBatchAdapter::new (#7817) 2026-03-16 09:56:34 +00:00
Lei, HUANG
be4a7a6d37 refactor: remove Memtable::iter (#7809)
* refactor: remove Memtable::iter

Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>

* fix: review comments

Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>

---------

Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
2026-03-16 07:49:31 +00:00
maximk777
b007f85986 feat(http): improve error logging with client IP (#7503)
* feat(http): improve error logging with client IP

- Add logging to ErrorResponse::from_error_message()
- Add middleware to log HTTP errors with client IP

Closes #7328

Signed-off-by: maximk777 <maximkirienkov777@gmail.com>

* fix(http): address review comments for error logging

Restore rich Debug logging in from_error(), add URI/method/matched path
to client IP middleware, and only log when client address is available.

Signed-off-by: evenyag <realevenyag@gmail.com>

---------

Signed-off-by: maximk777 <maximkirienkov777@gmail.com>
Signed-off-by: evenyag <realevenyag@gmail.com>
Co-authored-by: evenyag <realevenyag@gmail.com>
2026-03-16 07:10:33 +00:00
jeremyhi
c6f1ef8aec feat: track unlimited usage in memory manager (#7811)
* feat: track unlimited usage in memory manager

Signed-off-by: jeremyhi <fengjiachun@gmail.com>

* chore: by gemini comment

Signed-off-by: jeremyhi <fengjiachun@gmail.com>

* chore: remove unused import

Signed-off-by: jeremyhi <fengjiachun@gmail.com>

---------

Signed-off-by: jeremyhi <fengjiachun@gmail.com>
2026-03-16 03:52:27 +00:00
Ning Sun
306e8398cf fix: correct unicode representation for jsonb_to_string (#7810)
* fix: correct unicode representation for jsonb_to_string

* refactor: correct function name and behavior

* fix: fix json_to_string and provide tests
2026-03-16 03:01:02 +00:00
Yingwen
e215851c8a refactor: unify flush and compaction to always use FlatSource (#7799)
* feat: support write flat as primary key format

Signed-off-by: evenyag <realevenyag@gmail.com>

* feat: migrate flush to always use FlatSource

Add FormatType propagation in SstWriteRequest and use it to choose
Flat vs PrimaryKey write paths (write_all_flat vs
write_all_flat_as_primary_key) in AccessLayer and WriteCache. Make
compactor and flush derive the sst_write_format from region options or
engine config. Simplify flush logic and remove the old memtable_source
helper. Update tests to set default sst_write_format.

Signed-off-by: evenyag <realevenyag@gmail.com>

* refactor: compaction use flat source

Signed-off-by: evenyag <realevenyag@gmail.com>

* refactor: read parquet sequentially as flat batches

Signed-off-by: evenyag <realevenyag@gmail.com>

* refactor: remove new_batch_with_binary in favor of new_record_batch_with_binary

Replace PrimaryKeyWriteFormat with FlatWriteFormat in test_read_large_binary
test and use new_record_batch_with_binary directly, removing the now-unused
new_batch_with_binary function and its BinaryArray import.

Signed-off-by: evenyag <realevenyag@gmail.com>

* test: add tests for PrimaryKeyWriteFormat::convert_flat_batch

Signed-off-by: evenyag <realevenyag@gmail.com>

* refactor: remove Either from SstWriteRequest

Signed-off-by: evenyag <realevenyag@gmail.com>

* fix: handle index build mode

Signed-off-by: evenyag <realevenyag@gmail.com>

* fix: consider sparse encoding and last non null in flush

Signed-off-by: evenyag <realevenyag@gmail.com>

* test: add unit tests for field_column_start edge cases

Signed-off-by: evenyag <realevenyag@gmail.com>

---------

Signed-off-by: evenyag <realevenyag@gmail.com>
2026-03-13 09:44:13 +00:00
LFC
74ff5c37ea refactor: customize standalone instance build (#7807)
* refactor: customize standalone instance build

Signed-off-by: luofucong <luofc@foxmail.com>

* resolve PR comments

Signed-off-by: luofucong <luofc@foxmail.com>

---------

Signed-off-by: luofucong <luofc@foxmail.com>
2026-03-13 09:25:21 +00:00
dennis zhuang
0572a680af fix: allow empty string for env values (#7803)
* fix: allow empty string for env values

Signed-off-by: Dennis Zhuang <killme2008@gmail.com>

* chore: strip suffix

Signed-off-by: Dennis Zhuang <killme2008@gmail.com>

---------

Signed-off-by: Dennis Zhuang <killme2008@gmail.com>
2026-03-13 03:57:08 +00:00
Ning Sun
3cdf03d830 feat: introduce APIs for storing perses dashboard definition (#7791)
* feat: introduce APIs for storing perses dashboard definition

* test: ensure we can update dashboard

* refactor: construct dashboard defnition directly

* refactor: don't create table on list requests
2026-03-13 03:40:04 +00:00
discord9
3beb538aa8 fix: rm useless analyzer (#7797)
* fix: rm useless analyzer

Signed-off-by: discord9 <discord9@163.com>

* test: rm related test

Signed-off-by: discord9 <discord9@163.com>

* test: flow tql avg

Signed-off-by: discord9 <discord9@163.com>

---------

Signed-off-by: discord9 <discord9@163.com>
2026-03-12 10:53:47 +00:00
Chengjie Jin
7866132920 feat(procedure): detect potential deadlock when parent/child procedures share lock keys (#7752)
* feat(procedure): detect potential deadlock when parent/child share lock keys

Add a deadlock detection mechanism in submit_subprocedure() to warn
when a child procedure's lock_key overlaps with its parent's lock_key.

When this happens, the parent holds the lock while waiting for the child
to complete (at child_notify.notified().await), but the child blocks
forever trying to acquire the same lock. This is a classic Hold-and-Wait
deadlock.

The detection:
- Emits a warn! log in all builds (visible in production)
- Triggers debug_assert!(false) in debug/test builds for early CI detection

This partially addresses the TODO at line 121-122 and is a follow-up
to the discussion in: https://github.com/GreptimeTeam/greptimedb/issues/7692

Signed-off-by: YZL0v3ZZ <2055877225@qq.com>

* style: fix trailing whitespace

Signed-off-by: YZL0v3ZZ <2055877225@qq.com>

* refactor(procedure): extract deadlock detection into a testable pure function

Signed-off-by: YZL0v3ZZ <2055877225@qq.com>

* fix(procedure): preserve lock mode when detecting parent/child deadlock

Signed-off-by: YZL0v3ZZ <2055877225@qq.com>

* re-run ci check

Signed-off-by: YZL0v3ZZ <2055877225@qq.com>

---------

Signed-off-by: YZL0v3ZZ <2055877225@qq.com>
2026-03-12 09:33:55 +00:00
Ning Sun
28f97191a0 fix: make pipeline table ttl forever (#7795)
* fix: make pipeline table ttl forever

* chore: use constants when possible
2026-03-12 02:15:25 +00:00
discord9
922f9cb3d6 feat: use dyn filter (#7545)
* parent b2074e3863
author discord9 <discord9@163.com> 1767869295 +0800
committer discord9 <discord9@163.com> 1772529023 +0800

feat: use dyn filter

Signed-off-by: discord9 <discord9@163.com>

not supported

Signed-off-by: discord9 <discord9@163.com>

refactor: use make_mut instead

Signed-off-by: discord9 <discord9@163.com>

refactor: rm need to clone stream ctx

Signed-off-by: discord9 <discord9@163.com>

r

Signed-off-by: discord9 <discord9@163.com>

pcr

Signed-off-by: discord9 <discord9@163.com>

test: wait for datafusion update

Signed-off-by: discord9 <discord9@163.com>

refactor: use arc swap for dyn filters

Signed-off-by: discord9 <discord9@163.com>

* test: update sqlness

Signed-off-by: discord9 <discord9@163.com>

* chore: comment out sqlness

Signed-off-by: discord9 <discord9@163.com>

* test: update sqlness

Signed-off-by: discord9 <discord9@163.com>

* test: sqlness fix

Signed-off-by: discord9 <discord9@163.com>

* refactor: predicate without option

Signed-off-by: discord9 <discord9@163.com>

* feat: print dyn filters& more tests

Signed-off-by: discord9 <discord9@163.com>

* test: sqlness vector result update

Signed-off-by: discord9 <discord9@163.com>

* chore: log

Signed-off-by: discord9 <discord9@163.com>

* test: properly redact

Signed-off-by: discord9 <discord9@163.com>

* test: better data dist for non empty dyn filter

Signed-off-by: discord9 <discord9@163.com>

* test: properly redacted

Signed-off-by: discord9 <discord9@163.com>

* chore: per review

Signed-off-by: discord9 <discord9@163.com>

* properly redact

Signed-off-by: discord9 <discord9@163.com>

* docs:  explain why not  do it

Signed-off-by: discord9 <discord9@163.com>

* chore: rename update to add as its more proper

Signed-off-by: discord9 <discord9@163.com>

* chore: rm no need clone

Signed-off-by: discord9 <discord9@163.com>

* docs: per review

Signed-off-by: discord9 <discord9@163.com>

---------

Signed-off-by: discord9 <discord9@163.com>
2026-03-11 03:06:24 +00:00
Ruihang Xia
9e95214fc8 feat: replace shadow-rs with self-maintained version info (#7782)
* reimplement shadow-rs

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* fix: remove timestamp build metadata

* fix: refresh version build metadata

* use git2

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* warn about git failure

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

---------

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
2026-03-10 21:08:26 +00:00
Weny Xu
b1b91e88f4 fix(mito2): ensure enter staging waits for compaction (#7776)
* fix: do not schedule compaction

Signed-off-by: WenyXu <wenymedia@gmail.com>

* mito2: add pending ddl queue primitives to compaction scheduler

Signed-off-by: WenyXu <wenymedia@gmail.com>

* mito2: hand off pending ddls on compaction finish

Signed-off-by: WenyXu <wenymedia@gmail.com>

* mito2: defer enter staging via compaction pending ddl queue

Signed-off-by: WenyXu <wenymedia@gmail.com>

* mito2: cover pending-ddl failure paths on region lifecycle events

Signed-off-by: WenyXu <wenymedia@gmail.com>

* mito2: replay pending ddls directly in compaction handler

Signed-off-by: WenyXu <wenymedia@gmail.com>

* chore: styling

Signed-off-by: WenyXu <wenymedia@gmail.com>

* chore: styling

Signed-off-by: WenyXu <wenymedia@gmail.com>

* chore: apply suggestions from CR

Signed-off-by: WenyXu <wenymedia@gmail.com>

* chore: apply suggestions

Signed-off-by: WenyXu <wenymedia@gmail.com>

* test: add unit test

Signed-off-by: WenyXu <wenymedia@gmail.com>

* chore: add unit tests

Signed-off-by: WenyXu <wenymedia@gmail.com>

* chore: apply suggestions from CR

Signed-off-by: WenyXu <wenymedia@gmail.com>

---------

Signed-off-by: WenyXu <wenymedia@gmail.com>
2026-03-10 13:49:40 +00:00
Yingwen
04cd2c8a05 feat: flat read path support primary_key format memtables (#7759)
* feat: add adapter for batch to flat recordbatch

Signed-off-by: evenyag <realevenyag@gmail.com>

* feat: support batch to flat record batch in MemtableRange

Signed-off-by: evenyag <realevenyag@gmail.com>

* refactor: address review issues for BatchToRecordBatchAdapter

- Extract duplicated read_column_ids computation into a shared
  `read_column_ids_from_projection` helper function
- Cache `FormatProjection` in `BatchToRecordBatchContext::new()` instead
  of recomputing it on every `adapt_iter()` call
- Remove unnecessary `Arc` wrapping of `read_column_ids` in
  `SimpleBulkMemtable::ranges()`
- Fix clippy `filter_map_bool_then` warning in `batch_adapter.rs`

Signed-off-by: evenyag <realevenyag@gmail.com>

* chore: simplify comments

Signed-off-by: evenyag <realevenyag@gmail.com>

* refactor(mito2): use read column ids in batch adapter

Signed-off-by: evenyag <realevenyag@gmail.com>

* test: test build_record_batch_iter

Signed-off-by: evenyag <realevenyag@gmail.com>

* chore: fmt code

Signed-off-by: evenyag <realevenyag@gmail.com>

* test: test build_record_batch_iter for all old memtables

Signed-off-by: evenyag <realevenyag@gmail.com>

* chore: address comment

Signed-off-by: evenyag <realevenyag@gmail.com>

* fix: prune time range before adapter

Signed-off-by: evenyag <realevenyag@gmail.com>

* chore: share BatchToRecordBatchContext in simple_bulk_memtable.rs

Signed-off-by: evenyag <realevenyag@gmail.com>

* chore: use ScalarValue::to_array_of_size to build repeated value array

Signed-off-by: evenyag <realevenyag@gmail.com>

---------

Signed-off-by: evenyag <realevenyag@gmail.com>
2026-03-10 12:46:39 +00:00
Ruihang Xia
58528d1334 feat: fast path for empty selection files (#7780)
* perf(mito2): skip reader context for empty selections

* refactor(mito2): make parquet reader input optional
2026-03-10 12:22:03 +00:00
Lei, HUANG
9c98a3d5f6 refactor: remote write related code (#7775)
* refactor/prom-related-code:
 ### Commit Message

 Refactor Byte Handling and Improve Decoding Logic

 - **`prom_decode.rs`**: Removed `Bytes` usage in favor of `Vec<u8>` for handling raw data, improving memory management and simplifying the decoding process.
 - **`prom_store.rs`**: Updated `try_decompress` function to return `Vec<u8>` instead of `Bytes`, aligning with the new data handling approach.
 - **`prom_row_builder.rs`**: Modified `TablesBuilder` to use `Vec<u8>` for `raw_data`, enhancing data manipulation capabilities.
 - **`proto.rs`**: Refactored `PromWriteRequest` decoding logic to use `Vec<u8>`, optimizing the buffer management and decoding flow.

Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>

* refactor: mod structure

Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>

* refactor/prom-related-code:
 - **Refactor `prom_store.rs` and `prom_remote_write/mod.rs`:** Moved `decode_remote_write_request` and `try_decompress` functions from `prom_store.rs` to `prom_remote_write/mod.rs`. This change centralizes the logic related to remote write request
 decoding and decompression.
 - **Update `PromValidationMode` in `validation.rs`:** Implemented `Default` trait using the `#[derive(Default)]` attribute for `PromValidationMode` and updated related methods to use `Result` instead of `std::result::Result`.

Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>

* refactor/prom-related-code:
 ### Remove `proto.rs` and Update References

 - **Removed**: Deleted the `proto.rs` file, which contained re-exports for Prometheus remote write decode types.
 - **Updated References**: Adjusted references to `PromSeriesProcessor` and `PromWriteRequest` in `prom_decode.rs` and `prom_store.rs` to import directly from `prom_remote_write`.
 - **Modified Modules**: Removed the `proto` module from `lib.rs`.

Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>

* fix: lint

Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>

* fix: remove assert_eq

Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>

* refactor/prom-related-code:
 ### Refactor Prometheus Remote Write Module

 - **Modularization of `prom_remote_write`:**
   - Split `PromValidationMode` and `validate_label_name` into a new `validation` module.
   - Moved `PromSeriesProcessor` and `PromWriteRequest` to a `decode` module.
   - Separated `PromLabel` into a `types` module and adjusted visibility.

 - **Visibility Adjustments:**
   - Changed `PromTimeSeries` and `PromLabel` structs to `pub(crate)` for internal use.

 - **File Updates:**
   - Updated references in `prom_decode.rs`, `http.rs`, `prom_store.rs`, `decode.rs`, `mod.rs`, `row_builder.rs`, `types.rs`, `prom_store_test.rs`, and `test_util.rs` to reflect module changes.

Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>

---------

Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
2026-03-10 12:19:08 +00:00
Weny Xu
4c2f056f93 fix(meta): remap route addresses when reading full table info (#7778)
* refactor(meta): expose crate-level table route address remap helper

* fix(meta): remap table route address in get_full_table_info

* test(meta): cover get_full_table_info route address remap
2026-03-09 18:24:59 +00:00
Weny Xu
3e81345d7f fix(mito2): avoid parquet redownload when write cache already contains file (#7777)
* feat: add idempotent write cache download API

* fix: skip parquet redownload in manifest edit path

* test: cover download_if_absent cache hit and miss

* chore: add comments

Signed-off-by: WenyXu <wenymedia@gmail.com>

---------

Signed-off-by: WenyXu <wenymedia@gmail.com>
2026-03-09 12:16:53 +00:00
Yingwen
4232ce8eaa test: fix unstable index meta list test (#7774)
* test: fix unstable index meta list test

Signed-off-by: evenyag <realevenyag@gmail.com>

* fix: raise bucket_size threshold to avoid bucketing sizes in [512, 999] to 0

Signed-off-by: evenyag <realevenyag@gmail.com>

---------

Signed-off-by: evenyag <realevenyag@gmail.com>
2026-03-09 07:57:43 +00:00
Ning Sun
9b1288f66a feat: add function rewrite rule for json_get with cast (#7631)
* feat: initial function rewriter for json_get

* feat: make sure rewrite rule is applied

* feat: keep analyzer's default rules

* feat: implement rewriter for arrow_cast

* test: add unit test for tht rewriter

* chore: format

* refactor: extract some more functions

* Apply suggestion from @waynexia

Co-authored-by: Ruihang Xia <waynestxia@gmail.com>

---------

Co-authored-by: Ruihang Xia <waynestxia@gmail.com>
2026-03-09 02:39:48 +00:00
ZonaHe
44ab4c6857 feat: update dashboard to v0.11.13 (#7763)
Co-authored-by: sunchanglong <sunchanglong@users.noreply.github.com>
2026-03-09 02:33:02 +00:00
Ruihang Xia
a71df9477a perf(mito2): speed up parquet scan via minmax caches (#7708)
* perf(mito2): speed up parquet scan via meta caches

* fix(mito2): fix parquet pruning and metadata cache

* revert config changes

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* enhance cache file enter logic

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* resolve tiny cr comments

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* only preload from fs or cache

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* fix vector test

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

---------

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
2026-03-07 06:32:47 +00:00
Yingwen
93c48a078c feat: implement last row cache reader for flat format (#7757)
* feat: initial implementation

Signed-off-by: evenyag <realevenyag@gmail.com>

* fix: handle multiple series

Signed-off-by: evenyag <realevenyag@gmail.com>

* fix: reset state in finish()

Signed-off-by: evenyag <realevenyag@gmail.com>

* fix: handle duplicated last timestamps across batches

Signed-off-by: evenyag <realevenyag@gmail.com>

* perf: compact primary key array

Signed-off-by: evenyag <realevenyag@gmail.com>

* fix(mito2): simplify flat last timestamp selector state

Signed-off-by: evenyag <realevenyag@gmail.com>

* refactor(mito2): rebuild flat pk dictionary from selector state

Signed-off-by: evenyag <realevenyag@gmail.com>

* test: reduce tests

Signed-off-by: evenyag <realevenyag@gmail.com>

* chore: update comment

Signed-off-by: evenyag <realevenyag@gmail.com>

* chore: more logs to debug

Signed-off-by: evenyag <realevenyag@gmail.com>

* feat: concat batches in last row reader

Signed-off-by: evenyag <realevenyag@gmail.com>

* refactor(mito2): simplify flat last row selector output buffer

- Replace VecDeque with BatchBuffer struct for output buffering
- Remove rebuild_pk_dictionary_for_key as batches go directly into buffer
- Remove unused push method and make BatchBuffer pub(crate)
- Remove debug logging in maybe_update_cache

Signed-off-by: evenyag <realevenyag@gmail.com>

* chore: address comments

Signed-off-by: evenyag <realevenyag@gmail.com>

---------

Signed-off-by: evenyag <realevenyag@gmail.com>
2026-03-06 12:08:58 +00:00
Lei, HUANG
2000611ea1 perf(remote-write): optimize decode prom (#7761)
* perf: optimize Prometheus label name decoding with byte-level validation

Add `decode_label_name` and `validate_label_name` to skip redundant
UTF-8 validation for Prometheus label names, which are guaranteed ASCII
(`[a-zA-Z_][a-zA-Z0-9_]*`). Rename `validate_bytes` to `validate_utf8`
for clarity and add benchmarks for label name validation and UTF-8
validation (std vs simdutf8).

Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>

* perf(servers): optimize validate_label_name with lookup table and loop unrolling

Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>

* perf/decode-prom-2:
 - **Refactor UTF-8 Validation and Label Decoding**:
   - Removed `validate_utf8` method and integrated label name validation directly in `decode_label_name` in `http.rs`.
   - Updated `decode_label_name` to always enforce Prometheus label name validation across all modes.
   - Adjusted test cases in `http.rs` to reflect the new validation logic.

 - **Enhance Label Validation in `prom_row_builder.rs`**:
   - Replaced UTF-8 validation with direct label name validation using `validate_label_name`.
   - Updated `decode_label_name` usage to return `&str` and adjusted related logic.

Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>

* perf/decode-prom-2:
 **Refactor `TableBuilder` to Use `RawBytes` for Column Indexes**

 - Updated `TableBuilder` in `prom_row_builder.rs` to use `RawBytes` instead of `Vec<u8>` for `col_indexes`.
 - Modified `with_capacity` method to directly insert `RawBytes` for timestamp and value columns.
 - Adjusted schema handling to use `to_owned` for `tag_name` and directly insert `raw_tag_name` into `col_indexes`.

Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>

* perf/decode-prom-2:
 ### Commit Message

 Refactor `PromWriteRequest` Method and Enhance Data Handling

 - **Refactor Method**: Renamed the `merge` method to `decode` in `PromWriteRequest` to better reflect its functionality. Updated references in `prom_decode.rs`, `prom_store.rs`, and `prom_row_builder.rs`.
 - **Enhance Data Handling**: Introduced `raw_data` field in `PromWriteRequest` to store a clone of the buffer for potential future use. Updated the `clear` method to reset `raw_data`.

 Files affected: `prom_decode.rs`, `prom_store.rs`, `prom_row_builder.rs`, `proto.rs`.

Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>

* perf/decode-prom-2:
 **Commit Summary:**

 - **Enhancement in `prom_row_builder.rs`:**
   - Added a new field `raw_data` of type `Bytes` to `TablesBuilder`.
   - Implemented `set_raw_data` method to update `raw_data`.
   - Modified `clear` method to reset `raw_data`.

 - **Refactor in `proto.rs`:**
   - Removed `raw_data` field from `PromWriteRequest`.
   - Updated `decode_and_process` method to use `set_raw_data` from `TablesBuilder` for handling raw data.

Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>

* chore: remove duplicated validation

Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>

* perf/decode-prom-2:
 ### Commit Message

 Refactor `TablesBuilder` and `TableBuilder` to Use Lifetime Annotations

 - Updated `prom_store.rs`:
   - Modified `PROM_WRITE_REQUEST_POOL` and `decode_remote_write_request` to use lifetime annotations for `PromWriteRequest` and `TablesBuilder`.

 - Updated `prom_row_builder.rs`:
   - Refactored `TablesBuilder` and `TableBuilder` structs to include lifetime annotations.
   - Adjusted methods in `TablesBuilder` and `TableBuilder` to accommodate lifetime changes.

 - Updated `proto.rs`:
   - Added lifetime annotations to `PromWriteRequest` and its methods.
   - Modified `add_to_table_data` to use lifetime annotations for `TablesBuilder`.

Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>

* chore: fmt

Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>

---------

Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
2026-03-06 10:23:11 +00:00
Yingwen
33acbf985d fix: Allow overriding sequence when writing flat SSTs (#7764)
Signed-off-by: evenyag <realevenyag@gmail.com>
2026-03-06 09:43:58 +00:00
discord9
56ee8baa3f feat: admin gc table/regions (#7619)
* feat: gc table

Signed-off-by: discord9 <discord9@163.com>

* test: admin gc

Signed-off-by: discord9 <discord9@163.com>

* chore: after rebase fix

Signed-off-by: discord9 <discord9@163.com>

* refactor: GcStats

Signed-off-by: discord9 <discord9@163.com>

* refactor: use gc ticker for admin gc

Signed-off-by: discord9 <discord9@163.com>

* fix: region routes override

Signed-off-by: discord9 <discord9@163.com>

* test: non happy path

Signed-off-by: discord9 <discord9@163.com>

* refactor: gc job report enum

Signed-off-by: discord9 <discord9@163.com>

* test: process 0 regions

Signed-off-by: discord9 <discord9@163.com>

* after rebase

Signed-off-by: discord9 <discord9@163.com>

* feat: allow manual gc to return error

Signed-off-by: discord9 <discord9@163.com>

* chore: update proto

Signed-off-by: discord9 <discord9@163.com>

* per review

Signed-off-by: discord9 <discord9@163.com>

* chore: timeout and update proto

Signed-off-by: discord9 <discord9@163.com>

* chore: udpate proto

Signed-off-by: discord9 <discord9@163.com>

---------

Signed-off-by: discord9 <discord9@163.com>
2026-03-06 08:25:44 +00:00
discord9
5e6d2b221e feat: gc batch delete files (#7733)
* feat: batch delete

Signed-off-by: discord9 <discord9@163.com>

* chore: explict error msg

Signed-off-by: discord9 <discord9@163.com>

* chore: per review

Signed-off-by: discord9 <discord9@163.com>

* refactor: not fallback when batch failure

Signed-off-by: discord9 <discord9@163.com>

* chore: per review

Signed-off-by: discord9 <discord9@163.com>

* pcr

Signed-off-by: discord9 <discord9@163.com>

* chore: per review

Signed-off-by: discord9 <discord9@163.com>

* refactor: batch delete in access layer

Signed-off-by: discord9 <discord9@163.com>

---------

Signed-off-by: discord9 <discord9@163.com>
2026-03-06 07:17:20 +00:00
discord9
d1151b665b feat: flow tql cte (#7702)
* feat: flow tql cte

Signed-off-by: discord9 <discord9@163.com>

* fix: creating flow TQL CTE source tables lose cte part in query

Signed-off-by: discord9 <discord9@163.com>

* test: update sqlness result

Signed-off-by: discord9 <discord9@163.com>

* chore

Signed-off-by: discord9 <discord9@163.com>

* fix: properly canonicalize ident

Signed-off-by: discord9 <discord9@163.com>

* feat: even stricter check

Signed-off-by: discord9 <discord9@163.com>

* chore: sqlness update

Signed-off-by: discord9 <discord9@163.com>

* chore: after rebase fix

Signed-off-by: discord9 <discord9@163.com>

---------

Signed-off-by: discord9 <discord9@163.com>
2026-03-06 03:36:42 +00:00
discord9
4c30b9efaf fix: null first for part expr as logical expr (#7747)
* fix: null first for part expr as logical expr

Signed-off-by: discord9 <discord9@163.com>

* test: update tests

Signed-off-by: discord9 <discord9@163.com>

* chore: per review

Signed-off-by: discord9 <discord9@163.com>

* fix: nulll handle&non-null filter

Signed-off-by: discord9 <discord9@163.com>

* chore: doc test

Signed-off-by: discord9 <discord9@163.com>

---------

Signed-off-by: discord9 <discord9@163.com>
2026-03-06 02:53:05 +00:00
Ruihang Xia
b4f0886f80 chore: grouping batch open region logs (#7758)
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
2026-03-05 13:39:29 +00:00
Ruihang Xia
39140058d0 feat: adapt new name of holt winters fn (#7700)
* feat: adapt new name of holt winters fn

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* update parser

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* alias old fn name

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

---------

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
2026-03-04 23:32:37 +00:00
Weny Xu
b17947b5d8 feat: add COPY support for GCS/Azblob with OSS/GCS/Azblob integration coverage (#7743)
* feat: add gcs and azblob backends to copy datasource

Signed-off-by: WenyXu <wenymedia@gmail.com>

* test: add copy to/from oss integration coverage

Signed-off-by: WenyXu <wenymedia@gmail.com>

* test: add copy to/from gcs/azure integration coverage

Signed-off-by: WenyXu <wenymedia@gmail.com>

* refactor: use existing layers

Signed-off-by: WenyXu <wenymedia@gmail.com>

* feat: hide `credential_path` in cli

Signed-off-by: WenyXu <wenymedia@gmail.com>

* chore: remove logs

Signed-off-by: WenyXu <wenymedia@gmail.com>

* chore: apply suggestions

Signed-off-by: WenyXu <wenymedia@gmail.com>

---------

Signed-off-by: WenyXu <wenymedia@gmail.com>
2026-03-04 04:21:21 +00:00
Yingwen
5c8ece27e0 feat: improve filter support for scanbench (#7736)
* feat: cast filters type for scanbench

Signed-off-by: evenyag <realevenyag@gmail.com>

* chore: pub file_range mod

So we can use the pub struct FileRange in other places

Signed-off-by: evenyag <realevenyag@gmail.com>

* fix: add api as dev-dependency to cmd for clippy

Signed-off-by: evenyag <realevenyag@gmail.com>

* feat: support profiling after warmup

Signed-off-by: evenyag <realevenyag@gmail.com>

---------

Signed-off-by: evenyag <realevenyag@gmail.com>
2026-03-03 09:00:41 +00:00
LFC
b2074e3863 chore: upgrade DataFusion family, again (#7578)
* chore: upgrade DataFusion family

Signed-off-by: luofucong <luofc@foxmail.com>

* chore: switch to released version of datafusion-pg-catalog

---------

Signed-off-by: luofucong <luofc@foxmail.com>
Co-authored-by: Ning Sun <sunning@greptime.com>
Co-authored-by: Ning Sun <sunng@protonmail.com>
2026-03-03 07:36:39 +00:00