Commit Graph

901 Commits

Author SHA1 Message Date
fys
7d330cc4e6 fix(mito2): schema-safe inverted index pruning (#8089)
* fix(mito2): skip inverted index on per-SST type mismatch to avoid false negatives

* restore INDEX_APPLY_MEMORY_USAGE

* fix: cr

* fix: cr
2026-05-12 09:37:11 +00:00
LFC
abf4623440 refactor: store the schema of flat source (#8091)
* refactor: store the schema of flat source

Signed-off-by: luofucong <luofc@foxmail.com>

* resolve PR comments

Signed-off-by: luofucong <luofc@foxmail.com>

* fix ci

Signed-off-by: luofucong <luofc@foxmail.com>

---------

Signed-off-by: luofucong <luofc@foxmail.com>
2026-05-11 10:22:40 +00:00
LFC
7a285c2890 feat: concretize json type from query (#8081)
* feat: concretize json type from query

Signed-off-by: luofucong <luofc@foxmail.com>

* resolve PR comments

Signed-off-by: luofucong <luofc@foxmail.com>

* add more tests

Signed-off-by: luofucong <luofc@foxmail.com>

---------

Signed-off-by: luofucong <luofc@foxmail.com>
2026-05-09 07:27:40 +00:00
Lei, HUANG
98033b6421 fix(mito): queue writes during region edit (#8079)
* fix(mito): queue writes during region edit

Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>

* refactor(mito): wrap bulk insert worker request

Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>

* fix(mito): fail queued requests on region close

Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>

* docs: add comment for the design considerations

Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>

* chore: remove redundant check for flushable region

skip ci

Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>

* fix: clippy

Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>

---------

Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
2026-05-08 13:08:35 +00:00
fys
d0e0c21600 feat: support nested projection in mito2 read path (#7959)
* feat: support nested projection

* fix: cr

* fix: cr

* fix: code review

* fix: cargo clippy

* fix: ci

* fix: unit test

* avoid repeated evaluation of is_schema_matched

* add comment for append time index

* fix: keep nested schema alignment after filling missing projected roots
2026-05-07 09:50:19 +00:00
Lei, HUANG
796aae3d9f feat(operator): allow last_row merge mode with append mode (#8065)
* feat(operator): allow last_row merge_mode when append_mode is enabled

- Update RegionOptions::validate to allow last_row merge_mode with append_mode.
- Update fill_table_options_for_create to automatically set merge_mode to last_row when append_mode is enabled for LastNonNull table type.
- Add unit tests in mito2 and operator to verify options validation and table creation.
- Add integration test for InfluxDB write with append mode hint.

Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>

* fix(operator): simplify append mode options

Group `LastNonNull` auto-create options in a single append-mode branch.

Files:

- `src/operator/src/insert.rs`

Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>

* fix: sqlness

Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>

---------

Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
2026-05-07 07:21:37 +00:00
QuakeWang
45e990b7f3 refactor: propagate flush reasons through FlushRegions path (#8051)
* feat: propagate flush reasons through FlushRegions path

Signed-off-by: QuakeWang <wangfuzheng0814@foxmail.com>

* refactor: address flush reason review feedback

Signed-off-by: QuakeWang <wangfuzheng0814@foxmail.com>

* refactor: keep flush instruction helper name

Signed-off-by: QuakeWang <wangfuzheng0814@foxmail.com>

---------

Signed-off-by: QuakeWang <wangfuzheng0814@foxmail.com>
2026-05-01 02:28:55 +00:00
Ning Sun
b8951a3514 feat: persist our column_id to parquet field_id (#8032)
* feat: persist our column_id to parquet field_id

* refactor: avoid clone field when possible

* chore: fmt

* chore: address style suggestions
2026-04-30 15:40:24 +00:00
Yingwen
6a84393e08 feat: support prefiltering any columns in flat format (#7972)
* refactor: prepare parquet prefilter for multi-column execution

Signed-off-by: evenyag <realevenyag@gmail.com>

* refactor: restore parquet physical filter contexts

Signed-off-by: evenyag <realevenyag@gmail.com>

* feat: add generalized parquet prefilter projection

Signed-off-by: evenyag <realevenyag@gmail.com>

* refactor: avoid re-evaluating parquet prefiltered predicates

Signed-off-by: evenyag <realevenyag@gmail.com>

* test: cover generalized parquet prefilter behavior

Signed-off-by: evenyag <realevenyag@gmail.com>

* refactor: remove variant

Signed-off-by: evenyag <realevenyag@gmail.com>

* feat: only prefilter physical exprs

Signed-off-by: evenyag <realevenyag@gmail.com>

* refactor: remove execute_general_prefilter

Signed-off-by: evenyag <realevenyag@gmail.com>

* feat: only prefilter cheap exprs

Signed-off-by: evenyag <realevenyag@gmail.com>

* refactor: context usage

Signed-off-by: evenyag <realevenyag@gmail.com>

* refactor: categorize filters

Signed-off-by: evenyag <realevenyag@gmail.com>

* refactor: prefilter plan for bulk memtable

Signed-off-by: evenyag <realevenyag@gmail.com>

* refactor: move parquet filter plan builders into prefilter

Signed-off-by: evenyag <realevenyag@gmail.com>

* chore: comment

Signed-off-by: evenyag <realevenyag@gmail.com>

* test: simplify tests

Signed-off-by: evenyag <realevenyag@gmail.com>

* feat: enable prefilter by threshold

Signed-off-by: evenyag <realevenyag@gmail.com>

* fix: correct pk filter grouping

Signed-off-by: evenyag <realevenyag@gmail.com>

* chore: update comment

Signed-off-by: evenyag <realevenyag@gmail.com>

* chore: remove unused code

Signed-off-by: evenyag <realevenyag@gmail.com>

* chore: fix warning

Signed-off-by: evenyag <realevenyag@gmail.com>

* fix: handle nulls in physical filter result

Signed-off-by: evenyag <realevenyag@gmail.com>

* chore: fmt import

Signed-off-by: evenyag <realevenyag@gmail.com>

* docs: update comments

Signed-off-by: evenyag <realevenyag@gmail.com>

---------

Signed-off-by: evenyag <realevenyag@gmail.com>
2026-04-29 03:35:22 +00:00
fys
8955e2c651 feat: add read column abstraction (#8038)
* feat: add read columns strcut

* fix: cr by ai

* fix: cr by ai

* fix: cr by ai

* fix: cr
2026-04-29 03:29:39 +00:00
Yingwen
9b75d8b734 refactor(mito2): reshape extension range API (#8004)
* refactor(mito2): reshape extension range API

Thread per-range read options into the extension range flat reader so
it can honor PreFilterMode like other range kinds, and drop the unused
legacy Batch reader surface.

- extension.rs: introduce ExtensionRangeReadOptions { pre_filter_mode };
  flat_reader() takes it; remove reader(), ExtensionRangeReader, and
  BoxedExtensionRangeReader (no in-tree or out-of-tree call sites use
  the Batch-format reader path anymore).
- scan_util.rs: plumb options through maybe_scan_flat_other_ranges and
  scan_flat_extension_range.
- seq_scan.rs / unordered_scan.rs: build options at the call site via
  StreamContext::range_pre_filter_mode.

Using an options struct (rather than a bare PreFilterMode argument)
keeps future additions additive for out-of-tree ExtensionRange impls.

Signed-off-by: evenyag <realevenyag@gmail.com>

* refactor(mito2): hoist ExtensionRangeReadOptions out of row-group loops

Build ext_options once per partition range instead of per row-group index in
build_flat_sources and scan_flat_partition_range. Both inputs (part_range and
range_pre_filter_mode) are loop-invariant.

Signed-off-by: evenyag <realevenyag@gmail.com>

* chore: fix compiler error without enterprise feature

Signed-off-by: evenyag <realevenyag@gmail.com>

---------

Signed-off-by: evenyag <realevenyag@gmail.com>
2026-04-27 07:41:54 +00:00
shuiyisong
0effc30778 chore: update the opendal to 0.56 rc2 (#8003)
* chore: update opendal version

Signed-off-by: shuiyisong <xixing.sys@gmail.com>

* chore: update opendal version

Signed-off-by: shuiyisong <xixing.sys@gmail.com>

* chore: fix test

Signed-off-by: shuiyisong <xixing.sys@gmail.com>

* fix: grpc init

Signed-off-by: shuiyisong <xixing.sys@gmail.com>

* fix: dep versions

Signed-off-by: shuiyisong <xixing.sys@gmail.com>

* fix: remove aws-lc-rs in reqwest

Signed-off-by: shuiyisong <xixing.sys@gmail.com>

* chore: rebase main and fix compile

Signed-off-by: shuiyisong <xixing.sys@gmail.com>

* fix: remove unused deps

Signed-off-by: shuiyisong <xixing.sys@gmail.com>

* Revert "fix: remove aws-lc-rs in reqwest"

This reverts commit 90bfafca06.

* chore: remove aws-lc-sys from blacklist

Signed-off-by: shuiyisong <xixing.sys@gmail.com>

* chore: fix sqlness

Signed-off-by: shuiyisong <xixing.sys@gmail.com>

* chore: add tls deps

Signed-off-by: shuiyisong <xixing.sys@gmail.com>

* fix: idemptent install in rds

Signed-off-by: shuiyisong <xixing.sys@gmail.com>

* fix: test

Signed-off-by: shuiyisong <xixing.sys@gmail.com>

* chore: use aws-lc-sys as possible

Signed-off-by: shuiyisong <xixing.sys@gmail.com>

* fix: lint

Signed-off-by: shuiyisong <xixing.sys@gmail.com>

* fix: address comments

Signed-off-by: shuiyisong <xixing.sys@gmail.com>

* chore: address CR issue

Signed-off-by: shuiyisong <xixing.sys@gmail.com>
Signed-off-by: evenyag <realevenyag@gmail.com>

* fix: sync opendal compat adapter with upstream

Signed-off-by: evenyag <realevenyag@gmail.com>

* fix: address compat clippy warnings

Signed-off-by: evenyag <realevenyag@gmail.com>

---------

Signed-off-by: shuiyisong <xixing.sys@gmail.com>
Signed-off-by: evenyag <realevenyag@gmail.com>
Co-authored-by: evenyag <realevenyag@gmail.com>
2026-04-26 09:59:48 +00:00
Yingwen
1fda2f5a35 feat: tune range cache (#8006)
* feat: use range cache size in manager

Signed-off-by: evenyag <realevenyag@gmail.com>

* chore: update compact condition

Signed-off-by: evenyag <realevenyag@gmail.com>

* refactor: update tests

Signed-off-by: evenyag <realevenyag@gmail.com>

* fix: validate permits before acquiring

Signed-off-by: evenyag <realevenyag@gmail.com>

---------

Signed-off-by: evenyag <realevenyag@gmail.com>
2026-04-23 11:26:23 +00:00
LFC
209880b991 feat: json2 flush (#8011)
Signed-off-by: luofucong <luofc@foxmail.com>
2026-04-23 03:03:37 +00:00
Yingwen
dc2f2cbfae fix: manifest recovery scans after last version if possible (#8009)
* feat: suppport scan with start after

Signed-off-by: evenyag <realevenyag@gmail.com>

* test: add start_after test

Signed-off-by: evenyag <realevenyag@gmail.com>

* chore: adjust remove dir warning

Signed-off-by: evenyag <realevenyag@gmail.com>

* test: test list_with_start_after

Signed-off-by: evenyag <realevenyag@gmail.com>

* fix: update get_paths call with start_after arg in checkpoint test

Signed-off-by: evenyag <realevenyag@gmail.com>

* feat: log scan metrics

Signed-off-by: evenyag <realevenyag@gmail.com>

* fix: fix start_after on manifest dir

Signed-off-by: evenyag <realevenyag@gmail.com>

---------

Signed-off-by: evenyag <realevenyag@gmail.com>
2026-04-23 02:42:59 +00:00
fys
1440924955 fix: nested projection missing roots (#7993)
* fix: nested projection missing roots

* add docs and unit test

* fix: cr by ai

* move some computations to new method
2026-04-22 09:35:21 +00:00
Yingwen
449243a175 fix: update manifest state before deleting delta files (#8001)
* fix: update state before deleting deltas

Signed-off-by: evenyag <realevenyag@gmail.com>

* chore: update comment

Signed-off-by: evenyag <realevenyag@gmail.com>

* chore: update log level

Signed-off-by: evenyag <realevenyag@gmail.com>

---------

Signed-off-by: evenyag <realevenyag@gmail.com>
2026-04-21 11:14:13 +00:00
Yingwen
555741a9f1 feat: prune bulk memtable parts by first tag (#7911)
* feat: initial implementation for MultiBulkPart minmax pruning

Signed-off-by: evenyag <realevenyag@gmail.com>

* chore: add debug logs

Signed-off-by: evenyag <realevenyag@gmail.com>

* feat: prune bulk part

Signed-off-by: evenyag <realevenyag@gmail.com>

* chore: fmt code

Signed-off-by: evenyag <realevenyag@gmail.com>

* chore: address PR review comments for bulk part pruning

Document sparse encoding format in SparsePrimaryKeyCodec and add
comment explaining why primary_key.first() works for both encodings.
Remove noisy info-level pruning logs from the read path.

Signed-off-by: evenyag <realevenyag@gmail.com>

* test: add test for MultiBulkPart batch pruning behavior

Signed-off-by: evenyag <realevenyag@gmail.com>

* fix: don't return error when read() returns None

Signed-off-by: evenyag <realevenyag@gmail.com>

* chore: address review comments

Signed-off-by: evenyag <realevenyag@gmail.com>

---------

Signed-off-by: evenyag <realevenyag@gmail.com>
2026-04-21 10:08:16 +00:00
Lei, HUANG
f6851cf8d7 feat(mito2): add PK-range-aware TWCS overlap handling (#7954)
* feat(mito2): extract and cache primary key range for SST files

Extracts primary key ranges from SST files during flush and compaction, and caches them in FileHandle for future use (e.g., overlapping checks).

Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>

* feat/compaction-overlapping-check:
 - **Enhance Primary Key Range Logic**: Updated the `primary_key_ranges_overlap` function in `run.rs` to use `chunk()` for comparing byte ranges, improving accuracy in overlap detection.
 - **Refactor Run Assignment Logic**: Simplified the logic for assigning items to runs in `run.rs` by removing redundant match statements and using `is_empty()` and `iter().any()` for cleaner checks.
 - **Add Test for Transitivity Break**: Introduced a new test `test_find_sorted_runs_handles_2d_transitivity_break` in `run.rs` to ensure correct handling of transitivity breaks in sorted runs.

Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>

* feat/compaction-overlapping-check:
 - **Remove PK-disjoint detection logic**: Simplified the compaction logic in `twcs.rs` by removing the `has_time_overlapping_pairs` function and related logic for PK-disjoint detection. This includes the removal of the `append_mode_force_compact` condition and associated tests.
 - **Update compaction trigger settings**: Modified `append_mode_test.rs` to set `compaction.twcs.trigger_file_num` to "2" and adjusted the expected number of files in the scanner assertion from 1 to 2.

Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>

* chore: rebase main

Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>

* feat/compaction-overlapping-check:
 ### Remove Unused Import in `compactor.rs`

 - Removed the unused import `compact_request` from `compactor.rs` to clean up the codebase.

Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>

* fix: tighten `mito2` file lifecycle handling

Refine compaction, flush, and SST/version bookkeeping across `src/mito2/src/compaction/*`, `src/mito2/src/flush.rs`, `src/mito2/src/region/*`, `src/mito2/src/sst/*`, and related tests/utilities.

* fix: reuse primary key range merge in twcs compaction

Centralize primary key range merging so  can call the shared helper from .

Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>

* refactor(mito2): simplify `FileHandle` initialization and internalize primary key range extraction

- Updated `FileHandle::new` to automatically compute the primary key range directly from `FileMeta`.
- Restricted `FileHandle::new_with_primary_key_range` to be test-only by adding the `#[cfg(test)]` attribute.
- Simplified `SstVersion::add_files` by adopting the updated `FileHandle::new` instead of manually providing the primary key range.

Modified files:
- `src/mito2/src/sst/file.rs`
- `src/mito2/src/sst/version.rs`

Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>

* feat/compaction-overlapping-check:
 ### Improve File Management and Documentation

 - **`twcs.rs`**: Added a comment to clarify the merging of small files when there are no overlapping files.
 - **`version.rs` (in `region` and `sst`)**: Enhanced documentation for `add_files` method, explaining its functionality and panic conditions. Simplified the file handle creation logic in `sst/version.rs`.

Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>

* feat/compaction-overlapping-check:
 ### Enhance Primary Key Range Handling in `opener.rs`

 - Updated logic in `opener.rs` to set the primary key range for file handles when it is not already defined. This change ensures that the primary key range is extracted and set using `extract_primary_key_range` when necessary.

Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>

* docs: add docs for the necessity of checking pk and timesmaps while find overlapping files.

* chore: address review comments

Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>

---------

Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
2026-04-21 03:23:22 +00:00
Yingwen
f8f2e19dda refactor(mito2): remove PrimaryKey variants (#7982)
* refactor: make scan compat path flat-only

Signed-off-by: evenyag <realevenyag@gmail.com>

* refactor: use flat projection mapper directly

Signed-off-by: evenyag <realevenyag@gmail.com>

* refactor: remove dead compat batch enum

Signed-off-by: evenyag <realevenyag@gmail.com>

* refactor: drop dead primary-key projection helpers

Signed-off-by: evenyag <realevenyag@gmail.com>

* refactor: use flat read format directly

Signed-off-by: evenyag <realevenyag@gmail.com>

* chore: fix warnings

Signed-off-by: evenyag <realevenyag@gmail.com>

* chore: fmt code

Signed-off-by: evenyag <realevenyag@gmail.com>

---------

Signed-off-by: evenyag <realevenyag@gmail.com>
2026-04-21 02:43:09 +00:00
LFC
c7427fa6d2 feat: json2 insert (#7981)
* feat: json2 insert

Signed-off-by: luofucong <luofc@foxmail.com>

* resolve PR comments

Signed-off-by: luofucong <luofc@foxmail.com>

* use main greptime-proto

Signed-off-by: luofucong <luofc@foxmail.com>

---------

Signed-off-by: luofucong <luofc@foxmail.com>
2026-04-20 09:45:32 +00:00
discord9
8123406fae feat(flow): inc query scan bind seq (#7879)
* feat: add sequence scan boundaries and stale detection

Signed-off-by: discord9 <discord9@163.com>

* fix: only scan bind once

Signed-off-by: discord9 <discord9@163.com>

* refactor: flow snapshot decision

Signed-off-by: discord9 <discord9@163.com>

* refactor: one place modify

Signed-off-by: discord9 <discord9@163.com>

* chore: per review

Signed-off-by: discord9 <discord9@163.com>

* refactor: pre review

Signed-off-by: discord9 <discord9@163.com>

* per review

Signed-off-by: discord9 <discord9@163.com>

---------

Signed-off-by: discord9 <discord9@163.com>
2026-04-20 03:32:31 +00:00
Yingwen
fb3290e555 feat: use PreFilterMode::All if only one source in the partition range (#7973)
* feat: use PrefilterMode::All if only one source in the partition range

Signed-off-by: evenyag <realevenyag@gmail.com>

* fix: consider append_mode

Signed-off-by: evenyag <realevenyag@gmail.com>

* chore: skip merge if only one source

Signed-off-by: evenyag <realevenyag@gmail.com>

* test: fix test

Signed-off-by: evenyag <realevenyag@gmail.com>

---------

Signed-off-by: evenyag <realevenyag@gmail.com>
2026-04-17 03:24:46 +00:00
Weny Xu
525e88bce4 feat: cancel local compaction for enter staging (#7885)
* feat(mito2): support cancelling active local compaction

Signed-off-by: WenyXu <wenymedia@gmail.com>

* chore: apply suggestions from CR

Signed-off-by: WenyXu <wenymedia@gmail.com>

* test(mito2): cover compaction cancellation return paths

Signed-off-by: WenyXu <wenymedia@gmail.com>

* chore: apply suggestions from CR

Signed-off-by: WenyXu <wenymedia@gmail.com>

* chore: apply suggestions from CR

Signed-off-by: WenyXu <wenymedia@gmail.com>

* chore: apply suggestions from CR

Signed-off-by: WenyXu <wenymedia@gmail.com>

* fix: cancel remaining tasks

Signed-off-by: WenyXu <wenymedia@gmail.com>

* chore: apply suggestions

Signed-off-by: WenyXu <wenymedia@gmail.com>

---------

Signed-off-by: WenyXu <wenymedia@gmail.com>
2026-04-16 03:56:37 +00:00
Lei, HUANG
00d67d6fa1 refactor(mito): remove Compactor::compact method (#7968)
refactor/remove-compactor-compact:
 ### Remove Unused Compaction Functionality

 - **Removed `compact` Method**: Eliminated the `compact` method from the `Compactor` trait and its default implementation, which was primarily used for local compaction in testing. This change affects `compactor.rs`.
 - **Code Cleanup**: Removed associated code and comments related to the `compact` method, streamlining the `Compactor` trait interface.

Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
2026-04-15 02:50:47 +00:00
Yingwen
60fc455149 fix: always skip field pruning when using merge mode (#7957)
* test: add prefilter regressions for last_row null filters

Signed-off-by: evenyag <realevenyag@gmail.com>

* fix: skip fields in all merge mode

Signed-off-by: evenyag <realevenyag@gmail.com>

* refactor: simplify pre-filter skip fields handling

Signed-off-by: evenyag <realevenyag@gmail.com>

* test: update test

Signed-off-by: evenyag <realevenyag@gmail.com>

---------

Signed-off-by: evenyag <realevenyag@gmail.com>
2026-04-14 11:00:52 +00:00
fys
c90e4147de refactor: introduce the ProjectInput structure (#7908)
* refactor: introduce the ProjectInput structure

* remove unused import

* fix: cr

* fix: cr

* fix: code review

* add more unit test

* avoid clone of input.projection
2026-04-14 09:29:33 +00:00
Weny Xu
57f1921253 feat: propagate staging leader through lease and heartbeat (#7950)
* feat(mito): expose staging leader role state

* fix(region): clear staging metadata on leader exit

* feat: propagate staging leader role through heartbeat and metasrv

* chore: update comments

Signed-off-by: WenyXu <wenymedia@gmail.com>

* fix(region): unify staging exit role transitions

* chore: update proto

Signed-off-by: WenyXu <wenymedia@gmail.com>

---------

Signed-off-by: WenyXu <wenymedia@gmail.com>
2026-04-13 09:04:02 +00:00
Yingwen
01a73105b8 feat: use partition range cache in scan (#7873)
* feat: use range cache in scan

Signed-off-by: evenyag <realevenyag@gmail.com>

* refactor: rename dedup to skip_dedup

Signed-off-by: evenyag <realevenyag@gmail.com>

* feat: use background concat for buffered batches

Signed-off-by: evenyag <realevenyag@gmail.com>

* chore: fmt

Signed-off-by: evenyag <realevenyag@gmail.com>

* fix: store permits

Signed-off-by: evenyag <realevenyag@gmail.com>

* fix: fix potential panic

Signed-off-by: evenyag <realevenyag@gmail.com>

* fix: skip range-cache wrapping when cache is disabled

Signed-off-by: evenyag <realevenyag@gmail.com>

* fix: avoid potential deadlock

Deadlock Chain

1. Range-level merge tasks: Each concurrent build_flat_partition_range_read (line 494-506) calls
build_flat_reader_from_sources → create_parallel_flat_sources → spawn_flat_scan_task. These
background tasks loop: acquire permit → input.next() → release permit.
2. Final merge tasks: After all range tasks return streams (line 509-511), the distributor calls
build_flat_reader_from_sources again (line 520-527) → create_parallel_flat_sources → more
spawn_flat_scan_task tasks. These also loop: acquire permit → input.next() → release permit.
3. Circular wait: The final merge tasks' input.next() reads from ReceiverStreams backed by
range-level merge tasks. If all num_partitions permits are held by final merge tasks blocked on
input.next(), the range-level merge tasks can't acquire permits to produce data → deadlock.

Signed-off-by: evenyag <realevenyag@gmail.com>

* test: add test for small permits

Signed-off-by: evenyag <realevenyag@gmail.com>

* feat: use avg batch size for channel size

Signed-off-by: evenyag <realevenyag@gmail.com>

* test: fix test

Signed-off-by: evenyag <realevenyag@gmail.com>

* chore: address review comments

Signed-off-by: evenyag <realevenyag@gmail.com>

---------

Signed-off-by: evenyag <realevenyag@gmail.com>
2026-04-13 08:27:53 +00:00
Lei, HUANG
9f7ffb4d26 feat(mito2): allow CompactionOutput to succeed independently (#7948)
* refactor(mito2): improve compaction error handling and file removal

Refactor compaction task execution to enhance error handling and robustness.
- Implemented parallel execution of compaction tasks with proper error capture and logging for individual task failures.
- Ensured JoinSnafu is no longer directly used in error propagation, instead handling errors within the task processing loop.
- Adjusted file removal logic to correctly include expired SSTs after compaction merges.

Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>

* refactor(mito2): extract SstMerger trait for testability in compaction

Extract SstMerger trait and DefaultSstMerger implementation to improve the testability of DefaultCompactor.

The DefaultCompactor is now generic over SstMerger, allowing mock implementations to be injected for unit testing without relying on the full object storage access layer. This refactoring separates the concerns of SST file merging from the overall compaction orchestration logic.

Additionally:
- Updated CompactionScheduler to use DefaultCompactor::default().
- Added unit tests for DefaultCompactor using a MockMerger.

Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>

* fix(compaction): propagate join error during sst flush

Correctly propagates the error when joining SST flush handles during compaction. Previously, the error was logged but not returned, leading to potential silent failures.
Also reorders some imports for consistency.

Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>

* perf(compaction): pre-allocate capacity for compacted_inputs

Pre-allocates capacity for the compacted_inputs vector based on the estimated total size of inputs and expired SSTs. This optimization aims to reduce vector reallocations during the compaction process.

Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>

* feat/allow-partial-compaction:
 ### Commit Message

 Enhance `DefaultCompactor` and `MockMerger` for Improved Flexibility

 - **`compactor.rs`**:
   - Added `Clone` trait to `DefaultSstMerger` and `MockMerger` to allow cloning.
   - Removed `Arc` wrapping from `DefaultCompactor`'s `merger` field for direct usage.
   - Updated `merge_ssts` method to require `Clone` trait for `SstMerger`.
   - Modified `MockMerger` to use `Arc<Mutex>` for `results` and `call_idx` to ensure thread safety.
   - Adjusted error handling to use `error::InvalidMetaSnafu` directly.

Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>

---------

Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
2026-04-13 08:12:11 +00:00
fys
76cad696c6 feat: add parquet nested leaf projection (#7900)
* feat: add parquet nested leaf projection

* rename ParquetProjection related struct

* add some apis

* extract common build schema function for test

* remove unsed method

* keep only deduped parquet root projection constructor

* add more unit tests

* fix: typo

* fix: cr

* fast-path parquet root projection without nested fields

* extract a build_projection_mask method

* fix: cargo clippy
2026-04-10 10:41:48 +00:00
Yingwen
fd94f55193 refactor(mito2): remove dead scan code (#7925)
* refactor(mito2): remove dead batch parallel scan helpers

Signed-off-by: evenyag <realevenyag@gmail.com>

* refactor(mito2): remove dead merge reader path

Signed-off-by: evenyag <realevenyag@gmail.com>

* refactor(mito2): remove dead batch dedup reader

Signed-off-by: evenyag <realevenyag@gmail.com>

* test(mito2): remove obsolete batch source helper

Signed-off-by: evenyag <realevenyag@gmail.com>

* refactor: remove unused plain batch

Signed-off-by: evenyag <realevenyag@gmail.com>

---------

Signed-off-by: evenyag <realevenyag@gmail.com>
2026-04-10 03:12:33 +00:00
Ruihang Xia
09b368c00a feat: tune constants (#7851)
* feat: tune constants

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* cap output batch size

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* handle empty input

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* one more ut for cr

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

---------

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
2026-04-08 23:34:13 +00:00
jeremyhi
6e0f5c5042 chore: memory limit comment (#7914)
* chore: memory limit comment

Signed-off-by: jeremyhi <fengjiachun@gmail.com>

* chore: by gemini comment

Signed-off-by: jeremyhi <fengjiachun@gmail.com>

---------

Signed-off-by: jeremyhi <fengjiachun@gmail.com>
2026-04-08 00:37:03 +00:00
Yingwen
6c72dc8e57 fix: add overflow check before interleave() (#7921)
* fix: add overflow check before interleave()

Signed-off-by: evenyag <realevenyag@gmail.com>

* refactor: pass batches and column index to check_interleave_bytes_overflow

Refactor check_interleave_bytes_overflow to accept batches and a column
index directly, avoiding the intermediate Vec collection of arrays.

Signed-off-by: evenyag <realevenyag@gmail.com>

---------

Signed-off-by: evenyag <realevenyag@gmail.com>
2026-04-07 21:59:29 +00:00
Yingwen
233e35c0c9 feat!: switch default sst format to flat (#7909)
* feat: support alter from primary_key to flat

Signed-off-by: evenyag <realevenyag@gmail.com>

* chore: alter flat to primary_key

Signed-off-by: evenyag <realevenyag@gmail.com>

* feat: change default_experimental_flat_format to true

Signed-off-by: evenyag <realevenyag@gmail.com>

* feat: compute channel size from splitted batch size

Signed-off-by: evenyag <realevenyag@gmail.com>

* test: add tests for split and channel size

Signed-off-by: evenyag <realevenyag@gmail.com>

* fix: always set sst_format from manifest on region open

sanitize_region_options did not set options.sst_format when the
default (PrimaryKey) matched the manifest value, leaving it as None
after reopen. This caused the alter format change to appear lost.

Signed-off-by: evenyag <realevenyag@gmail.com>

* test: fix tests

Signed-off-by: evenyag <realevenyag@gmail.com>

* test: show create table after alteration

Signed-off-by: evenyag <realevenyag@gmail.com>

* refactor!: rename default_experimental_flat_format to default_flat_format

The flat format is no longer experimental. Remove "experimental" from
the config field name, doc comments, and all references.

Signed-off-by: evenyag <realevenyag@gmail.com>

* chore: fix clippy

Signed-off-by: evenyag <realevenyag@gmail.com>

---------

Signed-off-by: evenyag <realevenyag@gmail.com>
2026-04-03 04:14:02 +00:00
Yingwen
2af59ed386 feat: always use flat scan path for both format (#7901)
* feat: remove primary_key format scan path

Signed-off-by: evenyag <realevenyag@gmail.com>

* feat: remove flat format flag

Signed-off-by: evenyag <realevenyag@gmail.com>

* test: remove CompatReader tests

Signed-off-by: evenyag <realevenyag@gmail.com>

* chore: show whether the format is flat in explain

Signed-off-by: evenyag <realevenyag@gmail.com>

* test: stable series scan result

Signed-off-by: evenyag <realevenyag@gmail.com>

---------

Signed-off-by: evenyag <realevenyag@gmail.com>
2026-04-02 07:53:33 +00:00
Yingwen
b75a112561 feat: implement prefilter for bulk memtable (#7895)
* feat: prefilter in memtable

Signed-off-by: evenyag <realevenyag@gmail.com>

* chore: fmt code

Signed-off-by: evenyag <realevenyag@gmail.com>

* feat: bulk part reader also do prefilter

Signed-off-by: evenyag <realevenyag@gmail.com>

* chore: extract pk filters check

Signed-off-by: evenyag <realevenyag@gmail.com>

* fix: scanbench support explain verbose

Signed-off-by: evenyag <realevenyag@gmail.com>

* feat: add metrics for mem prefilter

Signed-off-by: evenyag <realevenyag@gmail.com>

* chore: address review comment

Signed-off-by: evenyag <realevenyag@gmail.com>

* chore: remove dead code

Signed-off-by: evenyag <realevenyag@gmail.com>

---------

Signed-off-by: evenyag <realevenyag@gmail.com>
2026-04-01 09:02:54 +00:00
Yingwen
dde1edcdb4 fix: incorrect prefilter check (#7886)
Signed-off-by: evenyag <realevenyag@gmail.com>
2026-03-31 02:45:07 +00:00
Ning Sun
e14404c677 chore: update rust toolchain to 2026-03-21 (#7849)
* chore: update rust toolchain to 2026-03-21

* chore: new format

* fix: lint

* chore: resolve lint issues

* chore: remove as_millis_f64

* chore: deps up
2026-03-30 12:13:14 +00:00
Yingwen
92e2d71f48 feat: implement prefilter framework and primary key prefilter (#7862)
* feat: prefilter basic framework

Signed-off-by: evenyag <realevenyag@gmail.com>

* refactor: move arguments to RowGroupBuildContext

Signed-off-by: evenyag <realevenyag@gmail.com>

* refactor: skip prefiltered exprs in FlatPruneReader

Signed-off-by: evenyag <realevenyag@gmail.com>

* refactor: remove unused functions

Signed-off-by: evenyag <realevenyag@gmail.com>

* chore: update comment

Signed-off-by: evenyag <realevenyag@gmail.com>

* feat: handle partition columns in prefilter

Signed-off-by: evenyag <realevenyag@gmail.com>

* chore: fix clippy

Signed-off-by: evenyag <realevenyag@gmail.com>

* fix: apply prefiltered selection by and_then

Signed-off-by: evenyag <realevenyag@gmail.com>

* chore: fix clippy

Signed-off-by: evenyag <realevenyag@gmail.com>

* fix: handle last row cache

Signed-off-by: evenyag <realevenyag@gmail.com>

* fix: don't ignore error in PrimaryKeyFilter

Signed-off-by: evenyag <realevenyag@gmail.com>

---------

Signed-off-by: evenyag <realevenyag@gmail.com>
2026-03-30 04:24:20 +00:00
jeremyhi
3904df5397 chore: refine track memory metrics semantics (#7874)
Signed-off-by: jeremyhi <fengjiachun@gmail.com>
2026-03-30 03:32:49 +00:00
Yingwen
d3c8df70f5 fix: fix SeriesScan verbose mode mising metrics (#7872)
Signed-off-by: evenyag <realevenyag@gmail.com>
2026-03-26 17:52:18 +00:00
jeremyhi
8058ce7cf2 refactor: simplify scan memory tracking (#7827)
* refactor: simplify scan memory tracking

Signed-off-by: jeremyhi <fengjiachun@gmail.com>

* chore: make confg-docs

Signed-off-by: jeremyhi <fengjiachun@gmail.com>

* chore: by codex review comment

Signed-off-by: jeremyhi <fengjiachun@gmail.com>

* feat: track_with_policy

Signed-off-by: jeremyhi <fengjiachun@gmail.com>

* chore: minor change

Signed-off-by: jeremyhi <fengjiachun@gmail.com>

* chore: mem granularity mb to kb

Signed-off-by: jeremyhi <fengjiachun@gmail.com>

* chore: by review comment

Signed-off-by: jeremyhi <fengjiachun@gmail.com>

* chore: by scan_memory_on_exhausted comment

Signed-off-by: jeremyhi <fengjiachun@gmail.com>

* fix: by review comment

Signed-off-by: jeremyhi <fengjiachun@gmail.com>

* chore: typo

Signed-off-by: jeremyhi <fengjiachun@gmail.com>

---------

Signed-off-by: jeremyhi <fengjiachun@gmail.com>
2026-03-26 03:25:50 +00:00
Lei, HUANG
35c5a4adb7 fix(mito2): accept post-truncate flush for skip-wal tables (#7858)
Allow flush edits with equal entry ids when flushed sequence advances, so close-time flush after truncate still succeeds for skip-wal regions while stale pre-truncate flushes are rejected. Add a regression test for create->truncate->write->close timing.

Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
2026-03-25 12:26:27 +00:00
Yingwen
04aa84af62 feat: use ArrowReaderBuilder instead of the RowGroups API (#7853)
* feat: use ArrowReaderBuilder instead of the RowGroups API

Signed-off-by: evenyag <realevenyag@gmail.com>

* refactor: make row_group_idx required

Signed-off-by: evenyag <realevenyag@gmail.com>

* chore: remove unsed variant

Signed-off-by: evenyag <realevenyag@gmail.com>

* fix: collect total_fetch_elapsed metrics

Signed-off-by: evenyag <realevenyag@gmail.com>

---------

Signed-off-by: evenyag <realevenyag@gmail.com>
2026-03-25 03:10:19 +00:00
Yingwen
0e22d6a72b feat: implement partition range cache stream (#7842)
* feat: add cache stream helpers, key construction, config wiring, and metrics for partition range cache

Add range result cache size config field and wire it through cache builder
chains. Implement cache key building (build_range_cache_key), stream
replay/store helpers (cached_flat_range_stream, cache_flat_range_stream),
dictionary compaction (compact_pk_dictionary), and partition range row group
collection. Add range cache metrics (size, hit, miss) to ScanMetricsSet
and PartitionMetrics. Move fingerprint tests from scan_region to
range_cache module. These functions are not yet wired into scan execution.

Signed-off-by: evenyag <realevenyag@gmail.com>

* feat: add benchmark for cache stream

Signed-off-by: evenyag <realevenyag@gmail.com>

* refactor: move bench_util to test_util

Signed-off-by: evenyag <realevenyag@gmail.com>

* feat: share dict

Signed-off-by: evenyag <realevenyag@gmail.com>

* test: test ptr_eq

Signed-off-by: evenyag <realevenyag@gmail.com>

* chore: fmt code

Signed-off-by: evenyag <realevenyag@gmail.com>

* refactor: simplify value array handling

Signed-off-by: evenyag <realevenyag@gmail.com>

* chore: add todo for estimate size

Signed-off-by: evenyag <realevenyag@gmail.com>

* feat: simplify size calculation

Signed-off-by: evenyag <realevenyag@gmail.com>

* chore: remove one test

Signed-off-by: evenyag <realevenyag@gmail.com>

* test: update config test

Signed-off-by: evenyag <realevenyag@gmail.com>

* chore: address review comment

Only ignore exprs that can extract time ranges

Signed-off-by: evenyag <realevenyag@gmail.com>

* test: fix tests

Signed-off-by: evenyag <realevenyag@gmail.com>

---------

Signed-off-by: evenyag <realevenyag@gmail.com>
2026-03-24 10:01:13 +00:00
Yingwen
5231ee40c8 feat: add parquet pk prefilter helpers (#7850)
* feat: extract parquet pk prefilter helpers

Signed-off-by: evenyag <realevenyag@gmail.com>

* chore: fmt code

Signed-off-by: evenyag <realevenyag@gmail.com>

* chore: fix warnings

Signed-off-by: evenyag <realevenyag@gmail.com>

* chore: update todo

Signed-off-by: evenyag <realevenyag@gmail.com>

---------

Signed-off-by: evenyag <realevenyag@gmail.com>
2026-03-24 03:57:18 +00:00
Ruihang Xia
f999d5e70e feat: avoid some vector-array conversions on flat projection (#7804)
* perf(mito2): optimize flat projection conversion

* shrink the diff size

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* apply gemini's sugg

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* nit

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

---------

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
2026-03-24 00:11:37 +00:00
Lei, HUANG
7874282089 feat(mito): flat scan for time series memtable (#7814)
* feat/flat-for-time-series:
 ### Commit Message

 Enhance `TimeSeriesMemtable` with Record Batch Support

 - **`time_series.rs`**:
   - Introduced `BatchToRecordBatchContext` to facilitate conversion of batch iterators to record batch iterators.
   - Added `build_record_batch` method in `TimeSeriesIterBuilder` to support record batch creation.
   - Implemented multiple test cases to validate the functionality of record batch creation, including tests for projections,
 deduplication, sequence filtering, and data correctness.

Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>

* feat/flat-for-time-series:
 Refactor `TimeSeriesMemtable` and `TimeSeriesIterBuilder`

 - Renamed `adapter_context` to `batch_to_record_batch` in `TimeSeriesMemtable` for clarity.
 - Simplified `MemtableRangeContext` initialization by removing the `batch_to_record_batch` parameter.
 - Added `is_record_batch` method to `TimeSeriesIterBuilder` to indicate record batch status.

Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>

* feat/flat-for-time-series:
 ### Add Time Range Filtering and Predicate Group Enhancements

 - **`memtable.rs`**: Updated `IterBuilder` to include `time_range` parameter in `build_record_batch` method, enhancing record batch iteration with time range filtering.
 - **`time_series.rs`**: Modified `TimeSeriesIterBuilder` to use `PredicateGroup` instead of `Predicate`, and integrated `PruneTimeIterator` for time-based filtering.
 - **`memtable_util.rs`**: Removed unused `Predicate` import, reflecting changes in predicate handling.

Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>

---------

Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
2026-03-23 19:39:57 +00:00