Lei, HUANG
35c5a4adb7
fix(mito2): accept post-truncate flush for skip-wal tables ( #7858 )
...
Allow flush edits with equal entry ids when flushed sequence advances, so close-time flush after truncate still succeeds for skip-wal regions while stale pre-truncate flushes are rejected. Add a regression test for create->truncate->write->close timing.
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com >
2026-03-25 12:26:27 +00:00
Yingwen
04aa84af62
feat: use ArrowReaderBuilder instead of the RowGroups API ( #7853 )
...
* feat: use ArrowReaderBuilder instead of the RowGroups API
Signed-off-by: evenyag <realevenyag@gmail.com >
* refactor: make row_group_idx required
Signed-off-by: evenyag <realevenyag@gmail.com >
* chore: remove unsed variant
Signed-off-by: evenyag <realevenyag@gmail.com >
* fix: collect total_fetch_elapsed metrics
Signed-off-by: evenyag <realevenyag@gmail.com >
---------
Signed-off-by: evenyag <realevenyag@gmail.com >
2026-03-25 03:10:19 +00:00
Yingwen
0e22d6a72b
feat: implement partition range cache stream ( #7842 )
...
* feat: add cache stream helpers, key construction, config wiring, and metrics for partition range cache
Add range result cache size config field and wire it through cache builder
chains. Implement cache key building (build_range_cache_key), stream
replay/store helpers (cached_flat_range_stream, cache_flat_range_stream),
dictionary compaction (compact_pk_dictionary), and partition range row group
collection. Add range cache metrics (size, hit, miss) to ScanMetricsSet
and PartitionMetrics. Move fingerprint tests from scan_region to
range_cache module. These functions are not yet wired into scan execution.
Signed-off-by: evenyag <realevenyag@gmail.com >
* feat: add benchmark for cache stream
Signed-off-by: evenyag <realevenyag@gmail.com >
* refactor: move bench_util to test_util
Signed-off-by: evenyag <realevenyag@gmail.com >
* feat: share dict
Signed-off-by: evenyag <realevenyag@gmail.com >
* test: test ptr_eq
Signed-off-by: evenyag <realevenyag@gmail.com >
* chore: fmt code
Signed-off-by: evenyag <realevenyag@gmail.com >
* refactor: simplify value array handling
Signed-off-by: evenyag <realevenyag@gmail.com >
* chore: add todo for estimate size
Signed-off-by: evenyag <realevenyag@gmail.com >
* feat: simplify size calculation
Signed-off-by: evenyag <realevenyag@gmail.com >
* chore: remove one test
Signed-off-by: evenyag <realevenyag@gmail.com >
* test: update config test
Signed-off-by: evenyag <realevenyag@gmail.com >
* chore: address review comment
Only ignore exprs that can extract time ranges
Signed-off-by: evenyag <realevenyag@gmail.com >
* test: fix tests
Signed-off-by: evenyag <realevenyag@gmail.com >
---------
Signed-off-by: evenyag <realevenyag@gmail.com >
2026-03-24 10:01:13 +00:00
Yingwen
5231ee40c8
feat: add parquet pk prefilter helpers ( #7850 )
...
* feat: extract parquet pk prefilter helpers
Signed-off-by: evenyag <realevenyag@gmail.com >
* chore: fmt code
Signed-off-by: evenyag <realevenyag@gmail.com >
* chore: fix warnings
Signed-off-by: evenyag <realevenyag@gmail.com >
* chore: update todo
Signed-off-by: evenyag <realevenyag@gmail.com >
---------
Signed-off-by: evenyag <realevenyag@gmail.com >
2026-03-24 03:57:18 +00:00
Ruihang Xia
f999d5e70e
feat: avoid some vector-array conversions on flat projection ( #7804 )
...
* perf(mito2): optimize flat projection conversion
* shrink the diff size
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
* apply gemini's sugg
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
* nit
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
---------
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
2026-03-24 00:11:37 +00:00
Lei, HUANG
7874282089
feat(mito): flat scan for time series memtable ( #7814 )
...
* feat/flat-for-time-series:
### Commit Message
Enhance `TimeSeriesMemtable` with Record Batch Support
- **`time_series.rs`**:
- Introduced `BatchToRecordBatchContext` to facilitate conversion of batch iterators to record batch iterators.
- Added `build_record_batch` method in `TimeSeriesIterBuilder` to support record batch creation.
- Implemented multiple test cases to validate the functionality of record batch creation, including tests for projections,
deduplication, sequence filtering, and data correctness.
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com >
* feat/flat-for-time-series:
Refactor `TimeSeriesMemtable` and `TimeSeriesIterBuilder`
- Renamed `adapter_context` to `batch_to_record_batch` in `TimeSeriesMemtable` for clarity.
- Simplified `MemtableRangeContext` initialization by removing the `batch_to_record_batch` parameter.
- Added `is_record_batch` method to `TimeSeriesIterBuilder` to indicate record batch status.
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com >
* feat/flat-for-time-series:
### Add Time Range Filtering and Predicate Group Enhancements
- **`memtable.rs`**: Updated `IterBuilder` to include `time_range` parameter in `build_record_batch` method, enhancing record batch iteration with time range filtering.
- **`time_series.rs`**: Modified `TimeSeriesIterBuilder` to use `PredicateGroup` instead of `Predicate`, and integrated `PruneTimeIterator` for time-based filtering.
- **`memtable_util.rs`**: Removed unused `Predicate` import, reflecting changes in predicate handling.
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com >
---------
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com >
2026-03-23 19:39:57 +00:00
Ruihang Xia
2af3951944
feat: cache decoded region metadata alone with parquet metadata ( #7813 )
...
* cache decoded region metadata
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
* fix: account for decoded sst metadata cache weight
* take optional pre-exist metadata
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
---------
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
2026-03-19 03:09:47 +00:00
Yingwen
e0aadffb91
feat: add flat last row reader to the final stream ( #7818 )
...
Signed-off-by: evenyag <realevenyag@gmail.com >
2026-03-17 07:55:48 +00:00
Yingwen
5a37e58b4f
feat(mito2): add partition range cache infrastructure ( #7798 )
...
* feat: add partition range cache infra
Signed-off-by: evenyag <realevenyag@gmail.com >
* refactor: optimize scan request fingerprint cloning
Signed-off-by: evenyag <realevenyag@gmail.com >
* refactor: merge loops
Signed-off-by: evenyag <realevenyag@gmail.com >
* chore: more docs
Signed-off-by: evenyag <realevenyag@gmail.com >
* chore: update estimated size method and comment
Signed-off-by: evenyag <realevenyag@gmail.com >
* chore: fix clippy
Signed-off-by: evenyag <realevenyag@gmail.com >
* feat: only cache when we scan files
Signed-off-by: evenyag <realevenyag@gmail.com >
* fix: address PR review comments for partition range cache
- Remove TimeSeriesDistribution from fingerprint as it only affects yield order
- Disable range cache when dyn filters are present since they change at runtime
Signed-off-by: evenyag <realevenyag@gmail.com >
* chore: fmt code
Signed-off-by: evenyag <realevenyag@gmail.com >
---------
Signed-off-by: evenyag <realevenyag@gmail.com >
2026-03-17 03:53:20 +00:00
Ning Sun
dd82fcac00
chore: update visibility of BatchToRecordBatchAdapter::new ( #7817 )
2026-03-16 09:56:34 +00:00
Lei, HUANG
be4a7a6d37
refactor: remove Memtable::iter ( #7809 )
...
* refactor: remove Memtable::iter
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com >
* fix: review comments
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com >
---------
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com >
2026-03-16 07:49:31 +00:00
Yingwen
e215851c8a
refactor: unify flush and compaction to always use FlatSource ( #7799 )
...
* feat: support write flat as primary key format
Signed-off-by: evenyag <realevenyag@gmail.com >
* feat: migrate flush to always use FlatSource
Add FormatType propagation in SstWriteRequest and use it to choose
Flat vs PrimaryKey write paths (write_all_flat vs
write_all_flat_as_primary_key) in AccessLayer and WriteCache. Make
compactor and flush derive the sst_write_format from region options or
engine config. Simplify flush logic and remove the old memtable_source
helper. Update tests to set default sst_write_format.
Signed-off-by: evenyag <realevenyag@gmail.com >
* refactor: compaction use flat source
Signed-off-by: evenyag <realevenyag@gmail.com >
* refactor: read parquet sequentially as flat batches
Signed-off-by: evenyag <realevenyag@gmail.com >
* refactor: remove new_batch_with_binary in favor of new_record_batch_with_binary
Replace PrimaryKeyWriteFormat with FlatWriteFormat in test_read_large_binary
test and use new_record_batch_with_binary directly, removing the now-unused
new_batch_with_binary function and its BinaryArray import.
Signed-off-by: evenyag <realevenyag@gmail.com >
* test: add tests for PrimaryKeyWriteFormat::convert_flat_batch
Signed-off-by: evenyag <realevenyag@gmail.com >
* refactor: remove Either from SstWriteRequest
Signed-off-by: evenyag <realevenyag@gmail.com >
* fix: handle index build mode
Signed-off-by: evenyag <realevenyag@gmail.com >
* fix: consider sparse encoding and last non null in flush
Signed-off-by: evenyag <realevenyag@gmail.com >
* test: add unit tests for field_column_start edge cases
Signed-off-by: evenyag <realevenyag@gmail.com >
---------
Signed-off-by: evenyag <realevenyag@gmail.com >
2026-03-13 09:44:13 +00:00
discord9
922f9cb3d6
feat: use dyn filter ( #7545 )
...
* parent b2074e3863
author discord9 <discord9@163.com > 1767869295 +0800
committer discord9 <discord9@163.com > 1772529023 +0800
feat: use dyn filter
Signed-off-by: discord9 <discord9@163.com >
not supported
Signed-off-by: discord9 <discord9@163.com >
refactor: use make_mut instead
Signed-off-by: discord9 <discord9@163.com >
refactor: rm need to clone stream ctx
Signed-off-by: discord9 <discord9@163.com >
r
Signed-off-by: discord9 <discord9@163.com >
pcr
Signed-off-by: discord9 <discord9@163.com >
test: wait for datafusion update
Signed-off-by: discord9 <discord9@163.com >
refactor: use arc swap for dyn filters
Signed-off-by: discord9 <discord9@163.com >
* test: update sqlness
Signed-off-by: discord9 <discord9@163.com >
* chore: comment out sqlness
Signed-off-by: discord9 <discord9@163.com >
* test: update sqlness
Signed-off-by: discord9 <discord9@163.com >
* test: sqlness fix
Signed-off-by: discord9 <discord9@163.com >
* refactor: predicate without option
Signed-off-by: discord9 <discord9@163.com >
* feat: print dyn filters& more tests
Signed-off-by: discord9 <discord9@163.com >
* test: sqlness vector result update
Signed-off-by: discord9 <discord9@163.com >
* chore: log
Signed-off-by: discord9 <discord9@163.com >
* test: properly redact
Signed-off-by: discord9 <discord9@163.com >
* test: better data dist for non empty dyn filter
Signed-off-by: discord9 <discord9@163.com >
* test: properly redacted
Signed-off-by: discord9 <discord9@163.com >
* chore: per review
Signed-off-by: discord9 <discord9@163.com >
* properly redact
Signed-off-by: discord9 <discord9@163.com >
* docs: explain why not do it
Signed-off-by: discord9 <discord9@163.com >
* chore: rename update to add as its more proper
Signed-off-by: discord9 <discord9@163.com >
* chore: rm no need clone
Signed-off-by: discord9 <discord9@163.com >
* docs: per review
Signed-off-by: discord9 <discord9@163.com >
---------
Signed-off-by: discord9 <discord9@163.com >
2026-03-11 03:06:24 +00:00
Weny Xu
b1b91e88f4
fix(mito2): ensure enter staging waits for compaction ( #7776 )
...
* fix: do not schedule compaction
Signed-off-by: WenyXu <wenymedia@gmail.com >
* mito2: add pending ddl queue primitives to compaction scheduler
Signed-off-by: WenyXu <wenymedia@gmail.com >
* mito2: hand off pending ddls on compaction finish
Signed-off-by: WenyXu <wenymedia@gmail.com >
* mito2: defer enter staging via compaction pending ddl queue
Signed-off-by: WenyXu <wenymedia@gmail.com >
* mito2: cover pending-ddl failure paths on region lifecycle events
Signed-off-by: WenyXu <wenymedia@gmail.com >
* mito2: replay pending ddls directly in compaction handler
Signed-off-by: WenyXu <wenymedia@gmail.com >
* chore: styling
Signed-off-by: WenyXu <wenymedia@gmail.com >
* chore: styling
Signed-off-by: WenyXu <wenymedia@gmail.com >
* chore: apply suggestions from CR
Signed-off-by: WenyXu <wenymedia@gmail.com >
* chore: apply suggestions
Signed-off-by: WenyXu <wenymedia@gmail.com >
* test: add unit test
Signed-off-by: WenyXu <wenymedia@gmail.com >
* chore: add unit tests
Signed-off-by: WenyXu <wenymedia@gmail.com >
* chore: apply suggestions from CR
Signed-off-by: WenyXu <wenymedia@gmail.com >
---------
Signed-off-by: WenyXu <wenymedia@gmail.com >
2026-03-10 13:49:40 +00:00
Yingwen
04cd2c8a05
feat: flat read path support primary_key format memtables ( #7759 )
...
* feat: add adapter for batch to flat recordbatch
Signed-off-by: evenyag <realevenyag@gmail.com >
* feat: support batch to flat record batch in MemtableRange
Signed-off-by: evenyag <realevenyag@gmail.com >
* refactor: address review issues for BatchToRecordBatchAdapter
- Extract duplicated read_column_ids computation into a shared
`read_column_ids_from_projection` helper function
- Cache `FormatProjection` in `BatchToRecordBatchContext::new()` instead
of recomputing it on every `adapt_iter()` call
- Remove unnecessary `Arc` wrapping of `read_column_ids` in
`SimpleBulkMemtable::ranges()`
- Fix clippy `filter_map_bool_then` warning in `batch_adapter.rs`
Signed-off-by: evenyag <realevenyag@gmail.com >
* chore: simplify comments
Signed-off-by: evenyag <realevenyag@gmail.com >
* refactor(mito2): use read column ids in batch adapter
Signed-off-by: evenyag <realevenyag@gmail.com >
* test: test build_record_batch_iter
Signed-off-by: evenyag <realevenyag@gmail.com >
* chore: fmt code
Signed-off-by: evenyag <realevenyag@gmail.com >
* test: test build_record_batch_iter for all old memtables
Signed-off-by: evenyag <realevenyag@gmail.com >
* chore: address comment
Signed-off-by: evenyag <realevenyag@gmail.com >
* fix: prune time range before adapter
Signed-off-by: evenyag <realevenyag@gmail.com >
* chore: share BatchToRecordBatchContext in simple_bulk_memtable.rs
Signed-off-by: evenyag <realevenyag@gmail.com >
* chore: use ScalarValue::to_array_of_size to build repeated value array
Signed-off-by: evenyag <realevenyag@gmail.com >
---------
Signed-off-by: evenyag <realevenyag@gmail.com >
2026-03-10 12:46:39 +00:00
Ruihang Xia
58528d1334
feat: fast path for empty selection files ( #7780 )
...
* perf(mito2): skip reader context for empty selections
* refactor(mito2): make parquet reader input optional
2026-03-10 12:22:03 +00:00
Weny Xu
3e81345d7f
fix(mito2): avoid parquet redownload when write cache already contains file ( #7777 )
...
* feat: add idempotent write cache download API
* fix: skip parquet redownload in manifest edit path
* test: cover download_if_absent cache hit and miss
* chore: add comments
Signed-off-by: WenyXu <wenymedia@gmail.com >
---------
Signed-off-by: WenyXu <wenymedia@gmail.com >
2026-03-09 12:16:53 +00:00
Yingwen
4232ce8eaa
test: fix unstable index meta list test ( #7774 )
...
* test: fix unstable index meta list test
Signed-off-by: evenyag <realevenyag@gmail.com >
* fix: raise bucket_size threshold to avoid bucketing sizes in [512, 999] to 0
Signed-off-by: evenyag <realevenyag@gmail.com >
---------
Signed-off-by: evenyag <realevenyag@gmail.com >
2026-03-09 07:57:43 +00:00
Ruihang Xia
a71df9477a
perf(mito2): speed up parquet scan via minmax caches ( #7708 )
...
* perf(mito2): speed up parquet scan via meta caches
* fix(mito2): fix parquet pruning and metadata cache
* revert config changes
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
* enhance cache file enter logic
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
* resolve tiny cr comments
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
* only preload from fs or cache
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
* fix vector test
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
---------
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
2026-03-07 06:32:47 +00:00
Yingwen
93c48a078c
feat: implement last row cache reader for flat format ( #7757 )
...
* feat: initial implementation
Signed-off-by: evenyag <realevenyag@gmail.com >
* fix: handle multiple series
Signed-off-by: evenyag <realevenyag@gmail.com >
* fix: reset state in finish()
Signed-off-by: evenyag <realevenyag@gmail.com >
* fix: handle duplicated last timestamps across batches
Signed-off-by: evenyag <realevenyag@gmail.com >
* perf: compact primary key array
Signed-off-by: evenyag <realevenyag@gmail.com >
* fix(mito2): simplify flat last timestamp selector state
Signed-off-by: evenyag <realevenyag@gmail.com >
* refactor(mito2): rebuild flat pk dictionary from selector state
Signed-off-by: evenyag <realevenyag@gmail.com >
* test: reduce tests
Signed-off-by: evenyag <realevenyag@gmail.com >
* chore: update comment
Signed-off-by: evenyag <realevenyag@gmail.com >
* chore: more logs to debug
Signed-off-by: evenyag <realevenyag@gmail.com >
* feat: concat batches in last row reader
Signed-off-by: evenyag <realevenyag@gmail.com >
* refactor(mito2): simplify flat last row selector output buffer
- Replace VecDeque with BatchBuffer struct for output buffering
- Remove rebuild_pk_dictionary_for_key as batches go directly into buffer
- Remove unused push method and make BatchBuffer pub(crate)
- Remove debug logging in maybe_update_cache
Signed-off-by: evenyag <realevenyag@gmail.com >
* chore: address comments
Signed-off-by: evenyag <realevenyag@gmail.com >
---------
Signed-off-by: evenyag <realevenyag@gmail.com >
2026-03-06 12:08:58 +00:00
Yingwen
33acbf985d
fix: Allow overriding sequence when writing flat SSTs ( #7764 )
...
Signed-off-by: evenyag <realevenyag@gmail.com >
2026-03-06 09:43:58 +00:00
discord9
56ee8baa3f
feat: admin gc table/regions ( #7619 )
...
* feat: gc table
Signed-off-by: discord9 <discord9@163.com >
* test: admin gc
Signed-off-by: discord9 <discord9@163.com >
* chore: after rebase fix
Signed-off-by: discord9 <discord9@163.com >
* refactor: GcStats
Signed-off-by: discord9 <discord9@163.com >
* refactor: use gc ticker for admin gc
Signed-off-by: discord9 <discord9@163.com >
* fix: region routes override
Signed-off-by: discord9 <discord9@163.com >
* test: non happy path
Signed-off-by: discord9 <discord9@163.com >
* refactor: gc job report enum
Signed-off-by: discord9 <discord9@163.com >
* test: process 0 regions
Signed-off-by: discord9 <discord9@163.com >
* after rebase
Signed-off-by: discord9 <discord9@163.com >
* feat: allow manual gc to return error
Signed-off-by: discord9 <discord9@163.com >
* chore: update proto
Signed-off-by: discord9 <discord9@163.com >
* per review
Signed-off-by: discord9 <discord9@163.com >
* chore: timeout and update proto
Signed-off-by: discord9 <discord9@163.com >
* chore: udpate proto
Signed-off-by: discord9 <discord9@163.com >
---------
Signed-off-by: discord9 <discord9@163.com >
2026-03-06 08:25:44 +00:00
discord9
5e6d2b221e
feat: gc batch delete files ( #7733 )
...
* feat: batch delete
Signed-off-by: discord9 <discord9@163.com >
* chore: explict error msg
Signed-off-by: discord9 <discord9@163.com >
* chore: per review
Signed-off-by: discord9 <discord9@163.com >
* refactor: not fallback when batch failure
Signed-off-by: discord9 <discord9@163.com >
* chore: per review
Signed-off-by: discord9 <discord9@163.com >
* pcr
Signed-off-by: discord9 <discord9@163.com >
* chore: per review
Signed-off-by: discord9 <discord9@163.com >
* refactor: batch delete in access layer
Signed-off-by: discord9 <discord9@163.com >
---------
Signed-off-by: discord9 <discord9@163.com >
2026-03-06 07:17:20 +00:00
discord9
4c30b9efaf
fix: null first for part expr as logical expr ( #7747 )
...
* fix: null first for part expr as logical expr
Signed-off-by: discord9 <discord9@163.com >
* test: update tests
Signed-off-by: discord9 <discord9@163.com >
* chore: per review
Signed-off-by: discord9 <discord9@163.com >
* fix: nulll handle&non-null filter
Signed-off-by: discord9 <discord9@163.com >
* chore: doc test
Signed-off-by: discord9 <discord9@163.com >
---------
Signed-off-by: discord9 <discord9@163.com >
2026-03-06 02:53:05 +00:00
Yingwen
5c8ece27e0
feat: improve filter support for scanbench ( #7736 )
...
* feat: cast filters type for scanbench
Signed-off-by: evenyag <realevenyag@gmail.com >
* chore: pub file_range mod
So we can use the pub struct FileRange in other places
Signed-off-by: evenyag <realevenyag@gmail.com >
* fix: add api as dev-dependency to cmd for clippy
Signed-off-by: evenyag <realevenyag@gmail.com >
* feat: support profiling after warmup
Signed-off-by: evenyag <realevenyag@gmail.com >
---------
Signed-off-by: evenyag <realevenyag@gmail.com >
2026-03-03 09:00:41 +00:00
LFC
b2074e3863
chore: upgrade DataFusion family, again ( #7578 )
...
* chore: upgrade DataFusion family
Signed-off-by: luofucong <luofc@foxmail.com >
* chore: switch to released version of datafusion-pg-catalog
---------
Signed-off-by: luofucong <luofc@foxmail.com >
Co-authored-by: Ning Sun <sunning@greptime.com >
Co-authored-by: Ning Sun <sunng@protonmail.com >
2026-03-03 07:36:39 +00:00
Yingwen
f4b4d61651
feat: add extension range api for flat format ( #7730 )
...
feat: add flat extension api
Signed-off-by: evenyag <realevenyag@gmail.com >
2026-02-27 09:48:18 +00:00
LFC
5eac4f10aa
chore: remove dependency on "atty" ( #7725 )
...
Signed-off-by: luofucong <luofc@foxmail.com >
2026-02-26 09:58:01 +00:00
Yingwen
0c30bf1a10
feat: add a subcommand to bench scan ( #7722 )
...
* feat: support scan bench
Signed-off-by: evenyag <realevenyag@gmail.com >
* feat: support projection by name
Signed-off-by: evenyag <realevenyag@gmail.com >
* feat: support force flat format
Signed-off-by: evenyag <realevenyag@gmail.com >
* feat: spawn tasks to poll streams
Signed-off-by: evenyag <realevenyag@gmail.com >
* feat: support filter config
Signed-off-by: evenyag <realevenyag@gmail.com >
* feat: scan bench support wal
Signed-off-by: evenyag <realevenyag@gmail.com >
* chore: support not providing provider in wal
Signed-off-by: evenyag <realevenyag@gmail.com >
* fix: skip wal replay
Signed-off-by: evenyag <realevenyag@gmail.com >
* refactor: wrap EngineComponents
Signed-off-by: evenyag <realevenyag@gmail.com >
* docs: add scanbench doc
Signed-off-by: evenyag <realevenyag@gmail.com >
* chore: change --skip-wal-replay to --enable-wal
Signed-off-by: evenyag <realevenyag@gmail.com >
* chore: remove limit from config
Signed-off-by: evenyag <realevenyag@gmail.com >
---------
Signed-off-by: evenyag <realevenyag@gmail.com >
2026-02-26 06:37:40 +00:00
discord9
46683f908a
chore: tracing for gc ( #7723 )
...
* chore: tracing for gc
Signed-off-by: discord9 <discord9@163.com >
* chore: rm some clone
Signed-off-by: discord9 <discord9@163.com >
---------
Signed-off-by: discord9 <discord9@163.com >
2026-02-26 06:37:29 +00:00
Weny Xu
df04267c54
fix(repartition): reject writes on deallocating regions during region merge ( #7694 )
...
* feat(meta): add write route policy to region route with backward compatibility
Signed-off-by: WenyXu <wenymedia@gmail.com >
* fix(meta): use partition_expr compatibility accessor in repartition matching
Signed-off-by: WenyXu <wenymedia@gmail.com >
* feat(meta): introduce staging partition rule enum for repartition instructions
Signed-off-by: WenyXu <wenymedia@gmail.com >
* feat(datanode): plumb staging partition rule enum through heartbeat handlers
Signed-off-by: WenyXu <wenymedia@gmail.com >
* feat(meta): mark pending-deallocate regions as reject-all during merge staging
Signed-off-by: WenyXu <wenymedia@gmail.com >
* feat(partition): exclude reject-all regions from write partitioning
Signed-off-by: WenyXu <wenymedia@gmail.com >
* feat(mito): store staging partition rule enum in region state
Signed-off-by: WenyXu <wenymedia@gmail.com >
* feat(mito): reject writes in staging when partition rule is reject-all
Signed-off-by: WenyXu <wenymedia@gmail.com >
* feat(meta): send enter staging instruction with reject-all
Signed-off-by: WenyXu <wenymedia@gmail.com >
* fix(repartition): preserve reject-all on exit, merge enter-staging instructions, and allow staged bulk writes
Signed-off-by: WenyXu <wenymedia@gmail.com >
* refactor: refactor to ignore all writes
Signed-off-by: WenyXu <wenymedia@gmail.com >
* chore: apply suggestions
Signed-off-by: WenyXu <wenymedia@gmail.com >
* refactor: rename StagingPartitionRule to StagingPartitionDirective across staging flow
Signed-off-by: WenyXu <wenymedia@gmail.com >
* chore: add comments
Signed-off-by: WenyXu <wenymedia@gmail.com >
* chore: clippy
Signed-off-by: WenyXu <wenymedia@gmail.com >
* refactor: nit
Signed-off-by: WenyXu <wenymedia@gmail.com >
* chore: apply suggestions
Signed-off-by: WenyXu <wenymedia@gmail.com >
* refactor: rename
Signed-off-by: WenyXu <wenymedia@gmail.com >
---------
Signed-off-by: WenyXu <wenymedia@gmail.com >
2026-02-25 07:04:38 +00:00
Yingwen
42ad842434
feat: support changing table's append_mode to true ( #7669 )
...
* feat: support alter append_mode to true
Signed-off-by: evenyag <realevenyag@gmail.com >
* test: add sqlness test
Signed-off-by: evenyag <realevenyag@gmail.com >
* chore: remove comment
Signed-off-by: evenyag <realevenyag@gmail.com >
* chore: fix compiler errors
Signed-off-by: evenyag <realevenyag@gmail.com >
* fix: clear merge mode in mito when setting append mode
Signed-off-by: evenyag <realevenyag@gmail.com >
* fix: sanitize open request and options with both append/merge mode
Signed-off-by: evenyag <realevenyag@gmail.com >
* feat: clear merge mode when append mode is true
Signed-off-by: evenyag <realevenyag@gmail.com >
---------
Signed-off-by: evenyag <realevenyag@gmail.com >
2026-02-25 04:11:23 +00:00
discord9
279b009583
chore: more gc metrics ( #7661 )
...
* chore: more gc metrics
Signed-off-by: discord9 <discord9@163.com >
* clippy
Signed-off-by: discord9 <discord9@163.com >
* refactor: simple metrics
Signed-off-by: discord9 <discord9@163.com >
* unused metrics
Signed-off-by: discord9 <discord9@163.com >
* fix(meta-srv): count need-retry regions in GC failure metric
Signed-off-by: discord9 <discord9@163.com >
* chore: better bucketing
Signed-off-by: discord9 <discord9@163.com >
---------
Signed-off-by: discord9 <discord9@163.com >
2026-02-24 08:21:10 +00:00
Weny Xu
0ed3b83099
refactor: rename partition rule version to partition expr version ( #7696 )
...
* refactor: rename partition rule version to partition expr version
Signed-off-by: WenyXu <wenymedia@gmail.com >
* chore: update proto
Signed-off-by: WenyXu <wenymedia@gmail.com >
* chore: clippy
Signed-off-by: WenyXu <wenymedia@gmail.com >
---------
Signed-off-by: WenyXu <wenymedia@gmail.com >
2026-02-10 10:12:47 +00:00
Weny Xu
45a3e1121d
fix(mito2): introduce PartitionExprChange in staging flow and keep memtables on metadata-only updates ( #7695 )
...
* feat(mito2): add RegionMetaAction::PartitionExprChange
Signed-off-by: WenyXu <wenymedia@gmail.com >
* refactor(mito2): apply partition-expr action in manifest builder and manager
Signed-off-by: WenyXu <wenymedia@gmail.com >
* refactor(mito2): add partition-expr action merge rules and conflict guard
Signed-off-by: WenyXu <wenymedia@gmail.com >
* refactor(mito2): use partition-rule action in enter staging
Signed-off-by: WenyXu <wenymedia@gmail.com >
* fix(mito2): validate Change and route to metadata-only update on staging exit
Signed-off-by: WenyXu <wenymedia@gmail.com >
* test(mito2): cover partition-expr action staging flow and conflict cases
Signed-off-by: WenyXu <wenymedia@gmail.com >
* test(mito2): add apply-staging coverage for Change metadata validation
Signed-off-by: WenyXu <wenymedia@gmail.com >
* test(mito2): add apply-staging coverage for Change metadata validation
Signed-off-by: WenyXu <wenymedia@gmail.com >
* chore: fmt
Signed-off-by: WenyXu <wenymedia@gmail.com >
* chore: remove unused error
Signed-off-by: WenyXu <wenymedia@gmail.com >
* chore: apply suggestions
Signed-off-by: WenyXu <wenymedia@gmail.com >
* chore: add warn
Signed-off-by: WenyXu <wenymedia@gmail.com >
* chore: add comments
Signed-off-by: WenyXu <wenymedia@gmail.com >
* test: preserves unflushed memtable
Signed-off-by: WenyXu <wenymedia@gmail.com >
* chore: fmt
Signed-off-by: WenyXu <wenymedia@gmail.com >
---------
Signed-off-by: WenyXu <wenymedia@gmail.com >
2026-02-10 08:26:25 +00:00
fys
c75f6d8ca8
fix(mito2): filter extension ranges in pruner ( #7693 )
...
* fix: ci
* fix: cargo fmt
2026-02-10 02:41:22 +00:00
Weny Xu
8026b23834
feat: partition rule version validation for writes and staging ( #7628 )
...
* feat: verify partition rule
Signed-off-by: WenyXu <wenymedia@gmail.com >
* feat: add partition version cache
Signed-off-by: WenyXu <wenymedia@gmail.com >
* chore: header check
Signed-off-by: WenyXu <wenymedia@gmail.com >
* chore: fmt toml
Signed-off-by: WenyXu <wenymedia@gmail.com >
* refactor: minor refactor
Signed-off-by: WenyXu <wenymedia@gmail.com >
* chore: header
Signed-off-by: WenyXu <wenymedia@gmail.com >
* chore: fix clippy
Signed-off-by: WenyXu <wenymedia@gmail.com >
* fix: fix unit tests
Signed-off-by: WenyXu <wenymedia@gmail.com >
* refactor: minor refactor
Signed-off-by: WenyXu <wenymedia@gmail.com >
* chore: apply suggestions
Signed-off-by: WenyXu <wenymedia@gmail.com >
* chore: nit
Signed-off-by: WenyXu <wenymedia@gmail.com >
* chore: nit
Signed-off-by: WenyXu <wenymedia@gmail.com >
* chore: apply suggestions
Signed-off-by: WenyXu <wenymedia@gmail.com >
* chore: apply suggestions
Signed-off-by: WenyXu <wenymedia@gmail.com >
---------
Signed-off-by: WenyXu <wenymedia@gmail.com >
2026-02-06 12:16:34 +00:00
Yingwen
581c777dce
feat: Implement a shared pruner for partitions in the same scanner ( #7635 )
...
* feat: implement pruner
feat: initial implementation for pruner
Signed-off-by: evenyag <realevenyag@gmail.com >
refactor: simplify worker
Signed-off-by: evenyag <realevenyag@gmail.com >
refactor: increase remaining counts in each scan partition
Signed-off-by: evenyag <realevenyag@gmail.com >
feat: pre filter files prepared to read
Signed-off-by: evenyag <realevenyag@gmail.com >
feat: add metrics for pruner
Signed-off-by: evenyag <realevenyag@gmail.com >
feat: more logs for worker
Signed-off-by: evenyag <realevenyag@gmail.com >
feat: log files
Signed-off-by: evenyag <realevenyag@gmail.com >
fix: move sneders to pruner to avoid cycling ref
Signed-off-by: evenyag <realevenyag@gmail.com >
* feat: collect ReaderMetrics in pruner worker
Signed-off-by: evenyag <realevenyag@gmail.com >
* feat: report build part cost for per file metrics
Signed-off-by: evenyag <realevenyag@gmail.com >
* feat: skip files with no ranges in top metrics
Signed-off-by: evenyag <realevenyag@gmail.com >
* chore: update comments for pruner
Signed-off-by: evenyag <realevenyag@gmail.com >
* refactor: wrap method to get worker idx
Signed-off-by: evenyag <realevenyag@gmail.com >
* chore: remove unused PruneStatus
Signed-off-by: evenyag <realevenyag@gmail.com >
---------
Signed-off-by: evenyag <realevenyag@gmail.com >
2026-02-05 15:28:28 +00:00
dennis zhuang
8883022742
refactor(vector-index): use protobuf for metadata and align code ( #7648 )
...
* refactor(vector-index): use protobuf for metadata and introduce lifecycle traits
Signed-off-by: Dennis Zhuang <killme2008@gmail.com >
* chore: minor change
Signed-off-by: Dennis Zhuang <killme2008@gmail.com >
* refactor: by suggestions
Signed-off-by: Dennis Zhuang <killme2008@gmail.com >
* chore: format
Signed-off-by: Dennis Zhuang <killme2008@gmail.com >
* fix: style
Signed-off-by: Dennis Zhuang <killme2008@gmail.com >
* chore: remove usearch from mito2
Signed-off-by: Dennis Zhuang <killme2008@gmail.com >
* chore: tweak errors
Signed-off-by: Dennis Zhuang <killme2008@gmail.com >
* test: update index size in result
Signed-off-by: Dennis Zhuang <killme2008@gmail.com >
* fix: clippy
Signed-off-by: Dennis Zhuang <killme2008@gmail.com >
* chore: update proto deps
Signed-off-by: Dennis Zhuang <killme2008@gmail.com >
---------
Signed-off-by: Dennis Zhuang <killme2008@gmail.com >
2026-02-05 02:41:48 +00:00
dennis zhuang
c08f3a4472
test: adds sqlness test for vector index ( #7634 )
...
* test: adds sqlness test for vector index
Signed-off-by: Dennis Zhuang <killme2008@gmail.com >
* fix: CI
Signed-off-by: Dennis Zhuang <killme2008@gmail.com >
* test: redacted flat map and size
Signed-off-by: Dennis Zhuang <killme2008@gmail.com >
* test: simplify the replace rules
Signed-off-by: Dennis Zhuang <killme2008@gmail.com >
* chore: update comments and tests
Signed-off-by: Dennis Zhuang <killme2008@gmail.com >
---------
Signed-off-by: Dennis Zhuang <killme2008@gmail.com >
2026-02-04 03:54:47 +00:00
Yingwen
c6ce4485a2
chore: adjust manifest cache log level ( #7655 )
...
* chore: adjust manifest cache log level
Signed-off-by: evenyag <realevenyag@gmail.com >
* chore: add consts to words
Signed-off-by: evenyag <realevenyag@gmail.com >
---------
Signed-off-by: evenyag <realevenyag@gmail.com >
2026-02-03 07:08:52 +00:00
Ruihang Xia
d5285e965b
perf(mito2): merge last_non_null within memtable batches ( #7653 )
...
* perf(mito2): merge last_non_null within memtable batches
* fix(mito2): apply sequence filter before memtable merge
* test(mito2): cover merge_last_non_null
* refactor(mito2): remove redundant loop label
2026-02-02 13:43:32 +00:00
Yingwen
a8f1ed7fc9
feat: add recover_sync to ManifestCache::new ( #7652 )
...
feat: add recover_sync to ManifestCache::new
This ensures tests can recover in sync
Signed-off-by: evenyag <realevenyag@gmail.com >
2026-02-02 12:33:41 +00:00
discord9
dd4698002a
fix: send get file ref to all regions ( #7640 )
...
* fix: send get file ref to all regions
Signed-off-by: discord9 <discord9@163.com >
* refactor: return err on fail to get table route
Signed-off-by: discord9 <discord9@163.com >
* refactor: batch get
Signed-off-by: discord9 <discord9@163.com >
* chore: add loggin in all places
Signed-off-by: discord9 <discord9@163.com >
---------
Signed-off-by: discord9 <discord9@163.com >
2026-01-30 10:36:59 +00:00
Weny Xu
ac9c830365
fix: clean up staging blob directory on clear ( #7642 )
...
Signed-off-by: WenyXu <wenymedia@gmail.com >
2026-01-30 09:39:37 +00:00
Yingwen
7711661618
feat: BulkMemtable compact parts without encoding into Parquet ( #7617 )
...
* feat: implement MultiBulkPart to hold a list of batches in BulkMemtable
Signed-off-by: evenyag <realevenyag@gmail.com >
* feat: Only encode parts when there are enough rows
Signed-off-by: evenyag <realevenyag@gmail.com >
* refactor: merge MultiBulkPartIter and BulkPartBatchIter
Signed-off-by: evenyag <realevenyag@gmail.com >
* refactor: remove some enums and structs
Signed-off-by: evenyag <realevenyag@gmail.com >
* chore: reuse code in merging bulk/encoded parts
Signed-off-by: evenyag <realevenyag@gmail.com >
* refactor: collect part groups directly
Signed-off-by: evenyag <realevenyag@gmail.com >
* test: add unit tests
Signed-off-by: evenyag <realevenyag@gmail.com >
* feat: enlarge merge threshold and configure by env
- GREPTIME_BULK_MERGE_THRESHOLD
- GREPTIME_BULK_ENCODE_ROW_THRESHOLD
- GREPTIME_BULK_ENCODE_BYTES_THRESHOLD
Signed-off-by: evenyag <realevenyag@gmail.com >
* feat: change flush strategy
Signed-off-by: evenyag <realevenyag@gmail.com >
* feat: add BulkMemtableConfig
Signed-off-by: evenyag <realevenyag@gmail.com >
* chore: limit max groups and adjust threshold
Signed-off-by: evenyag <realevenyag@gmail.com >
* feat: add flush file number metrics
Signed-off-by: evenyag <realevenyag@gmail.com >
* chore: add bulk filter 1 host bench
Signed-off-by: evenyag <realevenyag@gmail.com >
* feat: adjust bulk compact threshold
Signed-off-by: evenyag <realevenyag@gmail.com >
* feat: flush a file if == min_flush_rows
Signed-off-by: evenyag <realevenyag@gmail.com >
* test: fix test_index_build_type_compact test
Signed-off-by: evenyag <realevenyag@gmail.com >
* test: fix mito tests
Signed-off-by: evenyag <realevenyag@gmail.com >
* fix: remove regions from catchup_regions before notify
Signed-off-by: evenyag <realevenyag@gmail.com >
---------
Signed-off-by: evenyag <realevenyag@gmail.com >
2026-01-30 08:03:36 +00:00
Ruihang Xia
d99a946d33
refactor: remove duplications from mito ( #7632 )
...
* parse wal options
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
* new memtable from version
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
* file path
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
* use wal entry reader
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
* map batch responses
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
---------
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
2026-01-28 09:03:22 +00:00
discord9
00f568ed28
fix: gc update repart map properly ( #7606 )
...
* feat: update repart map
Signed-off-by: discord9 <discord9@163.com >
* fix: table id write lock
Signed-off-by: discord9 <discord9@163.com >
* chore: default value
Signed-off-by: discord9 <discord9@163.com >
* chore: config
Signed-off-by: discord9 <discord9@163.com >
* test: update repartition map
Signed-off-by: discord9 <discord9@163.com >
* fix: empty file ref set
Signed-off-by: discord9 <discord9@163.com >
* chore: per review
Signed-off-by: discord9 <discord9@163.com >
* chore: properly log error
Signed-off-by: discord9 <discord9@163.com >
---------
Signed-off-by: discord9 <discord9@163.com >
2026-01-28 04:31:19 +00:00
Weny Xu
5bfc728d32
fix(repartition): improve physical region allocation and compaction read path correctness ( #7621 )
...
* fix: fix metadata region
Signed-off-by: WenyXu <wenymedia@gmail.com >
* fix: adjust repartition flow and compaction read compatibility
Signed-off-by: WenyXu <wenymedia@gmail.com >
* chore: remove logs
Signed-off-by: WenyXu <wenymedia@gmail.com >
* refactor: rename compaction mapper and pk projection
Signed-off-by: WenyXu <wenymedia@gmail.com >
* refactor: rename `CompactionProjectionMapper`
Signed-off-by: WenyXu <wenymedia@gmail.com >
* refactor: clarify compaction projection naming
Signed-off-by: WenyXu <wenymedia@gmail.com >
* chore: add comments
Signed-off-by: WenyXu <wenymedia@gmail.com >
* chore: fmt
Signed-off-by: WenyXu <wenymedia@gmail.com >
* feat: allow create physical table with internal columns
Signed-off-by: WenyXu <wenymedia@gmail.com >
* test: add tests
Signed-off-by: WenyXu <wenymedia@gmail.com >
* fix: fix template logic
Signed-off-by: WenyXu <wenymedia@gmail.com >
* fix: fix unit test
Signed-off-by: WenyXu <wenymedia@gmail.com >
* chore: apply suggestions
Signed-off-by: WenyXu <wenymedia@gmail.com >
* chore: apply suggestions
Signed-off-by: WenyXu <wenymedia@gmail.com >
* chore: update sqlness result
Signed-off-by: WenyXu <wenymedia@gmail.com >
---------
Signed-off-by: WenyXu <wenymedia@gmail.com >
2026-01-28 04:04:05 +00:00
dennis zhuang
238bc4fa2c
feat: impl vector index query ( #7564 )
...
* feat: impl vector index query
Signed-off-by: Dennis Zhuang <killme2008@gmail.com >
* feat: remove VectorSearchRule and merge it into scan hint rule
Signed-off-by: Dennis Zhuang <killme2008@gmail.com >
* refactor: vector search hint
Signed-off-by: Dennis Zhuang <killme2008@gmail.com >
* test: join and subquery
Signed-off-by: Dennis Zhuang <killme2008@gmail.com >
* fix: clippy when feature disabled
Signed-off-by: Dennis Zhuang <killme2008@gmail.com >
* fix: push hint only when column is non-nullable or an explicit IS NOT NULL filter exists
Signed-off-by: Dennis Zhuang <killme2008@gmail.com >
* fix: transformed = true
Co-authored-by: Yingwen <realevenyag@gmail.com >
Signed-off-by: Dennis Zhuang <killme2008@gmail.com >
* chore: remove adpater vector hint
Signed-off-by: Dennis Zhuang <killme2008@gmail.com >
* chore: revert transformed
Signed-off-by: Dennis Zhuang <killme2008@gmail.com >
---------
Signed-off-by: Dennis Zhuang <killme2008@gmail.com >
Co-authored-by: Yingwen <realevenyag@gmail.com >
2026-01-28 03:40:56 +00:00