Yingwen
14550429e9
chore: reduce SeriesScan sender timeout ( #6983 )
...
Signed-off-by: evenyag <realevenyag@gmail.com >
2025-09-17 07:02:47 +00:00
Lei, HUANG
c92ab4217f
fix: avoid truncating SST statistics during flush ( #6977 )
...
fix/disable-parquet-stats-truncate:
- **Update `memcomparable` Dependency**: Switched from crates.io to a Git repository for `memcomparable` in `Cargo.lock`, `mito-codec/Cargo.toml`, and removed it from `mito2/Cargo.toml`.
- **Enhance Parquet Writer Properties**: Added `set_statistics_truncate_length` and `set_column_index_truncate_length` to `WriterProperties` in `parquet.rs`, `bulk/part.rs`, `partition_tree/data.rs`, and `writer.rs`.
- **Add Test for Corrupt Scan**: Introduced a new test module `scan_corrupt.rs` in `mito2/src/engine` to verify handling of corrupt data.
- **Update Test Data**: Modified test data in `flush.rs` to reflect changes in file sizes and sequences.
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com >
2025-09-17 03:02:52 +00:00
Zhenchi
77981a7de5
fix: clean intm ignore notfound ( #6971 )
...
* fix: clean intm ignore notfound
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
* address comments
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
---------
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
2025-09-17 02:58:03 +00:00
Lei, HUANG
9096c5ebbf
chore: bump sequence on region edit ( #6947 )
...
* chore/update-sequence-on-region-edit:
### Commit Message
Refactor `get_last_seq_num` Method Across Engines
- **Change Return Type**: Updated the `get_last_seq_num` method to return `Result<SequenceNumber, BoxedError>` instead of `Result<Option<SequenceNumber>, BoxedError>` in the following files:
- `src/datanode/src/tests.rs`
- `src/file-engine/src/engine.rs`
- `src/metric-engine/src/engine.rs`
- `src/metric-engine/src/engine/read.rs`
- `src/mito2/src/engine.rs`
- `src/query/src/optimizer/test_util.rs`
- `src/store-api/src/region_engine.rs`
- **Enhance Region Edit Handling**: Modified `RegionWorkerLoop` in `src/mito2/src/worker/handle_manifest.rs` to update file sequences during region edits.
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com >
* add committed_sequence to RegionEdit
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com >
* chore/update-sequence-on-region-edit:
### Commit Message
Refactor sequence retrieval method
- **Renamed Method**: Changed `get_last_seq_num` to `get_committed_sequence` across multiple files to better reflect its purpose of retrieving the latest committed sequence.
- Affected files: `tests.rs`, `engine.rs` in `file-engine`, `metric-engine`, `mito2`, `test_util.rs`, and `region_engine.rs`.
- **Removed Unused Struct**: Deleted `RegionSequencesRequest` struct from `region_request.rs` as it is no longer needed.
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com >
* chore/update-sequence-on-region-edit:
**Add Committed Sequence Handling in Region Engine**
- **`engine.rs`**: Introduced a new test module `bump_committed_sequence_test` to verify committed sequence handling.
- **`bump_committed_sequence_test.rs`**: Added a test to ensure the committed sequence is correctly updated and persisted across region reopenings.
- **`action.rs`**: Updated `RegionManifest` and `RegionManifestBuilder` to include `committed_sequence` for tracking.
- **`manager.rs`**: Adjusted manifest size assertion to accommodate new committed sequence data.
- **`opener.rs`**: Implemented logic to override committed sequence during region opening.
- **`version.rs`**: Added `set_committed_sequence` method to update the committed sequence in `VersionControl`.
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com >
* chore/update-sequence-on-region-edit:
**Enhance `test_bump_committed_sequence` in `bump_committed_sequence_test.rs`**
- Updated the test to include row operations using `build_rows`, `put_rows`, and `rows_schema` to verify the committed sequence behavior.
- Adjusted assertions to reflect changes in committed sequence after row operations and region edits.
- Added comments to clarify the expected behavior of committed sequence after reopening the region and replaying the WAL.
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com >
* chore/update-sequence-on-region-edit:
**Enhance Region Sequence Management**
- **`bump_committed_sequence_test.rs`**: Updated test to handle region reopening and sequence management, ensuring committed sequences are correctly set and verified after edits.
- **`opener.rs`**: Improved committed sequence handling by overriding it only if the manifest's sequence is greater than the replayed sequence. Added logging for mutation sequence replay.
- **`region_write_ctx.rs`**: Modified `push_mutation` and `push_bulk` methods to adopt sequence numbers from parameters, enhancing sequence management during write operations.
- **`handle_write.rs`**: Updated `RegionWorkerLoop` to pass sequence numbers in `push_bulk` and `push_mutation` methods, ensuring consistent sequence handling.
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com >
* chore/update-sequence-on-region-edit:
### Remove Debug Logging from `opener.rs`
- Removed debug logging for mutation sequences in `opener.rs` to clean up the output and improve performance.
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com >
---------
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com >
2025-09-16 16:22:25 +00:00
Yingwen
b8e0c49cb4
feat: add an flag to enable the experimental flat format ( #6976 )
...
* feat: add enable_experimental_flat_format flag to enable flat format
Signed-off-by: evenyag <realevenyag@gmail.com >
* refactor: extract build_scan_input for CompactionSstReaderBuilder
Signed-off-by: evenyag <realevenyag@gmail.com >
* chore: add compact memtable cost to flush metrics
Signed-off-by: evenyag <realevenyag@gmail.com >
* feat: Sets compact dispatcher for bulk memtable
Signed-off-by: evenyag <realevenyag@gmail.com >
* feat: Cast dictionary to target type in FlatProjectionMapper
Signed-off-by: evenyag <realevenyag@gmail.com >
* fix: add time index to FlatProjectionMapper::batch_schema
Signed-off-by: evenyag <realevenyag@gmail.com >
* chore: update config toml
Signed-off-by: evenyag <realevenyag@gmail.com >
* fix: pass flat_format to ProjectionMapper in CompactionSstReaderBuilder
Signed-off-by: evenyag <realevenyag@gmail.com >
---------
Signed-off-by: evenyag <realevenyag@gmail.com >
2025-09-16 09:33:12 +00:00
Zhenchi
db42ad42dc
feat: add visible to sst entry for staging mode ( #6964 )
...
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
2025-09-15 09:05:54 +00:00
Yingwen
b3aabb6706
feat: support flush and compact flat format files ( #6949 )
...
* feat: basic functions for flush/compact flat format
Signed-off-by: evenyag <realevenyag@gmail.com >
* feat: bridge flush and compaction for flat format
Signed-off-by: evenyag <realevenyag@gmail.com >
* feat: add write cache support
Signed-off-by: evenyag <realevenyag@gmail.com >
* style: fix clippy
Signed-off-by: evenyag <realevenyag@gmail.com >
* chore: change log level to debug
Signed-off-by: evenyag <realevenyag@gmail.com >
* refactor: wrap duplicated code to merge and dedup iter
Signed-off-by: evenyag <realevenyag@gmail.com >
* refactor: wrap some code into flush_flat_mem_ranges
Signed-off-by: evenyag <realevenyag@gmail.com >
* refactor: extract logic into do_flush_memtables
Signed-off-by: evenyag <realevenyag@gmail.com >
---------
Signed-off-by: evenyag <realevenyag@gmail.com >
2025-09-14 13:36:24 +00:00
Ruihang Xia
d86f489a74
fix: staging mode with proper region edit operations ( #6962 )
...
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
2025-09-12 04:39:42 +00:00
Yingwen
5e65581f94
feat: support flat format for SeriesScan ( #6938 )
...
* feat: Support flat format for SeriesScan
Signed-off-by: evenyag <realevenyag@gmail.com >
* test: simplify tests
Signed-off-by: evenyag <realevenyag@gmail.com >
* chore: update comment
Signed-off-by: evenyag <realevenyag@gmail.com >
* chore: only accumulate fetch time to scan_cost in SeriesDistributor of
the SeriesScan
Signed-off-by: evenyag <realevenyag@gmail.com >
* chore: update comment
Signed-off-by: evenyag <realevenyag@gmail.com >
---------
Signed-off-by: evenyag <realevenyag@gmail.com >
2025-09-11 06:12:53 +00:00
Ruihang Xia
2f6da3b718
feat: exiting staging mode on success case ( #6913 )
...
* clear staging directories on entering staging mode
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
* test and merger
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
* storage accessor
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
* exit on success
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
* use remove_all
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
---------
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
2025-09-10 01:52:55 +00:00
Ruihang Xia
83290be3ba
feat: store partition expr per file in region manifest ( #6849 )
...
* feat: store partition expr per file in region manifest
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
* use deserialized form in memory
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
* test non-existing partition expr
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
* try remove partition expr dup
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
* Revert "try remove partition expr dup"
This reverts commit a0f2c8ca6a127ef356a5510466e84513cf4b13d8.
* fix merge test
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
* format doc
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
---------
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
2025-09-09 23:37:29 +00:00
Zhenchi
21ee981b49
feat: add origin_region_id and node_id to sst entry ( #6937 )
...
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
2025-09-09 09:45:15 +00:00
Yingwen
9c22092189
feat: Implements compaction for bulk memtable ( #6923 )
...
* feat: initial bulk memtable compaction implementation
Signed-off-by: evenyag <realevenyag@gmail.com >
* feat: implement compact for memtable
Signed-off-by: evenyag <realevenyag@gmail.com >
* test: add tests for bulk memtable compaction
Signed-off-by: evenyag <realevenyag@gmail.com >
* style: clippy
Signed-off-by: evenyag <realevenyag@gmail.com >
* chore: address review comments
Signed-off-by: evenyag <realevenyag@gmail.com >
---------
Signed-off-by: evenyag <realevenyag@gmail.com >
2025-09-09 09:15:25 +00:00
Zhenchi
9fe84f6fbd
feat(mito): backfill partition expr on region open ( #6862 )
...
* feat(mito): backfill partition expr on region open
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
* tiny polish
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
* only writable leader persists
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
* catchup needs backfill
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
* add log
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
* handle distribute mode
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
* only when transfer to set gracefully
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
* fix fmt
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
---------
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
2025-09-09 02:50:01 +00:00
Ruihang Xia
c9377e7c5a
build: bump rust edition to 2024 ( #6920 )
...
* bump edition
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
* format
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
* gen keyword
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
* lifetime and env var
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
* one more gen fix
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
* lifetime of temporaries in tail expressions
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
* format again
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
* clippy nested if
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
* clippy let and return
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
---------
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
2025-09-08 02:37:18 +00:00
Weny Xu
658d07bfc8
feat: add written_bytes_since_open column to region_statistics table ( #6904 )
...
* feat: add `write_bytes` column to `region_statistics` table
Signed-off-by: WenyXu <wenymedia@gmail.com >
* chore: update comments
Signed-off-by: WenyXu <wenymedia@gmail.com >
* chore: rename `write_bytes` to `written_bytes`
Signed-off-by: WenyXu <wenymedia@gmail.com >
* chore: rename `written_bytes` to `written_bytes_since_open`
Signed-off-by: WenyXu <wenymedia@gmail.com >
* chore: apply suggestions
Signed-off-by: WenyXu <wenymedia@gmail.com >
---------
Signed-off-by: WenyXu <wenymedia@gmail.com >
2025-09-05 07:27:30 +00:00
discord9
f7c8d86ebb
feat: file ref mgr ( #6844 )
...
* feat: file ref manager
Signed-off-by: discord9 <discord9@163.com >
* refactor: put file ref mgr to sep file
Signed-off-by: discord9 <discord9@163.com >
* chore: rename path
Signed-off-by: discord9 <discord9@163.com >
* test: ref path
Signed-off-by: discord9 <discord9@163.com >
* fix: pass node id
Signed-off-by: discord9 <discord9@163.com >
* chore: after rebase fix
Signed-off-by: discord9 <discord9@163.com >
* feat: exlucde already in manifest files
Signed-off-by: discord9 <discord9@163.com >
* docs: explain why it can work
Signed-off-by: discord9 <discord9@163.com >
* feat: also include manifest versions
Signed-off-by: discord9 <discord9@163.com >
* refactor: per review
Signed-off-by: discord9 <discord9@163.com >
* rename func to imply what's it actually doing
Signed-off-by: discord9 <discord9@163.com >
* more docs
Signed-off-by: discord9 <discord9@163.com >
* docs: expect gc worker to do the job right
Signed-off-by: discord9 <discord9@163.com >
* refactor: partially per review
Signed-off-by: discord9 <discord9@163.com >
* refactor: per review
Signed-off-by: discord9 <discord9@163.com >
* chore: unused
Signed-off-by: discord9 <discord9@163.com >
* metrics: change to per datanode instead
Signed-off-by: discord9 <discord9@163.com >
---------
Signed-off-by: discord9 <discord9@163.com >
2025-09-05 03:45:01 +00:00
Yingwen
8bbb396506
feat: Supports flat format in SeqScan and UnorderedScan ( #6905 )
...
* feat: support flat format in SeqScan
Signed-off-by: evenyag <realevenyag@gmail.com >
* feat: support flat format in unordered scan
Signed-off-by: evenyag <realevenyag@gmail.com >
* feat: support parallel read for flat format in SeqScan
Signed-off-by: evenyag <realevenyag@gmail.com >
* refactor: rename flat DedupReader to FlatDedupReader
Signed-off-by: evenyag <realevenyag@gmail.com >
* chore: address review comments
It also precomputes the input arrow schema
Signed-off-by: evenyag <realevenyag@gmail.com >
---------
Signed-off-by: evenyag <realevenyag@gmail.com >
2025-09-04 13:12:24 +00:00
Ruihang Xia
493a1efe65
feat: skip compaction on large file on append only mode ( #6838 )
...
* feat: skip compaction on large file on append only mode
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
* log ignored files
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
* format
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
* only ignore level 1 files
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
* early exit
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
* fix typo
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
---------
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
2025-09-04 12:01:09 +00:00
Yingwen
40a49ddc82
feat: implement basic write/read methods for bulk memtable ( #6888 )
...
* feat: implement basic write/read methods for bulk memtable
Signed-off-by: evenyag <realevenyag@gmail.com >
* chore: change update stats atomic ordering
We have already acquired the write lock so Relaxed ordering is fine
Signed-off-by: evenyag <realevenyag@gmail.com >
---------
Signed-off-by: evenyag <realevenyag@gmail.com >
2025-09-04 08:42:05 +00:00
Zhenchi
deb728c38a
fix: move prune_region_dir to region drop ( #6891 )
...
* fix: move prune_region_dir to region drop
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
* address comments
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
---------
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
2025-09-03 07:11:05 +00:00
Weny Xu
7cf47ccf54
fix: initialize remote WAL regions with correct flushed entry IDs ( #6856 )
...
* fix: initialize remote WAL regions with correct flushed entry IDs
Signed-off-by: WenyXu <wenymedia@gmail.com >
* chore: add logs
Signed-off-by: WenyXu <wenymedia@gmail.com >
* fix: correct latest offset
Signed-off-by: WenyXu <wenymedia@gmail.com >
* fix: update sqlness
Signed-off-by: WenyXu <wenymedia@gmail.com >
* feat: add replay checkpoint to catchup request
Signed-off-by: WenyXu <wenymedia@gmail.com >
* chore: apply suggestions
Signed-off-by: WenyXu <wenymedia@gmail.com >
* chore: logs
Signed-off-by: WenyXu <wenymedia@gmail.com >
* chore: apply suggestions
Signed-off-by: WenyXu <wenymedia@gmail.com >
---------
Signed-off-by: WenyXu <wenymedia@gmail.com >
2025-09-03 04:00:02 +00:00
Ruihang Xia
56db1d468a
chore: fix typo ( #6885 )
...
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
2025-09-03 02:43:25 +00:00
Zhenchi
9b8f04b065
fix: prune intermediate dirs on index finish and region pruge ( #6878 )
...
* fix: prune intermediate dirs on index finish and region pruge
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
* address comments
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
---------
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
2025-09-02 13:57:16 +00:00
Yingwen
8fc42aeb27
feat: Update parquet writer and indexer to support the flat format ( #6866 )
...
* feat: implements method to write flat batch for ParquetWriter
Signed-off-by: evenyag <realevenyag@gmail.com >
* feat: add update method for flat RecordBatch in Indexer
Signed-off-by: evenyag <realevenyag@gmail.com >
* feat: calls indexer to write flat batch in ParquetWriter
Signed-off-by: evenyag <realevenyag@gmail.com >
* fix: handle empty projection for flat format
Signed-off-by: evenyag <realevenyag@gmail.com >
* fix: eval array in precise_filter_flat
Signed-off-by: evenyag <realevenyag@gmail.com >
* feat: cache column lookup result in inverted indexer
Signed-off-by: evenyag <realevenyag@gmail.com >
* test: add test
Signed-off-by: evenyag <realevenyag@gmail.com >
* feat: support dict type in dense codec
Signed-off-by: evenyag <realevenyag@gmail.com >
* test: remove read part in test as it need modifying the reader
Signed-off-by: evenyag <realevenyag@gmail.com >
* feat: support dictionary type in other methods for dense codec
Signed-off-by: evenyag <realevenyag@gmail.com >
* refactor: fulltext use string array directly
Signed-off-by: evenyag <realevenyag@gmail.com >
---------
Signed-off-by: evenyag <realevenyag@gmail.com >
2025-09-02 12:48:34 +00:00
Yingwen
ab96703d8f
chore: enlarge max file limit to 384 ( #6868 )
...
chore: enlarge max concurrent files to 384
Signed-off-by: evenyag <realevenyag@gmail.com >
2025-09-01 09:26:20 +00:00
zyy17
d585c23ba5
refactor: add stop methods for LocalFilePurger and CompactionRegion ( #6848 )
...
* refactor: add `LocalFilePurger::stop(&self)` and `stop_file_purger()` of `CompactionRegion`
Signed-off-by: zyy17 <zyylsxm@gmail.com >
* chore: rename methods
Signed-off-by: zyy17 <zyylsxm@gmail.com >
---------
Signed-off-by: zyy17 <zyylsxm@gmail.com >
2025-08-29 09:23:59 +00:00
discord9
bacd9c7d15
feat: add event ts to region manifest ( #6751 )
...
* feat: add event ts to region manifest
Signed-off-by: discord9 <discord9@163.com >
delete files
Signed-off-by: discord9 <discord9@163.com >
chore: clippy
Signed-off-by: discord9 <discord9@163.com >
lower concurrency
Signed-off-by: discord9 <discord9@163.com >
feat: gc use delta manifest to get expel time
Signed-off-by: discord9 <discord9@163.com >
docs: terminalogy
Signed-off-by: discord9 <discord9@163.com >
refactor: some advices from review
Signed-off-by: discord9 <discord9@163.com >
feat: manifest add removed files field
Signed-off-by: discord9 <discord9@163.com >
feat(WIP): add remove time in manifest
Signed-off-by: discord9 <discord9@163.com >
wip: more config
Signed-off-by: discord9 <discord9@163.com >
feat: manifest uodate removed files
Signed-off-by: discord9 <discord9@163.com >
test: add remove file opts field
Signed-off-by: discord9 <discord9@163.com >
test: fix test
Signed-off-by: discord9 <discord9@163.com >
chore: delete gc.rs
Signed-off-by: discord9 <discord9@163.com >
* feat: proper option name
Signed-off-by: discord9 <discord9@163.com >
* refactor: per review
Signed-off-by: discord9 <discord9@163.com >
* test: update manifest size
Signed-off-by: discord9 <discord9@163.com >
* test: fix eq
Signed-off-by: discord9 <discord9@163.com >
* refactor: some per review
Signed-off-by: discord9 <discord9@163.com >
* refactor: per review
Signed-off-by: discord9 <discord9@163.com >
* refactor: more per review
Signed-off-by: discord9 <discord9@163.com >
* test: update manifest size
Signed-off-by: discord9 <discord9@163.com >
* test: update config
Signed-off-by: discord9 <discord9@163.com >
* feat: keep count 0 means only use ttl
Signed-off-by: discord9 <discord9@163.com >
---------
Signed-off-by: discord9 <discord9@163.com >
2025-08-28 09:08:18 +00:00
Yingwen
32e73dad12
fix: use actual buf size as cache page value size ( #6829 )
...
* feat: cache the cloned page bytes
Signed-off-by: evenyag <realevenyag@gmail.com >
* feat: cache the whole row group pages
The opendal reader may merge IO requests so the pages of different
columns can share the same Bytes.
When we use a per-column page cache, the page cache may still referencing
the whole Bytes after eviction if there are other columns in the cache that
share the same Bytes.
Signed-off-by: evenyag <realevenyag@gmail.com >
* feat: check possible max byte range and copy pages if needed
Signed-off-by: evenyag <realevenyag@gmail.com >
* feat: always copy pages
Signed-off-by: evenyag <realevenyag@gmail.com >
* feat: returns the copied pages
Signed-off-by: evenyag <realevenyag@gmail.com >
* feat: compute cache size by MERGE_GAP
Signed-off-by: evenyag <realevenyag@gmail.com >
* feat: align to buf size
Signed-off-by: evenyag <realevenyag@gmail.com >
* feat: aligh to 2MB
Signed-off-by: evenyag <realevenyag@gmail.com >
* chore: remove unused code
Signed-off-by: evenyag <realevenyag@gmail.com >
* style: fix clippy
Signed-off-by: evenyag <realevenyag@gmail.com >
* chore: fix typo
Signed-off-by: evenyag <realevenyag@gmail.com >
* test: fix parquet read with cache test
Signed-off-by: evenyag <realevenyag@gmail.com >
---------
Signed-off-by: evenyag <realevenyag@gmail.com >
2025-08-28 03:37:11 +00:00
Weny Xu
566a647ec7
feat: add replay checkpoint to reduce overhead for remote WAL ( #6816 )
...
* feat: introduce `TopicRegionValue`
Signed-off-by: WenyXu <wenymedia@gmail.com >
* feat: persist region replay checkpoint
Signed-off-by: WenyXu <wenymedia@gmail.com >
* feat: introduce checkpoint
Signed-off-by: WenyXu <wenymedia@gmail.com >
* chore: udpate config.md
Signed-off-by: WenyXu <wenymedia@gmail.com >
* refactor: minor refactor
Signed-off-by: WenyXu <wenymedia@gmail.com >
* feat: send open region instructions with reply checkpoint
Signed-off-by: WenyXu <wenymedia@gmail.com >
* chore: use usize
Signed-off-by: WenyXu <wenymedia@gmail.com >
* fix: fix unit tests
Signed-off-by: WenyXu <wenymedia@gmail.com >
* fix: fix unit tests
Signed-off-by: WenyXu <wenymedia@gmail.com >
* feat: add topic name pattern
Signed-off-by: WenyXu <wenymedia@gmail.com >
* feat: enable wal prune by default
Signed-off-by: WenyXu <wenymedia@gmail.com >
* chore: apply suggestions
Signed-off-by: WenyXu <wenymedia@gmail.com >
* chore: apply suggestions
Signed-off-by: WenyXu <wenymedia@gmail.com >
---------
Signed-off-by: WenyXu <wenymedia@gmail.com >
2025-08-27 07:24:33 +00:00
Yingwen
906e1ca0bf
feat: functions and structs to scan flat format file and mem ranges ( #6817 )
...
* feat: implement function to scan flat memtable ranges
Signed-off-by: evenyag <realevenyag@gmail.com >
* feat: implement function to scan flat file ranges
Signed-off-by: evenyag <realevenyag@gmail.com >
* chore: compat batch in scan file range
Signed-off-by: evenyag <realevenyag@gmail.com >
* chore: scan other ranges
Signed-off-by: evenyag <realevenyag@gmail.com >
* chore: fix compiler errors
Signed-off-by: evenyag <realevenyag@gmail.com >
---------
Signed-off-by: evenyag <realevenyag@gmail.com >
2025-08-27 06:31:30 +00:00
Arshdeep
8894cb5406
feat: resolve unused dependencies with cargo-udeps ( #6578 ) ( #6619 )
...
* feat:resolve unused dependencies with cargo-udeps (#6578 )
Signed-off-by: Arshdeep54 <balarsh535@gmail.com >
* Apply suggestion from @zyy17
Co-authored-by: zyy17 <zyylsxm@gmail.com >
* Apply suggestion from @zyy17
Co-authored-by: zyy17 <zyylsxm@gmail.com >
---------
Signed-off-by: Arshdeep54 <balarsh535@gmail.com >
Co-authored-by: Ning Sun <classicning@gmail.com >
Co-authored-by: zyy17 <zyylsxm@gmail.com >
2025-08-26 10:22:53 +00:00
Yingwen
d5575d3fa4
feat: add FlatConvertFormat to convert record batches in old format to the flat format ( #6786 )
...
* feat: add convert format to FlatReadFormat
Signed-off-by: evenyag <realevenyag@gmail.com >
* test: test convert format
Signed-off-by: evenyag <realevenyag@gmail.com >
* fix: only convert string pks to dictionary
Signed-off-by: evenyag <realevenyag@gmail.com >
---------
Signed-off-by: evenyag <realevenyag@gmail.com >
2025-08-25 06:47:06 +00:00
Zhenchi
6839b5aef4
feat(mito): list SSTs from manifest and storage ( #6766 )
...
* feat(engine): list SSTs from manifest and storage
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
* mito engine only
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
* try best to remove allocation
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
* address comments
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
---------
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
2025-08-22 09:22:42 +00:00
Weny Xu
896d72191e
feat: introduce PersistStatsHandler ( #6777 )
...
* feat: add `Inserter` trait and impl
Signed-off-by: WenyXu <wenymedia@gmail.com >
* chore: import items
Signed-off-by: WenyXu <wenymedia@gmail.com >
* feat: introduce `PersistStatsHandler`
Signed-off-by: WenyXu <wenymedia@gmail.com >
* chore: disable persisting stats in sqlness
Signed-off-by: WenyXu <wenymedia@gmail.com >
* reset channel manager
Signed-off-by: WenyXu <wenymedia@gmail.com >
* chore: apply suggestions
Signed-off-by: WenyXu <wenymedia@gmail.com >
* chore: avoid to collect
Signed-off-by: WenyXu <wenymedia@gmail.com >
* refactor: remove insert options
Signed-off-by: WenyXu <wenymedia@gmail.com >
* refactor: use `write_bytes` instead of `write_bytes_per_sec`
Signed-off-by: WenyXu <wenymedia@gmail.com >
* refactor: compute write bytes delta
Signed-off-by: WenyXu <wenymedia@gmail.com >
* chore: apply suggestions
Signed-off-by: WenyXu <wenymedia@gmail.com >
* test: add unit tests
Signed-off-by: WenyXu <wenymedia@gmail.com >
* chore: apply suggestions
Signed-off-by: WenyXu <wenymedia@gmail.com >
* Update src/meta-srv/src/handler/persist_stats_handler.rs
Co-authored-by: Yingwen <realevenyag@gmail.com >
---------
Signed-off-by: WenyXu <wenymedia@gmail.com >
Co-authored-by: Yingwen <realevenyag@gmail.com >
2025-08-21 09:34:58 +00:00
Yingwen
2995eddca5
feat: Implements FlatCompatBatch to adapt schema in flat format ( #6771 )
...
* feat: compat flat wip
Signed-off-by: evenyag <realevenyag@gmail.com >
* feat: rewrite key
Signed-off-by: evenyag <realevenyag@gmail.com >
* feat: support append key
Signed-off-by: evenyag <realevenyag@gmail.com >
* refactor: Split rewrite logic
Signed-off-by: evenyag <realevenyag@gmail.com >
* refactor: rename CompatFlatBatch to FlatCompatBatch
Signed-off-by: evenyag <realevenyag@gmail.com >
* feat: compat primary key
Signed-off-by: evenyag <realevenyag@gmail.com >
* chore: address fixme
Signed-off-by: evenyag <realevenyag@gmail.com >
* feat: new CompatBatch if need convert
Signed-off-by: evenyag <realevenyag@gmail.com >
* feat: use different compat
Signed-off-by: evenyag <realevenyag@gmail.com >
* refactor: plain_projection -> flat_projection
Signed-off-by: evenyag <realevenyag@gmail.com >
* feat: test compat
Signed-off-by: evenyag <realevenyag@gmail.com >
* test: test FlatCompatBatch
Signed-off-by: evenyag <realevenyag@gmail.com >
* feat: fix warnings and remove unused code
Signed-off-by: evenyag <realevenyag@gmail.com >
* fix: support compat tags with dictionary types
Signed-off-by: evenyag <realevenyag@gmail.com >
* style: fix clippy
Signed-off-by: evenyag <realevenyag@gmail.com >
* chore: reuse column_id_values
Signed-off-by: evenyag <realevenyag@gmail.com >
* chore: avoid zero pk values len
Signed-off-by: evenyag <realevenyag@gmail.com >
---------
Signed-off-by: evenyag <realevenyag@gmail.com >
2025-08-20 12:36:14 +00:00
Yingwen
8fc3a9a9d7
feat: Implements async FlatMergeReader and FlatDedupReader ( #6761 )
...
* refactor: Add Flat prefix to MergeIterator and DedupIterator
Signed-off-by: evenyag <realevenyag@gmail.com >
* feat: implement MergeReader for RecordBatch
Signed-off-by: evenyag <realevenyag@gmail.com >
* refactor: use GenericNode for IterNode
Signed-off-by: evenyag <realevenyag@gmail.com >
* feat: flat merge reader to stream
Signed-off-by: evenyag <realevenyag@gmail.com >
* feat: implement FlatDedupReader
Signed-off-by: evenyag <realevenyag@gmail.com >
* chore: add a benchmark for FlatMergeIterator
Signed-off-by: evenyag <realevenyag@gmail.com >
* refactor: rename plain_projection to flat_projection
Signed-off-by: evenyag <realevenyag@gmail.com >
---------
Signed-off-by: evenyag <realevenyag@gmail.com >
2025-08-19 06:30:59 +00:00
Ruihang Xia
b652ea52ee
fix: partition tree's dict size metrics mismatch ( #6746 )
...
fix: partition tree metrics mismatch
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
2025-08-18 08:36:35 +00:00
Weny Xu
f64fc3a57a
feat: add RateMeter for tracking memtable write throughput ( #6744 )
...
* feat: introduce `RateMeter`
Signed-off-by: WenyXu <wenymedia@gmail.com >
* chore: apply suggestions
Signed-off-by: WenyXu <wenymedia@gmail.com >
---------
Signed-off-by: WenyXu <wenymedia@gmail.com >
2025-08-18 06:45:31 +00:00
LFC
f9d2a89a0c
chore: update datafusion family ( #6675 )
...
* chore: update datafusion family
Signed-off-by: luofucong <luofc@foxmail.com >
* fix ci
Signed-off-by: luofucong <luofc@foxmail.com >
* use official otel-arrow-rust
Signed-off-by: luofucong <luofc@foxmail.com >
* rebase
Signed-off-by: luofucong <luofc@foxmail.com >
* use the official orc-rust
Signed-off-by: luofucong <luofc@foxmail.com >
* resolve PR comments
Signed-off-by: luofucong <luofc@foxmail.com >
* remove the empty lines
Signed-off-by: luofucong <luofc@foxmail.com >
* try following PR comments
Signed-off-by: luofucong <luofc@foxmail.com >
---------
Signed-off-by: luofucong <luofc@foxmail.com >
2025-08-15 12:41:49 +00:00
Yingwen
60e01c7c3d
feat: Implements last-non-null dedup strategy for flat format ( #6709 )
...
* feat: Implements FlatLastNonNull strategy
Dedup rows and keep last non null fields
Signed-off-by: evenyag <realevenyag@gmail.com >
* test: add basic test for FlatLastNonNull
Signed-off-by: evenyag <realevenyag@gmail.com >
* test: port more last non null test
Signed-off-by: evenyag <realevenyag@gmail.com >
* fix: do merge rows after delete op
Signed-off-by: evenyag <realevenyag@gmail.com >
* style: fix clippy
Signed-off-by: evenyag <realevenyag@gmail.com >
* refactor: rename num_pk_columns to field_column_start
So we can support different format later
Signed-off-by: evenyag <realevenyag@gmail.com >
* chore: address comment
Signed-off-by: evenyag <realevenyag@gmail.com >
---------
Signed-off-by: evenyag <realevenyag@gmail.com >
2025-08-14 14:38:47 +00:00
Weny Xu
021ad09c21
refactor: simplify WAL pruning procedure and introduce region flush trigger ( #6741 )
...
* chore: add logs
Signed-off-by: WenyXu <wenymedia@gmail.com >
* refactor: update wal config for metasrv
Signed-off-by: WenyXu <wenymedia@gmail.com >
* feat: introduce region flush trigger
Signed-off-by: WenyXu <wenymedia@gmail.com >
* fix: debug assert
Signed-off-by: WenyXu <wenymedia@gmail.com >
* chore: log level
Signed-off-by: WenyXu <wenymedia@gmail.com >
* refactor: simplify wal prune procedure
Signed-off-by: WenyXu <wenymedia@gmail.com >
* chore: upgrade rskafka
Signed-off-by: WenyXu <wenymedia@gmail.com >
* feat: always flush inactive regions
Signed-off-by: WenyXu <wenymedia@gmail.com >
* refactor: refactor flush trigger
Signed-off-by: WenyXu <wenymedia@gmail.com >
* chore: remove unused code
Signed-off-by: WenyXu <wenymedia@gmail.com >
* fix: typo
Signed-off-by: WenyXu <wenymedia@gmail.com >
* fix: update unit tests
Signed-off-by: WenyXu <wenymedia@gmail.com >
* chore: add metrics
Signed-off-by: WenyXu <wenymedia@gmail.com >
* chore: rename
Signed-off-by: WenyXu <wenymedia@gmail.com >
* chore: apply suggestions from CR
Signed-off-by: WenyXu <wenymedia@gmail.com >
---------
Signed-off-by: WenyXu <wenymedia@gmail.com >
2025-08-14 14:15:30 +00:00
discord9
2a3e4c7a82
fix: truncate manifest action compat ( #6742 )
...
* fix: truncate manifest action compat
Signed-off-by: discord9 <discord9@163.com >
* refactor: use simpler compat
Signed-off-by: discord9 <discord9@163.com >
---------
Signed-off-by: discord9 <discord9@163.com >
2025-08-14 14:05:17 +00:00
Ruihang Xia
4d97754cb4
feat: persist manifest, SST and index files to staging dir ( #6726 )
...
* make flush and access layer aware of staging mode
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
* staging flag in flush notify
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
* staging manifest
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
* index methods
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
* only stage manifest
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
* improve comment
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
---------
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
2025-08-14 06:02:31 +00:00
Zhenchi
fb3b1d4866
feat: Store partition expr in RegionMetadata ( #6699 )
...
* wire partition.expr_json option constant and parsing
test(mito2): manifest roundtrip persists partition_expr JSON
test(mito2): create/open with partition.expr_json persists in manifest
docs: add comments for partition.expr_json option and RegionOptions.partition_expr
serde: include RegionOptions.partition_expr (skip if None)
test(mito2): doc intent and verify runtime backfill + persistence-after-alter for partition expr
add partition expr to create request
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
* add create_with_partition_expr_persists_manifest
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
* pass partition expr to create request
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
* polish
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
* fix sqlness
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
* fix test
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
* remove unused dep
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
---------
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
Co-authored-by: Ruihang Xia <waynestxia@gmail.com >
2025-08-13 22:05:37 +00:00
Weny Xu
8659412cac
feat: introduce PeriodicTopicStatsReporter ( #6730 )
...
* refactor: introduce `PeriodicTopicStatsReporter`
Signed-off-by: WenyXu <wenymedia@gmail.com >
* chore: apply suggestions from CR
Signed-off-by: WenyXu <wenymedia@gmail.com >
* fix: fix typo
Signed-off-by: WenyXu <wenymedia@gmail.com >
* chore: remote wal tests styling
Signed-off-by: WenyXu <wenymedia@gmail.com >
* fix: fix unit test
Signed-off-by: WenyXu <wenymedia@gmail.com >
* chore: apply suggestions
Signed-off-by: WenyXu <wenymedia@gmail.com >
* fix: handling region wal options not found
Signed-off-by: WenyXu <wenymedia@gmail.com >
* chore: apply suggestions from CR
Signed-off-by: WenyXu <wenymedia@gmail.com >
* fix: minor
Signed-off-by: WenyXu <wenymedia@gmail.com >
* chore: upgrade greptime-proto
Signed-off-by: WenyXu <wenymedia@gmail.com >
---------
Signed-off-by: WenyXu <wenymedia@gmail.com >
2025-08-13 11:46:50 +00:00
Ruihang Xia
8a44137f37
chore: prefix debug_assertion only variables with underscore ( #6727 )
...
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
2025-08-13 02:50:16 +00:00
Yingwen
1977ae50ee
feat: Projection mapper for flat schema ( #6679 )
...
* feat: plain projection mapper wip
Signed-off-by: evenyag <realevenyag@gmail.com >
* refactor: change ProjectionMapper to enum
Signed-off-by: evenyag <realevenyag@gmail.com >
* feat: convert plain batch
Signed-off-by: evenyag <realevenyag@gmail.com >
* test: add tests for the mapper
Signed-off-by: evenyag <realevenyag@gmail.com >
* chore: allow dead code
Signed-off-by: evenyag <realevenyag@gmail.com >
* style: format code
Signed-off-by: evenyag <realevenyag@gmail.com >
* chore: fix compiler errors
Signed-off-by: evenyag <realevenyag@gmail.com >
* feat: change PlainProjectionMapper to FlatProjectionMapper
Signed-off-by: evenyag <realevenyag@gmail.com >
* test: fix projection tests
Signed-off-by: evenyag <realevenyag@gmail.com >
* style: fmt code
Signed-off-by: evenyag <realevenyag@gmail.com >
* style: fix clippy
Signed-off-by: evenyag <realevenyag@gmail.com >
* chore: Address comments
Removes some unwrap()
Signed-off-by: evenyag <realevenyag@gmail.com >
* style: fmt
Signed-off-by: evenyag <realevenyag@gmail.com >
* chore: address comment
as_plain -> as_flat
Signed-off-by: evenyag <realevenyag@gmail.com >
---------
Signed-off-by: evenyag <realevenyag@gmail.com >
2025-08-12 07:36:23 +00:00
Ruihang Xia
25f926ea7d
feat: mito region staging state ( #6664 )
...
* fix: not mark all deleted when partial trunc (#6654 )
* fix: not mark all deleted when partial trunc¬ update manifest when partial file range is empty
Signed-off-by: discord9 <discord9@163.com >
* docs: note
Signed-off-by: discord9 <discord9@163.com >
---------
Signed-off-by: discord9 <discord9@163.com >
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
* some tests and DdlRequest
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
* stage transit
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
* address CR comments
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
* correct error type
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
---------
Signed-off-by: discord9 <discord9@163.com >
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
Co-authored-by: discord9 <55937128+discord9@users.noreply.github.com >
2025-08-12 03:17:47 +00:00
Yingwen
e4454e0c7d
feat: Implements last row dedup strategy for flat format ( #6695 )
...
* feat: implement FlatLastRow dedup strategy
Signed-off-by: evenyag <realevenyag@gmail.com >
* fix: fix dedup metrics
Add tests for iter with FlatLastRow strategy
Signed-off-by: evenyag <realevenyag@gmail.com >
* chore: fix delete metrics
Signed-off-by: evenyag <realevenyag@gmail.com >
* chore: address review comments
Signed-off-by: evenyag <realevenyag@gmail.com >
* style: fix clippy
Signed-off-by: evenyag <realevenyag@gmail.com >
* refactor: move BatchLastRow position
Signed-off-by: evenyag <realevenyag@gmail.com >
* style: fix typos
Signed-off-by: evenyag <realevenyag@gmail.com >
---------
Signed-off-by: evenyag <realevenyag@gmail.com >
2025-08-11 07:44:45 +00:00