discord9
aea4e9fa55
fix: RemovedFiles deser compatibility ( #7475 )
...
* fix: compat for RemovedFiles
Signed-off-by: discord9 <discord9@163.com >
* cr
Signed-off-by: discord9 <discord9@163.com >
---------
Signed-off-by: discord9 <discord9@163.com >
2025-12-25 02:50:34 +00:00
AntiTopQuark
cea578244c
fix(compaction): unify behavior of database compaction options with TTL ( #7402 )
...
* fix: fix dynamic compactiom option,unify behavior of database compaction options with TTL option
Signed-off-by: AntiTopQuark <AntiTopQuark1350@outlook.com >
* fix unit test
Signed-off-by: AntiTopQuark <AntiTopQuark1350@outlook.com >
* add debug log
Signed-off-by: AntiTopQuark <AntiTopQuark1350@outlook.com >
---------
Signed-off-by: AntiTopQuark <AntiTopQuark1350@outlook.com >
2025-12-25 02:34:42 +00:00
Weny Xu
e1b18614ee
feat(mito2): implement ApplyStagingManifest request handling ( #7456 )
...
* feat(mito2): implement `ApplyStagingManifest` request handling
Signed-off-by: WenyXu <wenymedia@gmail.com >
* chore: apply suggestions from CR
Signed-off-by: WenyXu <wenymedia@gmail.com >
* chore: fmt
Signed-off-by: WenyXu <wenymedia@gmail.com >
* chore: apply suggestions from CR
Signed-off-by: WenyXu <wenymedia@gmail.com >
* chore: apply suggestions from CR
Signed-off-by: WenyXu <wenymedia@gmail.com >
* fix: fix logic
Signed-off-by: WenyXu <wenymedia@gmail.com >
* chore: update proto
Signed-off-by: WenyXu <wenymedia@gmail.com >
---------
Signed-off-by: WenyXu <wenymedia@gmail.com >
2025-12-24 09:05:09 +00:00
Weny Xu
2d9967b981
fix(mito2): pass partition expr explicitly to flush task for region ( #7461 )
...
* fix(mito2): pass partition expr explicitly to flush task for staging mode
Signed-off-by: WenyXu <wenymedia@gmail.com >
* chore: apply suggestions from CR
Signed-off-by: WenyXu <wenymedia@gmail.com >
* chore: rename
Signed-off-by: WenyXu <wenymedia@gmail.com >
---------
Signed-off-by: WenyXu <wenymedia@gmail.com >
2025-12-24 04:18:06 +00:00
discord9
dec0d522f8
feat: gc versioned index ( #7412 )
...
* feat: add index version to file ref
Signed-off-by: discord9 <discord9@163.com >
* refactor wip
Signed-off-by: discord9 <discord9@163.com >
* wip
Signed-off-by: discord9 <discord9@163.com >
* update gc worker
Signed-off-by: discord9 <discord9@163.com >
* stuff
Signed-off-by: discord9 <discord9@163.com >
* gc report for index files
Signed-off-by: discord9 <discord9@163.com >
* fix: type
Signed-off-by: discord9 <discord9@163.com >
* stuff
Signed-off-by: discord9 <discord9@163.com >
* chore: clippy
Signed-off-by: discord9 <discord9@163.com >
* chore: metrics
Signed-off-by: discord9 <discord9@163.com >
* typo
Signed-off-by: discord9 <discord9@163.com >
* typo
Signed-off-by: discord9 <discord9@163.com >
* chore: naming
Signed-off-by: discord9 <discord9@163.com >
* docs: update explain
Signed-off-by: discord9 <discord9@163.com >
* test: parse file id/type from file path
Signed-off-by: discord9 <discord9@163.com >
* chore: change parse method visibility to crate
Signed-off-by: discord9 <discord9@163.com >
* pcr
Signed-off-by: discord9 <discord9@163.com >
* pcr
Signed-off-by: discord9 <discord9@163.com >
* chore
Signed-off-by: discord9 <discord9@163.com >
---------
Signed-off-by: discord9 <discord9@163.com >
2025-12-24 03:07:53 +00:00
discord9
b3bc3c76f1
feat: file range dynamic filter ( #7441 )
...
* feat: add dynamic filtering support in file range and predicate handling
Signed-off-by: discord9 <discord9@163.com >
* clippy
Signed-off-by: discord9 <discord9@163.com >
* c
Signed-off-by: discord9 <discord9@163.com >
* c
Signed-off-by: discord9 <discord9@163.com >
* per review
Signed-off-by: discord9 <discord9@163.com >
* per review
Signed-off-by: discord9 <discord9@163.com >
* pcr
Signed-off-by: discord9 <discord9@163.com >
* c
Signed-off-by: discord9 <discord9@163.com >
---------
Signed-off-by: discord9 <discord9@163.com >
2025-12-23 06:15:30 +00:00
Lei, HUANG
a8b512dded
chore: expose symbols ( #7451 )
...
* chore/expose-symbols:
### Commit Message
Enhance `merge_and_dedup` Functionality in `flush.rs`
- **Function Signature Update**: Modified the `merge_and_dedup` function to accept `append_mode` and `merge_mode` as separate parameters instead of using `options`.
- **Function Accessibility**: Changed the visibility of `merge_and_dedup` to `pub` to allow external access.
- **Function Calls Update**: Updated calls to `merge_and_dedup` within `memtable_flat_sources` to align with the new function signature, passing `options.append_mode` and `options.merge_mode()` directly.
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com >
* chore/expose-symbols:
### Add Merge and Deduplication Functionality
- **File**: `src/mito2/src/flush.rs`
- Introduced `merge_and_dedup` function to merge multiple record batch iterators and apply deduplication based on specified modes.
- Added detailed documentation for the function, explaining its arguments, behavior, and usage examples.
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com >
---------
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com >
2025-12-22 05:39:03 +00:00
Yingwen
fed6cb0806
fix: flat format use correct encoding in indexer for tags ( #7440 )
...
* test: add inverted and skipping test
Signed-off-by: evenyag <realevenyag@gmail.com >
* test: Add tests for fulltext index
Signed-off-by: evenyag <realevenyag@gmail.com >
* fix: index dictionary type in correct encoding in flat format
Signed-off-by: evenyag <realevenyag@gmail.com >
* refactor: use encode_data_type() in SortField
Signed-off-by: evenyag <realevenyag@gmail.com >
* refactor: refine imports
Signed-off-by: evenyag <realevenyag@gmail.com >
* test: add tests for sparse encoding
Signed-off-by: evenyag <realevenyag@gmail.com >
* chore: remove logs
Signed-off-by: evenyag <realevenyag@gmail.com >
* test: update list test
Signed-off-by: evenyag <realevenyag@gmail.com >
* test: simplify tests
Signed-off-by: evenyag <realevenyag@gmail.com >
---------
Signed-off-by: evenyag <realevenyag@gmail.com >
2025-12-19 07:36:44 +00:00
Lanqing Yang
658332fe68
chore(mito): nit remove extra hashset in gc workers ( #7399 )
...
chore(mito): remove extra hashset in gc workers
Signed-off-by: lyang24 <lanqingy93@gmail.com >
2025-12-18 13:09:32 +00:00
jeremyhi
95eccd6cde
feat: introduce granularity for memory manager ( #7416 )
...
* feat: introduce granularity for memory manager
Signed-off-by: jeremyhi <fengjiachun@gmail.com >
* chore: add unit test
Signed-off-by: jeremyhi <fengjiachun@gmail.com >
* chore: remove granularity getter for mamanger
Signed-off-by: jeremyhi <fengjiachun@gmail.com >
* Update src/common/memory-manager/src/manager.rs
Co-authored-by: Lei, HUANG <6406592+v0y4g3r@users.noreply.github.com >
* feat: acquire_with_policy for manager
Signed-off-by: jeremyhi <fengjiachun@gmail.com >
---------
Signed-off-by: jeremyhi <fengjiachun@gmail.com >
Co-authored-by: Lei, HUANG <6406592+v0y4g3r@users.noreply.github.com >
2025-12-17 11:08:51 +00:00
Lei, HUANG
da964880f5
chore: expose symbols ( #7417 )
...
* refactor/expose-symbols:
## Refactor `bulk/part.rs` to Simplify Mutation Handling
- Removed the `mutations_to_record_batch` function and its associated helper functions, including `ArraysSorter`, `timestamp_array_to_iter`, and `binary_array_to_dictionary`, to simplify the mutation handling logic in `bulk/part.rs`.
- Deleted related test functions `check_binary_array_to_dictionary` and `check_mutations_to_record_batches` from the test module, along with their associated test cases.
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com >
* refactor/expose-symbols:
### Commit Message
**Refactor and Enhance Deduplication Logic**
- **`flush.rs`**: Refactored `maybe_dedup_one` function to accept `append_mode` and `merge_mode` as parameters instead of `RegionOptions`. This change enhances flexibility in deduplication logic.
- **`memtable/bulk.rs`**: Made `BulkRangeIterBuilder` struct and its fields public to allow external access and modification, improving extensibility.
- **`sst.rs`**: Corrected a typo in the schema documentation, changing `__prmary_key` to `__primary_key` for clarity and accuracy.
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com >
---------
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com >
2025-12-17 01:29:36 +00:00
Yingwen
f6afb10e33
feat!: download file to fill the cache on write cache miss ( #7294 )
...
* feat: download inverted index file
Signed-off-by: evenyag <realevenyag@gmail.com >
* feat: download for bloom and fulltext
Signed-off-by: evenyag <realevenyag@gmail.com >
* feat: implement maybe_download_background for FileCache
Signed-off-by: evenyag <realevenyag@gmail.com >
* feat: load file for parquet
Signed-off-by: evenyag <realevenyag@gmail.com >
* chore: reduce channel size
Signed-off-by: evenyag <realevenyag@gmail.com >
* feat: use ManifestCache
Signed-off-by: evenyag <realevenyag@gmail.com >
* refactor: pass cache to ManifestObjectStore::new
Signed-off-by: evenyag <realevenyag@gmail.com >
* style: fix fmt and clippy
Signed-off-by: evenyag <realevenyag@gmail.com >
* chore: remove manifest cache ttl
Signed-off-by: evenyag <realevenyag@gmail.com >
* refactor: remove read cache
Signed-off-by: evenyag <realevenyag@gmail.com >
* feat: clean old read cache path
Signed-off-by: evenyag <realevenyag@gmail.com >
* docs: update config
Signed-off-by: evenyag <realevenyag@gmail.com >
* docs: update config examples
Signed-off-by: evenyag <realevenyag@gmail.com >
* test: update test
Signed-off-by: evenyag <realevenyag@gmail.com >
* chore: fix CI
Signed-off-by: evenyag <realevenyag@gmail.com >
* fix: also clean the root directory
Signed-off-by: evenyag <realevenyag@gmail.com >
* test: update manifest test
Signed-off-by: evenyag <realevenyag@gmail.com >
* chore: fix compiler errors
Signed-off-by: evenyag <realevenyag@gmail.com >
* feat: skip file if it exists
Signed-off-by: evenyag <realevenyag@gmail.com >
* feat: remove warn in replace
Signed-off-by: evenyag <realevenyag@gmail.com >
* feat: add a flag to enable/disable background download
set the concurrency to 1 for background download
Signed-off-by: evenyag <realevenyag@gmail.com >
* refactor: rename write_cache_enable_background_download to enable_refill_cache_on_read
Signed-off-by: evenyag <realevenyag@gmail.com >
* test: update config test
Signed-off-by: evenyag <realevenyag@gmail.com >
* chore: address comments
Signed-off-by: evenyag <realevenyag@gmail.com >
* docs: update config.md
Signed-off-by: evenyag <realevenyag@gmail.com >
* chore: fmt code
Signed-off-by: evenyag <realevenyag@gmail.com >
---------
Signed-off-by: evenyag <realevenyag@gmail.com >
2025-12-16 08:31:26 +00:00
Weny Xu
f7d5c87ac0
feat: introduce copy_region_from for mito engine ( #7389 )
...
* feat: introduce `copy_region_from`
Signed-off-by: WenyXu <wenymedia@gmail.com >
* fix: fix clippy
Signed-off-by: WenyXu <wenymedia@gmail.com >
* chore: apply suggestions from CR
Signed-off-by: WenyXu <wenymedia@gmail.com >
* chore: apply suggestions
Signed-off-by: WenyXu <wenymedia@gmail.com >
* chore: apply suggestions from CR
Signed-off-by: WenyXu <wenymedia@gmail.com >
* chore: apply suggestions from CR
Signed-off-by: WenyXu <wenymedia@gmail.com >
---------
Signed-off-by: WenyXu <wenymedia@gmail.com >
2025-12-16 06:12:06 +00:00
jeremyhi
32f9cc5286
feat: move memory_manager to common crate ( #7408 )
...
* feat: move memory_manager to common crate
Signed-off-by: jeremyhi <fengjiachun@gmail.com >
* chore: add license header
Signed-off-by: jeremyhi <fengjiachun@gmail.com >
* fix: by AI comment
Signed-off-by: jeremyhi <fengjiachun@gmail.com >
---------
Signed-off-by: jeremyhi <fengjiachun@gmail.com >
2025-12-15 13:15:33 +00:00
Yingwen
5232a12a8c
feat: per file scan metrics ( #7396 )
...
* feat: collect per file metrics
Signed-off-by: evenyag <realevenyag@gmail.com >
* feat: divide build_cost to build_part_cost and build_reader_cost
Signed-off-by: evenyag <realevenyag@gmail.com >
* feat: limit the file metrics num to display
Signed-off-by: evenyag <realevenyag@gmail.com >
* fix: use sorted iter to get sorted files
Signed-off-by: evenyag <realevenyag@gmail.com >
* fix: output metrics in desc order
Signed-off-by: evenyag <realevenyag@gmail.com >
---------
Signed-off-by: evenyag <realevenyag@gmail.com >
2025-12-15 12:52:03 +00:00
jeremyhi
baffed8c6a
feat: mem manager on compaction ( #7305 )
...
* feat: mem manager on compaction
Signed-off-by: jeremyhi <fengjiachun@gmail.com >
* fix: by copilot review comment
Signed-off-by: jeremyhi <fengjiachun@gmail.com >
* feat: experimental_
Signed-off-by: jeremyhi <fengjiachun@gmail.com >
* fix: refine estimate_compaction_bytes
Signed-off-by: jeremyhi <fengjiachun@gmail.com >
* feat: make them into config example
Signed-off-by: jeremyhi <fengjiachun@gmail.com >
* chore: by copilot comment
Signed-off-by: jeremyhi <fengjiachun@gmail.com >
* Update src/mito2/src/compaction.rs
Co-authored-by: Lei, HUANG <6406592+v0y4g3r@users.noreply.github.com >
* fix: dedup the regions waiting
Signed-off-by: jeremyhi <fengjiachun@gmail.com >
* chore: by comment
Signed-off-by: jeremyhi <fengjiachun@gmail.com >
* chore: minor change
Signed-off-by: jeremyhi <fengjiachun@gmail.com >
* feat: add AdditionalMemoryGuard for the running compaction task
Signed-off-by: jeremyhi <fengjiachun@gmail.com >
* refactor: do OnExhaustedPolicy before running task
Signed-off-by: jeremyhi <fengjiachun@gmail.com >
* refactor: use OwnedSemaphorePermit to impl guard
Signed-off-by: jeremyhi <fengjiachun@gmail.com >
* feat: add early_release_partial method to release a portion of memory
Signed-off-by: jeremyhi <fengjiachun@gmail.com >
* fix: 0 bytes make request_additional unlimited
Signed-off-by: jeremyhi <fengjiachun@gmail.com >
* fix: fail-fast on acquire
Signed-off-by: jeremyhi <fengjiachun@gmail.com >
---------
Signed-off-by: jeremyhi <fengjiachun@gmail.com >
Co-authored-by: Lei, HUANG <6406592+v0y4g3r@users.noreply.github.com >
2025-12-12 06:49:58 +00:00
Lanqing Yang
f5e0e94e3a
chore(mito): nit avoid clone the batch object on inverted index building ( #7388 )
...
fix: avoid clone the batch object on inverted index building
Signed-off-by: lyang24 <lanqingy93@gmail.com >
2025-12-12 04:58:37 +00:00
discord9
f06a64ff90
feat: mark index outdated ( #7383 )
...
* feat: mark index outdated
Signed-off-by: discord9 <discord9@163.com >
* refactor: move IndexVerwsion to store-api
Signed-off-by: discord9 <discord9@163.com >
* per review
Signed-off-by: discord9 <discord9@163.com >
* fix: condition for add files
Signed-off-by: discord9 <discord9@163.com >
* cleanup
Signed-off-by: discord9 <discord9@163.com >
* refactor(sst): extract index version check into method
Signed-off-by: discord9 <discord9@163.com >
---------
Signed-off-by: discord9 <discord9@163.com >
2025-12-11 12:08:45 +00:00
discord9
a26dee0ca1
fix: gc listing op first ( #7385 )
...
Signed-off-by: discord9 <discord9@163.com >
2025-12-11 03:25:05 +00:00
Yingwen
a22d08f1b1
feat: collect merge and dedup metrics ( #7375 )
...
* feat: collect FlatMergeReader metrics
Signed-off-by: evenyag <realevenyag@gmail.com >
* feat: add MergeMetricsReporter, rename Metrics to MergeMetrics
Signed-off-by: evenyag <realevenyag@gmail.com >
* feat: remove num_input_rows from MergeMetrics
The merge reader won't dedup so there is no need to collect input rows
Signed-off-by: evenyag <realevenyag@gmail.com >
* feat: report merge metrics to PartitionMetrics
Signed-off-by: evenyag <realevenyag@gmail.com >
* feat: add dedup cost to DedupMetrics
Signed-off-by: evenyag <realevenyag@gmail.com >
* feat: collect dedup metrics
Signed-off-by: evenyag <realevenyag@gmail.com >
* refactor: remove metrics from FlatMergeIterator
Signed-off-by: evenyag <realevenyag@gmail.com >
* feat: remove num_output_rows from MergeMetrics
Signed-off-by: evenyag <realevenyag@gmail.com >
* chore: fix clippy
Signed-off-by: evenyag <realevenyag@gmail.com >
* feat: implement merge() for merge and dedup metrics
Signed-off-by: evenyag <realevenyag@gmail.com >
* fix: report metrics after observe metrics
Signed-off-by: evenyag <realevenyag@gmail.com >
---------
Signed-off-by: evenyag <realevenyag@gmail.com >
2025-12-10 09:16:20 +00:00
Lei, HUANG
2f9130a2de
chore(mito): expose some symbols ( #7373 )
...
chore/expose-symbols:
### Commit Summary
- **Visibility Changes**: Updated visibility of functions in `bulk/part.rs`:
- Made `record_batch_estimated_size` and `sort_primary_key_record_batch` functions public.
- **Enhancements**: Enhanced functionality in `memtable.rs` by exposing additional components from `bulk::part`:
- `BulkPartEncoder`, `BulkPartMeta`, `UnorderedPart`, `record_batch_estimated_size`, and `sort_primary_key_record_batch`.
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com >
2025-12-09 14:33:14 +00:00
discord9
9197e818ec
refactor: use versioned index for index file ( #7309 )
...
* refactor: use versioned index for index file
Signed-off-by: discord9 <discord9@163.com >
* fix: sst entry table
Signed-off-by: discord9 <discord9@163.com >
* update sqlness
Signed-off-by: discord9 <discord9@163.com >
* chore: unit type
Signed-off-by: discord9 <discord9@163.com >
* fix: missing version
Signed-off-by: discord9 <discord9@163.com >
* more fix build index
Signed-off-by: discord9 <discord9@163.com >
* fix: use proper index id
Signed-off-by: discord9 <discord9@163.com >
* pcr
Signed-off-by: discord9 <discord9@163.com >
* test: update
Signed-off-by: discord9 <discord9@163.com >
* clippy
Signed-off-by: discord9 <discord9@163.com >
* test: test_list_ssts fixed
Signed-off-by: discord9 <discord9@163.com >
* test: fix test
Signed-off-by: discord9 <discord9@163.com >
* feat: stuff
Signed-off-by: discord9 <discord9@163.com >
* fix: clean temp index file on abort&delete all index version when delete file
Signed-off-by: discord9 <discord9@163.com >
* docs: explain
Signed-off-by: discord9 <discord9@163.com >
* fix: actually clean up tmp dir
Signed-off-by: discord9 <discord9@163.com >
* clippy
Signed-off-by: discord9 <discord9@163.com >
* clean tmp dir only when write cache enabled
Signed-off-by: discord9 <discord9@163.com >
* refactor: add version to index cache
Signed-off-by: discord9 <discord9@163.com >
* per review
Signed-off-by: discord9 <discord9@163.com >
* test: update size
Signed-off-by: discord9 <discord9@163.com >
* per review
Signed-off-by: discord9 <discord9@163.com >
---------
Signed-off-by: discord9 <discord9@163.com >
2025-12-09 07:31:12 +00:00
Ruihang Xia
edb1f6086f
feat: decode pk eagerly ( #7350 )
...
* feat: decode pk eagerly
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
* merge primary_key_codec and decode_primary_key_values
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
---------
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
2025-12-05 09:11:51 +00:00
Yingwen
84e4e42ee7
feat: add more verbose metrics to scanners ( #7336 )
...
* feat: add inverted applier metrics
Signed-off-by: evenyag <realevenyag@gmail.com >
* feat: add metrics to bloom applier
Signed-off-by: evenyag <realevenyag@gmail.com >
* feat: add metrics to fulltext index applier
Signed-off-by: evenyag <realevenyag@gmail.com >
* feat: implement BloomFilterReadMetrics for BloomFilterReader
Signed-off-by: evenyag <realevenyag@gmail.com >
* feat: collect read metrics for inverted index
Signed-off-by: evenyag <realevenyag@gmail.com >
* feat: add metrics for range_read and metadata
Signed-off-by: evenyag <realevenyag@gmail.com >
* refactor: rename elapsed to fetch_elapsed
Signed-off-by: evenyag <realevenyag@gmail.com >
* feat: collect metadata fetch metrics for inverted index
Signed-off-by: evenyag <realevenyag@gmail.com >
* feat: collect cache metrics for inverted and bloom index
Signed-off-by: evenyag <realevenyag@gmail.com >
* feat: collect read metrics in appliers
Signed-off-by: evenyag <realevenyag@gmail.com >
* feat: collect fulltext dir metrics for applier
Signed-off-by: evenyag <realevenyag@gmail.com >
* feat: collect parquet row group metrics
Signed-off-by: evenyag <realevenyag@gmail.com >
* feat: add parquet metadata metrics
Signed-off-by: evenyag <realevenyag@gmail.com >
* feat: add apply metrics
Signed-off-by: evenyag <realevenyag@gmail.com >
* feat: collect more metrics for memory row group
Signed-off-by: evenyag <realevenyag@gmail.com >
* feat: add fetch metrics to ReaderMetrics
Signed-off-by: evenyag <realevenyag@gmail.com >
* feat: init verbose metrics
Signed-off-by: evenyag <realevenyag@gmail.com >
* feat: debug print metrics in ScanMetricsSet
Signed-off-by: evenyag <realevenyag@gmail.com >
* feat: implement debug for new metrics
Signed-off-by: evenyag <realevenyag@gmail.com >
* chore: fix compiler errors
Signed-off-by: evenyag <realevenyag@gmail.com >
* feat: update parquet fetch metrics
Signed-off-by: evenyag <realevenyag@gmail.com >
* feat: collect the whole fetch time
Signed-off-by: evenyag <realevenyag@gmail.com >
* feat: add file_scan_cost
Signed-off-by: evenyag <realevenyag@gmail.com >
* chore: parquet fetch add cache_miss counter
Signed-off-by: evenyag <realevenyag@gmail.com >
* feat: print index read metrics
Signed-off-by: evenyag <realevenyag@gmail.com >
* chore: use actual bytes to increase counter
Signed-off-by: evenyag <realevenyag@gmail.com >
* refactor: remove provided implementations for index reader traits
Signed-off-by: evenyag <realevenyag@gmail.com >
* refactor: change get_parquet_meta_data() method to receive metrics
Signed-off-by: evenyag <realevenyag@gmail.com >
* refactor: rename file_scan_cost to sst_scan_cost
Signed-off-by: evenyag <realevenyag@gmail.com >
* chore: refine ParquetFetchMetrics
Signed-off-by: evenyag <realevenyag@gmail.com >
* style: fix clippy
Signed-off-by: evenyag <realevenyag@gmail.com >
* style: fmt code
Signed-off-by: evenyag <realevenyag@gmail.com >
* refactor: remove useless inner method
Signed-off-by: evenyag <realevenyag@gmail.com >
* refactor: collect page size actual needed
Signed-off-by: evenyag <realevenyag@gmail.com >
* refactor: simplify InvertedIndexReadMetrics
Signed-off-by: evenyag <realevenyag@gmail.com >
* refactor: simplfy InvertedIndexApplyMetrics Debug
Signed-off-by: evenyag <realevenyag@gmail.com >
* refactor: simplify BloomFilterReadMetrics Debug
Signed-off-by: evenyag <realevenyag@gmail.com >
* refactor: simplify BloomFilterIndexApplyMetrics Debug
Signed-off-by: evenyag <realevenyag@gmail.com >
* refactor: simplify FulltextIndexApplyMetrics implementation
Signed-off-by: evenyag <realevenyag@gmail.com >
* refactor: simplify ParquetFetchMetrics Debug
Signed-off-by: evenyag <realevenyag@gmail.com >
* refactor: simplify MetadataCacheMetrics Debug
Signed-off-by: evenyag <realevenyag@gmail.com >
* feat: only print verbose metrics when they are not empty.
Signed-off-by: evenyag <realevenyag@gmail.com >
* refactor: use mutex to protect ParquetFetchMetrics
Signed-off-by: evenyag <realevenyag@gmail.com >
* style: fmt code
Signed-off-by: evenyag <realevenyag@gmail.com >
* refactor: use duration for elapsed in ParquetFetchMetricsData
Signed-off-by: evenyag <realevenyag@gmail.com >
---------
Signed-off-by: evenyag <realevenyag@gmail.com >
2025-12-04 13:40:18 +00:00
Yingwen
d5c616a9ff
feat: implement a cache for manifest files ( #7326 )
...
* feat: use cache in manifest store
Signed-off-by: evenyag <realevenyag@gmail.com >
* feat: use ManifestCache
Signed-off-by: evenyag <realevenyag@gmail.com >
* feat: clean empty manifest dir
Signed-off-by: evenyag <realevenyag@gmail.com >
* feat: get last checkpoint from cache
Signed-off-by: evenyag <realevenyag@gmail.com >
* feat: add hit/miss counter for manifest cache
Signed-off-by: evenyag <realevenyag@gmail.com >
* chore: add logs
Signed-off-by: evenyag <realevenyag@gmail.com >
* refactor: pass cache to ManifestObjectStore::new
Signed-off-by: evenyag <realevenyag@gmail.com >
* chore: fix compiler errors
Signed-off-by: evenyag <realevenyag@gmail.com >
* feat: cache checkpoint
Signed-off-by: evenyag <realevenyag@gmail.com >
* feat: cache checkpoint in write
Signed-off-by: evenyag <realevenyag@gmail.com >
* chore: fix compiler warnings
Signed-off-by: evenyag <realevenyag@gmail.com >
* chore: update config comment
Signed-off-by: evenyag <realevenyag@gmail.com >
* chore: manifest store cache for staging
Signed-off-by: evenyag <realevenyag@gmail.com >
* refactor: move recover_inner to FileCacheInner
Signed-off-by: evenyag <realevenyag@gmail.com >
* feat: remove manifest cache config from MitoConfig
Signed-off-by: evenyag <realevenyag@gmail.com >
* refactor: reduce clone when cache is enabled
Signed-off-by: evenyag <realevenyag@gmail.com >
* feat: do not cache staging manifests
We clean staging manifests by remove_all which isn't easy to clean
the cache in the same way
Signed-off-by: evenyag <realevenyag@gmail.com >
* fix: fix paths in manifest cache
Signed-off-by: evenyag <realevenyag@gmail.com >
* chore: don't clean dir if it is too new
Signed-off-by: evenyag <realevenyag@gmail.com >
* refactor: reuse write cache ttl as manifest cache ttl
Signed-off-by: evenyag <realevenyag@gmail.com >
* style: fix clippy
Signed-off-by: evenyag <realevenyag@gmail.com >
* fix: clean all empty subdirectories
Signed-off-by: evenyag <realevenyag@gmail.com >
---------
Signed-off-by: evenyag <realevenyag@gmail.com >
2025-12-04 12:51:09 +00:00
Weny Xu
0177f244e9
fix: fix write stall that never recovers due to flush logic issues ( #7322 )
...
* fix: fix write stall that never recovers due to flush logic issues
Signed-off-by: WenyXu <wenymedia@gmail.com >
* fix: fix unit test
Signed-off-by: WenyXu <wenymedia@gmail.com >
* feat: flush multiple regions when engine is full
Signed-off-by: WenyXu <wenymedia@gmail.com >
* chore: apply suggestions from CR
Signed-off-by: WenyXu <wenymedia@gmail.com >
* chore: apply suggestions from CR
Signed-off-by: WenyXu <wenymedia@gmail.com >
* chore: refine fn name
Signed-off-by: WenyXu <wenymedia@gmail.com >
* refactor: simplify flush scheduler by removing flushing state
Signed-off-by: WenyXu <wenymedia@gmail.com >
* chore: apply suggestions from CR
Signed-off-by: WenyXu <wenymedia@gmail.com >
* fix: fix unit tests
Signed-off-by: WenyXu <wenymedia@gmail.com >
---------
Signed-off-by: WenyXu <wenymedia@gmail.com >
2025-12-02 12:48:41 +00:00
Weny Xu
8346acb900
feat: introduce EnterStagingRequest for RegionEngine ( #7261 )
...
* feat: introduce `EnterStagingRequest` for region engine
Signed-off-by: WenyXu <wenymedia@gmail.com >
* chore: apply suggestions from CR
Signed-off-by: WenyXu <wenymedia@gmail.com >
* chore: apply suggestions from CR
Signed-off-by: WenyXu <wenymedia@gmail.com >
* chore: apply suggestions from CR
Signed-off-by: WenyXu <wenymedia@gmail.com >
* refactor: improve error handling in staging mode entry
Signed-off-by: WenyXu <wenymedia@gmail.com >
* chore: apply suggestions from CR
Signed-off-by: WenyXu <wenymedia@gmail.com >
* chore: apply suggestions from CR
Signed-off-by: WenyXu <wenymedia@gmail.com >
---------
Signed-off-by: WenyXu <wenymedia@gmail.com >
2025-11-28 09:02:32 +00:00
LFC
fdab75ce27
feat: simple read write new json type values ( #7175 )
...
feat: basic json read and write
Signed-off-by: luofucong <luofc@foxmail.com >
2025-11-27 12:40:35 +00:00
Yingwen
afefc0c604
fix: implement bulk write for time partitions and bulk memtable ( #7293 )
...
* feat: implement convert_bulk_part
Signed-off-by: evenyag <realevenyag@gmail.com >
* chore: convert bulk part in TimePartitions
Signed-off-by: evenyag <realevenyag@gmail.com >
* feat: fill missing columns for bulk parts
Signed-off-by: evenyag <realevenyag@gmail.com >
* chore: update comments
Signed-off-by: evenyag <realevenyag@gmail.com >
* feat: cast to dictionary type
Signed-off-by: evenyag <realevenyag@gmail.com >
* test: add unit tests
Signed-off-by: evenyag <realevenyag@gmail.com >
* chore: update comment
Signed-off-by: evenyag <realevenyag@gmail.com >
* fix: do not convert part if bulk is written by write()
Signed-off-by: evenyag <realevenyag@gmail.com >
---------
Signed-off-by: evenyag <realevenyag@gmail.com >
2025-11-27 08:01:45 +00:00
discord9
0aeaf405c7
feat: add batch gc procedure ( #7296 )
...
* feat: add batch gc procedure
Signed-off-by: discord9 <discord9@163.com >
* chore
Signed-off-by: discord9 <discord9@163.com >
* chore: per review
Signed-off-by: discord9 <discord9@163.com >
* pcr
Signed-off-by: discord9 <discord9@163.com >
* per even review
Signed-off-by: discord9 <discord9@163.com >
* per review
Signed-off-by: discord9 <discord9@163.com >
---------
Signed-off-by: discord9 <discord9@163.com >
2025-11-27 03:58:15 +00:00
Yingwen
b5cbc35a0d
fix: partition tree metric should the delta ( #7307 )
...
Signed-off-by: evenyag <realevenyag@gmail.com >
2025-11-27 03:49:02 +00:00
discord9
6485a26fa3
refactor: load metadata using offical impl ( #7302 )
...
* refactor: load metadata using offical impl
Signed-off-by: discord9 <discord9@163.com >
* pcr
Signed-off-by: discord9 <discord9@163.com >
---------
Signed-off-by: discord9 <discord9@163.com >
2025-11-26 08:52:04 +00:00
Sicong Hu
2783a5218e
feat: implement manual type for async index build ( #7104 )
...
* feat: prepare for index_build command
Signed-off-by: SNC123 <sinhco@outlook.com >
* feat: impl manual index build
Signed-off-by: SNC123 <sinhco@outlook.com >
* chore: clippy and fmt
Signed-off-by: SNC123 <sinhco@outlook.com >
* test: add idempotency check for manual build
Signed-off-by: SNC123 <sinhco@outlook.com >
* chore: apply suggestions
Signed-off-by: SNC123 <sinhco@outlook.com >
* chore: update proto
Signed-off-by: SNC123 <sinhco@outlook.com >
* chore: apply suggestions
Signed-off-by: SNC123 <sinhco@outlook.com >
* chore: fmt
Signed-off-by: SNC123 <sinhco@outlook.com >
* chore: update proto souce to greptimedb
Signed-off-by: SNC123 <sinhco@outlook.com >
* fix: cargo.lock
Signed-off-by: SNC123 <sinhco@outlook.com >
---------
Signed-off-by: SNC123 <sinhco@outlook.com >
2025-11-25 15:21:30 +00:00
Weny Xu
6b6d1ce7c4
feat: introduce remap_manifests for RegionEngine ( #7265 )
...
* refactor: consolidate RegionManifestOptions creation logic
Signed-off-by: WenyXu <wenymedia@gmail.com >
* feat: introduce`remap_manifests` for `RegionEngine`
Signed-off-by: WenyXu <wenymedia@gmail.com >
* Apply suggestions from code review
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
* chore: apply suggestions from CR
Signed-off-by: WenyXu <wenymedia@gmail.com >
---------
Signed-off-by: WenyXu <wenymedia@gmail.com >
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
2025-11-25 12:09:20 +00:00
yihong
d811c4f060
fix: pre-commit all files failed ( #7290 )
...
Signed-off-by: yihong0618 <zouzou0208@gmail.com >
2025-11-25 07:27:46 +00:00
Ruihang Xia
b32ca3ad86
perf: parallelize file source region ( #7285 )
...
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
2025-11-24 11:37:48 +00:00
discord9
52a576cf6d
feat: basic gc scheduler ( #7263 )
...
* feat: basic gc scheduler
Signed-off-by: discord9 <discord9@163.com >
* refactor: rm dup code
Signed-off-by: discord9 <discord9@163.com >
* docs: todo for cleaner code
Signed-off-by: discord9 <discord9@163.com >
* chore
Signed-off-by: discord9 <discord9@163.com >
* feat: rm retry path
Signed-off-by: discord9 <discord9@163.com >
* per review
Signed-off-by: discord9 <discord9@163.com >
* feat: skip first full listing after metasrv start
Signed-off-by: discord9 <discord9@163.com >
---------
Signed-off-by: discord9 <discord9@163.com >
2025-11-24 07:57:18 +00:00
LFC
4a7c16586b
refactor: remove Vectors from RecordBatch completely ( #7184 )
...
* refactor: remove `Vector`s from `RecordBatch` completely
Signed-off-by: luofucong <luofc@foxmail.com >
* resolve PR comments
Signed-off-by: luofucong <luofc@foxmail.com >
* resolve PR comments
Signed-off-by: luofucong <luofc@foxmail.com >
---------
Signed-off-by: luofucong <luofc@foxmail.com >
2025-11-21 08:53:35 +00:00
discord9
0cee4fa115
feat: gc get ref from manifest ( #7260 )
...
feat: get file ref from other manifest
Signed-off-by: discord9 <discord9@163.com >
2025-11-19 12:13:28 +00:00
discord9
e59612043d
feat: gc scheduler ctx&procedure ( #7252 )
...
* feat: gc ctx&procedure
Signed-off-by: discord9 <discord9@163.com >
* fix: handle region not found case
Signed-off-by: discord9 <discord9@163.com >
* docs: more explain&todo
Signed-off-by: discord9 <discord9@163.com >
* per review
Signed-off-by: discord9 <discord9@163.com >
* chore: add time for region gc
Signed-off-by: discord9 <discord9@163.com >
* fix: explain why loader for gc region should fail
Signed-off-by: discord9 <discord9@163.com >
---------
Signed-off-by: discord9 <discord9@163.com >
2025-11-19 08:35:17 +00:00
Yingwen
ee35ec0a39
feat: split batches before merge ( #7225 )
...
* feat: split batches by rule in build_flat_sources()
It checks the num_series and splits batches when the series cardinality
is low
Signed-off-by: evenyag <realevenyag@gmail.com >
* fix: panic when no num_series available
Signed-off-by: evenyag <realevenyag@gmail.com >
* fix: don't subtract file index if checking mem range
Signed-off-by: evenyag <realevenyag@gmail.com >
* chore: update comments and control flow
Signed-off-by: evenyag <realevenyag@gmail.com >
* style: fix clippy
Signed-off-by: evenyag <realevenyag@gmail.com >
---------
Signed-off-by: evenyag <realevenyag@gmail.com >
2025-11-18 08:19:39 +00:00
discord9
29bbff3c90
feat: gc worker only local regions&test ( #7203 )
...
* feat: gc worker only on local region
Signed-off-by: discord9 <discord9@163.com >
* more check
Signed-off-by: discord9 <discord9@163.com >
* chore: stuff
Signed-off-by: discord9 <discord9@163.com >
* fix: ignore async index file for now
Signed-off-by: discord9 <discord9@163.com >
* fix: file removal rate calc
Signed-off-by: discord9 <discord9@163.com >
* chore: per review
Signed-off-by: discord9 <discord9@163.com >
* chore: per review
Signed-off-by: discord9 <discord9@163.com >
* clippy
Signed-off-by: discord9 <discord9@163.com >
---------
Signed-off-by: discord9 <discord9@163.com >
2025-11-18 02:45:09 +00:00
Yingwen
77483ad7d4
fix: allow compacting L1 files under append mode ( #7239 )
...
* fix: allow compacting L1 files under append mode
Signed-off-by: evenyag <realevenyag@gmail.com >
* feat: limit the number of compaction input files
Signed-off-by: evenyag <realevenyag@gmail.com >
---------
Signed-off-by: evenyag <realevenyag@gmail.com >
2025-11-17 12:46:30 +00:00
Ruihang Xia
1eb8d6b76b
feat: build partition sources in parallel ( #7243 )
...
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
2025-11-17 11:44:48 +00:00
Yingwen
df954b47d5
fix: clone the page before putting into the index cache ( #7229 )
...
* fix: clone the page before putting into the index cache
Signed-off-by: evenyag <realevenyag@gmail.com >
* chore: fix warnings
Signed-off-by: evenyag <realevenyag@gmail.com >
---------
Signed-off-by: evenyag <realevenyag@gmail.com >
2025-11-15 17:52:32 +00:00
Yingwen
7cc0439cc9
feat: load latest index file first ( #7221 )
...
Signed-off-by: evenyag <realevenyag@gmail.com >
2025-11-13 08:56:44 +00:00
Yingwen
bb6a3a2ff3
feat: support altering sst format for a table ( #7206 )
...
* refactor: remove memtable_builder from MitoRegion
Signed-off-by: evenyag <realevenyag@gmail.com >
* chore: add alter format
Signed-off-by: evenyag <realevenyag@gmail.com >
* feat: support changing the format and memtable
Signed-off-by: evenyag <realevenyag@gmail.com >
* feat: support changing sst format via table options
Signed-off-by: evenyag <realevenyag@gmail.com >
* fix: set scanner and memtable builder with correct format
Signed-off-by: evenyag <realevenyag@gmail.com >
* style: fix clippy
Signed-off-by: evenyag <realevenyag@gmail.com >
* fix: fix incorrect metadata in version after alter
Signed-off-by: evenyag <realevenyag@gmail.com >
* test: add sqlness test
Signed-off-by: evenyag <realevenyag@gmail.com >
* test: replace region_id in sqlness result
Signed-off-by: evenyag <realevenyag@gmail.com >
* fix: create correct memtable when setting sst_format explicitly
Signed-off-by: evenyag <realevenyag@gmail.com >
* test: sqlness alter_format test set sst_format to primary_key
Signed-off-by: evenyag <realevenyag@gmail.com >
* chore: remove verbose log
Signed-off-by: evenyag <realevenyag@gmail.com >
---------
Signed-off-by: evenyag <realevenyag@gmail.com >
2025-11-11 13:19:00 +00:00
Weny Xu
49c6812e98
fix: deregister failure detectors on rollback and improve timeout handling ( #7212 )
...
Signed-off-by: WenyXu <wenymedia@gmail.com >
2025-11-11 09:44:27 +00:00
Yingwen
24671b60b4
feat: tracks index files in another cache and preloads them ( #7181 )
...
* feat: divide parquet and puffin index
Signed-off-by: evenyag <realevenyag@gmail.com >
* feat: download index files when we open the region
Signed-off-by: evenyag <realevenyag@gmail.com >
* feat: use different label for parquet/puffin
Signed-off-by: evenyag <realevenyag@gmail.com >
* feat: control parallelism and cache size by env
Signed-off-by: evenyag <realevenyag@gmail.com >
* fix: change gauge to counter
Signed-off-by: evenyag <realevenyag@gmail.com >
* fix: correct file type labels in file cache
Signed-off-by: evenyag <realevenyag@gmail.com >
* refactor: move env to config and change cache ratio to percent
Signed-off-by: evenyag <realevenyag@gmail.com >
* feat: checks capacity before download and refine metrics
Signed-off-by: evenyag <realevenyag@gmail.com >
* refactor: change open to return MitoRegionRef
Signed-off-by: evenyag <realevenyag@gmail.com >
* refactor: extract download to FileCache
Signed-off-by: evenyag <realevenyag@gmail.com >
* feat: run load cache task in write cache
Signed-off-by: evenyag <realevenyag@gmail.com >
* feat: check region state before downloading files
Signed-off-by: evenyag <realevenyag@gmail.com >
* chore: update config docs and test
Signed-off-by: evenyag <realevenyag@gmail.com >
* fix: use file id from index_file_id to compute puffin key
Signed-off-by: evenyag <realevenyag@gmail.com >
* fix: skip loading cache in some states
Signed-off-by: evenyag <realevenyag@gmail.com >
---------
Signed-off-by: evenyag <realevenyag@gmail.com >
2025-11-11 08:37:32 +00:00
jeremyhi
c7fded29ee
feat: query mem limiter ( #7078 )
...
* feat: query mem limiter
* feat: config docs
* feat: frontend query limit config
* fix: unused imports
Signed-off-by: jeremyhi <fengjiachun@gmail.com >
* feat: add metrics for query memory tracker
Signed-off-by: jeremyhi <fengjiachun@gmail.com >
* fix: right postion for tracker
Signed-off-by: jeremyhi <fengjiachun@gmail.com >
* fix: avoid race condition
Signed-off-by: jeremyhi <fengjiachun@gmail.com >
* feat: soft and hard limit
Signed-off-by: jeremyhi <fengjiachun@gmail.com >
* feat: docs
Signed-off-by: jeremyhi <fengjiachun@gmail.com >
* fix: when soft_limit == 0
Signed-off-by: jeremyhi <fengjiachun@gmail.com >
* feat: upgrade limit algorithm
Signed-off-by: jeremyhi <fengjiachun@gmail.com >
* fix: remove batch window
Signed-off-by: jeremyhi <fengjiachun@gmail.com >
* chore: batch mem size
Signed-off-by: jeremyhi <fengjiachun@gmail.com >
* feat: refine limit algorithm
Signed-off-by: jeremyhi <fengjiachun@gmail.com >
* fix: get sys mem
Signed-off-by: jeremyhi <fengjiachun@gmail.com >
* chore: minor change
* feat: up tracker to the top stream
* feat: estimated_size for batch
Signed-off-by: jeremyhi <fengjiachun@gmail.com >
* chore: minor refactor
* feat: scan_memory_limit connect to max_concurrent_queries
Signed-off-by: jeremyhi <fengjiachun@gmail.com >
* chore: make callback clearly
* feat: add unlimted enum
Signed-off-by: jeremyhi <fengjiachun@gmail.com >
* chore: by review comment
* chore: comment on recursion_limit
Signed-off-by: jeremyhi <fengjiachun@gmail.com >
* feat: refactor and put permit into RegionScanExec
Signed-off-by: jeremyhi <fengjiachun@gmail.com >
* chore: multiple lazy static blocks
* chore: minor change
Signed-off-by: jeremyhi <fengjiachun@gmail.com >
---------
Signed-off-by: jeremyhi <fengjiachun@gmail.com >
2025-11-11 07:47:55 +00:00