mirror of
https://github.com/GreptimeTeam/greptimedb.git
synced 2026-05-19 06:20:38 +00:00
* bulk-multiparts-merge-reader: **Enhance Memtable Iteration and Flushing Logic** - **`flush.rs`**: Updated `RegionFlushTask` to handle multiple ranges using `MergeReaderBuilder` for improved source management during flush operations. - **`memtable.rs`**: Introduced `build_prune_iter` and `build_iter` methods in `MemtableRange` for flexible iteration. Added `MemtableRanges` struct to manage multiple contexts. - **`simple_bulk_memtable.rs`**: Refactored to use `BatchIterBuilder` and `BatchIterBuilderDeprecated` for iteration, supporting new `read_to_values` method in `Series`. - **`time_series.rs`**: Added `read_to_values` and `finish_cloned` methods in `Series` and `ValueBuilder` for efficient data handling. - **`scan_util.rs`**: Replaced `build_iter` with `build_prune_iter` for range iteration, enhancing scan utility. Signed-off-by: Lei, HUANG <mrsatangel@gmail.com> * bulk-multiparts-merge-reader: - **Add Rayon for Parallel Processing**: Introduced `rayon` for parallel processing in `simple_bulk_memtable.rs` and updated `Cargo.toml` and `Cargo.lock` to include `rayon` dependency. - **Enhance Benchmarking**: Added new benchmarks in `simple_bulk_memtable.rs` to compare parallel vs sequential processing, projection, sequence filtering, and write performance. - **Make Structs and Methods Public**: Changed visibility of several structs and methods to `pub` in `simple_bulk_memtable.rs`, `memtable.rs`, `time_series.rs`, and `test_util.rs` to facilitate testing and benchmarking. - **Update Criterion Features**: Modified `Cargo.toml` to include `html_reports` feature for `criterion`. Signed-off-by: Lei, HUANG <mrsatangel@gmail.com> * bulk-multiparts-merge-reader: ### Commit Summary - **Refactor `SimpleBulkMemtable`**: - Moved `ranges_sequential` function to a new `test_only` module and made it a method of `SimpleBulkMemtable`. - Made several fields in `SimpleBulkMemtable` private and added a `region_metadata` getter. - Affected files: `simple_bulk_memtable.rs`, `test_only.rs`. - **Benchmark Adjustments**: - Updated benchmark functions to use the new `ranges_sequential` method. - Affected file: `simple_bulk_memtable.rs`. Signed-off-by: Lei, HUANG <mrsatangel@gmail.com> * bulk-multiparts-merge-reader: ### Add Test Configuration for `iter` Method in Memtable Implementations - **Enhancements**: - Added `#[cfg(any(test, feature = "test"))]` attribute to the `iter` method in various `Memtable` implementations to enable conditional compilation for testing purposes. - Affected files: - `src/mito2/src/memtable.rs` - `src/mito2/src/memtable/bulk.rs` - `src/mito2/src/memtable/partition_tree.rs` - `src/mito2/src/memtable/simple_bulk_memtable.rs` - `src/mito2/src/memtable/time_series.rs` - `src/mito2/src/test_util/memtable_util.rs` - **Benchmark Adjustments**: - Removed `black_box` usage in `bench_memtable_write_performance` function to streamline benchmarking. - Affected file: `src/mito2/benches/simple_bulk_memtable.rs` Signed-off-by: Lei, HUANG <mrsatangel@gmail.com> * bulk-multiparts-merge-reader: **Enhance Async Support and Refactor Iteration in `mito2`** - **Add Async Features**: Updated `Cargo.toml` to include `async` and `async_tokio` features for `criterion`. - **Async Iteration**: Introduced async functions `flush` and `flush_original` in `simple_bulk_memtable.rs` to handle memtable flushing using async iterators. - **Refactor Iteration Logic**: Moved `create_iter` and `BatchIterBuilderDeprecated` to `test_only.rs` for better separation of concerns. - **Public API Change**: Made `next_batch` in `read.rs` public to support async batch processing. - **Benchmark Updates**: Modified benchmarks in `simple_bulk_memtable.rs` to use async runtime for performance testing. Files affected: `Cargo.toml`, `simple_bulk_memtable.rs`, `test_only.rs`, `read.rs`. Signed-off-by: Lei, HUANG <mrsatangel@gmail.com> * bulk-multiparts-merge-reader: **Enhance Benchmarking for Memtable** - Refactored `create_large_memtable` to `create_memtable_with_rows` in `simple_bulk_memtable.rs` to allow dynamic row count configuration. - Introduced parameterized benchmarking in `bench_ranges_parallel_vs_sequential` to test various row counts, improving the flexibility and coverage of performance tests. Signed-off-by: Lei, HUANG <mrsatangel@gmail.com> * bulk-multiparts-merge-reader: ### Enhance Memory Management and Public API - **`builder.rs`**: Made `next_offset` method public to allow external access to offset calculations. - **`simple_bulk_memtable.rs`**: Simplified the `series.extend` method by removing the iterator conversion for `fields`. - **`time_series.rs`**: - Added `can_accommodate` method to `ValueBuilder` to check if fields can be accommodated without offset overflow. - Modified `extend` method to use a `Vec` for `fields` instead of an iterator, improving memory management and error handling. Signed-off-by: Lei, HUANG <mrsatangel@gmail.com> * bulk-multiparts-merge-reader: Add License and Enhance Testing in `simple_bulk_memtable.rs` - Added Apache License header to `simple_bulk_memtable.rs`. - Modified test configuration in `simple_bulk_memtable.rs` to include `any(test, feature = "test")`. - Introduced a new test `test_write_read_large_string` in `simple_bulk_memtable.rs` to verify handling of large strings. Signed-off-by: Lei, HUANG <mrsatangel@gmail.com> * bulk-multiparts-merge-reader: Update `Cargo.toml` dependencies - Adjust features for `common-meta` and `mito-codec` to include "testing". - Maintain `criterion` version and features for async support. Signed-off-by: Lei, HUANG <mrsatangel@gmail.com> * bulk-multiparts-merge-reader: ### Update Predicate Type in Memtable Iterators - **Files Modified**: - `src/mito2/src/memtable.rs` - `src/mito2/src/memtable/bulk.rs` - `src/mito2/src/memtable/simple_bulk_memtable.rs` - **Key Changes**: - Updated the `iter` method in `Memtable` trait and its implementations to use `Option<table::predicate::Predicate>` instead of `Option<Predicate>`. - Adjusted return type in `BulkMemtable`'s `iter` method to `Result<crate::memtable::BoxedBatchIterator>`. Signed-off-by: Lei, HUANG <mrsatangel@gmail.com> * bulk-multiparts-merge-reader: **Enhance Memtable Functionality** - **`memtable.rs`**: - Added `Clone` trait to `MemtableStats` and made `num_ranges` public. - Introduced `num_rows` field in `MemtableRange` and updated its constructor. - Added `num_rows` method to `MemtableRange`. - **`partition_tree.rs`, `simple_bulk_memtable.rs`, `time_series.rs`**: - Updated `MemtableRange` instantiation to include `num_rows`. - **`range.rs`**: - Refactored `MemRangeBuilder` to handle a single `MemtableRange` and `MemtableStats`. - **`scan_region.rs`**: - Enhanced memtable filtering based on time range and updated `MemRangeBuilder` usage. Signed-off-by: Lei, HUANG <mrsatangel@gmail.com> * bulk-multiparts-merge-reader: **Enhancements and Bug Fixes** - **Deduplication Enhancements**: - Introduced `DedupReader` and `LastRow` as public structs in `dedup.rs` to enhance deduplication capabilities. - Added `LastNonNull` deduplication strategy in `flush.rs` and `simple_bulk_memtable.rs`. - **Memtable Improvements**: - Updated `SimpleBulkMemtable` to support batch size configuration and deduplication strategies. - Modified `Series` struct in `time_series.rs` to include a configurable capacity. - **Testing Enhancements**: - Added new test `test_write_dedup` in `simple_bulk_memtable.rs` to verify deduplication functionality. - Updated existing tests to include `OpType` parameter for better operation type handling. - **Refactoring**: - Renamed `BatchIterBuilder` to `BatchRangeBuilder` in `simple_bulk_memtable.rs` for clarity. Signed-off-by: Lei, HUANG <mrsatangel@gmail.com> * bulk-multiparts-merge-reader: - **Refactor `flush.rs`:** Removed `LastNonNullIter` usage and adjusted `DedupReader` instantiation to use `LastRow::new(false)` and `LastNonNull::new(false)`. - **Enhance `simple_bulk_memtable.rs`:** Added logic to handle `LastNonNull` merge mode in `IterBuilder`. Introduced new tests: `test_delete_only` and `test_single_range` to verify delete operations and single range handling. Signed-off-by: Lei, HUANG <mrsatangel@gmail.com> * fix: tests Signed-off-by: Lei, HUANG <mrsatangel@gmail.com> --------- Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>