* feat(operator): allow last_row merge_mode when append_mode is enabled
- Update RegionOptions::validate to allow last_row merge_mode with append_mode.
- Update fill_table_options_for_create to automatically set merge_mode to last_row when append_mode is enabled for LastNonNull table type.
- Add unit tests in mito2 and operator to verify options validation and table creation.
- Add integration test for InfluxDB write with append mode hint.
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
* fix(operator): simplify append mode options
Group `LastNonNull` auto-create options in a single append-mode branch.
Files:
- `src/operator/src/insert.rs`
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
* fix: sqlness
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
---------
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
* refactor(mito2): reshape extension range API
Thread per-range read options into the extension range flat reader so
it can honor PreFilterMode like other range kinds, and drop the unused
legacy Batch reader surface.
- extension.rs: introduce ExtensionRangeReadOptions { pre_filter_mode };
flat_reader() takes it; remove reader(), ExtensionRangeReader, and
BoxedExtensionRangeReader (no in-tree or out-of-tree call sites use
the Batch-format reader path anymore).
- scan_util.rs: plumb options through maybe_scan_flat_other_ranges and
scan_flat_extension_range.
- seq_scan.rs / unordered_scan.rs: build options at the call site via
StreamContext::range_pre_filter_mode.
Using an options struct (rather than a bare PreFilterMode argument)
keeps future additions additive for out-of-tree ExtensionRange impls.
Signed-off-by: evenyag <realevenyag@gmail.com>
* refactor(mito2): hoist ExtensionRangeReadOptions out of row-group loops
Build ext_options once per partition range instead of per row-group index in
build_flat_sources and scan_flat_partition_range. Both inputs (part_range and
range_pre_filter_mode) are loop-invariant.
Signed-off-by: evenyag <realevenyag@gmail.com>
* chore: fix compiler error without enterprise feature
Signed-off-by: evenyag <realevenyag@gmail.com>
---------
Signed-off-by: evenyag <realevenyag@gmail.com>
* feat(mito2): extract and cache primary key range for SST files
Extracts primary key ranges from SST files during flush and compaction, and caches them in FileHandle for future use (e.g., overlapping checks).
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
* feat/compaction-overlapping-check:
- **Enhance Primary Key Range Logic**: Updated the `primary_key_ranges_overlap` function in `run.rs` to use `chunk()` for comparing byte ranges, improving accuracy in overlap detection.
- **Refactor Run Assignment Logic**: Simplified the logic for assigning items to runs in `run.rs` by removing redundant match statements and using `is_empty()` and `iter().any()` for cleaner checks.
- **Add Test for Transitivity Break**: Introduced a new test `test_find_sorted_runs_handles_2d_transitivity_break` in `run.rs` to ensure correct handling of transitivity breaks in sorted runs.
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
* feat/compaction-overlapping-check:
- **Remove PK-disjoint detection logic**: Simplified the compaction logic in `twcs.rs` by removing the `has_time_overlapping_pairs` function and related logic for PK-disjoint detection. This includes the removal of the `append_mode_force_compact` condition and associated tests.
- **Update compaction trigger settings**: Modified `append_mode_test.rs` to set `compaction.twcs.trigger_file_num` to "2" and adjusted the expected number of files in the scanner assertion from 1 to 2.
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
* chore: rebase main
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
* feat/compaction-overlapping-check:
### Remove Unused Import in `compactor.rs`
- Removed the unused import `compact_request` from `compactor.rs` to clean up the codebase.
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
* fix: tighten `mito2` file lifecycle handling
Refine compaction, flush, and SST/version bookkeeping across `src/mito2/src/compaction/*`, `src/mito2/src/flush.rs`, `src/mito2/src/region/*`, `src/mito2/src/sst/*`, and related tests/utilities.
* fix: reuse primary key range merge in twcs compaction
Centralize primary key range merging so can call the shared helper from .
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
* refactor(mito2): simplify `FileHandle` initialization and internalize primary key range extraction
- Updated `FileHandle::new` to automatically compute the primary key range directly from `FileMeta`.
- Restricted `FileHandle::new_with_primary_key_range` to be test-only by adding the `#[cfg(test)]` attribute.
- Simplified `SstVersion::add_files` by adopting the updated `FileHandle::new` instead of manually providing the primary key range.
Modified files:
- `src/mito2/src/sst/file.rs`
- `src/mito2/src/sst/version.rs`
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
* feat/compaction-overlapping-check:
### Improve File Management and Documentation
- **`twcs.rs`**: Added a comment to clarify the merging of small files when there are no overlapping files.
- **`version.rs` (in `region` and `sst`)**: Enhanced documentation for `add_files` method, explaining its functionality and panic conditions. Simplified the file handle creation logic in `sst/version.rs`.
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
* feat/compaction-overlapping-check:
### Enhance Primary Key Range Handling in `opener.rs`
- Updated logic in `opener.rs` to set the primary key range for file handles when it is not already defined. This change ensures that the primary key range is extracted and set using `extract_primary_key_range` when necessary.
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
* docs: add docs for the necessity of checking pk and timesmaps while find overlapping files.
* chore: address review comments
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
---------
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
refactor/remove-compactor-compact:
### Remove Unused Compaction Functionality
- **Removed `compact` Method**: Eliminated the `compact` method from the `Compactor` trait and its default implementation, which was primarily used for local compaction in testing. This change affects `compactor.rs`.
- **Code Cleanup**: Removed associated code and comments related to the `compact` method, streamlining the `Compactor` trait interface.
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
* refactor(mito2): improve compaction error handling and file removal
Refactor compaction task execution to enhance error handling and robustness.
- Implemented parallel execution of compaction tasks with proper error capture and logging for individual task failures.
- Ensured JoinSnafu is no longer directly used in error propagation, instead handling errors within the task processing loop.
- Adjusted file removal logic to correctly include expired SSTs after compaction merges.
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
* refactor(mito2): extract SstMerger trait for testability in compaction
Extract SstMerger trait and DefaultSstMerger implementation to improve the testability of DefaultCompactor.
The DefaultCompactor is now generic over SstMerger, allowing mock implementations to be injected for unit testing without relying on the full object storage access layer. This refactoring separates the concerns of SST file merging from the overall compaction orchestration logic.
Additionally:
- Updated CompactionScheduler to use DefaultCompactor::default().
- Added unit tests for DefaultCompactor using a MockMerger.
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
* fix(compaction): propagate join error during sst flush
Correctly propagates the error when joining SST flush handles during compaction. Previously, the error was logged but not returned, leading to potential silent failures.
Also reorders some imports for consistency.
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
* perf(compaction): pre-allocate capacity for compacted_inputs
Pre-allocates capacity for the compacted_inputs vector based on the estimated total size of inputs and expired SSTs. This optimization aims to reduce vector reallocations during the compaction process.
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
* feat/allow-partial-compaction:
### Commit Message
Enhance `DefaultCompactor` and `MockMerger` for Improved Flexibility
- **`compactor.rs`**:
- Added `Clone` trait to `DefaultSstMerger` and `MockMerger` to allow cloning.
- Removed `Arc` wrapping from `DefaultCompactor`'s `merger` field for direct usage.
- Updated `merge_ssts` method to require `Clone` trait for `SstMerger`.
- Modified `MockMerger` to use `Arc<Mutex>` for `results` and `call_idx` to ensure thread safety.
- Adjusted error handling to use `error::InvalidMetaSnafu` directly.
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
---------
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
* fix: add overflow check before interleave()
Signed-off-by: evenyag <realevenyag@gmail.com>
* refactor: pass batches and column index to check_interleave_bytes_overflow
Refactor check_interleave_bytes_overflow to accept batches and a column
index directly, avoiding the intermediate Vec collection of arrays.
Signed-off-by: evenyag <realevenyag@gmail.com>
---------
Signed-off-by: evenyag <realevenyag@gmail.com>
* feat: support alter from primary_key to flat
Signed-off-by: evenyag <realevenyag@gmail.com>
* chore: alter flat to primary_key
Signed-off-by: evenyag <realevenyag@gmail.com>
* feat: change default_experimental_flat_format to true
Signed-off-by: evenyag <realevenyag@gmail.com>
* feat: compute channel size from splitted batch size
Signed-off-by: evenyag <realevenyag@gmail.com>
* test: add tests for split and channel size
Signed-off-by: evenyag <realevenyag@gmail.com>
* fix: always set sst_format from manifest on region open
sanitize_region_options did not set options.sst_format when the
default (PrimaryKey) matched the manifest value, leaving it as None
after reopen. This caused the alter format change to appear lost.
Signed-off-by: evenyag <realevenyag@gmail.com>
* test: fix tests
Signed-off-by: evenyag <realevenyag@gmail.com>
* test: show create table after alteration
Signed-off-by: evenyag <realevenyag@gmail.com>
* refactor!: rename default_experimental_flat_format to default_flat_format
The flat format is no longer experimental. Remove "experimental" from
the config field name, doc comments, and all references.
Signed-off-by: evenyag <realevenyag@gmail.com>
* chore: fix clippy
Signed-off-by: evenyag <realevenyag@gmail.com>
---------
Signed-off-by: evenyag <realevenyag@gmail.com>
Allow flush edits with equal entry ids when flushed sequence advances, so close-time flush after truncate still succeeds for skip-wal regions while stale pre-truncate flushes are rejected. Add a regression test for create->truncate->write->close timing.
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
* feat/flat-for-time-series:
### Commit Message
Enhance `TimeSeriesMemtable` with Record Batch Support
- **`time_series.rs`**:
- Introduced `BatchToRecordBatchContext` to facilitate conversion of batch iterators to record batch iterators.
- Added `build_record_batch` method in `TimeSeriesIterBuilder` to support record batch creation.
- Implemented multiple test cases to validate the functionality of record batch creation, including tests for projections,
deduplication, sequence filtering, and data correctness.
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
* feat/flat-for-time-series:
Refactor `TimeSeriesMemtable` and `TimeSeriesIterBuilder`
- Renamed `adapter_context` to `batch_to_record_batch` in `TimeSeriesMemtable` for clarity.
- Simplified `MemtableRangeContext` initialization by removing the `batch_to_record_batch` parameter.
- Added `is_record_batch` method to `TimeSeriesIterBuilder` to indicate record batch status.
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
* feat/flat-for-time-series:
### Add Time Range Filtering and Predicate Group Enhancements
- **`memtable.rs`**: Updated `IterBuilder` to include `time_range` parameter in `build_record_batch` method, enhancing record batch iteration with time range filtering.
- **`time_series.rs`**: Modified `TimeSeriesIterBuilder` to use `PredicateGroup` instead of `Predicate`, and integrated `PruneTimeIterator` for time-based filtering.
- **`memtable_util.rs`**: Removed unused `Predicate` import, reflecting changes in predicate handling.
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
---------
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>