feat!: revise compaction picker (#6121)

* - **Refactor `RegionFilePathFactory` to `RegionFilePathProvider`:** Updated references and implementations in `access_layer.rs`, `write_cache.rs`, and related test files to use the new struct name.
 - **Add `max_file_size` support in compaction:** Introduced `max_file_size` option in `PickerOutput`, `SerializedPickerOutput`, and `WriteOptions` in `compactor.rs`, `picker.rs`, `twcs.rs`, and `window.rs`.
 - **Enhance Parquet writing logic:** Modified `parquet.rs` and `parquet/writer.rs` to support optional `max_file_size` and added a test case `test_write_multiple_files` to verify writing multiple files based on size constraints.

 **Refactor Parquet Writer Initialization and File Handling**
 - Updated `ParquetWriter` in `writer.rs` to handle `current_indexer` as an `Option`, allowing for more flexible initialization and management.
 - Introduced `finish_current_file` method to encapsulate logic for completing and transitioning between SST files, improving code clarity and maintainability.
 - Enhanced error handling and logging with `debug` statements for better traceability during file operations.

 - **Removed Output Size Enforcement in `twcs.rs`:**
   - Deleted the `enforce_max_output_size` function and related logic to simplify compaction input handling.

 - **Added Max File Size Option in `parquet.rs`:**
   - Introduced `max_file_size` in `WriteOptions` to control the maximum size of output files.

 - **Refactored Indexer Management in `parquet/writer.rs`:**
   - Changed `current_indexer` from an `Option` to a direct `Indexer` type.
   - Implemented `roll_to_next_file` to handle file transitions when exceeding `max_file_size`.
   - Simplified indexer initialization and management logic.

 - **Refactored SST File Handling**:
   - Introduced `FilePathProvider` trait and its implementations (`WriteCachePathProvider`, `RegionFilePathFactory`) to manage SST and index file paths.
   - Updated `AccessLayer`, `WriteCache`, and `ParquetWriter` to use `FilePathProvider` for path management.
   - Modified `SstWriteRequest` and `SstUploadRequest` to use path providers instead of direct paths.
   - Files affected: `access_layer.rs`, `write_cache.rs`, `parquet.rs`, `writer.rs`.

 - **Enhanced Indexer Management**:
   - Replaced `IndexerBuilder` with `IndexerBuilderImpl` and made it async to support dynamic indexer creation.
   - Updated `ParquetWriter` to handle multiple indexers and file IDs.
   - Files affected: `index.rs`, `parquet.rs`, `writer.rs`.

 - **Removed Redundant File ID Handling**:
   - Removed `file_id` from `SstWriteRequest` and `CompactionOutput`.
   - Updated related logic to dynamically generate file IDs where necessary.
   - Files affected: `compaction.rs`, `flush.rs`, `picker.rs`, `twcs.rs`, `window.rs`.

 - **Test Adjustments**:
   - Updated tests to align with new path and indexer management.
   - Introduced `FixedPathProvider` and `NoopIndexBuilder` for testing purposes.
   - Files affected: `sst_util.rs`, `version_util.rs`, `parquet.rs`.

* chore: rebase main

* feat/multiple-compaction-output:
 ### Add Benchmarking and Refactor Compaction Logic

 - **Benchmarking**: Added a new benchmark `run_bench` in `Cargo.toml` and implemented benchmarks in `benches/run_bench.rs` using Criterion for `find_sorted_runs` and `reduce_runs` functions.
 - **Compaction Module Enhancements**:
   - Made `run.rs` public and refactored the `Ranged` and `Item` traits to be public.
   - Simplified the logic in `find_sorted_runs` and `reduce_runs` by removing `MergeItems` and related functions.
   - Introduced `find_overlapping_items` for identifying overlapping items.
 - **Code Cleanup**: Removed redundant code and tests related to `MergeItems` in `run.rs`.

* feat/multiple-compaction-output:
 ### Enhance Compaction Logic and Add Benchmarks

 - **Compaction Logic Improvements**:
   - Updated `reduce_runs` function in `src/mito2/src/compaction/run.rs` to remove the target parameter and improve the logic for selecting files to merge based on minimum penalty.
   - Enhanced `find_overlapping_items` to handle unsorted inputs and improve overlap detection efficiency.

 - **Benchmark Enhancements**:
   - Added `bench_find_overlapping_items` in `src/mito2/benches/run_bench.rs` to benchmark the new `find_overlapping_items` function.
   - Extended existing benchmarks to include larger data sizes.

 - **Testing Enhancements**:
   - Updated tests in `src/mito2/src/compaction/run.rs` to reflect changes in `reduce_runs` and added new tests for `find_overlapping_items`.

 - **Logging and Debugging**:
   - Improved logging in `src/mito2/src/compaction/twcs.rs` to provide more detailed information about compaction decisions.

* feat/multiple-compaction-output:
 ### Refactor and Enhance Compaction Logic

 - **Refactor `find_overlapping_items` Function**: Changed the function signature to accept slices instead of mutable vectors in `run.rs`.
 - **Rename and Update Struct Fields**: Renamed `penalty` to `size` in `SortedRun` struct and updated related logic in `run.rs`.
 - **Enhance `reduce_runs` Function**: Improved logic to sort runs by size and limit probe runs to 100 in `run.rs`.
 - **Add `merge_seq_files` Function**: Introduced a new function `merge_seq_files` in `run.rs` for merging sequential files.
 - **Modify `TwcsPicker` Logic**: Updated the compaction logic to use `merge_seq_files` when only one run is found in `twcs.rs`.
 - **Remove `enforce_file_num` Function**: Deleted the `enforce_file_num` function and its related test cases in `twcs.rs`.

* feat/multiple-compaction-output:
 ### Enhance Compaction Logic and Testing

 - **Add `merge_seq_files` Functionality**: Implemented the `merge_seq_files` function in `run.rs` to optimize file merging based on scoring systems. Updated
 benchmarks in `run_bench.rs` to include `bench_merge_seq_files`.
 - **Improve Compaction Strategy in `twcs.rs`**: Modified the compaction logic to handle file merging more effectively, considering file size and overlap.
 - **Update Tests**: Enhanced test coverage in `compaction_test.rs` and `append_mode_test.rs` to validate new compaction logic and file merging strategies.
 - **Remove Unused Function**: Deleted `new_file_handles` from `test_util.rs` as it was no longer needed.

* feat/multiple-compaction-output:
 ### Refactor TWCS Compaction Options

 - **Refactor Compaction Logic**: Simplified the TWCS compaction logic by replacing multiple parameters (`max_active_window_runs`, `max_active_window_files`, `max_inactive_window_runs`, `max_inactive_window_files`) with a single `trigger_file_num` parameter in `picker.rs`, `twcs.rs`, and `options.rs`.
 - **Update Tests**: Adjusted test cases to reflect the new compaction logic in `append_mode_test.rs`, `compaction_test.rs`, `filter_deleted_test.rs`, `merge_mode_test.rs`, and various test files under `tests/cases`.
 - **Modify Engine Options**: Updated engine option keys to use `trigger_file_num` in `mito_engine_options.rs` and `region_request.rs`.
 - **Fuzz Testing**: Updated fuzz test generators and translators to accommodate the new compaction parameter in `alter_expr.rs` and related files.

 This refactor aims to streamline the compaction configuration by reducing the number of parameters and simplifying the codebase.

* chore: add trailing space

* fix license header

* feat/revise-compaction-picker:
 **Limit File Processing and Optimize Merge Logic in `run.rs`**

 - Introduced a limit to process a maximum of 100 files in `merge_seq_files` to control time complexity.
 - Adjusted logic to calculate `target_size` and iterate over files using the limited set of files.
 - Updated scoring calculations to use the limited file set, ensuring efficient file merging.

* feat/revise-compaction-picker:
 ### Add Compaction Metrics and Remove Debug Logging

 - **Compaction Metrics**: Introduced new histograms `COMPACTION_INPUT_BYTES` and `COMPACTION_OUTPUT_BYTES` to track compaction input and output file sizes in `metrics.rs`. Updated `compactor.rs` to observe these metrics during the compaction process.
 - **Logging Cleanup**: Removed debug logging of file ranges during the merge process in `twcs.rs`.

* feat/revise-compaction-picker:
 ## Enhance Compaction Logic and Metrics

 - **Compaction Logic Improvements**:
   - Added methods `input_file_size` and `output_file_size` to `MergeOutput` in `compactor.rs` to streamline file size calculations.
   - Updated `Compactor` implementation to use these methods for metrics tracking.
   - Modified `Ranged` trait logic in `run.rs` to improve range comparison.
   - Enhanced test cases in `run.rs` to reflect changes in compaction logic.

 - **Metrics Enhancements**:
   - Changed `COMPACTION_INPUT_BYTES` and `COMPACTION_OUTPUT_BYTES` from histograms to counters in `metrics.rs` for better performance tracking.

 - **Debugging and Logging**:
   - Added detailed logging for compaction pick results in `twcs.rs`.
   - Implemented custom `Debug` trait for `FileMeta` in `file.rs` to improve debugging output.

 - **Testing Enhancements**:
   - Added new test `test_compaction_overlapping_files` in `compaction_test.rs` to verify compaction behavior with overlapping files.
   - Updated `merge_mode_test.rs` to reflect changes in file handling during scans.

* feat/revise-compaction-picker:
 ### Update `FileHandle` Debug Implementation

 - **Refactor Debug Output**: Simplified the `fmt::Debug` implementation for `FileHandle` in `src/mito2/src/sst/file.rs` by consolidating multiple fields into a single `meta` field using `meta_ref()`.
 - **Atomic Operations**: Updated the `deleted` field to use atomic loading with `Ordering::Relaxed`.

* Trigger CI

* feat/revise-compaction-picker:
 **Update compaction logic and default options**

 - **`twcs.rs`**: Enhanced logging for compaction pick results by improving the formatting for better readability.
 - **`options.rs`**: Modified the default `max_output_file_size` in `TwcsOptions` from 2GB to 512MB to optimize file handling and performance.

* feat/revise-compaction-picker:
 Refactor `find_overlapping_items` to use an external result vector

 - Updated `find_overlapping_items` in `src/mito2/src/compaction/run.rs` to accept a mutable result vector instead of returning a new vector, improving memory efficiency.
 - Modified benchmarks in `src/mito2/benches/bench_compaction_picker.rs` to accommodate the new function signature.
 - Adjusted tests in `src/mito2/src/compaction/run.rs` to use the updated function signature, ensuring correct functionality with the new approach.

* feat/revise-compaction-picker:
 Improve file merging logic in `run.rs`

 - Refactor the loop logic in `merge_seq_files` to simplify the iteration over file groups.
 - Adjust the range for `end_idx` to include the endpoint, allowing for more flexible group selection.
 - Remove the condition that skips groups with only one file, enabling more comprehensive processing of file sequences.

* feat/revise-compaction-picker:
 Enhance `find_overlapping_items` with `SortedRun` and Update Tests

 - Refactor `find_overlapping_items` in `src/mito2/src/compaction/run.rs` to utilize the `SortedRun` struct for improved efficiency and clarity.
 - Introduce a `sorted` flag in `SortedRun` to optimize sorting operations.
 - Update test cases in `src/mito2/benches/bench_compaction_picker.rs` to accommodate changes in `find_overlapping_items` by using `SortedRun`.
 - Add `From<Vec<T>>` implementation for `SortedRun` to facilitate easy conversion from vectors.

* feat/revise-compaction-picker:
 **Enhancements in `compaction/run.rs`:**

 - Added `ReadableSize` import to handle size calculations.
 - Modified the logic in `merge_seq_files` to clamp the calculated target size to a maximum of 2GB when `max_file_size` is not provided.

* feat/revise-compaction-picker: Add Default Max Output Size Constant for Compaction

Introduce DEFAULT_MAX_OUTPUT_SIZE constant to define the default maximum compaction output file size as 2GB. Refactor the merge_seq_files function to utilize this constant, ensuring consistent and maintainable code for handling file size limits during compaction.
This commit is contained in:
Lei, HUANG
2025-05-23 11:29:08 +08:00
committed by GitHub
parent bf496e05cc
commit 4b71e493f7
40 changed files with 1172 additions and 1073 deletions

View File

@@ -238,21 +238,9 @@ impl<R: Rng> Generator<AlterTableExpr, R> for AlterExprSetTableOptionsGenerator<
let max_output_file_size: u64 = rng.random();
AlterTableOption::TwcsMaxOutputFileSize(ReadableSize(max_output_file_size))
}
AlterTableOption::TwcsMaxInactiveWindowRuns(_) => {
let max_inactive_window_runs: u64 = rng.random();
AlterTableOption::TwcsMaxInactiveWindowRuns(max_inactive_window_runs)
}
AlterTableOption::TwcsMaxActiveWindowFiles(_) => {
let max_active_window_files: u64 = rng.random();
AlterTableOption::TwcsMaxActiveWindowFiles(max_active_window_files)
}
AlterTableOption::TwcsMaxActiveWindowRuns(_) => {
let max_active_window_runs: u64 = rng.random();
AlterTableOption::TwcsMaxActiveWindowRuns(max_active_window_runs)
}
AlterTableOption::TwcsMaxInactiveWindowFiles(_) => {
let max_inactive_window_files: u64 = rng.random();
AlterTableOption::TwcsMaxInactiveWindowFiles(max_inactive_window_files)
AlterTableOption::TwcsTriggerFileNum(_) => {
let trigger_file_num: u64 = rng.random();
AlterTableOption::TwcsTriggerFileNum(trigger_file_num)
}
})
.collect();
@@ -365,7 +353,7 @@ mod tests {
.generate(&mut rng)
.unwrap();
let serialized = serde_json::to_string(&expr).unwrap();
let expected = r#"{"table_name":{"value":"quasi","quote_style":null},"alter_kinds":{"SetTableOptions":{"options":[{"TwcsMaxOutputFileSize":16770910638250818741}]}}}"#;
let expected = r#"{"table_name":{"value":"quasi","quote_style":null},"alter_kinds":{"SetTableOptions":{"options":[{"TwcsTimeWindow":{"value":2428665013,"unit":"Millisecond"}}]}}}"#;
assert_eq!(expected, serialized);
let expr = AlterExprUnsetTableOptionsGeneratorBuilder::default()
@@ -375,7 +363,7 @@ mod tests {
.generate(&mut rng)
.unwrap();
let serialized = serde_json::to_string(&expr).unwrap();
let expected = r#"{"table_name":{"value":"quasi","quote_style":null},"alter_kinds":{"UnsetTableOptions":{"keys":["compaction.twcs.max_active_window_runs","compaction.twcs.max_output_file_size","compaction.twcs.time_window","compaction.twcs.max_inactive_window_files","compaction.twcs.max_active_window_files"]}}}"#;
let expected = r#"{"table_name":{"value":"quasi","quote_style":null},"alter_kinds":{"UnsetTableOptions":{"keys":["compaction.twcs.trigger_file_num","compaction.twcs.time_window"]}}}"#;
assert_eq!(expected, serialized);
}
}

View File

@@ -21,9 +21,8 @@ use common_time::{Duration, FOREVER, INSTANT};
use derive_builder::Builder;
use serde::{Deserialize, Serialize};
use store_api::mito_engine_options::{
APPEND_MODE_KEY, COMPACTION_TYPE, TTL_KEY, TWCS_MAX_ACTIVE_WINDOW_FILES,
TWCS_MAX_ACTIVE_WINDOW_RUNS, TWCS_MAX_INACTIVE_WINDOW_FILES, TWCS_MAX_INACTIVE_WINDOW_RUNS,
TWCS_MAX_OUTPUT_FILE_SIZE, TWCS_TIME_WINDOW,
APPEND_MODE_KEY, COMPACTION_TYPE, TTL_KEY, TWCS_MAX_OUTPUT_FILE_SIZE, TWCS_TIME_WINDOW,
TWCS_TRIGGER_FILE_NUM,
};
use strum::EnumIter;
@@ -78,10 +77,7 @@ pub enum AlterTableOption {
Ttl(Ttl),
TwcsTimeWindow(Duration),
TwcsMaxOutputFileSize(ReadableSize),
TwcsMaxInactiveWindowFiles(u64),
TwcsMaxActiveWindowFiles(u64),
TwcsMaxInactiveWindowRuns(u64),
TwcsMaxActiveWindowRuns(u64),
TwcsTriggerFileNum(u64),
}
impl AlterTableOption {
@@ -90,10 +86,7 @@ impl AlterTableOption {
AlterTableOption::Ttl(_) => TTL_KEY,
AlterTableOption::TwcsTimeWindow(_) => TWCS_TIME_WINDOW,
AlterTableOption::TwcsMaxOutputFileSize(_) => TWCS_MAX_OUTPUT_FILE_SIZE,
AlterTableOption::TwcsMaxInactiveWindowFiles(_) => TWCS_MAX_INACTIVE_WINDOW_FILES,
AlterTableOption::TwcsMaxActiveWindowFiles(_) => TWCS_MAX_ACTIVE_WINDOW_FILES,
AlterTableOption::TwcsMaxInactiveWindowRuns(_) => TWCS_MAX_INACTIVE_WINDOW_RUNS,
AlterTableOption::TwcsMaxActiveWindowRuns(_) => TWCS_MAX_ACTIVE_WINDOW_RUNS,
AlterTableOption::TwcsTriggerFileNum(_) => TWCS_TRIGGER_FILE_NUM,
}
}
@@ -111,21 +104,9 @@ impl AlterTableOption {
};
Ok(AlterTableOption::Ttl(ttl))
}
TWCS_MAX_ACTIVE_WINDOW_RUNS => {
let runs = value.parse().unwrap();
Ok(AlterTableOption::TwcsMaxActiveWindowRuns(runs))
}
TWCS_MAX_ACTIVE_WINDOW_FILES => {
TWCS_TRIGGER_FILE_NUM => {
let files = value.parse().unwrap();
Ok(AlterTableOption::TwcsMaxActiveWindowFiles(files))
}
TWCS_MAX_INACTIVE_WINDOW_RUNS => {
let runs = value.parse().unwrap();
Ok(AlterTableOption::TwcsMaxInactiveWindowRuns(runs))
}
TWCS_MAX_INACTIVE_WINDOW_FILES => {
let files = value.parse().unwrap();
Ok(AlterTableOption::TwcsMaxInactiveWindowFiles(files))
Ok(AlterTableOption::TwcsTriggerFileNum(files))
}
TWCS_MAX_OUTPUT_FILE_SIZE => {
// may be "1M" instead of "1 MiB"
@@ -178,17 +159,8 @@ impl Display for AlterTableOption {
// Caution: to_string loses precision for ReadableSize
write!(f, "'{}' = '{}'", TWCS_MAX_OUTPUT_FILE_SIZE, s)
}
AlterTableOption::TwcsMaxInactiveWindowFiles(u) => {
write!(f, "'{}' = '{}'", TWCS_MAX_INACTIVE_WINDOW_FILES, u)
}
AlterTableOption::TwcsMaxActiveWindowFiles(u) => {
write!(f, "'{}' = '{}'", TWCS_MAX_ACTIVE_WINDOW_FILES, u)
}
AlterTableOption::TwcsMaxInactiveWindowRuns(u) => {
write!(f, "'{}' = '{}'", TWCS_MAX_INACTIVE_WINDOW_RUNS, u)
}
AlterTableOption::TwcsMaxActiveWindowRuns(u) => {
write!(f, "'{}' = '{}'", TWCS_MAX_ACTIVE_WINDOW_RUNS, u)
AlterTableOption::TwcsTriggerFileNum(u) => {
write!(f, "'{}' = '{}'", TWCS_TRIGGER_FILE_NUM, u)
}
}
}
@@ -212,21 +184,15 @@ mod tests {
]
);
let option_string = "compaction.twcs.max_active_window_files = '5030469694939972912',
compaction.twcs.max_active_window_runs = '8361168990283879099',
compaction.twcs.max_inactive_window_files = '6028716566907830876',
compaction.twcs.max_inactive_window_runs = '10622283085591494074',
let option_string = "compaction.twcs.trigger_file_num = '5030469694939972912',
compaction.twcs.max_output_file_size = '15686.4PiB',
compaction.twcs.time_window = '2061999256ms',
compaction.type = 'twcs',
ttl = '1month 3days 15h 49m 8s 279ms'";
let options = AlterTableOption::parse_kv_pairs(option_string).unwrap();
assert_eq!(options.len(), 7);
assert_eq!(options.len(), 4);
let expected = vec![
AlterTableOption::TwcsMaxActiveWindowFiles(5030469694939972912),
AlterTableOption::TwcsMaxActiveWindowRuns(8361168990283879099),
AlterTableOption::TwcsMaxInactiveWindowFiles(6028716566907830876),
AlterTableOption::TwcsMaxInactiveWindowRuns(10622283085591494074),
AlterTableOption::TwcsTriggerFileNum(5030469694939972912),
AlterTableOption::TwcsMaxOutputFileSize(ReadableSize::from_str("15686.4PiB").unwrap()),
AlterTableOption::TwcsTimeWindow(Duration::new_nanosecond(2_061_999_256_000_000)),
AlterTableOption::Ttl(Ttl::Duration(Duration::new_millisecond(

View File

@@ -191,10 +191,7 @@ mod tests {
AlterTableOption::Ttl(Ttl::Duration(Duration::new_second(60))),
AlterTableOption::TwcsTimeWindow(Duration::new_second(60)),
AlterTableOption::TwcsMaxOutputFileSize(ReadableSize::from_str("1GB").unwrap()),
AlterTableOption::TwcsMaxActiveWindowFiles(10),
AlterTableOption::TwcsMaxActiveWindowRuns(10),
AlterTableOption::TwcsMaxInactiveWindowFiles(5),
AlterTableOption::TwcsMaxInactiveWindowRuns(5),
AlterTableOption::TwcsTriggerFileNum(5),
],
},
};
@@ -204,10 +201,7 @@ mod tests {
"ALTER TABLE test SET 'ttl' = '60s', ",
"'compaction.twcs.time_window' = '60s', ",
"'compaction.twcs.max_output_file_size' = '1.0GiB', ",
"'compaction.twcs.max_active_window_files' = '10', ",
"'compaction.twcs.max_active_window_runs' = '10', ",
"'compaction.twcs.max_inactive_window_files' = '5', ",
"'compaction.twcs.max_inactive_window_runs' = '5';"
"'compaction.twcs.trigger_file_num' = '5';"
);
assert_eq!(expected, output);
}

View File

@@ -187,10 +187,7 @@ mod tests {
AlterTableOption::Ttl(Ttl::Duration(Duration::new_second(60))),
AlterTableOption::TwcsTimeWindow(Duration::new_second(60)),
AlterTableOption::TwcsMaxOutputFileSize(ReadableSize::from_str("1GB").unwrap()),
AlterTableOption::TwcsMaxActiveWindowFiles(10),
AlterTableOption::TwcsMaxActiveWindowRuns(10),
AlterTableOption::TwcsMaxInactiveWindowFiles(5),
AlterTableOption::TwcsMaxInactiveWindowRuns(5),
AlterTableOption::TwcsTriggerFileNum(10),
],
},
};
@@ -200,10 +197,7 @@ mod tests {
"ALTER TABLE test SET 'ttl' = '60s', ",
"'compaction.twcs.time_window' = '60s', ",
"'compaction.twcs.max_output_file_size' = '1.0GiB', ",
"'compaction.twcs.max_active_window_files' = '10', ",
"'compaction.twcs.max_active_window_runs' = '10', ",
"'compaction.twcs.max_inactive_window_files' = '5', ",
"'compaction.twcs.max_inactive_window_runs' = '5';"
"'compaction.twcs.trigger_file_num' = '10';",
);
assert_eq!(expected, output);
}