* feat: divide parquet and puffin index
Signed-off-by: evenyag <realevenyag@gmail.com>
* feat: download index files when we open the region
Signed-off-by: evenyag <realevenyag@gmail.com>
* feat: use different label for parquet/puffin
Signed-off-by: evenyag <realevenyag@gmail.com>
* feat: control parallelism and cache size by env
Signed-off-by: evenyag <realevenyag@gmail.com>
* fix: change gauge to counter
Signed-off-by: evenyag <realevenyag@gmail.com>
* fix: correct file type labels in file cache
Signed-off-by: evenyag <realevenyag@gmail.com>
* refactor: move env to config and change cache ratio to percent
Signed-off-by: evenyag <realevenyag@gmail.com>
* feat: checks capacity before download and refine metrics
Signed-off-by: evenyag <realevenyag@gmail.com>
* refactor: change open to return MitoRegionRef
Signed-off-by: evenyag <realevenyag@gmail.com>
* refactor: extract download to FileCache
Signed-off-by: evenyag <realevenyag@gmail.com>
* feat: run load cache task in write cache
Signed-off-by: evenyag <realevenyag@gmail.com>
* feat: check region state before downloading files
Signed-off-by: evenyag <realevenyag@gmail.com>
* chore: update config docs and test
Signed-off-by: evenyag <realevenyag@gmail.com>
* fix: use file id from index_file_id to compute puffin key
Signed-off-by: evenyag <realevenyag@gmail.com>
* fix: skip loading cache in some states
Signed-off-by: evenyag <realevenyag@gmail.com>
---------
Signed-off-by: evenyag <realevenyag@gmail.com>
* fix/region-expire-state:
Refactor region state handling in compaction task and manifest updates
- Introduce a variable to hold the current region state for clarity in compaction task updates.
- Add an expected_region_state field to RegionEditResult to manage region state expectations during manifest handling.
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
* fix/region-expire-state:
Refactor region state handling in compaction task
- Replace direct assignment of `RegionLeaderState::Writable` with dynamic state retrieval and conditional check for leader state.
- Modify `RegionEditResult` to include a flag `update_region_state` instead of `expected_region_state` to indicate if the region state should be updated to writable.
- Adjust handling of `RegionEditResult` in `handle_manifest` to conditionally update region state based on the new flag.
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
---------
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
* feat/allow-fuzz-input-override:
Add environment override for fuzzing parameters and seed values
- Implement `get_fuzz_override` function to read override values from environment variables for fuzzing parameters.
- Allow overriding `SEED`, `ACTIONS`, `ROWS`, `TABLES`, `COLUMNS`, `INSERTS`, and `PARTITIONS` in various fuzzing targets.
- Introduce new constants `GT_FUZZ_INPUT_MAX_PARTITIONS` and `FUZZ_OVERRIDE_PREFIX`.
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
* feat/allow-fuzz-input-override: Remove GT_FUZZ_INPUT_MAX_PARTITIONS constant and usage from fuzzing utils and tests
• Deleted the GT_FUZZ_INPUT_MAX_PARTITIONS constant from fuzzing utility functions.
• Updated FuzzInput struct in fuzz_migrate_mito_regions.rs to use a hardcoded range instead of get_gt_fuzz_input_max_partitions for determining the number of partitions.
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
* feat/allow-fuzz-input-override:
Improve fuzzing documentation with environment variable overrides
Enhanced the fuzzing instructions in the README to include guidance on how to override fuzz input using environment variables, providing an example for better clarity.
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
---------
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
* feat(expr): support vec_elem_avg function
Signed-off-by: Alan Tang <jmtangcs@gmail.com>
* feat: support vec_avg function
Signed-off-by: Alan Tang <jmtangcs@gmail.com>
* test: add more query test for avg aggregator
Signed-off-by: Alan Tang <jmtangcs@gmail.com>
* fix: fix the merge batch mode
Signed-off-by: Alan Tang <jmtangcs@gmail.com>
* refactor: use sum and count as state for avg function
Signed-off-by: Alan Tang <jmtangcs@gmail.com>
* refactor: refactor merge batch mode for avg function
Signed-off-by: Alan Tang <jmtangcs@gmail.com>
* feat: add additional vector restrictions for validation
Signed-off-by: Alan Tang <jmtangcs@gmail.com>
---------
Signed-off-by: Alan Tang <jmtangcs@gmail.com>
Co-authored-by: Yingwen <realevenyag@gmail.com>
* fix/pick-continue:
### Add Tests for TWCS Compaction Logic
- **`twcs.rs`**:
- Modified the logic in `TwcsPicker` to handle cases with zero runs by using `continue` instead of `return`.
- Added two new test cases: `test_build_output_multiple_windows_with_zero_runs` and `test_build_output_single_window_zero_runs` to verify the behavior of the compaction logic when there are zero runs in
the windows.
- **`memtable_util.rs`**:
- Removed unused import `PredicateGroup`.
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
* fix: clippy
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
* fix/pick-continue:
### Commit Message
Enhance Compaction Process with Expired SST Handling and Testing
- **`compactor.rs`**:
- Introduced handling for expired SSTs by updating the manifest immediately upon task completion.
- Added new test cases to verify the handling of expired SSTs and manifest updates.
- **`task.rs`**:
- Implemented `remove_expired` function to handle expired SSTs by updating the manifest and notifying the region worker loop.
- Refactored `handle_compaction` to `handle_expiration_and_compaction` to integrate expired SST removal before merging inputs.
- Added logging and error handling for expired SST removal process.
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
* refactor/progressive-compaction:
**Enhance Compaction Task Error Handling**
- Updated `task.rs` to conditionally execute the removal of expired SST files only when they exist, improving error handling and performance.
- Added a check for non-empty `expired_ssts` before initiating the removal process, ensuring unnecessary operations are avoided.
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
* refactor/progressive-compaction:
### Refactor `DefaultCompactor` to Extract `merge_single_output` Method
- **File**: `src/mito2/src/compaction/compactor.rs`
- Extracted the logic for merging a single compaction output into SST files into a new method `merge_single_output` within the `DefaultCompactor` struct.
- Simplified the `merge_ssts` method by utilizing the new `merge_single_output` method, reducing code duplication and improving maintainability.
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
* refactor/progressive-compaction:
### Add Max Background Compaction Tasks Configuration
- **`compaction.rs`**: Added `max_background_compactions` to the compaction scheduler to limit background tasks.
- **`compaction/compactor.rs`**: Removed immediate manifest update logic after task completion.
- **`compaction/picker.rs`**: Introduced `max_background_tasks` parameter in `new_picker` to control task limits.
- **`compaction/twcs.rs`**: Updated `TwcsPicker` to include `max_background_tasks` and truncate inputs exceeding this limit. Added related test cases to ensure functionality.
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
* fix/pick-continue:
### Improve Error Handling and Task Management in Compaction
- **`task.rs`**: Enhanced error handling in `remove_expired` function by logging errors without halting the compaction process. Removed the return of `Result` type and added detailed logging for various
failure scenarios.
- **`twcs.rs`**: Adjusted task management logic by removing input truncation based on `max_background_tasks` and instead discarding remaining tasks if the output size exceeds the limit. This ensures better
control over task execution and resource management.
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
* fix/pick-continue:
### Add Unit Tests for Compaction Task and TWCS Picker
- **`task.rs`**: Added unit tests to verify the behavior of `PickerOutput` with and without expired SSTs.
- **`twcs.rs`**: Introduced tests for `TwcsPicker` to ensure correct handling of `max_background_tasks` during compaction, including scenarios with and without task truncation.
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
* fix/pick-continue:
**Improve Error Handling and Notification in Compaction Task**
- **File:** `task.rs`
- Changed log level from `warn` to `error` for manifest update failures to enhance error visibility.
- Refactored the notification mechanism for expired file removal by using `BackgroundNotify::RegionEdit` with `RegionEditResult` to streamline the process.
- Simplified error handling by consolidating match cases into a single `if let Err` block for better readability and maintainability.
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
---------
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
* test: add tests for scanning append mode before flush
Signed-off-by: evenyag <realevenyag@gmail.com>
* refactor: extract a function maybe_dedup_one
Signed-off-by: evenyag <realevenyag@gmail.com>
* ci: add flat format to docs.yml so we can make it required later
Signed-off-by: evenyag <realevenyag@gmail.com>
---------
Signed-off-by: evenyag <realevenyag@gmail.com>
* mito2: add unit test for flat single-range append_mode dedup behavior
Verify memtable_flat_sources skips dedup when append_mode is true and
performs dedup otherwise for single-range flat memtables, preventing
regressions in the new append_mode path.
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
* fix/flat-source-merge:
### Improve Column Metadata Extraction Logic
- **File**: `src/common/meta/src/ddl/utils.rs`
- Modified the `extract_column_metadatas` function to use `swap_remove` for extracting the first schema and decode column metadata for comparison instead of raw bytes. This ensures that the extension map is considered during
verification, enhancing the robustness of metadata consistency checks across datanodes.
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
---------
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
* feat: add arrow json extension type
* feat: add json structure settings to extension type
* refactor: store json structure settings as extension metadata
* chore: make binary an acceptable type for extension
* chore/add-region-insert-failure-metric: Add metric for failed insert requests to region server in datanode module
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
* chore/add-region-insert-failure-metric:
Add metric for tracking failed region server requests
- Introduce a new metric `REGION_SERVER_REQUEST_FAILURE_COUNT` to count failed region server requests.
- Update `REGION_SERVER_INSERT_FAIL_COUNT` metric description for consistency.
- Implement error handling in `RegionServerHandler` to increment the new failure metric on request errors.
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
---------
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
* fix: potential failure in the test_index_build_type_compact test
Signed-off-by: Dennis Zhuang <killme2008@gmail.com>
* fix: relax timestamp checking in test_timestamp_default_now
Signed-off-by: Dennis Zhuang <killme2008@gmail.com>
---------
Signed-off-by: Dennis Zhuang <killme2008@gmail.com>