greptimedb

mirror of https://github.com/GreptimeTeam/greptimedb.git synced 2026-05-15 12:30:38 +00:00

Author	SHA1	Message	Date
Weny Xu	80c5af0ecf	fix: ignore incomplete WAL entries during read (#6251 ) * fix: ignore incomplete entry * fix: fix unit tests	2025-06-04 11:16:42 +00:00
Lei, HUANG	fdd164c0fa	fix(mito): revert initial builder capacity for TimeSeriesMemtable (#6231 ) * fix/initial-builder-cap: ### Enhance Series Initialization and Capacity Management - `simple_bulk_memtable.rs`: Updated the `Series` initialization to use `with_capacity` with a specified capacity of 8192, improving memory management. - `time_series.rs`: Introduced `with_capacity` method in `Series` to allow custom initial capacity for `ValueBuilder`. Adjusted `INITIAL_BUILDER_CAPACITY` to 16 for more efficient memory usage. Added a new `new` method to maintain backward compatibility. * fix/initial-builder-cap: ### Adjust Memory Allocation in Memtable - `simple_bulk_memtable.rs`: Reduced the initial capacity of `Series` from 8192 to 1024 to optimize memory usage. - `time_series.rs`: Decreased `INITIAL_BUILDER_CAPACITY` from 16 to 4 to improve efficiency in vector building.	2025-06-03 08:25:02 +00:00
Zhenchi	078afb2bd6	feat: bloom filter index applier support or eq chain (#6227 ) * feat: bloom filter index applier support or eq chain Signed-off-by: Zhenchi <zhongzc_arch@outlook.com> * address comments Signed-off-by: Zhenchi <zhongzc_arch@outlook.com> --------- Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>	2025-06-03 08:08:19 +00:00
Lei, HUANG	4e615e8906	feat(wal): support bulk wal entries (#6178 ) * feat/bulk-wal: ### Refactor: Simplify Data Handling in LogStore Implementations - `kafka/log_store.rs`, `raft_engine/log_store.rs`, `wal.rs`, `raw_entry_reader.rs`, `logstore.rs`: - Refactored `entry` and `build_entry` functions to accept `Vec<u8>` directly instead of `&mut Vec<u8>`. - Removed usage of `std::mem::take` for data handling, simplifying the code and improving readability. - Updated test cases to align with the new function signatures. * feat/bulk-wal: ### Add Support for Bulk WAL Entries and Flight Data Encoding - Add `raw_data` field to `BulkPart` and related structs: Updated `BulkPart` and related structures in `src/mito2/src/memtable/bulk/part.rs`, `src/mito2/src/memtable/simple_bulk_memtable.rs`, `src/mito2/src/memtable/time_partition.rs`, `src/mito2/src/region_write_ctx.rs`, `src/mito2/src/worker/handle_bulk_insert.rs`, and `src/store-api/src/region_request.rs` to include a new `raw_data` field for handling Arrow IPC data. - Implement Flight Data Encoding: Added a new module `flight` in `src/common/test-util/src/flight.rs` to encode record batches to Flight data format. - Update `greptime-proto` dependency: Changed the revision of the `greptime-proto` dependency in `Cargo.lock` and `Cargo.toml`. - Enhance WAL Writer and Tests: Modified `src/mito2/src/wal.rs` and related test files to support bulk WAL entries and added tests for encoding and handling bulk data. * feat/bulk-wal: - Update `greptime-proto` Dependency: Updated the `greptime-proto` dependency to a new revision in `Cargo.lock` and `Cargo.toml`. - Add `common-grpc` Dependency: Added `common-grpc` as a dependency in `Cargo.lock` and `src/mito2/Cargo.toml`. - Refactor `BulkPart` Structure: Removed `num_rows` field and added `num_rows()` method in `src/mito2/src/memtable/bulk/part.rs`. Updated related usages in `src/mito2/src/memtable/simple_bulk_memtable.rs`, `src/mito2/src/memtable/time_partition.rs`, `src/mito2/src/memtable/time_series.rs`, `src/mito2/src/region_write_ctx.rs`, and `src/mito2/src/worker/handle_bulk_insert.rs`. - Implement `TryFrom` and `From` for `BulkWalEntry`: Added implementations for converting between `BulkPart` and `BulkWalEntry` in `src/mito2/src/memtable/bulk/part.rs`. - Handle Bulk Entries in Region Opener: Added logic to process bulk entries in `src/mito2/src/region/opener.rs`. - Fix `BulkInsertRequest` Handling: Corrected `region_id` handling in `src/operator/src/bulk_insert.rs` and `src/store-api/src/region_request.rs`. - Add Error Variant for `ConvertBulkWalEntry`: Added a new error variant in `src/mito2/src/error.rs` for handling bulk WAL entry conversion errors. * fix: ci * feat/bulk-wal: Add bulk write operation in `opener.rs` - Enhanced the region write context by adding a call to `write_bulk()` after `write_memtable()` in `opener.rs`. - This change aims to improve the efficiency of writing operations by enabling bulk writes. * feat/bulk-wal: Enhance error handling and metrics in `bulk_insert.rs` - Updated `Inserter` to improve error handling by capturing the result of `datanode.handle(request)` and incrementing the `DIST_INGEST_ROW_COUNT` metric with the number of affected rows. * feat/bulk-wal: ### Remove Encode Error Handling for WAL Entries - `error.rs`: Removed the `EncodeWal` error variant and its associated handling. - `wal.rs`: Eliminated the `entry_encode_buf` buffer and its usage for encoding WAL entries. Replaced with direct encoding to a vector using `encode_to_vec()`.	2025-05-29 09:10:30 +00:00
Lei, HUANG	4b71e493f7	feat!: revise compaction picker (#6121 ) * - Refactor `RegionFilePathFactory` to `RegionFilePathProvider`: Updated references and implementations in `access_layer.rs`, `write_cache.rs`, and related test files to use the new struct name. - Add `max_file_size` support in compaction: Introduced `max_file_size` option in `PickerOutput`, `SerializedPickerOutput`, and `WriteOptions` in `compactor.rs`, `picker.rs`, `twcs.rs`, and `window.rs`. - Enhance Parquet writing logic: Modified `parquet.rs` and `parquet/writer.rs` to support optional `max_file_size` and added a test case `test_write_multiple_files` to verify writing multiple files based on size constraints. Refactor Parquet Writer Initialization and File Handling - Updated `ParquetWriter` in `writer.rs` to handle `current_indexer` as an `Option`, allowing for more flexible initialization and management. - Introduced `finish_current_file` method to encapsulate logic for completing and transitioning between SST files, improving code clarity and maintainability. - Enhanced error handling and logging with `debug` statements for better traceability during file operations. - Removed Output Size Enforcement in `twcs.rs`: - Deleted the `enforce_max_output_size` function and related logic to simplify compaction input handling. - Added Max File Size Option in `parquet.rs`: - Introduced `max_file_size` in `WriteOptions` to control the maximum size of output files. - Refactored Indexer Management in `parquet/writer.rs`: - Changed `current_indexer` from an `Option` to a direct `Indexer` type. - Implemented `roll_to_next_file` to handle file transitions when exceeding `max_file_size`. - Simplified indexer initialization and management logic. - Refactored SST File Handling: - Introduced `FilePathProvider` trait and its implementations (`WriteCachePathProvider`, `RegionFilePathFactory`) to manage SST and index file paths. - Updated `AccessLayer`, `WriteCache`, and `ParquetWriter` to use `FilePathProvider` for path management. - Modified `SstWriteRequest` and `SstUploadRequest` to use path providers instead of direct paths. - Files affected: `access_layer.rs`, `write_cache.rs`, `parquet.rs`, `writer.rs`. - Enhanced Indexer Management: - Replaced `IndexerBuilder` with `IndexerBuilderImpl` and made it async to support dynamic indexer creation. - Updated `ParquetWriter` to handle multiple indexers and file IDs. - Files affected: `index.rs`, `parquet.rs`, `writer.rs`. - Removed Redundant File ID Handling: - Removed `file_id` from `SstWriteRequest` and `CompactionOutput`. - Updated related logic to dynamically generate file IDs where necessary. - Files affected: `compaction.rs`, `flush.rs`, `picker.rs`, `twcs.rs`, `window.rs`. - Test Adjustments: - Updated tests to align with new path and indexer management. - Introduced `FixedPathProvider` and `NoopIndexBuilder` for testing purposes. - Files affected: `sst_util.rs`, `version_util.rs`, `parquet.rs`. * chore: rebase main * feat/multiple-compaction-output: ### Add Benchmarking and Refactor Compaction Logic - Benchmarking: Added a new benchmark `run_bench` in `Cargo.toml` and implemented benchmarks in `benches/run_bench.rs` using Criterion for `find_sorted_runs` and `reduce_runs` functions. - Compaction Module Enhancements: - Made `run.rs` public and refactored the `Ranged` and `Item` traits to be public. - Simplified the logic in `find_sorted_runs` and `reduce_runs` by removing `MergeItems` and related functions. - Introduced `find_overlapping_items` for identifying overlapping items. - Code Cleanup: Removed redundant code and tests related to `MergeItems` in `run.rs`. * feat/multiple-compaction-output: ### Enhance Compaction Logic and Add Benchmarks - Compaction Logic Improvements: - Updated `reduce_runs` function in `src/mito2/src/compaction/run.rs` to remove the target parameter and improve the logic for selecting files to merge based on minimum penalty. - Enhanced `find_overlapping_items` to handle unsorted inputs and improve overlap detection efficiency. - Benchmark Enhancements: - Added `bench_find_overlapping_items` in `src/mito2/benches/run_bench.rs` to benchmark the new `find_overlapping_items` function. - Extended existing benchmarks to include larger data sizes. - Testing Enhancements: - Updated tests in `src/mito2/src/compaction/run.rs` to reflect changes in `reduce_runs` and added new tests for `find_overlapping_items`. - Logging and Debugging: - Improved logging in `src/mito2/src/compaction/twcs.rs` to provide more detailed information about compaction decisions. * feat/multiple-compaction-output: ### Refactor and Enhance Compaction Logic - Refactor `find_overlapping_items` Function: Changed the function signature to accept slices instead of mutable vectors in `run.rs`. - Rename and Update Struct Fields: Renamed `penalty` to `size` in `SortedRun` struct and updated related logic in `run.rs`. - Enhance `reduce_runs` Function: Improved logic to sort runs by size and limit probe runs to 100 in `run.rs`. - Add `merge_seq_files` Function: Introduced a new function `merge_seq_files` in `run.rs` for merging sequential files. - Modify `TwcsPicker` Logic: Updated the compaction logic to use `merge_seq_files` when only one run is found in `twcs.rs`. - Remove `enforce_file_num` Function: Deleted the `enforce_file_num` function and its related test cases in `twcs.rs`. * feat/multiple-compaction-output: ### Enhance Compaction Logic and Testing - Add `merge_seq_files` Functionality: Implemented the `merge_seq_files` function in `run.rs` to optimize file merging based on scoring systems. Updated benchmarks in `run_bench.rs` to include `bench_merge_seq_files`. - Improve Compaction Strategy in `twcs.rs`: Modified the compaction logic to handle file merging more effectively, considering file size and overlap. - Update Tests: Enhanced test coverage in `compaction_test.rs` and `append_mode_test.rs` to validate new compaction logic and file merging strategies. - Remove Unused Function: Deleted `new_file_handles` from `test_util.rs` as it was no longer needed. * feat/multiple-compaction-output: ### Refactor TWCS Compaction Options - Refactor Compaction Logic: Simplified the TWCS compaction logic by replacing multiple parameters (`max_active_window_runs`, `max_active_window_files`, `max_inactive_window_runs`, `max_inactive_window_files`) with a single `trigger_file_num` parameter in `picker.rs`, `twcs.rs`, and `options.rs`. - Update Tests: Adjusted test cases to reflect the new compaction logic in `append_mode_test.rs`, `compaction_test.rs`, `filter_deleted_test.rs`, `merge_mode_test.rs`, and various test files under `tests/cases`. - Modify Engine Options: Updated engine option keys to use `trigger_file_num` in `mito_engine_options.rs` and `region_request.rs`. - Fuzz Testing: Updated fuzz test generators and translators to accommodate the new compaction parameter in `alter_expr.rs` and related files. This refactor aims to streamline the compaction configuration by reducing the number of parameters and simplifying the codebase. * chore: add trailing space * fix license header * feat/revise-compaction-picker: Limit File Processing and Optimize Merge Logic in `run.rs` - Introduced a limit to process a maximum of 100 files in `merge_seq_files` to control time complexity. - Adjusted logic to calculate `target_size` and iterate over files using the limited set of files. - Updated scoring calculations to use the limited file set, ensuring efficient file merging. * feat/revise-compaction-picker: ### Add Compaction Metrics and Remove Debug Logging - Compaction Metrics: Introduced new histograms `COMPACTION_INPUT_BYTES` and `COMPACTION_OUTPUT_BYTES` to track compaction input and output file sizes in `metrics.rs`. Updated `compactor.rs` to observe these metrics during the compaction process. - Logging Cleanup: Removed debug logging of file ranges during the merge process in `twcs.rs`. * feat/revise-compaction-picker: ## Enhance Compaction Logic and Metrics - Compaction Logic Improvements: - Added methods `input_file_size` and `output_file_size` to `MergeOutput` in `compactor.rs` to streamline file size calculations. - Updated `Compactor` implementation to use these methods for metrics tracking. - Modified `Ranged` trait logic in `run.rs` to improve range comparison. - Enhanced test cases in `run.rs` to reflect changes in compaction logic. - Metrics Enhancements: - Changed `COMPACTION_INPUT_BYTES` and `COMPACTION_OUTPUT_BYTES` from histograms to counters in `metrics.rs` for better performance tracking. - Debugging and Logging: - Added detailed logging for compaction pick results in `twcs.rs`. - Implemented custom `Debug` trait for `FileMeta` in `file.rs` to improve debugging output. - Testing Enhancements: - Added new test `test_compaction_overlapping_files` in `compaction_test.rs` to verify compaction behavior with overlapping files. - Updated `merge_mode_test.rs` to reflect changes in file handling during scans. * feat/revise-compaction-picker: ### Update `FileHandle` Debug Implementation - Refactor Debug Output: Simplified the `fmt::Debug` implementation for `FileHandle` in `src/mito2/src/sst/file.rs` by consolidating multiple fields into a single `meta` field using `meta_ref()`. - Atomic Operations: Updated the `deleted` field to use atomic loading with `Ordering::Relaxed`. * Trigger CI * feat/revise-compaction-picker: Update compaction logic and default options - `twcs.rs`: Enhanced logging for compaction pick results by improving the formatting for better readability. - `options.rs`: Modified the default `max_output_file_size` in `TwcsOptions` from 2GB to 512MB to optimize file handling and performance. * feat/revise-compaction-picker: Refactor `find_overlapping_items` to use an external result vector - Updated `find_overlapping_items` in `src/mito2/src/compaction/run.rs` to accept a mutable result vector instead of returning a new vector, improving memory efficiency. - Modified benchmarks in `src/mito2/benches/bench_compaction_picker.rs` to accommodate the new function signature. - Adjusted tests in `src/mito2/src/compaction/run.rs` to use the updated function signature, ensuring correct functionality with the new approach. * feat/revise-compaction-picker: Improve file merging logic in `run.rs` - Refactor the loop logic in `merge_seq_files` to simplify the iteration over file groups. - Adjust the range for `end_idx` to include the endpoint, allowing for more flexible group selection. - Remove the condition that skips groups with only one file, enabling more comprehensive processing of file sequences. * feat/revise-compaction-picker: Enhance `find_overlapping_items` with `SortedRun` and Update Tests - Refactor `find_overlapping_items` in `src/mito2/src/compaction/run.rs` to utilize the `SortedRun` struct for improved efficiency and clarity. - Introduce a `sorted` flag in `SortedRun` to optimize sorting operations. - Update test cases in `src/mito2/benches/bench_compaction_picker.rs` to accommodate changes in `find_overlapping_items` by using `SortedRun`. - Add `From<Vec<T>>` implementation for `SortedRun` to facilitate easy conversion from vectors. * feat/revise-compaction-picker: Enhancements in `compaction/run.rs`: - Added `ReadableSize` import to handle size calculations. - Modified the logic in `merge_seq_files` to clamp the calculated target size to a maximum of 2GB when `max_file_size` is not provided. * feat/revise-compaction-picker: Add Default Max Output Size Constant for Compaction Introduce DEFAULT_MAX_OUTPUT_SIZE constant to define the default maximum compaction output file size as 2GB. Refactor the merge_seq_files function to utilize this constant, ensuring consistent and maintainable code for handling file size limits during compaction.	2025-05-23 03:29:08 +00:00
Lei, HUANG	5a0da5b6bb	fix: region worker stall metrics (#6149 ) fix/stall-metrics: Improve stalled request handling in `handle_write.rs` - Updated logic to account for both `write_requests` and `bulk_requests` when adjusting `stalled_count`. - Modified `reject_region_stalled_requests` and `handle_region_stalled_requests` to correctly subtract the combined length of `requests` and `bulk` from `stalled_count`.	2025-05-21 13:21:50 +00:00
Lei, HUANG	eaf7b4b9dd	chore: update flush failure metric name and update grafana dashboard (#6138 ) * 1. rename `greptime_mito_flush_errors_total` metric to `greptime_mito_flush_errors_total` for consistency 2. update grafana dashboard to add following panel: - compaction input/output bytes - bulk insert handle elasped time in frontend and region worker	2025-05-20 12:05:54 +00:00
Zhenchi	400229c384	feat: introduce index result cache (#6110 ) * feat: introduce index result cache Signed-off-by: Zhenchi <zhongzc_arch@outlook.com> * Update src/mito2/src/sst/index/inverted_index/applier/builder.rs Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * optimize selector_len Signed-off-by: Zhenchi <zhongzc_arch@outlook.com> * address comments Signed-off-by: Zhenchi <zhongzc_arch@outlook.com> * address comments Signed-off-by: Zhenchi <zhongzc_arch@outlook.com> * address comments Signed-off-by: Zhenchi <zhongzc_arch@outlook.com> --------- Signed-off-by: Zhenchi <zhongzc_arch@outlook.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>	2025-05-20 01:45:42 +00:00
Weny Xu	864cc117b3	fix: append noop entry when auto topic creation is disabled (#6092 ) * feat: improve topic management and add stale records cleanup * fix: fix unit tests * chore: apply suggestions from CR * chore: apply suggestions from CR	2025-05-16 11:26:47 +00:00
Yingwen	0ea9ab385d	fix: clean files under the atomic write dir on failure (#6112 ) * fix: remove files under atomic dir on failure * fix: clean atomic dir on download failure * chore: update comment * fix: clean if failed to write without write cache * feat: add a TempFileCleaner to clean files on failure * chore: after merge fix * chore: more fix --------- Co-authored-by: discord9 <55937128+discord9@users.noreply.github.com> Co-authored-by: discord9 <discord9@163.com>	2025-05-16 11:18:11 +00:00
Yingwen	c7e9485534	feat: New scanner `SeriesScan` to scan by series for querying metrics (#5968 ) * chore: basic methods for SeriesScan * chore: add to scanner enum * feat: implement scan logic of each partition * feat: use series scan when distribution is PerSeries * refactor: remove per series scan from SeqScan * fix: use series scan in PerSeries distribution * feat: keep parallelize_scan unchanged * fix: address compiler errors * fix: include build merge reader cost to scan cost * feat: use smallvec * chore: update comment * Revert "feat: keep parallelize_scan unchanged" This reverts commit `96ba00d175`. * assign partition_ranges Signed-off-by: Ruihang Xia <waynestxia@gmail.com> * feat: try send before send reduce the send timeout to 10ms * chore: add comments * fix: add metrics to partition metrics list * fix: correct scan cost metrics * chore: reset instant * fix: scanner metrics init * chore: display more info in explain * feat: metrics for send series timeout * style: fix clippy * refactor: use ChainedRecordBatchStream to simplify codes * chore: fix typos * feat: separate distributor metrics * feat: remove parallelize hack * chore: fix warning * test: add test for series scan * test: update sqlness test --------- Signed-off-by: Ruihang Xia <waynestxia@gmail.com> Co-authored-by: Ruihang Xia <waynestxia@gmail.com>	2025-05-16 08:53:24 +00:00
Ruihang Xia	57b53211d9	feat: don't hide atomic write dir (#6109 ) * feat: don't hidden atomic write dir Signed-off-by: Ruihang Xia <waynestxia@gmail.com> * compatible code Signed-off-by: Ruihang Xia <waynestxia@gmail.com> * Update src/mito2/src/access_layer.rs Co-authored-by: Yingwen <realevenyag@gmail.com> --------- Signed-off-by: Ruihang Xia <waynestxia@gmail.com> Co-authored-by: Yingwen <realevenyag@gmail.com>	2025-05-16 06:21:13 +00:00
Lei, HUANG	5a9023d6b3	feat(bulk): write to multiple time partitions (#6086 ) * add benchmark for splitting according to time partition * feat/write-to-multiple-time-partitions: Enhancements to Bulk Processing and Time Partitioning - `part.rs`: Added `Snafu` to imports and introduced `timestamp_index` in `BulkPart` struct. Implemented `timestamps` method for accessing timestamp columns. - `simple_bulk_memtable.rs`: Updated tests to include `timestamp_index` initialization. - `time_partition.rs`: Enhanced `TimePartition` to support partial writes with `write_record_batch_partial`. Implemented `split_record_batch` for filtering records by timestamp range. Added comprehensive tests for `split_record_batch`. - `handle_bulk_insert.rs`: Modified to retrieve timestamp index and column together, updating `BulkPart` initialization with `timestamp_index`. * feat/write-to-multiple-time-partitions: ### Enhance Time Partitioning Logic - `time_partition.rs`: - Introduced `HashSet` for efficient partition management. - Refactored `write_bulk` to handle multiple partitions and added `find_partitions_by_time_range` for identifying existing and missing partitions. - Updated `get_or_create_time_partition` to manage partition creation. - Added comprehensive tests for partition finding logic, covering various scenarios including overlapping and non-overlapping time ranges. - Tests: - Added `test_find_partitions_by_time_range` to validate new partitioning logic. - Updated `test_split_record_batch` to ensure correct record batch splitting behavior. * feat/write-to-multiple-time-partitions: ### Enhance Time Partitioning and Testing in `time_partition.rs` - Time Partitioning Enhancements: - Updated `split_record_batch` to handle multiple timestamp units (`Second`, `Millisecond`, `Microsecond`, `Nanosecond`) by matching on `DataType`. - Improved filtering logic for timestamp arrays to support various time units. - Testing Enhancements: - Added `test_write_bulk` to verify writing across multiple partitions and scenarios in `time_partition.rs`. - Updated `test_split_record_batch` to use `TimestampMillisecondArray` for testing timestamp partitioning. - Imports and Dependencies: - Added necessary imports for new timestamp array types and testing utilities. * feat/write-to-multiple-time-partitions: ### Refactor and Enhance Time Partition Filtering - Refactor Filtering Logic: Consolidated the filtering logic for timestamp arrays using macros in `time_partition.rs` and `bench_filter_time_partition.rs`. This reduces code duplication and improves maintainability. - Enhance `BulkPart` Struct: Made fields in `BulkPart` public to facilitate easier access and manipulation in `memtable.rs` and `part.rs`. - Rename Function: Renamed `split_record_batch` to `filter_record_batch` for clarity in `time_partition.rs` and `bench_filter_time_partition.rs`. - Add Feature Flag: Introduced `int_roundings` feature in `lib.rs` to support new functionality. * refactor tests * feat/write-to-multiple-time-partitions: Improve timestamp handling in `time_partition.rs` - Enhanced safety comments for timestamp conversion to ensure clarity. - Modified logic to prevent overflow by using `div_euclid` for `bulk_start_sec` and `bulk_end_sec` calculations. - Adjusted the `filter_map` logic to correctly compute timestamps using `start_sec` and `part_duration_sec`. * feat/write-to-multiple-time-partitions: Refactor timestamp handling and add utility function - Refactor `time_partition.rs`: Simplified timestamp handling by replacing direct type access with a utility function to retrieve the timestamp unit. Improved error handling for timestamp conversion. - Enhance `metadata.rs`: Added `time_index_type` function to `RegionMetadata` to retrieve the timestamp type of the time index column, ensuring safer and more readable code. * feat/write-to-multiple-time-partitions: Refactor time partition variable names in `time_partition.rs` - Renamed variables for clarity: `bulk_start_sec` to `start_bucket` and `bulk_end_sec` to `end_bucket`. - Updated related logic to use new variable names for improved readability and maintainability. * feat/write-to-multiple-time-partitions: Refactor variable names in `time_partition.rs` - Updated variable names from `matching` and `missing` to `matchings` and `missings` for clarity and consistency. - Modified function calls and loop iterations to align with the new variable names. - Affected file: `src/mito2/src/memtable/time_partition.rs` * feat/write-to-multiple-time-partitions: ### Refactor variable names in `time_partition.rs` - Updated variable names for clarity in `time_partition.rs`: - Renamed `matchings` to `matching_parts` - Renamed `missings` to `missing_parts` - Adjusted logic to use new variable names in methods `find_partitions_by_time_range` and `write_record_batch`. * feat/write-to-multiple-time-partitions: ### Enhance Time Partition Handling - `time_partition.rs`: - Added `ArrayRef` to handle timestamp arrays, improving the partitioning logic by allowing more efficient timestamp range checks. - Enhanced `find_partitions_by_time_range` to support sparse data and handle different timestamp units (`Second`, `Millisecond`, `Microsecond`, `Nanosecond`). - Updated test cases to cover new scenarios, including sparse data and edge cases, ensuring robustness of partition handling. --------- Co-authored-by: Lei <lei@Leis-MacBook-Pro.local>	2025-05-14 05:09:59 +00:00
Ruihang Xia	bbb6f8685e	feat: implement commutativity rule for prom-related plans (#5875 ) * feat: implement commutativity rule for prom-related plans Signed-off-by: Ruihang Xia <waynestxia@gmail.com> * fix range manipulate deserializer Signed-off-by: Ruihang Xia <waynestxia@gmail.com> * blocklist in commutativity rule Signed-off-by: Ruihang Xia <waynestxia@gmail.com> * change dictionary type Signed-off-by: Ruihang Xia <waynestxia@gmail.com> * handle partition and ordering Signed-off-by: Ruihang Xia <waynestxia@gmail.com> * fix clippy Signed-off-by: Ruihang Xia <waynestxia@gmail.com> * update tests Signed-off-by: Ruihang Xia <waynestxia@gmail.com> * add rate, increase and delta Signed-off-by: Ruihang Xia <waynestxia@gmail.com> * update sqlness result Signed-off-by: Ruihang Xia <waynestxia@gmail.com> * regexp_replace uses empty string instead of null value Signed-off-by: Ruihang Xia <waynestxia@gmail.com> * update sqlness result Signed-off-by: Ruihang Xia <waynestxia@gmail.com> * update sqlness result Signed-off-by: Ruihang Xia <waynestxia@gmail.com> * update sqlness result again Signed-off-by: Ruihang Xia <waynestxia@gmail.com> --------- Signed-off-by: Ruihang Xia <waynestxia@gmail.com>	2025-05-13 09:06:25 +00:00
Yingwen	ca1641d1c4	feat: implement PlainBatch struct (#6079 ) * feat: implement PlainBatch struct * chore: typo * style: fix clippy * feat: assert num columns	2025-05-13 05:56:12 +00:00
Zhenchi	36d9346ffc	refactor: introduce row group selection (#6075 ) Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>	2025-05-12 07:15:17 +00:00
discord9	6ab0f0cc5c	fix: alter table modify type should also modify default value (#6049 ) * fix: select after alter * fix: insert a proper row&catch a bug * fix: alter table modify type modify default value type too * refactor: per review * chore: per review * refactor: per review * refactor: per review	2025-05-09 03:40:59 +00:00
Lei, HUANG	8685ceb232	feat: impl bulk memtable and bridge bulk inserts (#6054 ) * feat/bridge-bulk-insert: ## Implement Bulk Insert and Update Dependencies - Bulk Insert Implementation: Added `handle_bulk_inserts` method in `src/operator/src/bulk_insert.rs` to manage bulk insert requests using `FlightDecoder` and `FlightData`. - Dependency Updates: Updated `Cargo.lock` and `Cargo.toml` to use the latest revision of `greptime-proto` and added new dependencies like `arrow`, `arrow-ipc`, `bytes`, and `prost`. - gRPC Enhancements: Modified `put_record_batch` method in `src/frontend/src/instance/grpc.rs` and `src/servers/src/grpc/flight.rs` to handle `FlightData` instead of `RawRecordBatch`. - Error Handling: Added new error types in `src/operator/src/error.rs` for handling Arrow operations and decoding flight data. - Miscellaneous: Updated `src/operator/src/insert.rs` to expose `partition_manager` and `node_manager` as public fields. * feat/bridge-bulk-insert: - Update `greptime-proto` Dependency: Updated the `greptime-proto` dependency to a new revision in `Cargo.lock` and `Cargo.toml`. - Refactor gRPC Query Handling: Removed `RawRecordBatch` usage from `grpc.rs`, `flight.rs`, `greptime_handler.rs`, and test files, simplifying the gRPC query handling. - Enhance Bulk Insert Logic: Improved bulk insert logic in `bulk_insert.rs` and `region_request.rs` by using `FlightDecoder` and `BooleanArray` for better performance and clarity. - Add `common-grpc` Dependency: Added `common-grpc` as a workspace dependency in `store-api/Cargo.toml` to support gRPC functionalities. * fix: clippy * fix schema serialization * feat/bridge-bulk-insert: Add error handling for encoding/decoding in `metadata.rs` and `region_request.rs` - Introduced new error variants `FlightCodec` and `Prost` in `MetadataError` to handle encoding/decoding failures in `metadata.rs`. - Updated `make_region_bulk_inserts` function in `region_request.rs` to use `context` for error handling with `ProstSnafu` and `FlightCodecSnafu`. - Enhanced error handling for `FlightData` decoding and `filter_record_batch` operations. * fix: test * refactor: rename * allow empty app_metadata in FlightData * feat/bridge-bulk-insert: - Remove Logging: Removed unnecessary logging of affected rows in `region_server.rs`. - Error Handling Enhancement: Improved error handling in `bulk_insert.rs` by adding context to `split_record_batch` and handling single datanode fast path. - Error Enum Cleanup: Removed unused `Arrow` error variant from `error.rs`. * fix: standalone test * feat/bridge-bulk-insert: ### Enhance Bulk Insert Handling and Metadata Management - `lib.rs`: Enabled the `result_flattening` feature for improved error handling. - `request.rs`: Made `name_to_index` and `has_null` fields public in `WriteRequest` for better accessibility. - `handle_bulk_insert.rs`: - Added `handle_record_batch` function to streamline processing of bulk insert payloads. - Improved error handling and task management for bulk insert operations. - Updated `region_metadata_to_column_schema` to return both column schemas and a name-to-index map for efficient data access. * feat/bridge-bulk-insert: - Refactor `handle_bulk_insert.rs`: - Replaced `handle_record_batch` with `handle_payload` for handling payloads. - Modified the fast path to use `common_runtime::spawn_global` for asynchronous task execution. - Optimize `multi_dim.rs`: - Added a fast path for single-region scenarios in `MultiDimPartitionRule::partition_record_batch`. * feat/bridge-bulk-insert: - Update `greptime-proto` Dependency: Updated the `greptime-proto` dependency to a new revision in both `Cargo.lock` and `Cargo.toml`. - Optimize Memory Allocation: Increased initial and builder capacities in `time_series.rs` to improve performance. - Enhance Data Handling: Modified `bulk_insert.rs` to use `Bytes` for efficient data handling. - Improve Bulk Insert Logic: Refined the bulk insert logic in `region_request.rs` to handle schema and payload data more effectively and optimize record batch filtering. - String Handling Improvement: Updated string conversion in `helper.rs` for better performance. * fix: clippy warnings * feat/bridge-bulk-insert: Add Metrics and Improve Error Handling - Metrics Enhancements: Introduced new metrics for bulk insert operations in `metrics.rs`, `bulk_insert.rs`, `greptime_handler.rs`, and `region_request.rs`. Added `HANDLE_BULK_INSERT_ELAPSED`, `BULK_REQUEST_MESSAGE_SIZE`, and `GRPC_BULK_INSERT_ELAPSED` histograms to monitor performance. - Error Handling Improvements: Removed unnecessary error handling in `handle_bulk_insert.rs` by eliminating redundant `let _ =` patterns. - Dependency Updates: Added `lazy_static` and `prometheus` to `Cargo.lock` and `Cargo.toml` for metrics support. - Code Refactoring: Simplified function calls in `region_server.rs` and `handle_bulk_insert.rs` for better readability. * chore: rebase main * implement simple bulk memtable * impl write_bulk * implement simple bulk memtable * feat/simple-bulk-memtable: ### Enhance Time-Series Memtable and Bulk Insert Handling - Visibility Modifications: Made `mutable_array` in `PrimitiveVectorBuilder` and `StringVectorBuilder` public in `primitive.rs` and `string.rs`. - New Module: Added `builder.rs` to `memtable` for time-series builders, including `FieldBuilder` and `StringBuilder` implementations. - Bulk Insert Enhancements: - Added `sequence` field to `BulkPart` in `part.rs` and updated its handling in `simple_bulk_memtable.rs` and `region_write_ctx.rs`. - Introduced metrics for bulk insert operations in `metrics.rs` and `bulk_insert.rs`. - Performance Metrics: Added timing metrics for write operations in `metrics.rs`, `region_write_ctx.rs`, and `handle_write.rs`. - Region Request Handling: Updated `make_region_bulk_inserts` in `region_request.rs` to include performance metrics. * feat/simple-bulk-memtable: Improve Memtable Stats Calculation and Add Metrics Timer - `simple_bulk_memtable.rs`: Refactored `stats` method to use `num_rows` for checking if rows have been written, improving accuracy in memory table statistics. - `handle_bulk_insert.rs`: Introduced a metrics timer to measure the elapsed time for processing bulk requests, enhancing performance monitoring. * feat/simple-bulk-memtable: ### Commit Message Enhancements and Bug Fixes - Dependency Update: Updated `greptime-proto` dependency to a new revision in `Cargo.lock` and `Cargo.toml`. - Feature Addition: Implemented `to_mutation` method in `BulkPart` to convert `BulkPart` to `Mutation` for fallback `write_bulk` implementation in `src/mito2/src/memtable/bulk/part.rs`. - Functionality Improvement: Modified `write_bulk` method in `TimeSeriesMemtable` to support default implementation fallback to row iteration in `src/mito2/src/memtable/time_series.rs`. - Performance Optimization: Enhanced `bulk_insert` handling by optimizing region request processing and data partitioning in `src/operator/src/bulk_insert.rs`. - Error Handling: Added `ComputeArrow` error variant for better error management in `src/operator/src/error.rs`. - Code Refactoring: Simplified region bulk insert request processing in `src/store-api/src/region_request.rs`. * fix: some clippy warnings * feat/simple-bulk-memtable: ### Commit Summary - Refactor Return Types to `Result`: Updated the return type of the `ranges` method in `memtable.rs`, `bulk.rs`, `partition_tree.rs`, `simple_bulk_memtable.rs`, `time_series.rs`, and `memtable_util.rs` to return `Result<MemtableRanges>` for better error handling. - Enhance Metrics Tracking: Improved metrics tracking by adding `num_rows` and `max_sequence` to `WriteMetrics` in `stats.rs`. Updated related methods in `partition_tree.rs`, `simple_bulk_memtable.rs`, `time_series.rs`, and `scan_region.rs` to utilize these metrics. - Remove Unused Imports: Cleaned up unused imports in `time_series.rs` to streamline the codebase. * merge main * remove useless error vairant * use newer version of proto * feat/simple-bulk-memtable: Commit Message Summary Enhance FieldBuilder and StringBuilder functionality, add tests, and improve error handling. Key Changes • builder.rs: • Added documentation for FieldBuilder methods. • Renamed append_string_vector to append_vector in StringBuilder. • simple_bulk_memtable.rs: • Added new test cases for write_one, write_bulk, is_empty, stats, fork, and sequence_filter. • time_series.rs: • Improved error handling in ValueBuilder for type mismatches. • memtable_util.rs: • Removed unused imports and streamlined code. These changes enhance the robustness and test coverage of the memtable components. * feat/simple-bulk-memtable: Improve Time Partition Matching Logic in `time_partition.rs` - Enhanced the `write_bulk` method in `time_partition.rs` to improve the logic for matching partitions based on time ranges. - Introduced a new mechanism to filter and select partitions that overlap with the record batch's timestamp range before writing. * feat/simple-bulk-memtable: Improve Metrics Handling in `bulk_insert.rs` - Removed the `group_request_timer` and its associated metric observation to streamline the timing logic. - Moved the `BULK_REQUEST_ROWS` metric observation to occur after filtering, ensuring accurate row count metrics. * feat/simple-bulk-memtable: Enhance Stalled Requests Calculation and Update Metrics - `worker.rs`: Updated the `stalled_count` method to include both `reqs` and `bulk_reqs` in the calculation of stalled requests. - `bulk_insert.rs`: Removed duplicate observation of `BULK_REQUEST_MESSAGE_SIZE` metric. - `metrics.rs`: Changed the bucket strategy for `BULK_REQUEST_ROWS` from linear to exponential, improving the granularity of metrics collection. * feat/simple-bulk-memtable: Refactor `StringVector` Usage and Update Method Signatures - `src/datatypes/src/vectors/string.rs`: Changed `StringVector`'s `array` field from public to private. - `src/mito2/src/memtable/builder.rs`: Refactored `append_vector` method to `append_array`, updating its usage to work directly with `StringArray` instead of `StringVector`. - `src/mito2/src/memtable/time_series.rs`: Updated `ValueBuilder` to handle `StringArray` directly, replacing `StringVector` usage with `StringArray` in the `FieldBuilder::String` case. * feat/simple-bulk-memtable: - Refactor `PrimitiveVectorBuilder`: Made `mutable_array` private in `src/datatypes/src/vectors/primitive.rs`. - Optimize `ValueBuilder`: Replaced `UInt64VectorBuilder` and `UInt8VectorBuilder` with `Vec<u64>` and `Vec<u8>` for `sequence` and `op_type` in `src/mito2/src/memtable/time_series.rs`. - Improve Metrics Initialization: Updated histogram bucket initialization to use `exponential_buckets` in `src/mito2/src/metrics.rs`. * feat/simple-bulk-memtable: Improve error handling in `simple_bulk_memtable.rs` and `time_series.rs` - Enhanced error handling by using `OptionExt` for more concise error context management in `simple_bulk_memtable.rs` and `time_series.rs`. - Replaced `ok_or` with `with_context` to streamline error context creation in both files. * feat/simple-bulk-memtable: Enhance Time Partition Handling in `time_partition.rs` - Introduced `create_time_partition` function to streamline the creation of new time partitions, ensuring thread safety by acquiring a lock. - Modified logic to handle cases where no matching time partitions exist, creating new partitions as needed. - Updated `write_record_batch` and `write_one` methods to utilize the new partition creation logic, improving partition management and data writing efficiency. * replace proto * feat/simple-bulk-memtable: Update `metrics.rs` to adjust the range of exponential buckets for bulk insert message rows from `10 ~ 1_000_000` to `10 ~ 100_000`.	2025-05-09 02:56:09 +00:00
LFC	e787007eb5	feat: scan with sst minimal sequence (#6051 ) * feat: scan with sst minimal sequence * Update src/store-api/src/storage/requests.rs Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * update proto --------- Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>	2025-05-08 01:34:51 +00:00
Weny Xu	df31f0b9ec	fix: improve region migration error handling and optimize leader downgrade with lease check (#6026 ) * fix(meta): improve region migration error handling and lease management * chore: refine comments * chore: apply suggestions from CR * chore: apply suggestions from CR * feat: consume opening_region_guard	2025-05-07 00:54:35 +00:00
Lei, HUANG	f298a110f9	feat: bridge bulk insert (#5927 ) * feat/bridge-bulk-insert: ## Implement Bulk Insert and Update Dependencies - Bulk Insert Implementation: Added `handle_bulk_inserts` method in `src/operator/src/bulk_insert.rs` to manage bulk insert requests using `FlightDecoder` and `FlightData`. - Dependency Updates: Updated `Cargo.lock` and `Cargo.toml` to use the latest revision of `greptime-proto` and added new dependencies like `arrow`, `arrow-ipc`, `bytes`, and `prost`. - gRPC Enhancements: Modified `put_record_batch` method in `src/frontend/src/instance/grpc.rs` and `src/servers/src/grpc/flight.rs` to handle `FlightData` instead of `RawRecordBatch`. - Error Handling: Added new error types in `src/operator/src/error.rs` for handling Arrow operations and decoding flight data. - Miscellaneous: Updated `src/operator/src/insert.rs` to expose `partition_manager` and `node_manager` as public fields. * feat/bridge-bulk-insert: - Update `greptime-proto` Dependency: Updated the `greptime-proto` dependency to a new revision in `Cargo.lock` and `Cargo.toml`. - Refactor gRPC Query Handling: Removed `RawRecordBatch` usage from `grpc.rs`, `flight.rs`, `greptime_handler.rs`, and test files, simplifying the gRPC query handling. - Enhance Bulk Insert Logic: Improved bulk insert logic in `bulk_insert.rs` and `region_request.rs` by using `FlightDecoder` and `BooleanArray` for better performance and clarity. - Add `common-grpc` Dependency: Added `common-grpc` as a workspace dependency in `store-api/Cargo.toml` to support gRPC functionalities. * fix: clippy * fix schema serialization * feat/bridge-bulk-insert: Add error handling for encoding/decoding in `metadata.rs` and `region_request.rs` - Introduced new error variants `FlightCodec` and `Prost` in `MetadataError` to handle encoding/decoding failures in `metadata.rs`. - Updated `make_region_bulk_inserts` function in `region_request.rs` to use `context` for error handling with `ProstSnafu` and `FlightCodecSnafu`. - Enhanced error handling for `FlightData` decoding and `filter_record_batch` operations. * fix: test * refactor: rename * allow empty app_metadata in FlightData * feat/bridge-bulk-insert: - Remove Logging: Removed unnecessary logging of affected rows in `region_server.rs`. - Error Handling Enhancement: Improved error handling in `bulk_insert.rs` by adding context to `split_record_batch` and handling single datanode fast path. - Error Enum Cleanup: Removed unused `Arrow` error variant from `error.rs`. * fix: standalone test * feat/bridge-bulk-insert: ### Enhance Bulk Insert Handling and Metadata Management - `lib.rs`: Enabled the `result_flattening` feature for improved error handling. - `request.rs`: Made `name_to_index` and `has_null` fields public in `WriteRequest` for better accessibility. - `handle_bulk_insert.rs`: - Added `handle_record_batch` function to streamline processing of bulk insert payloads. - Improved error handling and task management for bulk insert operations. - Updated `region_metadata_to_column_schema` to return both column schemas and a name-to-index map for efficient data access. * feat/bridge-bulk-insert: - Refactor `handle_bulk_insert.rs`: - Replaced `handle_record_batch` with `handle_payload` for handling payloads. - Modified the fast path to use `common_runtime::spawn_global` for asynchronous task execution. - Optimize `multi_dim.rs`: - Added a fast path for single-region scenarios in `MultiDimPartitionRule::partition_record_batch`. * feat/bridge-bulk-insert: - Update `greptime-proto` Dependency: Updated the `greptime-proto` dependency to a new revision in both `Cargo.lock` and `Cargo.toml`. - Optimize Memory Allocation: Increased initial and builder capacities in `time_series.rs` to improve performance. - Enhance Data Handling: Modified `bulk_insert.rs` to use `Bytes` for efficient data handling. - Improve Bulk Insert Logic: Refined the bulk insert logic in `region_request.rs` to handle schema and payload data more effectively and optimize record batch filtering. - String Handling Improvement: Updated string conversion in `helper.rs` for better performance. * fix: clippy warnings * feat/bridge-bulk-insert: Add Metrics and Improve Error Handling - Metrics Enhancements: Introduced new metrics for bulk insert operations in `metrics.rs`, `bulk_insert.rs`, `greptime_handler.rs`, and `region_request.rs`. Added `HANDLE_BULK_INSERT_ELAPSED`, `BULK_REQUEST_MESSAGE_SIZE`, and `GRPC_BULK_INSERT_ELAPSED` histograms to monitor performance. - Error Handling Improvements: Removed unnecessary error handling in `handle_bulk_insert.rs` by eliminating redundant `let _ =` patterns. - Dependency Updates: Added `lazy_static` and `prometheus` to `Cargo.lock` and `Cargo.toml` for metrics support. - Code Refactoring: Simplified function calls in `region_server.rs` and `handle_bulk_insert.rs` for better readability. * chore: rebase main * chore: merge main	2025-05-06 09:53:25 +00:00
Yingwen	86aae6733d	fix: prune primary key with multiple columns may use default value as statistics (#5996 ) * test: incorrect test result when filtering pk with multiple columns * fix: prune non first tag correctly Distinguish no column and no stats and only use default value when no column * test: update test result * refactor: rename test file * test: add test for null filter * fix: use StatValues for null counts * test: drop table * test: fix unstable flow test	2025-04-28 04:53:30 +00:00
Lei, HUANG	1a517ec8ac	fix: check if memtable is empty by stats (#5989 ) fix/checking-memtable-empty-and-stats: - Refactor timestamp updates: Simplified timestamp range updates in `PartitionTreeMemtable` and `TimeSeriesMemtable` by replacing `update_timestamp_range` with `fetch_max` and `fetch_min` methods for `max_timestamp` and `min_timestamp`. - Affected files: `partition_tree.rs`, `time_series.rs` - Remove unused code: Deleted the `update_timestamp_range` method from `WriteMetrics` and removed unnecessary imports. - Affected file: `stats.rs` - Optimize memtable filtering: Streamlined the check for empty memtables in `ScanRegion` by directly using `time_range`. - Affected file: `scan_region.rs`	2025-04-28 01:57:17 +00:00
shuiyisong	3c943be189	chore: update rust toolchain (#5818 ) * chore: update nightly version * chore: sort lint lines * chore: minor fix * chore: update nix * chore: update toolchain to 2024-04-14 * chore: update toolchain to 2024-04-15 * chore: remove unnecessory test * chore: do not assert oid in sqlness test * chore: fix margin issue * chore: fix cr issues * chore: fix cr issues --------- Co-authored-by: Ning Sun <sunning@greptime.com>	2025-04-27 09:02:36 +00:00
Weny Xu	55cadcd2c0	feat: introduce flush metadata region task for metric engine (#5951 ) * feat: introduce flush metadata region task for metric engine * docs: generate config.md * chore: add header * test: fix unit test * fix: fix unit tests * chore: apply suggestions from CR * chore: remove docs * fix: fix unit tests	2025-04-23 04:51:22 +00:00
Yingwen	56f319a707	fix: filter doesn't consider default values after schema change (#5912 ) * test: sqlness test case * feat: use correct default while pruning row groups * fix: consider default in SimpleFilterContext * test: update sqlness test * test: add order by	2025-04-21 06:32:26 +00:00
Yuhan Wang	41814bb49f	feat: introduce `high_watermark` for remote wal logstore (#5877 ) * feat: introduce high_watermark_since_flush * test: add unit test for high watermark * refactor: submit a request instead * fix: send reply before submit request * fix: no need to update twice * feat: update high watermark in background periodically * test: update unit tests * fix: update high watermark periodically * test: update unit tests * chore: apply review comments * chore: rename * chore: apply review comments * chore: clean up * chore: apply review comments	2025-04-18 12:10:47 +00:00
Weny Xu	b8c6f1c8ed	feat: sync region followers after altering regions (#5901 ) * feat: close follower regions after dropping leader regions * chore: upgrade greptime-proto * feat: sync region followers after alter region operations * test: add tests * chore: apply suggestions from CR * chore: apply suggestions from CR	2025-04-18 10:21:35 +00:00
Lei, HUANG	799c7cbfa9	feat(mito): bulk insert request handling on datanode (#5831 ) * wip: implement basic request handling * feat/bulk-insert: ### Add Error Handling and Enhance Bulk Insert Functionality - Error Handling: Introduced a new error variant `ConvertDataType` in `error.rs` to handle conversion failures from `ConcreteDataType` to `ColumnDataType`. - Bulk Insert Enhancements: - Updated `WorkerRequest::BulkInserts` in `request.rs` to include metadata and sender. - Implemented `handle_bulk_inserts` in `worker.rs` to process bulk insert requests with region metadata. - Added functions `region_metadata_to_column_schema` and `record_batch_to_rows` in `handle_bulk_insert.rs` for schema conversion and row processing. - API Changes: Modified `RegionBulkInsertsRequest` in `region_request.rs` to include `region_id`. Files affected: `error.rs`, `request.rs`, `worker.rs`, `handle_bulk_insert.rs`, `region_request.rs`. * feat/bulk-insert: Enhance Error Handling and Add Unit Tests - Improved error handling in `record_batch_to_rows` function within `handle_bulk_insert.rs` by returning `Result` and handling errors with `context`. - Added unit tests for `region_metadata_to_column_schema` and `record_batch_to_rows` functions in `handle_bulk_insert.rs` to ensure correct functionality and error handling. * chore: update proto version * feat/bulk-insert: - Refactor Error Handling: Updated error handling in `error.rs` by modifying the `ConvertDataType` error handling. - Improve Logging and Error Reporting: Enhanced logging and error reporting in `worker.rs` by adding error messages for missing region metadata. - Add New Error Type: Introduced `DecodeArrowIpc` error in `metadata.rs` to handle Arrow IPC decoding failures. - Handle Arrow IPC Decoding: Updated `region_request.rs` to handle Arrow IPC decoding errors using the new `DecodeArrowIpc` error type. * chore: update proto version * feat/bulk-insert: Refactor `handle_bulk_insert.rs` to simplify row construction - Removed the mutable `current_row` vector and refactored `row_at` function to return a new vector directly. - Updated `record_batch_to_rows` to utilize the refactored `row_at` function for constructing rows. * feat/bulk-insert: ### Commit Summary Enhancements in Region Server Request Handling - Updated `region_server.rs` to include `RegionRequest::BulkInserts(_)` in the `RegionChange::Ingest` category, improving the handling of bulk insert operations. - Refined the categorization of region requests to ensure accurate mapping to `RegionChange` actions.	2025-04-15 14:11:50 +00:00
Lei, HUANG	6a50d71920	fix: memtable panic (#5894 ) * fix: memtable panic * fix: ci	2025-04-14 13:15:56 +00:00
Weny Xu	c522893552	fix: ensure logical regions are synced during region sync (#5878 ) * fix: ensure logical regions are synced during region sync * chore: apply suggestions from CR * chore: apply suggestions from CR	2025-04-14 12:37:31 +00:00
Zhenchi	e3675494b4	feat: apply terms with fulltext bloom backend (#5884 ) * feat: apply terms with fulltext bloom backend Signed-off-by: Zhenchi <zhongzc_arch@outlook.com> * perf: preload jieba Signed-off-by: Zhenchi <zhongzc_arch@outlook.com> * polish doc Signed-off-by: Zhenchi <zhongzc_arch@outlook.com> --------- Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>	2025-04-14 07:08:59 +00:00
Weny Xu	5a36fa5e18	fix: alway rejects write while downgrading region (#5842 ) * fix: alway rejects write while downgrading region * chore: apply suggestions from CR	2025-04-11 06:42:41 +00:00
Zhenchi	dce5e35d7c	feat: apply terms with fulltext tantivy backend (#5869 ) * feat: apply terms with fulltext tantivy backend Signed-off-by: Zhenchi <zhongzc_arch@outlook.com> * fix test Signed-off-by: Zhenchi <zhongzc_arch@outlook.com> * address comments Signed-off-by: Zhenchi <zhongzc_arch@outlook.com> --------- Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>	2025-04-10 07:32:15 +00:00
Zhenchi	6c66ec3ffc	refactor: abstract index source from fulltext index applier (#5845 ) * feat: add term as fulltext index request Signed-off-by: Zhenchi <zhongzc_arch@outlook.com> * fix fmt Signed-off-by: Zhenchi <zhongzc_arch@outlook.com> * address comments Signed-off-by: Zhenchi <zhongzc_arch@outlook.com> * refactor: abstract index source from fulltext index applier Signed-off-by: Zhenchi <zhongzc_arch@outlook.com> * address comments Signed-off-by: Zhenchi <zhongzc_arch@outlook.com> --------- Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>	2025-04-09 04:27:41 +00:00
LFC	311727939d	chore: update datafusion family (#5814 )	2025-04-09 02:20:55 +00:00
Ruihang Xia	c26e165887	refactor: check and fix super import (#5846 ) * refactor: check and fix super import Signed-off-by: Ruihang Xia <waynestxia@gmail.com> * add to makefile Signed-off-by: Ruihang Xia <waynestxia@gmail.com> * change dir Signed-off-by: Ruihang Xia <waynestxia@gmail.com> --------- Signed-off-by: Ruihang Xia <waynestxia@gmail.com>	2025-04-08 11:48:52 +00:00
Zhenchi	7335293983	feat: add term as fulltext index request (#5843 ) * feat: add term as fulltext index request Signed-off-by: Zhenchi <zhongzc_arch@outlook.com> * fix fmt Signed-off-by: Zhenchi <zhongzc_arch@outlook.com> * address comments Signed-off-by: Zhenchi <zhongzc_arch@outlook.com> --------- Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>	2025-04-08 11:19:32 +00:00
Weny Xu	981d51785b	fix: throw errors instead of ignoring (#5792 ) * fix: throw errors instead of ignoring * fix: fix unit tests * refactor: remove schema version check * fix: fix clippy * chore: remove unused error * refactor: remove schema version check * feat: handle mutliple results * feat: introduce consistency guard * fix: release consistency guard on datanode operation completion * test: add tests * chore: remove schema version * refactor: rename * test: add more tests * chore: print all error * tests: query table after alteration * log ignored request * refine fuzz test * chore: fix clippy and log mailbox message * chore: close prepared statement after execution * chore: add comment * chore: remove log * chore: rename to `ConsistencyPoison` * chore: remove unused error * fix: fix unit tests * chore: apply suggestions from CR	2025-04-07 13:51:00 +00:00
Weny Xu	eab702cc02	feat: implement `sync_region` for metric engine (#5826 ) * feat: implement `sync_region` for metric engine * chore: apply suggestions from CR * chore: upgrade proto	2025-04-03 12:46:20 +00:00
Zhenchi	f797de3497	feat: add backend field to fulltext options (#5806 ) * feat: add backend field to fulltext options Signed-off-by: Zhenchi <zhongzc_arch@outlook.com> * update proto Signed-off-by: Zhenchi <zhongzc_arch@outlook.com> * fix option conv Signed-off-by: Zhenchi <zhongzc_arch@outlook.com> * fix display Signed-off-by: Zhenchi <zhongzc_arch@outlook.com> * polish Signed-off-by: Zhenchi <zhongzc_arch@outlook.com> --------- Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>	2025-04-02 09:15:54 +00:00
Weny Xu	dbb79c9671	feat: introduce `CollectLeaderRegionHandler` (#5811 ) * feat: introduce `CollectLeaderRegionHandler` * feat: add to default handler group * fix: correct unit test * chore: rename	2025-04-02 04:47:00 +00:00
Zhenchi	aa486db8b7	refactor: allow bloom filter search to apply `and` conjunction (#5770 ) * refactor: change bloom filter search from any to all match Signed-off-by: Zhenchi <zhongzc_arch@outlook.com> * polish Signed-off-by: Zhenchi <zhongzc_arch@outlook.com> * place back in list Signed-off-by: Zhenchi <zhongzc_arch@outlook.com> * nit Signed-off-by: Zhenchi <zhongzc_arch@outlook.com> --------- Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>	2025-04-01 12:50:34 +00:00
Weny Xu	d701c18150	feat: introduce `CustomizedRegionLeaseRenewer` (#5762 ) * feat: add manifest_version to `GrantedRegion` * chore: upgrade proto * chore: apply review suggestions * chore: apply suggestions from CR * feat: introduce `CustomizedRegionLeaseRenewerRef` * chore: upgrade to `103948`	2025-03-31 13:25:05 +00:00
Weny Xu	d3a60d8821	feat: add limit for the number of running procedures (#5793 ) * refactor: remove unused `messages` * feat: introduce running procedure num limit * feat: update config * chore: apply suggestions from CR * feat: impl `status_code` for `log-store` crate	2025-03-31 06:14:21 +00:00
Weny Xu	41aee1f1b7	feat: implement `sync_region` for mito engine (#5765 ) * chore: upgrade proto to `2d52b` * feat: add `SyncRegion` to `WorkerRequest` * feat: impl `sync_region` for `Engine` trait * test: add tests * chore: fmt code * chore: upgrade proto * chore: unify `RegionLeaderState` and `RegionFollowerState` * chore: check immutable memtable * chore: fix clippy * chore: apply suggestions from CR	2025-03-31 03:53:47 +00:00
LFC	a9e990768d	refactor: skip re-taking arrays in memtable if possible (#5779 ) experiment: skip sorting and re-taking arrays if possible when scanning memtable	2025-03-28 09:58:55 +00:00
Yingwen	dbc25dd8da	feat: expose scanner metrics to df execution metrics (#5699 ) * feat: add metrics list to scanner * chore: add report metrics method * feat: use df metrics in PartitionMetrics * feat: pass execution metrics to scan partition * refactor: remove PartitionMetricsList * feat: better debug format for ScanMetricsSet * feat: do not expose all metrics to execution metrics by default * refactor: use struct destruction * feat: add metrics list to scanner * chore: Add custom Debug for ScanMetricsSet and partition metrics display * test: update sqlness result	2025-03-27 23:40:39 +00:00
fys	2b2ea5bf72	chore: upgrade some dependencies (#5777 ) * chore: upgrade some dependencies * chore: upgrade some dependencies * fix: cr * fix: ci * fix: test * fix: cargo fmt	2025-03-27 02:48:44 +00:00
Lei, HUANG	40b52f3b13	feat(mito): allow skipping wal while creating tables (#5740 ) * chore: add Noop Wal option * remove: WalOptionsAllocator::alloc method * feat/no-op-wal: ### Add Noop WAL Option - `engine.rs`, `opener.rs`, `wal.rs`, `entry_reader.rs`, `handle_write.rs`, `provider.rs`: - Introduced a new `WalOptions::Noop` variant to handle scenarios where no write-ahead logging is required. - Implemented `NoopEntryReader` to provide a no-operation entry reader. - Updated logic to skip WAL operations for regions with `Noop` option. - Added `Provider::Noop` to handle `Noop` operations in the provider logic. * feat/no-op-wal: ### Add `skip_wal` Option to Table Metadata - Enhancements in `table_meta.rs`: - Added a `skip_wal` parameter to the `create_wal_options` function to allow skipping WAL writes. - Updated the `create_table_route` function to utilize the `skip_wal` option from `table_info.meta.options`. - Updates in `wal_options_allocator.rs`: - Modified `alloc_batch` to handle the `skip_wal` flag, setting WAL options to `Noop` when true. - Added a test case `test_allocator_with_skip_wal` to verify the `skip_wal` functionality. - Changes in `requests.rs`: - Introduced `skip_wal` in `TableOptions` and added parsing logic. - Updated `TableOptions` display to include `skip_wal`. These changes introduce the ability to skip WAL writes for tables, enhancing flexibility in table metadata management. * feat/no-op-wal: Add WAL Option Handling and Table Option Validation - `handle_write.rs`: Introduced a check for `WalOptions::Noop` in the `RegionWorkerLoop` to skip WAL writing for regions with this option. - `requests.rs`: Added `SKIP_WAL_KEY` to the list of valid table options for enhanced table configuration validation. * feat/no-op-wal: ### Update WAL Options Allocation - `key.rs`: Modified the `allocate_region_wal_options` function to include an additional boolean parameter, enhancing the allocation logic. - `wal_options_allocator.rs`: Simplified the `test_allocator_with_skip_wal` test by removing unnecessary variable declarations and directly using `WalOptionsAllocator::RaftEngine`. These changes improve the flexibility and efficiency of WAL options allocation in the system. * chore: reformat code * feat/no-op-wal: Enhancement: Conditional Addition of `SKIP_WAL_KEY` in `requests.rs` - Updated `TableOptions` implementation in `requests.rs` to conditionally add `SKIP_WAL_KEY` to `key_vals` only when `self.skip_wal` is true, optimizing the key-value pair generation. * feat/no-op-wal: Update `requests.rs` tests to reflect changes in `skip_wal` option - Modified test assertions in `requests.rs` to remove `skip_wal=false` from expected strings. - Added a new test case to verify `skip_wal=true` is correctly represented in `TableOptions`. * feat/no-op-wal: Add Debug Logging and Improve Error Handling for WAL and Table Options • Introduced debug logging in wal.rs to skip obsolete regions, enhancing traceability. • Improved error handling in requests.rs by replacing warn with error propagation for invalid skip_wal values. • Added new test cases for skip_wal functionality, including SQL scripts and expected results, to ensure correct behavior and validation of the changes.	2025-03-26 07:53:52 +00:00

1 2 3 4 5 ...

581 Commits