* fix(mito2): handle corner case in catchup where compacted entry id exceeds region last entry id
Signed-off-by: WenyXu <wenymedia@gmail.com>
* chore: apply suggestions from CR
Signed-off-by: WenyXu <wenymedia@gmail.com>
---------
Signed-off-by: WenyXu <wenymedia@gmail.com>
* feat/bulk-wal:
### Refactor: Simplify Data Handling in LogStore Implementations
- **`kafka/log_store.rs`, `raft_engine/log_store.rs`, `wal.rs`, `raw_entry_reader.rs`, `logstore.rs`:**
- Refactored `entry` and `build_entry` functions to accept `Vec<u8>` directly instead of `&mut Vec<u8>`.
- Removed usage of `std::mem::take` for data handling, simplifying the code and improving readability.
- Updated test cases to align with the new function signatures.
* feat/bulk-wal:
### Add Support for Bulk WAL Entries and Flight Data Encoding
- **Add `raw_data` field to `BulkPart` and related structs**: Updated `BulkPart` and related structures in `src/mito2/src/memtable/bulk/part.rs`, `src/mito2/src/memtable/simple_bulk_memtable.rs`, `src/mito2/src/memtable/time_partition.rs`, `src/mito2/src/region_write_ctx.rs`,
`src/mito2/src/worker/handle_bulk_insert.rs`, and `src/store-api/src/region_request.rs` to include a new `raw_data` field for handling Arrow IPC data.
- **Implement Flight Data Encoding**: Added a new module `flight` in `src/common/test-util/src/flight.rs` to encode record batches to Flight data format.
- **Update `greptime-proto` dependency**: Changed the revision of the `greptime-proto` dependency in `Cargo.lock` and `Cargo.toml`.
- **Enhance WAL Writer and Tests**: Modified `src/mito2/src/wal.rs` and related test files to support bulk WAL entries and added tests for encoding and handling bulk data.
* feat/bulk-wal:
- **Update `greptime-proto` Dependency**: Updated the `greptime-proto` dependency to a new revision in `Cargo.lock` and `Cargo.toml`.
- **Add `common-grpc` Dependency**: Added `common-grpc` as a dependency in `Cargo.lock` and `src/mito2/Cargo.toml`.
- **Refactor `BulkPart` Structure**: Removed `num_rows` field and added `num_rows()` method in `src/mito2/src/memtable/bulk/part.rs`. Updated related usages in `src/mito2/src/memtable/simple_bulk_memtable.rs`, `src/mito2/src/memtable/time_partition.rs`, `src/mito2/src/memtable/time_series.rs`,
`src/mito2/src/region_write_ctx.rs`, and `src/mito2/src/worker/handle_bulk_insert.rs`.
- **Implement `TryFrom` and `From` for `BulkWalEntry`**: Added implementations for converting between `BulkPart` and `BulkWalEntry` in `src/mito2/src/memtable/bulk/part.rs`.
- **Handle Bulk Entries in Region Opener**: Added logic to process bulk entries in `src/mito2/src/region/opener.rs`.
- **Fix `BulkInsertRequest` Handling**: Corrected `region_id` handling in `src/operator/src/bulk_insert.rs` and `src/store-api/src/region_request.rs`.
- **Add Error Variant for `ConvertBulkWalEntry`**: Added a new error variant in `src/mito2/src/error.rs` for handling bulk WAL entry conversion errors.
* fix: ci
* feat/bulk-wal:
Add bulk write operation in `opener.rs`
- Enhanced the region write context by adding a call to `write_bulk()` after `write_memtable()` in `opener.rs`.
- This change aims to improve the efficiency of writing operations by enabling bulk writes.
* feat/bulk-wal:
Enhance error handling and metrics in `bulk_insert.rs`
- Updated `Inserter` to improve error handling by capturing the result of `datanode.handle(request)` and incrementing the `DIST_INGEST_ROW_COUNT` metric with the number of affected rows.
* feat/bulk-wal:
### Remove Encode Error Handling for WAL Entries
- **`error.rs`**: Removed the `EncodeWal` error variant and its associated handling.
- **`wal.rs`**: Eliminated the `entry_encode_buf` buffer and its usage for encoding WAL entries. Replaced with direct encoding to a vector using `encode_to_vec()`.
* feat: improve topic management and add stale records cleanup
* fix: fix unit tests
* chore: apply suggestions from CR
* chore: apply suggestions from CR
chore/move-wal-sync-to-bg:
### Refactor Log Store Task Management
- **Error Handling Enhancements**: Updated error handling for task management in `error.rs` by renaming `StartGcTask` and `StopGcTask` to `StartWalTask` and `StopWalTask`, respectively, and added a `name` field for more descriptive error messages.
- **Task Management Improvements**: Introduced `SyncWalTaskFunction` in `log_store.rs` to handle periodic synchronization of WAL tasks, replacing the previous atomic-based sync logic.
- **Backend Adjustments**: Modified `backend.rs` to use the new `StartWalTaskSnafu` for starting tasks, ensuring consistency with the updated error handling approach.
* feat: set max log files to 720 by default, info log only
* expose max_log_files in tomls
* include dir info when panicing, limit max_log_files of err_log to 30, and that of slow_queries to opt.max_log_files
* fix clippy
* update config.md
* update expected config str
* limit err_log max files size to `max_log_files` too, include err info when panicing, put `max_l_f` in right position
* fix typos
* chore: config
Co-authored-by: Lei, HUANG <6406592+v0y4g3r@users.noreply.github.com>
---------
Co-authored-by: dennis zhuang <killme2008@gmail.com>
Co-authored-by: Lei, HUANG <6406592+v0y4g3r@users.noreply.github.com>
* chore: cherrypick 52e8eebb2dbbbe81179583c05094004a5eedd7fd
* refactor/tables: Change variable from immutable to mutable in KvBackendCatalogManager's method
* refactor/tables: Replace unbounded channel with bounded and use semaphore for concurrency control in KvBackendCatalogManager
* refactor/tables: Add common-runtime dependency and update KvBackendCatalogManager to use common_runtime::spawn_global
* refactor/tables: Await on sending error through channel in KvBackendCatalogManager
* Refactor RaftEngineLogStore to use references for config
- Updated `RaftEngineLogStore::try_new` to accept a reference to `RaftEngineConfig` instead of taking ownership.
- Replaced direct usage of `config` with individual fields (`sync_write`, `sync_period`, `read_batch_size`).
- Adjusted test cases to pass references to `RaftEngineConfig`.
* Add parallelism configuration for WAL recovery
- Introduced `recovery_parallelism` setting in `datanode.example.toml` and `standalone.example.toml` for configuring parallelism during WAL recovery.
- Updated `Cargo.lock` and `Cargo.toml` to include `num_cpus` dependency.
- Modified `RaftEngineConfig` to include `recovery_parallelism` with a default value set to the number of CP
* feat/wal-recovery-parallelism:
Add `wal.recovery_parallelism` configuration option
- Introduced `wal.recovery_parallelism` to config.md for specifying parallelism during WAL recovery.
- Updated `RaftEngineLogStore` to include `recovery_threads` from the new configuration.
* fix: ut
* feat(log_store): use new `Consumer`
* feat: add `from_peer_id`
* feat: read WAL entries respect index
* test: add test for `build_region_wal_index_iterator`
* fix: keep the handle
* fix: incorrect last index
* fix: replay last entry id may be greater than expected
* chore: remove unused code
* chore: apply suggestions from CR
* chore: rename `datanode_id` to `location_id`
* chore: rename `from_peer_id` to `location_id`
* chore: rename `from_peer_id` to `location_id`
* chore: apply suggestions from CR