mirror of
https://github.com/GreptimeTeam/greptimedb.git
synced 2026-01-04 20:32:56 +00:00
* feat/bridge-bulk-insert: ## Implement Bulk Insert and Update Dependencies - **Bulk Insert Implementation**: Added `handle_bulk_inserts` method in `src/operator/src/bulk_insert.rs` to manage bulk insert requests using `FlightDecoder` and `FlightData`. - **Dependency Updates**: Updated `Cargo.lock` and `Cargo.toml` to use the latest revision of `greptime-proto` and added new dependencies like `arrow`, `arrow-ipc`, `bytes`, and `prost`. - **gRPC Enhancements**: Modified `put_record_batch` method in `src/frontend/src/instance/grpc.rs` and `src/servers/src/grpc/flight.rs` to handle `FlightData` instead of `RawRecordBatch`. - **Error Handling**: Added new error types in `src/operator/src/error.rs` for handling Arrow operations and decoding flight data. - **Miscellaneous**: Updated `src/operator/src/insert.rs` to expose `partition_manager` and `node_manager` as public fields. * feat/bridge-bulk-insert: - **Update `greptime-proto` Dependency**: Updated the `greptime-proto` dependency to a new revision in `Cargo.lock` and `Cargo.toml`. - **Refactor gRPC Query Handling**: Removed `RawRecordBatch` usage from `grpc.rs`, `flight.rs`, `greptime_handler.rs`, and test files, simplifying the gRPC query handling. - **Enhance Bulk Insert Logic**: Improved bulk insert logic in `bulk_insert.rs` and `region_request.rs` by using `FlightDecoder` and `BooleanArray` for better performance and clarity. - **Add `common-grpc` Dependency**: Added `common-grpc` as a workspace dependency in `store-api/Cargo.toml` to support gRPC functionalities. * fix: clippy * fix schema serialization * feat/bridge-bulk-insert: Add error handling for encoding/decoding in `metadata.rs` and `region_request.rs` - Introduced new error variants `FlightCodec` and `Prost` in `MetadataError` to handle encoding/decoding failures in `metadata.rs`. - Updated `make_region_bulk_inserts` function in `region_request.rs` to use `context` for error handling with `ProstSnafu` and `FlightCodecSnafu`. - Enhanced error handling for `FlightData` decoding and `filter_record_batch` operations. * fix: test * refactor: rename * allow empty app_metadata in FlightData * feat/bridge-bulk-insert: - **Remove Logging**: Removed unnecessary logging of affected rows in `region_server.rs`. - **Error Handling Enhancement**: Improved error handling in `bulk_insert.rs` by adding context to `split_record_batch` and handling single datanode fast path. - **Error Enum Cleanup**: Removed unused `Arrow` error variant from `error.rs`. * fix: standalone test * feat/bridge-bulk-insert: ### Enhance Bulk Insert Handling and Metadata Management - **`lib.rs`**: Enabled the `result_flattening` feature for improved error handling. - **`request.rs`**: Made `name_to_index` and `has_null` fields public in `WriteRequest` for better accessibility. - **`handle_bulk_insert.rs`**: - Added `handle_record_batch` function to streamline processing of bulk insert payloads. - Improved error handling and task management for bulk insert operations. - Updated `region_metadata_to_column_schema` to return both column schemas and a name-to-index map for efficient data access. * feat/bridge-bulk-insert: - **Refactor `handle_bulk_insert.rs`:** - Replaced `handle_record_batch` with `handle_payload` for handling payloads. - Modified the fast path to use `common_runtime::spawn_global` for asynchronous task execution. - **Optimize `multi_dim.rs`:** - Added a fast path for single-region scenarios in `MultiDimPartitionRule::partition_record_batch`. * feat/bridge-bulk-insert: - **Update `greptime-proto` Dependency**: Updated the `greptime-proto` dependency to a new revision in both `Cargo.lock` and `Cargo.toml`. - **Optimize Memory Allocation**: Increased initial and builder capacities in `time_series.rs` to improve performance. - **Enhance Data Handling**: Modified `bulk_insert.rs` to use `Bytes` for efficient data handling. - **Improve Bulk Insert Logic**: Refined the bulk insert logic in `region_request.rs` to handle schema and payload data more effectively and optimize record batch filtering. - **String Handling Improvement**: Updated string conversion in `helper.rs` for better performance. * fix: clippy warnings * feat/bridge-bulk-insert: **Add Metrics and Improve Error Handling** - **Metrics Enhancements**: Introduced new metrics for bulk insert operations in `metrics.rs`, `bulk_insert.rs`, `greptime_handler.rs`, and `region_request.rs`. Added `HANDLE_BULK_INSERT_ELAPSED`, `BULK_REQUEST_MESSAGE_SIZE`, and `GRPC_BULK_INSERT_ELAPSED` histograms to monitor performance. - **Error Handling Improvements**: Removed unnecessary error handling in `handle_bulk_insert.rs` by eliminating redundant `let _ =` patterns. - **Dependency Updates**: Added `lazy_static` and `prometheus` to `Cargo.lock` and `Cargo.toml` for metrics support. - **Code Refactoring**: Simplified function calls in `region_server.rs` and `handle_bulk_insert.rs` for better readability. * chore: rebase main * chore: merge main
Setup tests for multiple storage backend
To run the integration test, please copy .env.example to .env in the project root folder and change the values on need.
Take s3 for example. You need to set your S3 bucket, access key id and secret key:
# Settings for s3 test
GT_S3_BUCKET=S3 bucket
GT_S3_REGION=S3 region
GT_S3_ACCESS_KEY_ID=S3 access key id
GT_S3_ACCESS_KEY=S3 secret access key
Run
Execute the following command in the project root folder:
cargo test integration
Test s3 storage:
cargo test s3
Test oss storage:
cargo test oss
Test azblob storage:
cargo test azblob
Setup tests with Kafka wal
To run the integration test, please copy .env.example to .env in the project root folder and change the values on need.
GT_KAFKA_ENDPOINTS = localhost:9092
Setup kafka standalone
cd tests-integration/fixtures
docker compose -f docker-compose-standalone.yml up kafka