* fix/fast-path-for-single-region-bulk-insert:
### Commit Summary
- **Refactor `try_decode` Method**: Updated the `try_decode` method in `FlightDecoder` to accept a reference to `FlightData` instead of consuming it. This change affects multiple files including `database.rs`, `region.rs`, `flight.rs`, `bulk_insert.rs`, `stream.rs`, and `region_request.rs`.
- **Optimize Bulk Insert Handling**: Added a fast path for handling bulk inserts when only one region is involved in `bulk_insert.rs`.
* fix/fast-path-for-single-region-bulk-insert:
Improve `FlightDecoder` usage in tests
- Updated `try_decode` method calls in `flight.rs` to remove unnecessary references for `d1`, `d2`, and `d3`.
- Ensured consistency in handling `FlightMessage` variants within test cases.
* fix/fast-path-for-single-region-bulk-insert:
**Enhancement: Skip Empty Regions in Bulk Insert**
- Updated `bulk_insert.rs` to improve efficiency by skipping regions without data during the bulk insert process. This change ensures that regions with a `true_count` of zero are not processed, optimizing resource usage and performance.
* fix/fast-path-for-single-region-bulk-insert:
### Commit Summary
- **Refactor `RegionMask` Handling**:
- Introduced `RegionMask` struct to encapsulate boolean array and selected rows count.
- Updated methods to use `RegionMask` instead of `BooleanArray` for region selection.
- Affected files: `bulk_insert.rs`, `multi_dim.rs`, `partition.rs`, `splitter.rs`.
- **Optimize Region Selection**:
- Removed unnecessary checks for empty regions in `bulk_insert.rs`.
- Improved logic for handling default regions in `multi_dim.rs`.
- **Update Tests**:
- Modified test cases to accommodate `RegionMask` changes.
- Affected files: `multi_dim.rs`, `splitter.rs`.
* fix/fast-path-for-single-region-bulk-insert:
**Enhancements to MultiDimPartitionRule Logic and Tests**
- **`multi_dim.rs`**: Improved the logic for selecting rows in `MultiDimPartitionRule` by optimizing the selection process when only one region is present.
- **Tests**: Added new test cases to verify the behavior of default regions with unselected rows, existing default regions, and scenarios where all rows are selected. These tests ensure robust handling of partition rules and validate the correct assignment of rows to regions.
* feat: update pgwire to 0.29
* chore: only build default binary in nix ci
* Update src/servers/Cargo.toml
Co-authored-by: dennis zhuang <killme2008@gmail.com>
---------
Co-authored-by: dennis zhuang <killme2008@gmail.com>
* feat/bridge-bulk-insert:
## Implement Bulk Insert and Update Dependencies
- **Bulk Insert Implementation**: Added `handle_bulk_inserts` method in `src/operator/src/bulk_insert.rs` to manage bulk insert requests using `FlightDecoder` and `FlightData`.
- **Dependency Updates**: Updated `Cargo.lock` and `Cargo.toml` to use the latest revision of `greptime-proto` and added new dependencies like `arrow`, `arrow-ipc`, `bytes`, and `prost`.
- **gRPC Enhancements**: Modified `put_record_batch` method in `src/frontend/src/instance/grpc.rs` and `src/servers/src/grpc/flight.rs` to handle `FlightData` instead of `RawRecordBatch`.
- **Error Handling**: Added new error types in `src/operator/src/error.rs` for handling Arrow operations and decoding flight data.
- **Miscellaneous**: Updated `src/operator/src/insert.rs` to expose `partition_manager` and `node_manager` as public fields.
* feat/bridge-bulk-insert:
- **Update `greptime-proto` Dependency**: Updated the `greptime-proto` dependency to a new revision in `Cargo.lock` and `Cargo.toml`.
- **Refactor gRPC Query Handling**: Removed `RawRecordBatch` usage from `grpc.rs`, `flight.rs`, `greptime_handler.rs`, and test files, simplifying the gRPC query handling.
- **Enhance Bulk Insert Logic**: Improved bulk insert logic in `bulk_insert.rs` and `region_request.rs` by using `FlightDecoder` and `BooleanArray` for better performance and clarity.
- **Add `common-grpc` Dependency**: Added `common-grpc` as a workspace dependency in `store-api/Cargo.toml` to support gRPC functionalities.
* fix: clippy
* fix schema serialization
* feat/bridge-bulk-insert:
Add error handling for encoding/decoding in `metadata.rs` and `region_request.rs`
- Introduced new error variants `FlightCodec` and `Prost` in `MetadataError` to handle encoding/decoding failures in `metadata.rs`.
- Updated `make_region_bulk_inserts` function in `region_request.rs` to use `context` for error handling with `ProstSnafu` and `FlightCodecSnafu`.
- Enhanced error handling for `FlightData` decoding and `filter_record_batch` operations.
* fix: test
* refactor: rename
* allow empty app_metadata in FlightData
* feat/bridge-bulk-insert:
- **Remove Logging**: Removed unnecessary logging of affected rows in `region_server.rs`.
- **Error Handling Enhancement**: Improved error handling in `bulk_insert.rs` by adding context to `split_record_batch` and handling single datanode fast path.
- **Error Enum Cleanup**: Removed unused `Arrow` error variant from `error.rs`.
* fix: standalone test
* feat/bridge-bulk-insert:
### Enhance Bulk Insert Handling and Metadata Management
- **`lib.rs`**: Enabled the `result_flattening` feature for improved error handling.
- **`request.rs`**: Made `name_to_index` and `has_null` fields public in `WriteRequest` for better accessibility.
- **`handle_bulk_insert.rs`**:
- Added `handle_record_batch` function to streamline processing of bulk insert payloads.
- Improved error handling and task management for bulk insert operations.
- Updated `region_metadata_to_column_schema` to return both column schemas and a name-to-index map for efficient data access.
* feat/bridge-bulk-insert:
- **Refactor `handle_bulk_insert.rs`:**
- Replaced `handle_record_batch` with `handle_payload` for handling payloads.
- Modified the fast path to use `common_runtime::spawn_global` for asynchronous task execution.
- **Optimize `multi_dim.rs`:**
- Added a fast path for single-region scenarios in `MultiDimPartitionRule::partition_record_batch`.
* feat/bridge-bulk-insert:
- **Update `greptime-proto` Dependency**: Updated the `greptime-proto` dependency to a new revision in both `Cargo.lock` and `Cargo.toml`.
- **Optimize Memory Allocation**: Increased initial and builder capacities in `time_series.rs` to improve performance.
- **Enhance Data Handling**: Modified `bulk_insert.rs` to use `Bytes` for efficient data handling.
- **Improve Bulk Insert Logic**: Refined the bulk insert logic in `region_request.rs` to handle schema and payload data more effectively and optimize record batch filtering.
- **String Handling Improvement**: Updated string conversion in `helper.rs` for better performance.
* fix: clippy warnings
* feat/bridge-bulk-insert:
**Add Metrics and Improve Error Handling**
- **Metrics Enhancements**: Introduced new metrics for bulk insert operations in `metrics.rs`, `bulk_insert.rs`, `greptime_handler.rs`, and `region_request.rs`. Added `HANDLE_BULK_INSERT_ELAPSED`, `BULK_REQUEST_MESSAGE_SIZE`, and `GRPC_BULK_INSERT_ELAPSED` histograms to
monitor performance.
- **Error Handling Improvements**: Removed unnecessary error handling in `handle_bulk_insert.rs` by eliminating redundant `let _ =` patterns.
- **Dependency Updates**: Added `lazy_static` and `prometheus` to `Cargo.lock` and `Cargo.toml` for metrics support.
- **Code Refactoring**: Simplified function calls in `region_server.rs` and `handle_bulk_insert.rs` for better readability.
* chore: rebase main
* chore: merge main
* chore: insert support string to numeric auto cast
* test: add sqlness test
* chore: remove log
* test: fix sql test
* style: fix clippy
* test: test invalid number
* feat: do not convert to default if unable to parse
* chore: update comment
* test: update sqlness test
* test: update prepare test
* feat: support auto transform
* refactor: replace hashbrown with ahash
* refactor: params of run identity pipeline
* refactor: minor update
* test: add test for auto transform
* chore: fix cr issues
* feat: use flow batching engine
broken: try using logical plan
fix: use dummy catalog for logical plan
fix: insert plan exec&sqlness grpc addr
feat: use frontend instance in flownode in standalone
feat: flow type in metasrv&fix: flush flow out of sync& column name alias
tests: sqlness update
tests: sqlness flow rebuild udpate
chore: per review
refactor: keep chnl mgr
refactor: use catalog mgr for get table
tests: use valid sql
fix: add more check
refactor: put flow type determine to frontend
* chore: update proto
* chore: update proto to main branch
* fix: add locks for create/drop flow&docs: update docs
* feat: flush_flow flush all ranges now
* test: add align time window test
* docs: explain `nodeid` use in check task
* refactor: AddAutoColumnRewriter check for Projection
* refactor: per review
* fix: query without time window also clean dirty time window
* chore: better logging
* chore: add comments per review
* refactor: per review
* chore: per review
* chore: per review rename args
* refactor: per review partially
* chore: update docs
* chore: use better error variant
* chore: better error variant
* refactor: rename FlowWorkerManager to FlowStreamingEngine
* rename again
* refactor: per review
* chore: rebase after #5963 merged
* refactor: rename all flow_worker_manager occurs
* docs: rm resolved TODO
* [wip]: implement arrow service
* add service
* feat/otel-arrow:
### Add OpenTelemetry Arrow Support
- **`Cargo.toml`, `Cargo.lock`**: Updated `otel-arrow-rust` dependency to use a local path and added `arrow-ipc` as a dependency.
- **`src/servers/src/grpc.rs`, `src/servers/src/grpc/builder.rs`**: Integrated `ArrowMetricsServiceServer` with gRPC server, including support for custom header interception and message compression.
- **`src/servers/src/otel_arrow.rs`**: Implemented `OtelArrowServiceHandler` for handling OpenTelemetry Arrow metrics and added `HeaderInterceptor` for custom header handling.
* feat/otel-arrow:
Add error handling for OpenTelemetry Arrow requests
- **`src/error.rs`**: Introduced a new error variant `HandleOtelArrowRequest` to handle failures in processing OpenTelemetry Arrow requests.
- **`src/otel_arrow.rs`**: Implemented error handling for receiving and consuming batches from the OpenTelemetry Arrow client. Added logging for errors and updated the response status accordingly.
* feat/otel-arrow:
Remove `otel_arrow` Module from gRPC Server
- Deleted the `otel_arrow` module from the gRPC server implementation.
- Removed the `otel_arrow` module import from `grpc.rs`.
- Deleted the `otel_arrow.rs` file, which contained the `OtelArrowServer` struct and its implementation.
* feat/otel-arrow:
## Remove `Arc` Implementations for Protocol and Pipeline Handlers
- **Removed `Arc` Implementations**: Deleted `Arc` implementations for `OpenTelemetryProtocolHandler` and `PipelineHandler` traits in `query_handler.rs`. This change simplifies the code by removing redundant async trait implementations for `Arc<T>`.
- **File Affected**: `src/servers/src/query_handler.rs`
* feat/otel-arrow:
Improve error handling and metadata processing in `otel_arrow.rs`
- Updated error handling by ignoring the result of `sender.send` to prevent panic on failure.
- Enhanced metadata processing in `HeaderInterceptor` by using `Ok` to safely handle `grpc-encoding` entry retrieval.
* fix dependency
* feat/otel-arrow:
- **Update Dependencies**:
- Moved `otel-arrow-rust` dependency in `Cargo.toml`.
- Adjusted workspace dependencies in `src/frontend/Cargo.toml`.
- **Error Handling**:
- Removed `MissingQueryContext` error variant from `src/servers/src/error.rs`.
* fix: toml format
* remove useless code
* chore: resolve conflicts
* fix/pg-timestamp-diff:
### Add Support for `Duration` Type in PostgreSQL Encoding
- **Enhanced `encode_value` Functionality**: Updated `src/servers/src/postgres/types.rs` to support encoding of `Value::Duration` using `PgInterval`.
- **Implemented `Duration` Conversion**: Added conversion logic from `Duration` to `PgInterval` in `src/servers/src/postgres/types/interval.rs`.
- **Added Unit Tests**: Introduced tests for `Duration` to `PgInterval` conversion in `src/servers/src/postgres/types/interval.rs`.
- **Updated SQL Test Cases**: Modified `tests/cases/standalone/common/types/timestamp/timestamp.sql` and `timestamp.result` to include tests for timestamp subtraction using PostgreSQL protocol.
* fix: overflow
* fix/pg-timestamp-diff:
Update `timestamp.sql` to ensure newline consistency
- Modified `timestamp.sql` to add a newline at the end of the file for consistency.
* fix/pg-timestamp-diff:
### Add Documentation for Month Approximation in Interval Calculation
- **File Modified**: `src/servers/src/postgres/types/interval.rs`
- **Key Change**: Added a comment explaining the approximation of one month as 30.44 days in the interval calculations.
* feat: implement Arrow Flight "DoPut" in Frontend
* support auth for "do_put"
* set request_id in DoPut requests and responses
* set "db" in request header
* refactor: improve jaeger '/api/services' performance by adding the trace services table
* chore: refine some logic
* chore: compatible v0
* test: add integration test
* chore: expand default limit from 100 to 2000
* test: fix integration test
* refactor: make trace service table configurable
* refactor: use a timestamp(2100-01-01 00:00:00) as large as possible
* refactor: use '<trace_table>_services' as trace services table name
* perf: introduce simd_json for parsing ndjson
* fix: some tests
* fix: some tests
* fix: es test case
* chore: use `as_bytes_mut()`
* chore: remove unnecessary `to_string`
* chore: add safety comment
* chore: add table name template in pipeline yaml
* chore: implement apply function and add simple test
* chore: add comment and integration test
* chore: minor update
* fix: typos
* chore: change to table suffix
* chore: update comment and test
* chore: change name to table_suffix
* chore: minor refactor
* chore: minor refactor
* chore: support custom ts for identity pipeline
* chore: fix clippy
* chore: minor refactor & update tests
* chore: use ref on identity pipeline param
* refactor: remove trace id in primary key
* refactor: remove trace id in primary key in v0 model
* refactor: add span id in v1
* fix: integration test
* feat: update to disable http timeout by default
* feat: make http timeout default to 0
* test: correct test case
* chore: generate new config doc
* test: correct tests
* refactor: update jaeger api implementation
* test: add tests for v1 data model
* feat: customize trace table name
* fix: update column requirements to use Column type instead of String
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
* fix: lint fix
* refactor: accumulate resource attributes for v1
* fix: add empty check for additional string
* feat: add table option to mark data model version
* fix: do not overwrite all tags
* feat: use table option to mark table data model version and process accordingly
* chore: update comments to reflect query changes
* feat: use header for jaeger table name
* feat: update index for service_name, drop index for span_name
---------
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
Co-authored-by: Ruihang Xia <waynestxia@gmail.com>
Co-authored-by: zyy17 <zyylsxm@gmail.com>