* feat: support auto transform
* refactor: replace hashbrown with ahash
* refactor: params of run identity pipeline
* refactor: minor update
* test: add test for auto transform
* feat: add select processor
* test: select processor
* chore: use include and exclude for key
* fix: typos
* chore: address CR comment
* chore: typo
* chore: typo
* chore: address CR comment
* chore: use with_context
* fix: do not add projection for cast
Use cast to build time filter directly instead of adding a projection,
which will cause column not found
* feat: cast before creating plan
* feat/bridge-bulk-insert:
## Implement Bulk Insert and Update Dependencies
- **Bulk Insert Implementation**: Added `handle_bulk_inserts` method in `src/operator/src/bulk_insert.rs` to manage bulk insert requests using `FlightDecoder` and `FlightData`.
- **Dependency Updates**: Updated `Cargo.lock` and `Cargo.toml` to use the latest revision of `greptime-proto` and added new dependencies like `arrow`, `arrow-ipc`, `bytes`, and `prost`.
- **gRPC Enhancements**: Modified `put_record_batch` method in `src/frontend/src/instance/grpc.rs` and `src/servers/src/grpc/flight.rs` to handle `FlightData` instead of `RawRecordBatch`.
- **Error Handling**: Added new error types in `src/operator/src/error.rs` for handling Arrow operations and decoding flight data.
- **Miscellaneous**: Updated `src/operator/src/insert.rs` to expose `partition_manager` and `node_manager` as public fields.
* feat/bridge-bulk-insert:
- **Update `greptime-proto` Dependency**: Updated the `greptime-proto` dependency to a new revision in `Cargo.lock` and `Cargo.toml`.
- **Refactor gRPC Query Handling**: Removed `RawRecordBatch` usage from `grpc.rs`, `flight.rs`, `greptime_handler.rs`, and test files, simplifying the gRPC query handling.
- **Enhance Bulk Insert Logic**: Improved bulk insert logic in `bulk_insert.rs` and `region_request.rs` by using `FlightDecoder` and `BooleanArray` for better performance and clarity.
- **Add `common-grpc` Dependency**: Added `common-grpc` as a workspace dependency in `store-api/Cargo.toml` to support gRPC functionalities.
* fix: clippy
* fix schema serialization
* feat/bridge-bulk-insert:
Add error handling for encoding/decoding in `metadata.rs` and `region_request.rs`
- Introduced new error variants `FlightCodec` and `Prost` in `MetadataError` to handle encoding/decoding failures in `metadata.rs`.
- Updated `make_region_bulk_inserts` function in `region_request.rs` to use `context` for error handling with `ProstSnafu` and `FlightCodecSnafu`.
- Enhanced error handling for `FlightData` decoding and `filter_record_batch` operations.
* fix: test
* refactor: rename
* allow empty app_metadata in FlightData
* feat/bridge-bulk-insert:
- **Remove Logging**: Removed unnecessary logging of affected rows in `region_server.rs`.
- **Error Handling Enhancement**: Improved error handling in `bulk_insert.rs` by adding context to `split_record_batch` and handling single datanode fast path.
- **Error Enum Cleanup**: Removed unused `Arrow` error variant from `error.rs`.
* fix: standalone test
* feat/bridge-bulk-insert:
### Enhance Bulk Insert Handling and Metadata Management
- **`lib.rs`**: Enabled the `result_flattening` feature for improved error handling.
- **`request.rs`**: Made `name_to_index` and `has_null` fields public in `WriteRequest` for better accessibility.
- **`handle_bulk_insert.rs`**:
- Added `handle_record_batch` function to streamline processing of bulk insert payloads.
- Improved error handling and task management for bulk insert operations.
- Updated `region_metadata_to_column_schema` to return both column schemas and a name-to-index map for efficient data access.
* feat/bridge-bulk-insert:
- **Refactor `handle_bulk_insert.rs`:**
- Replaced `handle_record_batch` with `handle_payload` for handling payloads.
- Modified the fast path to use `common_runtime::spawn_global` for asynchronous task execution.
- **Optimize `multi_dim.rs`:**
- Added a fast path for single-region scenarios in `MultiDimPartitionRule::partition_record_batch`.
* feat/bridge-bulk-insert:
- **Update `greptime-proto` Dependency**: Updated the `greptime-proto` dependency to a new revision in both `Cargo.lock` and `Cargo.toml`.
- **Optimize Memory Allocation**: Increased initial and builder capacities in `time_series.rs` to improve performance.
- **Enhance Data Handling**: Modified `bulk_insert.rs` to use `Bytes` for efficient data handling.
- **Improve Bulk Insert Logic**: Refined the bulk insert logic in `region_request.rs` to handle schema and payload data more effectively and optimize record batch filtering.
- **String Handling Improvement**: Updated string conversion in `helper.rs` for better performance.
* fix: clippy warnings
* feat/bridge-bulk-insert:
**Add Metrics and Improve Error Handling**
- **Metrics Enhancements**: Introduced new metrics for bulk insert operations in `metrics.rs`, `bulk_insert.rs`, `greptime_handler.rs`, and `region_request.rs`. Added `HANDLE_BULK_INSERT_ELAPSED`, `BULK_REQUEST_MESSAGE_SIZE`, and `GRPC_BULK_INSERT_ELAPSED` histograms to
monitor performance.
- **Error Handling Improvements**: Removed unnecessary error handling in `handle_bulk_insert.rs` by eliminating redundant `let _ =` patterns.
- **Dependency Updates**: Added `lazy_static` and `prometheus` to `Cargo.lock` and `Cargo.toml` for metrics support.
- **Code Refactoring**: Simplified function calls in `region_server.rs` and `handle_bulk_insert.rs` for better readability.
* chore: rebase main
* chore: merge main
* feat: support auto transform
* refactor: replace hashbrown with ahash
* refactor: params of run identity pipeline
* refactor: minor update
* test: add test for auto transform
* chore: fix cr issues
* feat: use flow batching engine
broken: try using logical plan
fix: use dummy catalog for logical plan
fix: insert plan exec&sqlness grpc addr
feat: use frontend instance in flownode in standalone
feat: flow type in metasrv&fix: flush flow out of sync& column name alias
tests: sqlness update
tests: sqlness flow rebuild udpate
chore: per review
refactor: keep chnl mgr
refactor: use catalog mgr for get table
tests: use valid sql
fix: add more check
refactor: put flow type determine to frontend
* chore: update proto
* chore: update proto to main branch
* fix: add locks for create/drop flow&docs: update docs
* feat: flush_flow flush all ranges now
* test: add align time window test
* docs: explain `nodeid` use in check task
* refactor: AddAutoColumnRewriter check for Projection
* refactor: per review
* fix: query without time window also clean dirty time window
* chore: better logging
* chore: add comments per review
* refactor: per review
* chore: per review
* chore: per review rename args
* refactor: per review partially
* chore: update docs
* chore: use better error variant
* chore: better error variant
* refactor: rename FlowWorkerManager to FlowStreamingEngine
* rename again
* refactor: per review
* chore: rebase after #5963 merged
* refactor: rename all flow_worker_manager occurs
* docs: rm resolved TODO
* fix: remove obsolete failover detectors after region leader change
* chore: apply suggestions from CR
* fix: fix unit tests
* fix: fix unit test
* fix: failover logic
* feat: implement Arrow Flight "DoPut" in Frontend
* support auth for "do_put"
* set request_id in DoPut requests and responses
* set "db" in request header
* refactor: improve jaeger '/api/services' performance by adding the trace services table
* chore: refine some logic
* chore: compatible v0
* test: add integration test
* chore: expand default limit from 100 to 2000
* test: fix integration test
* refactor: make trace service table configurable
* refactor: use a timestamp(2100-01-01 00:00:00) as large as possible
* refactor: use '<trace_table>_services' as trace services table name
* refactor: remove mode option in configuration files
* chore: remove mode in configuration file
* remvoe mode field in FlownodeOptions
* add comment for test
* update config.md
* remove mode field in standalone options
* fix: ci
* chore: add table name template in pipeline yaml
* chore: implement apply function and add simple test
* chore: add comment and integration test
* chore: minor update
* fix: typos
* chore: change to table suffix
* chore: update comment and test
* chore: change name to table_suffix
* chore: minor refactor
* chore: minor refactor
* chore: support custom ts for identity pipeline
* chore: fix clippy
* chore: minor refactor & update tests
* chore: use ref on identity pipeline param
* refactor: remove trace id in primary key
* refactor: remove trace id in primary key in v0 model
* refactor: add span id in v1
* fix: integration test
* feat: update to disable http timeout by default
* feat: make http timeout default to 0
* test: correct test case
* chore: generate new config doc
* test: correct tests
* refactor: update jaeger api implementation
* test: add tests for v1 data model
* feat: customize trace table name
* fix: update column requirements to use Column type instead of String
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
* fix: lint fix
* refactor: accumulate resource attributes for v1
* fix: add empty check for additional string
* feat: add table option to mark data model version
* fix: do not overwrite all tags
* feat: use table option to mark table data model version and process accordingly
* chore: update comments to reflect query changes
* feat: use header for jaeger table name
* feat: update index for service_name, drop index for span_name
---------
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
Co-authored-by: Ruihang Xia <waynestxia@gmail.com>
Co-authored-by: zyy17 <zyylsxm@gmail.com>
* chore: resolve conflicts
* chore: merge main
* test: add compatibility test for DatanodeLeaseKey with missing cluster_id
* test: add compatibility test for DatanodeLeaseKey without cluster_id
* refactor/remove-cluster-id:
- **Update `greptime-proto` Dependency**: Updated the `greptime-proto` dependency in `Cargo.lock` and `Cargo.toml` to a new revision.
- **Remove `cluster_id` Usage**: Removed the `cluster_id` field and its related logic from various files, including `cluster.rs`, `datanode.rs`, `rpc.rs`,
`adapter.rs`, `client.rs`, `ask_leader.rs`, `heartbeat.rs`, `procedure.rs`, `store.rs`, `handler.rs`, `response_header_handler.rs`, `key.rs`, `datanode.rs`,
`lease.rs`, `metrics.rs`, `cluster.rs`, `heartbeat.rs`, `procedure.rs`, and `store.rs`.
- **Refactor Tests**: Updated tests in `client.rs`, `response_header_handler.rs`, `store.rs`, and `service` modules to reflect the removal of `cluster_id`.
* fix: clippy
* refactor/remove-cluster-id:
**Refactor and Cleanup in Meta Server**
- **`response_header_handler.rs`**: Removed unused import of `HeartbeatResponse` and cleaned up the test function by eliminating the creation of an unused `HeartbeatResponse` object.
- **`node_lease.rs`**: Simplified parameter handling in `HttpHandler` implementation by using an underscore for unused parameters.
* refactor/remove-cluster-id:
### Remove `TableMetadataAllocatorContext` and Refactor Code
- **Removed `TableMetadataAllocatorContext`**: Eliminated the `TableMetadataAllocatorContext` struct and its usage across multiple files, including `ddl.rs`, `create_table.rs`, `create_view.rs`, `table_meta.rs`, `test_util.rs`, `create_logical_tables.rs`,
`drop_table.rs`, and `table_meta_alloc.rs`.
- **Refactored Function Signatures**: Updated function signatures to remove the `TableMetadataAllocatorContext` parameter in methods like `create`, `create_view`, and `alloc` in `table_meta.rs` and `table_meta_alloc.rs`.
- **Updated Imports**: Adjusted import statements to reflect the removal of `TableMetadataAllocatorContext` in affected files.
These changes simplify the codebase by removing an unnecessary context struct and updating related function calls.
* refactor/remove-cluster-id:
### Update `datanode.rs` to Modify Key Prefix
- **File Modified**: `src/common/meta/src/datanode.rs`
- **Key Changes**:
- Updated `DatanodeStatKey::prefix_key` and `From<DatanodeStatKey>` to remove the cluster ID from the key prefix.
- Adjusted comments to reflect the changes in key prefix handling.
* reformat code
* refactor/remove-cluster-id:
### Commit Summary
- **Refactor `Pusher` Initialization**: Removed the `RequestHeader` parameter from the `Pusher::new` method across multiple files, including `handler.rs`, `test_util.rs`, and `heartbeat.rs`. This change simplifies the `Pusher` initialization process by eliminating th
unnecessary parameter.
- **Update Imports**: Adjusted import statements in `handler.rs` and `test_util.rs` to remove unused `RequestHeader` references, ensuring cleaner and more efficient code.
* chore: update proto
* feat: include trace v1 encoding
* feat: add trace ingestion in inserter
* feat: add partition rules and index for trace_id
* chore: format
* chore: fmt
* fix: issue introduced with merge
* feat: adjust index and add integration test for v1
* refactor: remove comment key
* fix: update default value of skip index granularity
* fix: update default value of skip index granularity
* refactor: rename some functions
* feat: remove skipping index from span_id
* refactor: made span_id part of primary key for potential dedup purpose
* feat: move the special attribute resource_attribute.service.name to top level
---------
Co-authored-by: shuiyisong <113876041+shuiyisong@users.noreply.github.com>