Commit Graph

1055 Commits

Author SHA1 Message Date
discord9
e2c7dfcaba fix: choose frontend randomly 2025-05-09 17:39:20 +08:00
Lei, HUANG
8685ceb232 feat: impl bulk memtable and bridge bulk inserts (#6054)
* feat/bridge-bulk-insert:
 ## Implement Bulk Insert and Update Dependencies

 - **Bulk Insert Implementation**: Added `handle_bulk_inserts` method in `src/operator/src/bulk_insert.rs` to manage bulk insert requests using `FlightDecoder` and `FlightData`.
 - **Dependency Updates**: Updated `Cargo.lock` and `Cargo.toml` to use the latest revision of `greptime-proto` and added new dependencies like `arrow`, `arrow-ipc`, `bytes`, and `prost`.
 - **gRPC Enhancements**: Modified `put_record_batch` method in `src/frontend/src/instance/grpc.rs` and `src/servers/src/grpc/flight.rs` to handle `FlightData` instead of `RawRecordBatch`.
 - **Error Handling**: Added new error types in `src/operator/src/error.rs` for handling Arrow operations and decoding flight data.
 - **Miscellaneous**: Updated `src/operator/src/insert.rs` to expose `partition_manager` and `node_manager` as public fields.

* feat/bridge-bulk-insert:
 - **Update `greptime-proto` Dependency**: Updated the `greptime-proto` dependency to a new revision in `Cargo.lock` and `Cargo.toml`.
 - **Refactor gRPC Query Handling**: Removed `RawRecordBatch` usage from `grpc.rs`, `flight.rs`, `greptime_handler.rs`, and test files, simplifying the gRPC query handling.
 - **Enhance Bulk Insert Logic**: Improved bulk insert logic in `bulk_insert.rs` and `region_request.rs` by using `FlightDecoder` and `BooleanArray` for better performance and clarity.
 - **Add `common-grpc` Dependency**: Added `common-grpc` as a workspace dependency in `store-api/Cargo.toml` to support gRPC functionalities.

* fix: clippy

* fix schema serialization

* feat/bridge-bulk-insert:
 Add error handling for encoding/decoding in `metadata.rs` and `region_request.rs`

 - Introduced new error variants `FlightCodec` and `Prost` in `MetadataError` to handle encoding/decoding failures in `metadata.rs`.
 - Updated `make_region_bulk_inserts` function in `region_request.rs` to use `context` for error handling with `ProstSnafu` and `FlightCodecSnafu`.
 - Enhanced error handling for `FlightData` decoding and `filter_record_batch` operations.

* fix: test

* refactor: rename

* allow empty app_metadata in FlightData

* feat/bridge-bulk-insert:
 - **Remove Logging**: Removed unnecessary logging of affected rows in `region_server.rs`.
 - **Error Handling Enhancement**: Improved error handling in `bulk_insert.rs` by adding context to `split_record_batch` and handling single datanode fast path.
 - **Error Enum Cleanup**: Removed unused `Arrow` error variant from `error.rs`.

* fix: standalone test

* feat/bridge-bulk-insert:
 ### Enhance Bulk Insert Handling and Metadata Management

 - **`lib.rs`**: Enabled the `result_flattening` feature for improved error handling.
 - **`request.rs`**: Made `name_to_index` and `has_null` fields public in `WriteRequest` for better accessibility.
 - **`handle_bulk_insert.rs`**:
   - Added `handle_record_batch` function to streamline processing of bulk insert payloads.
   - Improved error handling and task management for bulk insert operations.
   - Updated `region_metadata_to_column_schema` to return both column schemas and a name-to-index map for efficient data access.

* feat/bridge-bulk-insert:
 - **Refactor `handle_bulk_insert.rs`:**
   - Replaced `handle_record_batch` with `handle_payload` for handling payloads.
   - Modified the fast path to use `common_runtime::spawn_global` for asynchronous task execution.

 - **Optimize `multi_dim.rs`:**
   - Added a fast path for single-region scenarios in `MultiDimPartitionRule::partition_record_batch`.

* feat/bridge-bulk-insert:
 - **Update `greptime-proto` Dependency**: Updated the `greptime-proto` dependency to a new revision in both `Cargo.lock` and `Cargo.toml`.
 - **Optimize Memory Allocation**: Increased initial and builder capacities in `time_series.rs` to improve performance.
 - **Enhance Data Handling**: Modified `bulk_insert.rs` to use `Bytes` for efficient data handling.
 - **Improve Bulk Insert Logic**: Refined the bulk insert logic in `region_request.rs` to handle schema and payload data more effectively and optimize record batch filtering.
 - **String Handling Improvement**: Updated string conversion in `helper.rs` for better performance.

* fix: clippy warnings

* feat/bridge-bulk-insert:
 **Add Metrics and Improve Error Handling**

 - **Metrics Enhancements**: Introduced new metrics for bulk insert operations in `metrics.rs`, `bulk_insert.rs`, `greptime_handler.rs`, and `region_request.rs`. Added `HANDLE_BULK_INSERT_ELAPSED`, `BULK_REQUEST_MESSAGE_SIZE`, and `GRPC_BULK_INSERT_ELAPSED` histograms to
 monitor performance.
 - **Error Handling Improvements**: Removed unnecessary error handling in `handle_bulk_insert.rs` by eliminating redundant `let _ =` patterns.
 - **Dependency Updates**: Added `lazy_static` and `prometheus` to `Cargo.lock` and `Cargo.toml` for metrics support.
 - **Code Refactoring**: Simplified function calls in `region_server.rs` and `handle_bulk_insert.rs` for better readability.

* chore: rebase main

* implement simple bulk memtable

* impl write_bulk

* implement simple bulk memtable

* feat/simple-bulk-memtable:
 ### Enhance Time-Series Memtable and Bulk Insert Handling

 - **Visibility Modifications**: Made `mutable_array` in `PrimitiveVectorBuilder` and `StringVectorBuilder` public in `primitive.rs` and `string.rs`.
 - **New Module**: Added `builder.rs` to `memtable` for time-series builders, including `FieldBuilder` and `StringBuilder` implementations.
 - **Bulk Insert Enhancements**:
   - Added `sequence` field to `BulkPart` in `part.rs` and updated its handling in `simple_bulk_memtable.rs` and `region_write_ctx.rs`.
   - Introduced metrics for bulk insert operations in `metrics.rs` and `bulk_insert.rs`.
 - **Performance Metrics**: Added timing metrics for write operations in `metrics.rs`, `region_write_ctx.rs`, and `handle_write.rs`.
 - **Region Request Handling**: Updated `make_region_bulk_inserts` in `region_request.rs` to include performance metrics.

* feat/simple-bulk-memtable:
 **Improve Memtable Stats Calculation and Add Metrics Timer**

 - **`simple_bulk_memtable.rs`**: Refactored `stats` method to use `num_rows` for checking if rows have been written, improving accuracy in memory table statistics.
 - **`handle_bulk_insert.rs`**: Introduced a metrics timer to measure the elapsed time for processing bulk requests, enhancing performance monitoring.

* feat/simple-bulk-memtable:
 ### Commit Message

 **Enhancements and Bug Fixes**

 - **Dependency Update**: Updated `greptime-proto` dependency to a new revision in `Cargo.lock` and `Cargo.toml`.
 - **Feature Addition**: Implemented `to_mutation` method in `BulkPart` to convert `BulkPart` to `Mutation` for fallback `write_bulk` implementation in `src/mito2/src/memtable/bulk/part.rs`.
 - **Functionality Improvement**: Modified `write_bulk` method in `TimeSeriesMemtable` to support default implementation fallback to row iteration in `src/mito2/src/memtable/time_series.rs`.
 - **Performance Optimization**: Enhanced `bulk_insert` handling by optimizing region request processing and data partitioning in `src/operator/src/bulk_insert.rs`.
 - **Error Handling**: Added `ComputeArrow` error variant for better error management in `src/operator/src/error.rs`.
 - **Code Refactoring**: Simplified region bulk insert request processing in `src/store-api/src/region_request.rs`.

* fix: some clippy warnings

* feat/simple-bulk-memtable:
 ### Commit Summary

 - **Refactor Return Types to `Result`:**
   Updated the return type of the `ranges` method in `memtable.rs`, `bulk.rs`, `partition_tree.rs`, `simple_bulk_memtable.rs`, `time_series.rs`, and `memtable_util.rs` to return `Result<MemtableRanges>` for better error handling.

 - **Enhance Metrics Tracking:**
   Improved metrics tracking by adding `num_rows` and `max_sequence` to `WriteMetrics` in `stats.rs`. Updated related methods in `partition_tree.rs`, `simple_bulk_memtable.rs`, `time_series.rs`, and `scan_region.rs` to utilize these metrics.

 - **Remove Unused Imports:**
   Cleaned up unused imports in `time_series.rs` to streamline the codebase.

* merge main

* remove useless error vairant

* use newer version of proto

* feat/simple-bulk-memtable:
                                                                                                                                 Commit Message

                                                                                                                                     Summary

Enhance FieldBuilder and StringBuilder functionality, add tests, and improve error handling.

                                                                                                                                   Key Changes

 • builder.rs:
    • Added documentation for FieldBuilder methods.
    • Renamed append_string_vector to append_vector in StringBuilder.
 • simple_bulk_memtable.rs:
    • Added new test cases for write_one, write_bulk, is_empty, stats, fork, and sequence_filter.
 • time_series.rs:
    • Improved error handling in ValueBuilder for type mismatches.
 • memtable_util.rs:
    • Removed unused imports and streamlined code.

These changes enhance the robustness and test coverage of the memtable components.

* feat/simple-bulk-memtable:
 Improve Time Partition Matching Logic in `time_partition.rs`

 - Enhanced the `write_bulk` method in `time_partition.rs` to improve the logic for matching partitions based on time ranges.
 - Introduced a new mechanism to filter and select partitions that overlap with the record batch's timestamp range before writing.

* feat/simple-bulk-memtable:
 Improve Metrics Handling in `bulk_insert.rs`

 - Removed the `group_request_timer` and its associated metric observation to streamline the timing logic.
 - Moved the `BULK_REQUEST_ROWS` metric observation to occur after filtering, ensuring accurate row count metrics.

* feat/simple-bulk-memtable:
 **Enhance Stalled Requests Calculation and Update Metrics**

 - **`worker.rs`**: Updated the `stalled_count` method to include both `reqs` and `bulk_reqs` in the calculation of stalled requests.
 - **`bulk_insert.rs`**: Removed duplicate observation of `BULK_REQUEST_MESSAGE_SIZE` metric.
 - **`metrics.rs`**: Changed the bucket strategy for `BULK_REQUEST_ROWS` from linear to exponential, improving the granularity of metrics collection.

* feat/simple-bulk-memtable:
 **Refactor `StringVector` Usage and Update Method Signatures**

 - **`src/datatypes/src/vectors/string.rs`**: Changed `StringVector`'s `array` field from public to private.
 - **`src/mito2/src/memtable/builder.rs`**: Refactored `append_vector` method to `append_array`, updating its usage to work directly with `StringArray` instead of `StringVector`.
 - **`src/mito2/src/memtable/time_series.rs`**: Updated `ValueBuilder` to handle `StringArray` directly, replacing `StringVector` usage with `StringArray` in the `FieldBuilder::String` case.

* feat/simple-bulk-memtable:
 - **Refactor `PrimitiveVectorBuilder`**: Made `mutable_array` private in `src/datatypes/src/vectors/primitive.rs`.
 - **Optimize `ValueBuilder`**: Replaced `UInt64VectorBuilder` and `UInt8VectorBuilder` with `Vec<u64>` and `Vec<u8>` for `sequence` and `op_type` in `src/mito2/src/memtable/time_series.rs`.
 - **Improve Metrics Initialization**: Updated histogram bucket initialization to use `exponential_buckets` in `src/mito2/src/metrics.rs`.

* feat/simple-bulk-memtable:
 Improve error handling in `simple_bulk_memtable.rs` and `time_series.rs`

 - Enhanced error handling by using `OptionExt` for more concise error context management in `simple_bulk_memtable.rs` and `time_series.rs`.
 - Replaced `ok_or` with `with_context` to streamline error context creation in both files.

* feat/simple-bulk-memtable:
 **Enhance Time Partition Handling in `time_partition.rs`**

 - Introduced `create_time_partition` function to streamline the creation of new time partitions, ensuring thread safety by acquiring a lock.
 - Modified logic to handle cases where no matching time partitions exist, creating new partitions as needed.
 - Updated `write_record_batch` and `write_one` methods to utilize the new partition creation logic, improving partition management and data writing efficiency.

* replace proto

* feat/simple-bulk-memtable:
 Update `metrics.rs` to adjust the range of exponential buckets for bulk insert message rows from `10 ~ 1_000_000` to `10 ~ 100_000`.
2025-05-09 02:56:09 +00:00
dennis zhuang
fbf50c594e fix: csv format escaping (#6061)
* fix: csv format escaping

* chore: change status code

* fix: crate version
2025-05-08 05:52:20 +00:00
Ning Sun
5739302845 feat: update pgwire to 0.29 (#6058)
* feat: update pgwire to 0.29

* chore: only build default binary in nix ci

* Update src/servers/Cargo.toml

Co-authored-by: dennis zhuang <killme2008@gmail.com>

---------

Co-authored-by: dennis zhuang <killme2008@gmail.com>
2025-05-08 04:21:13 +00:00
LFC
e787007eb5 feat: scan with sst minimal sequence (#6051)
* feat: scan with sst minimal sequence

* Update src/store-api/src/storage/requests.rs

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* update proto

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-05-08 01:34:51 +00:00
discord9
cce1285b16 feat: flow add static user/pwd auth (#6048)
* feat: flow add static user/pwd auth

* fix: not print password

* chore: rm explict Any bound

* refactor: per review

* refactor: move away from plugin

* refactor: not use any

* chore: per revieww

* chore: complete a todo

* chore: fix after rebase
2025-05-07 05:20:37 +00:00
Lei, HUANG
f298a110f9 feat: bridge bulk insert (#5927)
* feat/bridge-bulk-insert:
 ## Implement Bulk Insert and Update Dependencies

 - **Bulk Insert Implementation**: Added `handle_bulk_inserts` method in `src/operator/src/bulk_insert.rs` to manage bulk insert requests using `FlightDecoder` and `FlightData`.
 - **Dependency Updates**: Updated `Cargo.lock` and `Cargo.toml` to use the latest revision of `greptime-proto` and added new dependencies like `arrow`, `arrow-ipc`, `bytes`, and `prost`.
 - **gRPC Enhancements**: Modified `put_record_batch` method in `src/frontend/src/instance/grpc.rs` and `src/servers/src/grpc/flight.rs` to handle `FlightData` instead of `RawRecordBatch`.
 - **Error Handling**: Added new error types in `src/operator/src/error.rs` for handling Arrow operations and decoding flight data.
 - **Miscellaneous**: Updated `src/operator/src/insert.rs` to expose `partition_manager` and `node_manager` as public fields.

* feat/bridge-bulk-insert:
 - **Update `greptime-proto` Dependency**: Updated the `greptime-proto` dependency to a new revision in `Cargo.lock` and `Cargo.toml`.
 - **Refactor gRPC Query Handling**: Removed `RawRecordBatch` usage from `grpc.rs`, `flight.rs`, `greptime_handler.rs`, and test files, simplifying the gRPC query handling.
 - **Enhance Bulk Insert Logic**: Improved bulk insert logic in `bulk_insert.rs` and `region_request.rs` by using `FlightDecoder` and `BooleanArray` for better performance and clarity.
 - **Add `common-grpc` Dependency**: Added `common-grpc` as a workspace dependency in `store-api/Cargo.toml` to support gRPC functionalities.

* fix: clippy

* fix schema serialization

* feat/bridge-bulk-insert:
 Add error handling for encoding/decoding in `metadata.rs` and `region_request.rs`

 - Introduced new error variants `FlightCodec` and `Prost` in `MetadataError` to handle encoding/decoding failures in `metadata.rs`.
 - Updated `make_region_bulk_inserts` function in `region_request.rs` to use `context` for error handling with `ProstSnafu` and `FlightCodecSnafu`.
 - Enhanced error handling for `FlightData` decoding and `filter_record_batch` operations.

* fix: test

* refactor: rename

* allow empty app_metadata in FlightData

* feat/bridge-bulk-insert:
 - **Remove Logging**: Removed unnecessary logging of affected rows in `region_server.rs`.
 - **Error Handling Enhancement**: Improved error handling in `bulk_insert.rs` by adding context to `split_record_batch` and handling single datanode fast path.
 - **Error Enum Cleanup**: Removed unused `Arrow` error variant from `error.rs`.

* fix: standalone test

* feat/bridge-bulk-insert:
 ### Enhance Bulk Insert Handling and Metadata Management

 - **`lib.rs`**: Enabled the `result_flattening` feature for improved error handling.
 - **`request.rs`**: Made `name_to_index` and `has_null` fields public in `WriteRequest` for better accessibility.
 - **`handle_bulk_insert.rs`**:
   - Added `handle_record_batch` function to streamline processing of bulk insert payloads.
   - Improved error handling and task management for bulk insert operations.
   - Updated `region_metadata_to_column_schema` to return both column schemas and a name-to-index map for efficient data access.

* feat/bridge-bulk-insert:
 - **Refactor `handle_bulk_insert.rs`:**
   - Replaced `handle_record_batch` with `handle_payload` for handling payloads.
   - Modified the fast path to use `common_runtime::spawn_global` for asynchronous task execution.

 - **Optimize `multi_dim.rs`:**
   - Added a fast path for single-region scenarios in `MultiDimPartitionRule::partition_record_batch`.

* feat/bridge-bulk-insert:
 - **Update `greptime-proto` Dependency**: Updated the `greptime-proto` dependency to a new revision in both `Cargo.lock` and `Cargo.toml`.
 - **Optimize Memory Allocation**: Increased initial and builder capacities in `time_series.rs` to improve performance.
 - **Enhance Data Handling**: Modified `bulk_insert.rs` to use `Bytes` for efficient data handling.
 - **Improve Bulk Insert Logic**: Refined the bulk insert logic in `region_request.rs` to handle schema and payload data more effectively and optimize record batch filtering.
 - **String Handling Improvement**: Updated string conversion in `helper.rs` for better performance.

* fix: clippy warnings

* feat/bridge-bulk-insert:
 **Add Metrics and Improve Error Handling**

 - **Metrics Enhancements**: Introduced new metrics for bulk insert operations in `metrics.rs`, `bulk_insert.rs`, `greptime_handler.rs`, and `region_request.rs`. Added `HANDLE_BULK_INSERT_ELAPSED`, `BULK_REQUEST_MESSAGE_SIZE`, and `GRPC_BULK_INSERT_ELAPSED` histograms to
 monitor performance.
 - **Error Handling Improvements**: Removed unnecessary error handling in `handle_bulk_insert.rs` by eliminating redundant `let _ =` patterns.
 - **Dependency Updates**: Added `lazy_static` and `prometheus` to `Cargo.lock` and `Cargo.toml` for metrics support.
 - **Code Refactoring**: Simplified function calls in `region_server.rs` and `handle_bulk_insert.rs` for better readability.

* chore: rebase main

* chore: merge main
2025-05-06 09:53:25 +00:00
discord9
6a5936468e chore: rm unnecessary depend for flow (#6047) 2025-05-06 03:36:19 +00:00
LFC
bb4890cff8 refactor: datanode instance builder (#6034)
remove another piece of REPL codes
2025-05-03 00:28:32 +00:00
shuiyisong
a706edbb73 feat(pipeline): auto transform (#6013)
* feat: support auto transform

* refactor: replace hashbrown with ahash

* refactor: params of run identity pipeline

* refactor: minor update

* test: add test for auto transform

* chore: fix cr issues
2025-04-30 07:40:26 +00:00
discord9
6166f2072e chore: upgrade hydroflow depend (#6011)
* chore: `hydroflow` -> `dfir`

* chore: refine log msg
2025-04-29 21:30:06 +00:00
shuiyisong
3c943be189 chore: update rust toolchain (#5818)
* chore: update nightly version

* chore: sort lint lines

* chore: minor fix

* chore: update nix

* chore: update toolchain to 2024-04-14

* chore: update toolchain to 2024-04-15

* chore: remove unnecessory test

* chore: do not assert oid in sqlness test

* chore: fix margin issue

* chore: fix cr issues

* chore: fix cr issues

---------

Co-authored-by: Ning Sun <sunning@greptime.com>
2025-04-27 09:02:36 +00:00
Zhenchi
2ff54486d3 chore: bump main branch version to 0.15 (#5984)
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>
2025-04-27 01:39:44 +00:00
discord9
66e2242e46 fix: conn timeout&refactor: better err msg (#5974)
* fix: conn timeout&refactor: better err msg

* chore: clippy

* chore: make test work

* chore: comment

* todo: fix null cast

* fix: retry conn&udd_calc

* chore: comment

* chore: apply suggestion

---------

Co-authored-by: dennis zhuang <killme2008@gmail.com>
2025-04-25 19:12:30 +00:00
Ning Sun
489b16ae30 fix: security update (#5982) 2025-04-25 18:11:09 +00:00
dennis zhuang
85d564b0fb fix: upgrade sqlparse and validate align in range query (#5958)
* fix: upgrade sqlparse and validate align in range query

* update sqlparser to the merged commit

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

---------

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
Co-authored-by: Ruihang Xia <waynestxia@gmail.com>
Co-authored-by: Zhenchi <zhongzc_arch@outlook.com>
2025-04-25 17:34:49 +00:00
discord9
a0900f5b90 feat(flow): use batching mode&fix sqlness (#5903)
* feat: use flow batching engine

broken: try using logical plan

fix: use dummy catalog for logical plan

fix: insert plan exec&sqlness grpc addr

feat: use frontend instance in flownode in standalone

feat: flow type in metasrv&fix: flush flow out of sync& column name alias

tests: sqlness update

tests: sqlness flow rebuild udpate

chore: per review

refactor: keep chnl mgr

refactor: use catalog mgr for get table

tests: use valid sql

fix: add more check

refactor: put flow type determine to frontend

* chore: update proto

* chore: update proto to main branch

* fix: add locks for create/drop flow&docs: update docs

* feat: flush_flow flush all ranges now

* test: add align time window test

* docs: explain `nodeid` use in check task

* refactor: AddAutoColumnRewriter check for Projection

* refactor: per review

* fix: query without time window also clean dirty time window

* chore: better logging

* chore: add comments per review

* refactor: per review

* chore: per review

* chore: per review rename args

* refactor: per review partially

* chore: update docs

* chore: use better error variant

* chore: better error variant

* refactor: rename FlowWorkerManager to FlowStreamingEngine

* rename again

* refactor: per review

* chore: rebase after #5963 merged

* refactor: rename all flow_worker_manager occurs

* docs: rm resolved TODO
2025-04-23 15:12:16 +00:00
Weny Xu
55cadcd2c0 feat: introduce flush metadata region task for metric engine (#5951)
* feat: introduce flush metadata region task for metric engine

* docs: generate config.md

* chore: add header

* test: fix unit test

* fix: fix unit tests

* chore: apply suggestions from CR

* chore: remove docs

* fix: fix unit tests
2025-04-23 04:51:22 +00:00
Lei, HUANG
90ffaa8a62 feat: implement otel-arrow protocol for GreptimeDB (#5840)
* [wip]: implement arrow service

* add service

* feat/otel-arrow:
 ### Add OpenTelemetry Arrow Support

 - **`Cargo.toml`, `Cargo.lock`**: Updated `otel-arrow-rust` dependency to use a local path and added `arrow-ipc` as a dependency.
 - **`src/servers/src/grpc.rs`, `src/servers/src/grpc/builder.rs`**: Integrated `ArrowMetricsServiceServer` with gRPC server, including support for custom header interception and message compression.
 - **`src/servers/src/otel_arrow.rs`**: Implemented `OtelArrowServiceHandler` for handling OpenTelemetry Arrow metrics and added `HeaderInterceptor` for custom header handling.

* feat/otel-arrow:
 Add error handling for OpenTelemetry Arrow requests

 - **`src/error.rs`**: Introduced a new error variant `HandleOtelArrowRequest` to handle failures in processing OpenTelemetry Arrow requests.
 - **`src/otel_arrow.rs`**: Implemented error handling for receiving and consuming batches from the OpenTelemetry Arrow client. Added logging for errors and updated the response status accordingly.

* feat/otel-arrow:
 Remove `otel_arrow` Module from gRPC Server

 - Deleted the `otel_arrow` module from the gRPC server implementation.
 - Removed the `otel_arrow` module import from `grpc.rs`.
 - Deleted the `otel_arrow.rs` file, which contained the `OtelArrowServer` struct and its implementation.

* feat/otel-arrow:
 ## Remove `Arc` Implementations for Protocol and Pipeline Handlers

 - **Removed `Arc` Implementations**: Deleted `Arc` implementations for `OpenTelemetryProtocolHandler` and `PipelineHandler` traits in `query_handler.rs`. This change simplifies the code by removing redundant async trait implementations for `Arc<T>`.
 - **File Affected**: `src/servers/src/query_handler.rs`

* feat/otel-arrow:
 Improve error handling and metadata processing in `otel_arrow.rs`

 - Updated error handling by ignoring the result of `sender.send` to prevent panic on failure.
 - Enhanced metadata processing in `HeaderInterceptor` by using `Ok` to safely handle `grpc-encoding` entry retrieval.

* fix dependency

* feat/otel-arrow:
 - **Update Dependencies**:
   - Moved `otel-arrow-rust` dependency in `Cargo.toml`.
   - Adjusted workspace dependencies in `src/frontend/Cargo.toml`.

 - **Error Handling**:
   - Removed `MissingQueryContext` error variant from `src/servers/src/error.rs`.

* fix: toml format

* remove useless code

* chore: resolve conflicts
2025-04-21 07:24:23 +00:00
Yuhan Wang
41814bb49f feat: introduce high_watermark for remote wal logstore (#5877)
* feat: introduce high_watermark_since_flush

* test: add unit test for high watermark

* refactor: submit a request instead

* fix: send reply before submit request

* fix: no need to update twice

* feat: update high watermark in background periodically

* test: update unit tests

* fix: update high watermark periodically

* test: update unit tests

* chore: apply review comments

* chore: rename

* chore: apply review comments

* chore: clean up

* chore: apply review comments
2025-04-18 12:10:47 +00:00
Weny Xu
b8c6f1c8ed feat: sync region followers after altering regions (#5901)
* feat: close follower regions after dropping leader regions

* chore: upgrade greptime-proto

* feat: sync region followers after alter region operations

* test: add tests

* chore: apply suggestions from CR

* chore: apply suggestions from CR
2025-04-18 10:21:35 +00:00
Ruihang Xia
115e5a03a8 fix: anchor regex string to fully match in promql (#5920)
* fix: anchor regex string to fully match in promql

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* fix format

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* update sqlness result

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* update test result

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* update test result again

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

---------

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
2025-04-18 10:13:45 +00:00
Yingwen
a5c443f734 perf: keep compiled regex in SimpleFilterEvaluator to avoid re-compiling (#5919)
* feat: cache regex in evaluator

* chore: fix warnings

* chore: add reference

* refactor: address CR comments

* Add negative to state
* Don't create the evaluator if the regex is invalid

* test: add test for maybe_build_regex
2025-04-18 09:36:28 +00:00
LFC
d27b9fc3a1 feat: implement Arrow Flight "DoPut" in Frontend (#5836)
* feat: implement Arrow Flight "DoPut" in Frontend

* support auth for "do_put"

* set request_id in DoPut requests and responses

* set "db" in request header
2025-04-17 03:46:19 +00:00
Lin Yihai
7274ceba30 feat: Add query pipeline http api (#5819)
* feat(pipeline): add query pipeline http api.

* chore(pipeline): rename get pipepile method

* refactor(pipeline): Also insert string piple  into cache after inserting into table.

---------

Co-authored-by: shuiyisong <113876041+shuiyisong@users.noreply.github.com>
2025-04-16 10:17:20 +00:00
Weny Xu
55c9a0de42 chore: upgrade opendal to 0.52 (#5857)
* chore: upgrade opendal to 0.52

* chore: ugprade object_store_opendal to 0.50

* Update Cargo.toml

Co-authored-by: dennis zhuang <killme2008@gmail.com>

---------

Co-authored-by: dennis zhuang <killme2008@gmail.com>
2025-04-15 18:48:42 +00:00
Lei, HUANG
799c7cbfa9 feat(mito): bulk insert request handling on datanode (#5831)
* wip: implement basic request handling

* feat/bulk-insert:
 ### Add Error Handling and Enhance Bulk Insert Functionality

 - **Error Handling**: Introduced a new error variant `ConvertDataType` in `error.rs` to handle conversion failures from `ConcreteDataType` to `ColumnDataType`.
 - **Bulk Insert Enhancements**:
   - Updated `WorkerRequest::BulkInserts` in `request.rs` to include metadata and sender.
   - Implemented `handle_bulk_inserts` in `worker.rs` to process bulk insert requests with region metadata.
   - Added functions `region_metadata_to_column_schema` and `record_batch_to_rows` in `handle_bulk_insert.rs` for schema conversion and row processing.
 - **API Changes**: Modified `RegionBulkInsertsRequest` in `region_request.rs` to include `region_id`.

 Files affected: `error.rs`, `request.rs`, `worker.rs`, `handle_bulk_insert.rs`, `region_request.rs`.

* feat/bulk-insert:
 **Enhance Error Handling and Add Unit Tests**

 - Improved error handling in `record_batch_to_rows` function within `handle_bulk_insert.rs` by returning `Result` and handling errors with `context`.
 - Added unit tests for `region_metadata_to_column_schema` and `record_batch_to_rows` functions in `handle_bulk_insert.rs` to ensure correct functionality and error handling.

* chore: update proto version

* feat/bulk-insert:
 - **Refactor Error Handling**: Updated error handling in `error.rs` by modifying the `ConvertDataType` error handling.
 - **Improve Logging and Error Reporting**: Enhanced logging and error reporting in `worker.rs` by adding error messages for missing region metadata.
 - **Add New Error Type**: Introduced `DecodeArrowIpc` error in `metadata.rs` to handle Arrow IPC decoding failures.
 - **Handle Arrow IPC Decoding**: Updated `region_request.rs` to handle Arrow IPC decoding errors using the new `DecodeArrowIpc` error type.

* chore: update proto version

* feat/bulk-insert:
 Refactor `handle_bulk_insert.rs` to simplify row construction

 - Removed the mutable `current_row` vector and refactored `row_at` function to return a new vector directly.
 - Updated `record_batch_to_rows` to utilize the refactored `row_at` function for constructing rows.

* feat/bulk-insert:
 ### Commit Summary

 **Enhancements in Region Server Request Handling**

 - Updated `region_server.rs` to include `RegionRequest::BulkInserts(_)` in the `RegionChange::Ingest` category, improving the handling of bulk insert operations.
 - Refined the categorization of region requests to ensure accurate mapping to `RegionChange` actions.
2025-04-15 14:11:50 +00:00
Ruihang Xia
dcf1a486f6 feat: support @@ (AtAt) operator for term matching (#5902)
* update dep and sqlness case

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* implement transcribe rule

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* more tests

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* update sqlness result

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

---------

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
2025-04-15 11:05:17 +00:00
Lei, HUANG
6700c0762d feat: Column-wise partition rule implementation (#5804)
* wip: naive impl

* feat/column-partition:
 ### Add support for DataFusion physical expressions

 - **`Cargo.lock` & `Cargo.toml`**: Added `datafusion-physical-expr` as a dependency to support physical expression creation.
 - **`expr.rs`**: Implemented conversion methods `try_as_logical_expr` and `try_as_physical_expr` for `Operand` and `PartitionExpr` to facilitate logical and physical expression handling.
 - **`multi_dim.rs`**: Enhanced `MultiDimPartitionRule` to utilize physical expressions for partitioning logic, including new methods for evaluating record batches.
 - **Tests**: Added unit tests for logical and physical expression conversions and partitioning logic in `expr.rs` and `multi_dim.rs`.

* feat/column-partition:
 ### Refactor and Enhance Partition Handling

 - **Refactor Partition Parsing Logic**: Moved partition parsing logic from `src/operator/src/statement/ddl.rs` to a new utility module `src/partition/src/utils.rs`. This includes functions like `parse_partitions`, `find_partition_bounds`, and `convert_one_expr`.
 - **Error Handling Improvements**: Added new error variants `ColumnNotFound`, `InvalidPartitionRule`, and `ParseSqlValue` in `src/partition/src/error.rs` to improve error reporting for partition-related operations.
 - **Dependency Updates**: Updated `Cargo.lock` and `Cargo.toml` to include new dependencies `common-time` and `session`.
 - **Code Cleanup**: Removed redundant partition parsing functions from `src/operator/src/error.rs` and `src/operator/src/statement/ddl.rs`.

* feat/column-partition:
 ## Refactor and Enhance SQL and Table Handling

 - **Refactor Column Definitions and Error Handling**
   - Made `FULLTEXT_GRPC_KEY`, `INVERTED_INDEX_GRPC_KEY`, and `SKIPPING_INDEX_GRPC_KEY` public in `column_def.rs`.
   - Removed `IllegalPrimaryKeysDef` error from `error.rs` and moved it to `sql/src/error.rs`.
   - Updated error handling in `fill_impure_default.rs` and `expr_helper.rs`.

 - **Enhance SQL Utility Functions**
   - Moved and refactored functions like `create_to_expr`, `find_primary_keys`, and `validate_create_expr` to `sql/src/util.rs`.
   - Added new utility functions for SQL parsing and validation in `sql/src/util.rs`.

 - **Improve Partition Handling**
   - Added `parse_partition_columns_and_exprs` function in `partition/src/utils.rs`.
   - Updated partition rule tests in `partition/src/multi_dim.rs` to use SQL-based partitioning.

 - **Simplify Table Name Handling**
   - Re-exported `table_idents_to_full_name` from `sql::util` in `session/src/table_name.rs`.

 - **Test Enhancements**
   - Updated tests in `partition/src/multi_dim.rs` to use SQL for partition rule creation.

* feat/column-partition:
 **Add Benchmarking and Enhance Partitioning Logic**

 - **Benchmarking**: Introduced a new benchmark for `split_record_batch` in `bench_split_record_batch.rs` using `criterion` and `rand` as development dependencies in `Cargo.toml`.
 - **Partitioning Logic**: Enhanced `MultiDimPartitionRule` in `multi_dim.rs` to include a default region for unmatched partition expressions and optimized the `split_record_batch` method.
 - **Refactoring**: Moved `sql_to_partition_rule` function to a public scope for reuse in `multi_dim.rs`.
 - **Testing**: Added new test module `test_split_record_batch` to validate the partitioning logic.

* Revert "feat/column-partition:  ### Refactor and Enhance Partition Handling"

This reverts commit 183fa19f

* fix: revert refctoring parse_partition

* revert some refactor

* feat/column-partition:
 ### Enhance Partitioning and Error Handling

 - **Benchmark Enhancements**: Added new benchmark `bench_split_record_batch_vs_row` in `bench_split_record_batch.rs` to compare row and column-based splitting.
 - **Error Handling Improvements**: Introduced new error variants in `error.rs` for better error reporting related to record batch evaluation and arrow kernel computation.
 - **Expression Handling**: Updated `expr.rs` to improve error context when converting schemas and creating physical expressions.
 - **Partition Rule Enhancements**: Made `row_at` and `record_batch_to_cols` methods public in `multi_dim.rs` and improved error handling for physical expression evaluation and boolean operations.

* feat/column-partition:
 ### Add `eq` Method and Optimize Expression Caching

 - **`expr.rs`**: Added a new `eq` method to the `Operand` struct for equality comparisons.
 - **`multi_dim.rs`**: Introduced a caching mechanism for physical expressions using `RwLock` to improve performance in `MultiDimPartitionRule`.
 - **`lib.rs`**: Enabled the `let_chains` feature for more concise code.
 - **`multi_dim.rs` Tests**: Enhanced test coverage with new test cases for multi-dimensional partitioning, including random record batch generation and default region handling.

* feat/column-partition:
 ### Add `split_record_batch` Method to `PartitionRule` Trait

 - **Files Modified**:
   - `src/partition/src/multi_dim.rs`
   - `src/partition/src/partition.rs`
   - `src/partition/src/splitter.rs`

 Added a new method `split_record_batch` to the `PartitionRule` trait, allowing record batches to be split into multiple regions based on partition values. Implemented this method in `MultiDimPartitionRule` and provided unimplemented stubs in test modules.

 ### Dependency Update

 - **File Modified**:
   - `src/operator/src/expr_helper.rs`

 Removed unused import `ColumnDataType` and `Timezone` from the test module.

 ### Miscellaneous

 - **File Modified**:
   - `src/partition/Cargo.toml`

 No functional changes; only minor formatting adjustments.

* chore: add license header

* chore: remove useless fules

* feat/column-partition:
 Add support for handling unsupported partition expression values

 - **`error.rs`**: Introduced a new error variant `UnsupportedPartitionExprValue` to handle unsupported partition expression values, and updated `ErrorExt` to map this error to `StatusCode::InvalidArguments`.
 - **`expr.rs`**: Modified the `Operand` implementation to return the new error when encountering unsupported partition expression values.
 - **`multi_dim.rs`**: Added a fast path to optimize the selection process when all rows are selected.

* feat/column-partition: Add validation for expression and region length in MultiDimPartitionRule constructor

 • Ensure the lengths of exprs and regions match to prevent mismatches.
 • Introduce error handling for length discrepancies with a descriptive error message.

* chore: add debug log

* feat/column-partition: Removed the validation check for matching lengths between exprs and regions in MultiDimPartitionRule constructor, simplifying the initialization process.

* fix: unit tests
2025-04-15 10:42:07 +00:00
zyy17
7b13376239 refactor: add partition_rules_for_uuid() (#5743)
* refactor: add partition_rules_for_uuid()

* refactor: support up to 65536 partitions for partition_rules_for_uuid()
2025-04-15 06:46:31 +00:00
Zhenchi
e3675494b4 feat: apply terms with fulltext bloom backend (#5884)
* feat: apply terms with fulltext bloom backend

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* perf: preload jieba

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* polish doc

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

---------

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>
2025-04-14 07:08:59 +00:00
fys
84e2bc52c2 fix: gRPC connection pool leak (#5876)
* fix: gRPC connection pool leak

* use .config() instead of .inner.config

* cancel the bg task if it is running

* fix: cr

* add unit test for pool release

* Avoid potential data races
2025-04-11 05:54:28 +00:00
LFC
71255b3cbd refactor: avoid empty display in errors (#5858)
* refactor: avoid empty display in errors

* fix: resolve PR comments
2025-04-10 10:08:45 +00:00
Weny Xu
54ef29f394 feat: add catalog_manager to ProcedureServiceHandler (#5873) 2025-04-10 06:55:46 +00:00
LFC
e052c65a58 chore: remove repl (#5860) 2025-04-10 06:30:29 +00:00
discord9
df362be012 feat(flow): batching mode engine (#5807)
* feat: partial impl of rr task/state

* feat: recording rule engine

* chore: rm unused

* chore: per review partially

* test: gen create table

* chore: rm some unused

* test: merge time window

* refactor: rename to batching mode

* refactor: per review

* refactor(partially): per review

* refactor: split engine.rs into three files

* refactor: use plan not sql

* chore: per review

* chore: per review

* refactor: per review

* refactor: per review

* chore: more per review

* refactor: per review

* refactor(partial): per review

* refactor: per review

* chore: clone task cheaper&more comments

* chore: fmt

* chore: typo
2025-04-09 09:53:32 +00:00
LFC
311727939d chore: update datafusion family (#5814) 2025-04-09 02:20:55 +00:00
Ruihang Xia
c16bae32c4 perf: evolve promql execution engine (#5691)
* use the same sort option across every prom plan

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* tweak plans

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* wip

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* fix merge compile

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* Revert "wip"

This reverts commit db58884236.

* tweak merge scan

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* handle error

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* pass distribution rule

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* reverse sort order

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* refine plans

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* more optimizations for plans

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* check logical table

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* fix tests

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* wierd tests

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* add comment

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* add test for series_divide

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* update sqlness result

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* fix scalar calculation

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* update sqlness result

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* fix: workaround join partition

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* update proto

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

---------

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
2025-04-08 08:12:15 +00:00
Weny Xu
917510ffd0 feat: introduce poison mechanism for procedure (#5822)
* feat: introduce poison for procedure

* tests: add unit tests

* refactor: minor refactor

* fix: unit tests

* chore: fix unit tests

* chore: apply suggestions from CR

* chore: apply suggestions from CR

* chore: update comments

* chore: introduce `ProcedureStatus::Poisoned`

* chore: upgrade greptime-proto to `2be0f`

* chore: apply suggestions from CR
2025-04-07 08:25:13 +00:00
fys
7b48ef1e97 chore: remove patch.crates-io for rustls (#5832)
* chore: remove patch.crates-io for rustls

* enable default-rustls-ring feature for mysql_sync

* fix: build error

* add comment

* update comment
2025-04-07 07:51:50 +00:00
Weny Xu
eab702cc02 feat: implement sync_region for metric engine (#5826)
* feat: implement `sync_region` for metric engine

* chore: apply suggestions from CR

* chore: upgrade proto
2025-04-03 12:46:20 +00:00
Zhenchi
dd63068df6 feat: add matches_term function (#5817)
* feat: add `matches_term` function

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* merge & fix

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* address comments

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* fix & skip char after boundary mismatch

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

---------

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>
2025-04-03 09:09:41 +00:00
Yuhan Wang
f73b61e767 feat(remote-wal): add remote wal prune procedure (#5714)
* feat: add remote wal prune procedure

* feat: add retry logic and remove rollback

* chore: simplify the logic

* fix: remove REMOTE_WAL_LOCK

* fix: use in-memory kv

* perf: O(n) judgement

* chore: add single write lock

* test: add unit test

* chore: remove unused function

* chore: update comments

* chore: apply comments

* chore: apply comments
2025-04-03 08:11:51 +00:00
Zhenchi
f797de3497 feat: add backend field to fulltext options (#5806)
* feat: add backend field to fulltext options

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* update proto

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* fix option conv

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* fix display

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* polish

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

---------

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>
2025-04-02 09:15:54 +00:00
Weny Xu
4ef9afd8d8 feat: introduce read preference (#5783)
* feat: introduce read preference

* feat: introduce `RegionQueryHandlerFactory`

* feat: extract ReadPreference from http header

* test: add more tests

* chore: apply suggestions from CR

* chore: apply suggestions from CR
2025-04-01 09:17:01 +00:00
shuiyisong
f9221e9e66 perf: introduce simd_json for parsing ndjson (#5794)
* perf: introduce simd_json for parsing ndjson

* fix: some tests

* fix: some tests

* fix: es test case

* chore: use `as_bytes_mut()`

* chore: remove unnecessary `to_string`

* chore: add safety comment
2025-04-01 08:17:26 +00:00
Weny Xu
d701c18150 feat: introduce CustomizedRegionLeaseRenewer (#5762)
* feat: add manifest_version to `GrantedRegion`

* chore: upgrade proto

* chore: apply review suggestions

* chore: apply suggestions from CR

* feat: introduce `CustomizedRegionLeaseRenewerRef`

* chore: upgrade to `103948`
2025-03-31 13:25:05 +00:00
Weny Xu
41aee1f1b7 feat: implement sync_region for mito engine (#5765)
* chore: upgrade proto to `2d52b`

* feat: add `SyncRegion` to `WorkerRequest`

* feat: impl `sync_region` for `Engine` trait

* test: add tests

* chore: fmt code

* chore: upgrade proto

* chore: unify `RegionLeaderState` and `RegionFollowerState`

* chore: check immutable memtable

* chore: fix clippy

* chore: apply suggestions from CR
2025-03-31 03:53:47 +00:00
shuiyisong
bef45ed0e8 feat(pipeline): support table name suffix templating in pipeline (#5775)
* chore: add table name template in pipeline yaml

* chore: implement apply function and add simple test

* chore: add comment and integration test

* chore: minor update

* fix: typos

* chore: change to table suffix

* chore: update comment and test

* chore: change name to table_suffix
2025-03-28 18:12:46 +00:00
fys
2b2ea5bf72 chore: upgrade some dependencies (#5777)
* chore: upgrade some dependencies

* chore: upgrade some dependencies

* fix: cr

* fix: ci

* fix: test

* fix: cargo fmt
2025-03-27 02:48:44 +00:00