* fix: select after alter
* fix: insert a proper row&catch a bug
* fix: alter table modify type modify default value type too
* refactor: per review
* chore: per review
* refactor: per review
* refactor: per review
* feat/bridge-bulk-insert:
## Implement Bulk Insert and Update Dependencies
- **Bulk Insert Implementation**: Added `handle_bulk_inserts` method in `src/operator/src/bulk_insert.rs` to manage bulk insert requests using `FlightDecoder` and `FlightData`.
- **Dependency Updates**: Updated `Cargo.lock` and `Cargo.toml` to use the latest revision of `greptime-proto` and added new dependencies like `arrow`, `arrow-ipc`, `bytes`, and `prost`.
- **gRPC Enhancements**: Modified `put_record_batch` method in `src/frontend/src/instance/grpc.rs` and `src/servers/src/grpc/flight.rs` to handle `FlightData` instead of `RawRecordBatch`.
- **Error Handling**: Added new error types in `src/operator/src/error.rs` for handling Arrow operations and decoding flight data.
- **Miscellaneous**: Updated `src/operator/src/insert.rs` to expose `partition_manager` and `node_manager` as public fields.
* feat/bridge-bulk-insert:
- **Update `greptime-proto` Dependency**: Updated the `greptime-proto` dependency to a new revision in `Cargo.lock` and `Cargo.toml`.
- **Refactor gRPC Query Handling**: Removed `RawRecordBatch` usage from `grpc.rs`, `flight.rs`, `greptime_handler.rs`, and test files, simplifying the gRPC query handling.
- **Enhance Bulk Insert Logic**: Improved bulk insert logic in `bulk_insert.rs` and `region_request.rs` by using `FlightDecoder` and `BooleanArray` for better performance and clarity.
- **Add `common-grpc` Dependency**: Added `common-grpc` as a workspace dependency in `store-api/Cargo.toml` to support gRPC functionalities.
* fix: clippy
* fix schema serialization
* feat/bridge-bulk-insert:
Add error handling for encoding/decoding in `metadata.rs` and `region_request.rs`
- Introduced new error variants `FlightCodec` and `Prost` in `MetadataError` to handle encoding/decoding failures in `metadata.rs`.
- Updated `make_region_bulk_inserts` function in `region_request.rs` to use `context` for error handling with `ProstSnafu` and `FlightCodecSnafu`.
- Enhanced error handling for `FlightData` decoding and `filter_record_batch` operations.
* fix: test
* refactor: rename
* allow empty app_metadata in FlightData
* feat/bridge-bulk-insert:
- **Remove Logging**: Removed unnecessary logging of affected rows in `region_server.rs`.
- **Error Handling Enhancement**: Improved error handling in `bulk_insert.rs` by adding context to `split_record_batch` and handling single datanode fast path.
- **Error Enum Cleanup**: Removed unused `Arrow` error variant from `error.rs`.
* fix: standalone test
* feat/bridge-bulk-insert:
### Enhance Bulk Insert Handling and Metadata Management
- **`lib.rs`**: Enabled the `result_flattening` feature for improved error handling.
- **`request.rs`**: Made `name_to_index` and `has_null` fields public in `WriteRequest` for better accessibility.
- **`handle_bulk_insert.rs`**:
- Added `handle_record_batch` function to streamline processing of bulk insert payloads.
- Improved error handling and task management for bulk insert operations.
- Updated `region_metadata_to_column_schema` to return both column schemas and a name-to-index map for efficient data access.
* feat/bridge-bulk-insert:
- **Refactor `handle_bulk_insert.rs`:**
- Replaced `handle_record_batch` with `handle_payload` for handling payloads.
- Modified the fast path to use `common_runtime::spawn_global` for asynchronous task execution.
- **Optimize `multi_dim.rs`:**
- Added a fast path for single-region scenarios in `MultiDimPartitionRule::partition_record_batch`.
* feat/bridge-bulk-insert:
- **Update `greptime-proto` Dependency**: Updated the `greptime-proto` dependency to a new revision in both `Cargo.lock` and `Cargo.toml`.
- **Optimize Memory Allocation**: Increased initial and builder capacities in `time_series.rs` to improve performance.
- **Enhance Data Handling**: Modified `bulk_insert.rs` to use `Bytes` for efficient data handling.
- **Improve Bulk Insert Logic**: Refined the bulk insert logic in `region_request.rs` to handle schema and payload data more effectively and optimize record batch filtering.
- **String Handling Improvement**: Updated string conversion in `helper.rs` for better performance.
* fix: clippy warnings
* feat/bridge-bulk-insert:
**Add Metrics and Improve Error Handling**
- **Metrics Enhancements**: Introduced new metrics for bulk insert operations in `metrics.rs`, `bulk_insert.rs`, `greptime_handler.rs`, and `region_request.rs`. Added `HANDLE_BULK_INSERT_ELAPSED`, `BULK_REQUEST_MESSAGE_SIZE`, and `GRPC_BULK_INSERT_ELAPSED` histograms to
monitor performance.
- **Error Handling Improvements**: Removed unnecessary error handling in `handle_bulk_insert.rs` by eliminating redundant `let _ =` patterns.
- **Dependency Updates**: Added `lazy_static` and `prometheus` to `Cargo.lock` and `Cargo.toml` for metrics support.
- **Code Refactoring**: Simplified function calls in `region_server.rs` and `handle_bulk_insert.rs` for better readability.
* chore: rebase main
* implement simple bulk memtable
* impl write_bulk
* implement simple bulk memtable
* feat/simple-bulk-memtable:
### Enhance Time-Series Memtable and Bulk Insert Handling
- **Visibility Modifications**: Made `mutable_array` in `PrimitiveVectorBuilder` and `StringVectorBuilder` public in `primitive.rs` and `string.rs`.
- **New Module**: Added `builder.rs` to `memtable` for time-series builders, including `FieldBuilder` and `StringBuilder` implementations.
- **Bulk Insert Enhancements**:
- Added `sequence` field to `BulkPart` in `part.rs` and updated its handling in `simple_bulk_memtable.rs` and `region_write_ctx.rs`.
- Introduced metrics for bulk insert operations in `metrics.rs` and `bulk_insert.rs`.
- **Performance Metrics**: Added timing metrics for write operations in `metrics.rs`, `region_write_ctx.rs`, and `handle_write.rs`.
- **Region Request Handling**: Updated `make_region_bulk_inserts` in `region_request.rs` to include performance metrics.
* feat/simple-bulk-memtable:
**Improve Memtable Stats Calculation and Add Metrics Timer**
- **`simple_bulk_memtable.rs`**: Refactored `stats` method to use `num_rows` for checking if rows have been written, improving accuracy in memory table statistics.
- **`handle_bulk_insert.rs`**: Introduced a metrics timer to measure the elapsed time for processing bulk requests, enhancing performance monitoring.
* feat/simple-bulk-memtable:
### Commit Message
**Enhancements and Bug Fixes**
- **Dependency Update**: Updated `greptime-proto` dependency to a new revision in `Cargo.lock` and `Cargo.toml`.
- **Feature Addition**: Implemented `to_mutation` method in `BulkPart` to convert `BulkPart` to `Mutation` for fallback `write_bulk` implementation in `src/mito2/src/memtable/bulk/part.rs`.
- **Functionality Improvement**: Modified `write_bulk` method in `TimeSeriesMemtable` to support default implementation fallback to row iteration in `src/mito2/src/memtable/time_series.rs`.
- **Performance Optimization**: Enhanced `bulk_insert` handling by optimizing region request processing and data partitioning in `src/operator/src/bulk_insert.rs`.
- **Error Handling**: Added `ComputeArrow` error variant for better error management in `src/operator/src/error.rs`.
- **Code Refactoring**: Simplified region bulk insert request processing in `src/store-api/src/region_request.rs`.
* fix: some clippy warnings
* feat/simple-bulk-memtable:
### Commit Summary
- **Refactor Return Types to `Result`:**
Updated the return type of the `ranges` method in `memtable.rs`, `bulk.rs`, `partition_tree.rs`, `simple_bulk_memtable.rs`, `time_series.rs`, and `memtable_util.rs` to return `Result<MemtableRanges>` for better error handling.
- **Enhance Metrics Tracking:**
Improved metrics tracking by adding `num_rows` and `max_sequence` to `WriteMetrics` in `stats.rs`. Updated related methods in `partition_tree.rs`, `simple_bulk_memtable.rs`, `time_series.rs`, and `scan_region.rs` to utilize these metrics.
- **Remove Unused Imports:**
Cleaned up unused imports in `time_series.rs` to streamline the codebase.
* merge main
* remove useless error vairant
* use newer version of proto
* feat/simple-bulk-memtable:
Commit Message
Summary
Enhance FieldBuilder and StringBuilder functionality, add tests, and improve error handling.
Key Changes
• builder.rs:
• Added documentation for FieldBuilder methods.
• Renamed append_string_vector to append_vector in StringBuilder.
• simple_bulk_memtable.rs:
• Added new test cases for write_one, write_bulk, is_empty, stats, fork, and sequence_filter.
• time_series.rs:
• Improved error handling in ValueBuilder for type mismatches.
• memtable_util.rs:
• Removed unused imports and streamlined code.
These changes enhance the robustness and test coverage of the memtable components.
* feat/simple-bulk-memtable:
Improve Time Partition Matching Logic in `time_partition.rs`
- Enhanced the `write_bulk` method in `time_partition.rs` to improve the logic for matching partitions based on time ranges.
- Introduced a new mechanism to filter and select partitions that overlap with the record batch's timestamp range before writing.
* feat/simple-bulk-memtable:
Improve Metrics Handling in `bulk_insert.rs`
- Removed the `group_request_timer` and its associated metric observation to streamline the timing logic.
- Moved the `BULK_REQUEST_ROWS` metric observation to occur after filtering, ensuring accurate row count metrics.
* feat/simple-bulk-memtable:
**Enhance Stalled Requests Calculation and Update Metrics**
- **`worker.rs`**: Updated the `stalled_count` method to include both `reqs` and `bulk_reqs` in the calculation of stalled requests.
- **`bulk_insert.rs`**: Removed duplicate observation of `BULK_REQUEST_MESSAGE_SIZE` metric.
- **`metrics.rs`**: Changed the bucket strategy for `BULK_REQUEST_ROWS` from linear to exponential, improving the granularity of metrics collection.
* feat/simple-bulk-memtable:
**Refactor `StringVector` Usage and Update Method Signatures**
- **`src/datatypes/src/vectors/string.rs`**: Changed `StringVector`'s `array` field from public to private.
- **`src/mito2/src/memtable/builder.rs`**: Refactored `append_vector` method to `append_array`, updating its usage to work directly with `StringArray` instead of `StringVector`.
- **`src/mito2/src/memtable/time_series.rs`**: Updated `ValueBuilder` to handle `StringArray` directly, replacing `StringVector` usage with `StringArray` in the `FieldBuilder::String` case.
* feat/simple-bulk-memtable:
- **Refactor `PrimitiveVectorBuilder`**: Made `mutable_array` private in `src/datatypes/src/vectors/primitive.rs`.
- **Optimize `ValueBuilder`**: Replaced `UInt64VectorBuilder` and `UInt8VectorBuilder` with `Vec<u64>` and `Vec<u8>` for `sequence` and `op_type` in `src/mito2/src/memtable/time_series.rs`.
- **Improve Metrics Initialization**: Updated histogram bucket initialization to use `exponential_buckets` in `src/mito2/src/metrics.rs`.
* feat/simple-bulk-memtable:
Improve error handling in `simple_bulk_memtable.rs` and `time_series.rs`
- Enhanced error handling by using `OptionExt` for more concise error context management in `simple_bulk_memtable.rs` and `time_series.rs`.
- Replaced `ok_or` with `with_context` to streamline error context creation in both files.
* feat/simple-bulk-memtable:
**Enhance Time Partition Handling in `time_partition.rs`**
- Introduced `create_time_partition` function to streamline the creation of new time partitions, ensuring thread safety by acquiring a lock.
- Modified logic to handle cases where no matching time partitions exist, creating new partitions as needed.
- Updated `write_record_batch` and `write_one` methods to utilize the new partition creation logic, improving partition management and data writing efficiency.
* replace proto
* feat/simple-bulk-memtable:
Update `metrics.rs` to adjust the range of exponential buckets for bulk insert message rows from `10 ~ 1_000_000` to `10 ~ 100_000`.
* feat: include trace v1 encoding
* feat: add trace ingestion in inserter
* feat: add partition rules and index for trace_id
* chore: format
* chore: fmt
* fix: issue introduced with merge
* feat: adjust index and add integration test for v1
* refactor: remove comment key
* fix: update default value of skip index granularity
* fix: update default value of skip index granularity
* refactor: rename some functions
* feat: remove skipping index from span_id
* refactor: made span_id part of primary key for potential dedup purpose
* feat: move the special attribute resource_attribute.service.name to top level
---------
Co-authored-by: shuiyisong <113876041+shuiyisong@users.noreply.github.com>
* change dep
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
* feat: adapt to arrow's interval array
* chore: fix compile errors in datatypes crate
* chore: fix api crate compiler errors
* chore: fix compiler errors in common-grpc
* chore: fix common-datasource errors
* chore: fix deprecated code in common-datasource
* fix promql and physical plan related
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
* wip: upgrading network deps
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
* block on updating `sqlparser`
* upgrade sqlparser
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
* adapt new df's trait requirements
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
* chore: fix compiler errors in mito2
* chore: fix common-function crate errors
* chore: fix catalog errors
* change import path
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
* chore: fix some errors in query crate
* chore: fix some errors in query crate
* aggr expr and some other tiny fixes
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
* chore: fix expr related errors in query crate
* chore: fix query serializer and admin command
* chore: fix grpc services
* feat: axum serve
* chore: fix http server
* remove handle_error handler
* refactor timeout layer
* serve axum
* chore: fix flow aggr functions
* chore: fix flow
* feat: fix errors in meta-srv
* boxed()
* use TokioIo
* feat!: Remove script crate and python feature (#5321)
* feat: exclude script crate
* chore: simplify feature
* feat: remove the script crate
* chore: remove python feature and some comments
* chore: fix warning
* chore: fix servers tests compiler errors
* feat: fix tests-integration errors
* chore: fix unused
* test: fix catalog test
* chore: fix compiler errors for crates using common-meta
testing feature is enabled when check with --workspace
* test: use display for logical plan test
* test: implement rewrite for ScanHintRule
* fix: http server build panic
* test: fix mito test
* fix: sql parser type alias error
* test: fix TestClient not listen
* test: some flow tests
* test(flow): more fix
* fix: test_otlp_logs
* test: fix promql test that using deprecated method fun()
* fix: sql type replace supports Int8 ~ Int64, UInt8 ~ UInt64
* test: fix infer schema test case
* test: fix tests related to plan display
* chore: fix last flow test
* test: fix function format related assertion
* test: use larger port range for tests
* fix: test_otlp_traces
* fix: test_otlp_metrics
* fix range query and dist plan
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
* fix: flow handle distinct use deprecated field
* fix: can't pass Join plan expressions to LogicalPlan::with_new_exprs
* test: fix deserialize test
* test: reduce split key case num
* tests: lower case aggr func name
* test: fix some sqlness tests
* tests: more sqlness fix
* tests: fixed sqlness test
* commit non-bug changes
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
* fix: make our udf correct
* fix: implement empty methods of ContextProvider for DfContextProviderAdapter
* test: update sqlness test result
* chore: remove unused
* fix: provide alias name for AggregateExprBuilder in range plan
* test: update range query result
* fix: implement missing ContextProvider methods for DfContextProviderAdapter
* test: update timestamps, cte result
* fix: supports empty projection in mito
* test: update comment for cte test
* fix: support projection for numbers
* test: update test cases after projection fix
* fix: fix range select first_value/last_value
* fix: handle CAST and time index conflict
* fix: handle order by correctly in range first_value/last_value
* test: update sqlness result
* test: update view test result
* test: update decimal test
wait for https://github.com/apache/datafusion/pull/14126 to fix this
* feat: remove redundant physical optimization
todo(ruihang): Check if we can remove this.
* test: update sqlness test result
* chore: range select default sort use nulls_first = false
* test: update filter push down test result
* test: comment deciaml test to avoid different panic message
* test: update some distributed test result
* test: update test for distributed count and filter push down
* test: update subqueries test
* fix: SessionState may overwrite our UDFs
* chore: fix compiler errors after merging main
* fix: fix elasticsearch and dashboard router panic
* chore: fix common-functions tests
* chore: update sqlness result
* test: fix id keyword and update sqlness result
* test: fix flow_null test
* fix: enlarge thread size in debug mode to avoid overflow
* chore: fix warnings in common-function
* chore: fix warning in flow
* chore: fix warnings in query crate
* chore: remove unused warnings
* chore: fix deprecated warnings for parquet
* chore: fix deprecated warning in servers crate
* style: fix clippy
* test: enlarge mito cache tttl test ttl time
* chore: fix typo
* style: fmt toml
* refactor: reimplement PartialOrd for RangeSelect
* chore: remove script crate files introduced by merge
* fix: return error if sql option is not kv
* chore: do not use ..default::default()
* chore: per review
* chore: update error message in BuildAdminFunctionArgsSnafu
Co-authored-by: jeremyhi <jiachun_feng@proton.me>
* refactor: typed precision
* update sqlness view case
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
* chore: flow per review
* chore: add example in comment
* chore: warn if parquet stats of timestamp is not INT64
* style: add a newline before derive to make the comment more clear
* test: update sqlness result
* fix: flow from substrait
* chore: change update_range_context log to debug level
* chore: move axum-extra axum-macros to workspace
---------
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
Co-authored-by: Ruihang Xia <waynestxia@gmail.com>
Co-authored-by: luofucong <luofc@foxmail.com>
Co-authored-by: discord9 <discord9@163.com>
Co-authored-by: shuiyisong <xixing.sys@gmail.com>
Co-authored-by: jeremyhi <jiachun_feng@proton.me>
* chore: otlp logs api
* feat: add API to write OpenTelemetry logs to GreptimeDB
* chore: fix test data schema error
* chore: modify the underlying data structure of the pipeline value map type from hashmap to btremap to keep key order
* chore: fix by pr comment
* chore: resolve conflicts and add some test
* chore: remove useless error
* chore: change otlp header name
* chore: fmt code
* chore: fix integration test for otlp log write api
* chore: fix by pr comment
* chore: set otlp body with fulltext default
* feat: add geojson function to aggregate paths
* test: add sqlness results
* test: add sqlness
* refactor: corrected to aggregation function
* chore: update comments
* fix: make linter happy again
* refactor: rename to remove `geo` from `geojson` function name
The return type is not geojson at all. It's just compatible with geojson's
coordinates part and superset's deckgl path plugin.
* feat: add json type and vector
* fix: allow to create and insert json data
* feat: udf to query json as string
* refactor: remove JsonbValue and JsonVector
* feat: show json value as strings
* chore: make ci happy
* test: adunit test and sqlness test
* refactor: use binary as grpc value of json
* fix: use non-preserve-order jsonb
* test: revert changed test
* refactor: change udf get_by_path to jq
* chore: make ci happy
* fix: distinguish binary and json in proto
* chore: delete udf for future pr
* refactor: remove Value(Json)
* chore: follow review comments
* test: some tests and checks
* test: fix unit tests
* chore: follow review comments
* chore: corresponding changes to proto
* fix: change grpc and pgsql server behavior alongside with sqlness/crud tests
* chore: follow review comments
* feat: udf of conversions between json and strings, used for grpc server
* refactor: rename to_string to json_to_string
* test: add more sqlness test for json
* chore: thanks for review :)
* Apply suggestions from code review
---------
Co-authored-by: Weny Xu <wenymedia@gmail.com>
* chore: add test pipeline api
* chore: add test for test pipeline api
* chore: fix taplo check
* chore: change pipeline dryrun api path
* chore: add more info for pipeline dryrun api