Commit Graph

4213 Commits

Author SHA1 Message Date
yinheli
8d36ffb4e1 chore: enable github folder typo check and fix typos (#6128) 2025-05-20 04:20:07 +00:00
Yingwen
955ad644f7 ci: add pull requests permissions to semantic check job (#6130)
* ci: add pull requests permissions

* ci: reduce permissions
2025-05-20 03:33:33 +00:00
localhost
c2e3c3d398 chore: Add more data format support to the pipeline dryrun api. (#6115)
* chore: supporting more data type for pipeline dryrun API

* chore: add docs for parse_dryrun_data

* chore: fix by pr comment

* chore: add user-friendly error message

* chore: change EventPayloadResolver content_type field type from owner to ref

* Apply suggestions from code review

Co-authored-by: shuiyisong <113876041+shuiyisong@users.noreply.github.com>

---------

Co-authored-by: shuiyisong <113876041+shuiyisong@users.noreply.github.com>
2025-05-20 03:29:28 +00:00
Zhenchi
400229c384 feat: introduce index result cache (#6110)
* feat: introduce index result cache

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* Update src/mito2/src/sst/index/inverted_index/applier/builder.rs

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* optimize selector_len

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* address comments

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* address comments

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* address comments

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

---------

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-05-20 01:45:42 +00:00
Ruihang Xia
cd9b6990bf feat: implement clamp_min and clamp_max (#6116)
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
2025-05-19 21:32:03 +00:00
Ruihang Xia
a56e6e04c2 chore: remove etcd from acknowledgement as not recommended (#6127)
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
2025-05-19 12:42:30 +00:00
Ning Sun
d324439014 ci: fix release job dependencies (#6125) 2025-05-19 11:48:57 +00:00
discord9
038acda7cd fix: flow update use proper update (#6108)
* fix: flow update use proper update

* refactor: per review

* fix: flow cache

* chore: per copilot review

* refactor: rm flow node id

* refactor: per review

* chore: per review

* refactor: per review

* chore: per review
2025-05-19 11:30:10 +00:00
shuiyisong
a0d89c9ed1 feat: Prometheus remote write with pipeline (#5981)
* chore: update nightly version

* chore: sort lint lines

* chore: minor fix

* chore: update nix

* chore: update toolchain to 2024-04-14

* chore: update toolchain to 2024-04-15

* chore: remove unnecessory test

* chore: do not assert oid in sqlness test

* chore: fix margin issue

* chore: fix cr issues

* chore: fix cr issues

* chore: add pipelie handler to prom state

* chore: add prom series processor to merge function

* chore: add run pipeline in decode

* chore: add channel to pipeline ctx

* chore: add pipeline info to remote wirte hander

* chore: minor update

* chore: minor update

* chore: add test

* chore: add comment

* refactor: simplify identity pipeline params

* fix: test

* refactor: remove is_prometheus

---------

Co-authored-by: Ning Sun <sunning@greptime.com>
2025-05-19 08:00:59 +00:00
discord9
3a5534722c feat: export to s3 add more options (#6091)
* feat: export to s3 add more options

* chore: rm output dir override logic

* fix: s3 root export data

* feat: use output_dir and s3 at same time

* refactor: per review

* fix: keep same behavior
0.15.0-nightly-20250519
2025-05-16 20:58:14 +00:00
Ruihang Xia
1010a0c2ad fix: update promql-parser for regex anchor fix (#6117)
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
2025-05-16 20:33:35 +00:00
Lei, HUANG
f46cdbd66b fix: fast path for single region bulk insert (#6104)
* fix/fast-path-for-single-region-bulk-insert:
 ### Commit Summary

 - **Refactor `try_decode` Method**: Updated the `try_decode` method in `FlightDecoder` to accept a reference to `FlightData` instead of consuming it. This change affects multiple files including `database.rs`, `region.rs`, `flight.rs`, `bulk_insert.rs`, `stream.rs`, and `region_request.rs`.
 - **Optimize Bulk Insert Handling**: Added a fast path for handling bulk inserts when only one region is involved in `bulk_insert.rs`.

* fix/fast-path-for-single-region-bulk-insert:
 Improve `FlightDecoder` usage in tests

 - Updated `try_decode` method calls in `flight.rs` to remove unnecessary references for `d1`, `d2`, and `d3`.
 - Ensured consistency in handling `FlightMessage` variants within test cases.

* fix/fast-path-for-single-region-bulk-insert:
 **Enhancement: Skip Empty Regions in Bulk Insert**

 - Updated `bulk_insert.rs` to improve efficiency by skipping regions without data during the bulk insert process. This change ensures that regions with a `true_count` of zero are not processed, optimizing resource usage and performance.

* fix/fast-path-for-single-region-bulk-insert:
 ### Commit Summary

 - **Refactor `RegionMask` Handling**:
   - Introduced `RegionMask` struct to encapsulate boolean array and selected rows count.
   - Updated methods to use `RegionMask` instead of `BooleanArray` for region selection.
   - Affected files: `bulk_insert.rs`, `multi_dim.rs`, `partition.rs`, `splitter.rs`.

 - **Optimize Region Selection**:
   - Removed unnecessary checks for empty regions in `bulk_insert.rs`.
   - Improved logic for handling default regions in `multi_dim.rs`.

 - **Update Tests**:
   - Modified test cases to accommodate `RegionMask` changes.
   - Affected files: `multi_dim.rs`, `splitter.rs`.

* fix/fast-path-for-single-region-bulk-insert:
 **Enhancements to MultiDimPartitionRule Logic and Tests**

 - **`multi_dim.rs`**: Improved the logic for selecting rows in `MultiDimPartitionRule` by optimizing the selection process when only one region is present.
 - **Tests**: Added new test cases to verify the behavior of default regions with unselected rows, existing default regions, and scenarios where all rows are selected. These tests ensure robust handling of partition rules and validate the correct assignment of rows to regions.
2025-05-16 20:26:56 +00:00
Weny Xu
864cc117b3 fix: append noop entry when auto topic creation is disabled (#6092)
* feat: improve topic management and add stale records cleanup

* fix: fix unit tests

* chore: apply suggestions from CR

* chore: apply suggestions from CR
2025-05-16 11:26:47 +00:00
Yingwen
0ea9ab385d fix: clean files under the atomic write dir on failure (#6112)
* fix: remove files under atomic dir on failure

* fix: clean atomic dir on download failure

* chore: update comment

* fix: clean if failed to write without write cache

* feat: add a TempFileCleaner to clean files on failure

* chore: after merge fix

* chore: more fix

---------

Co-authored-by: discord9 <55937128+discord9@users.noreply.github.com>
Co-authored-by: discord9 <discord9@163.com>
2025-05-16 11:18:11 +00:00
Yingwen
c7e9485534 feat: New scanner SeriesScan to scan by series for querying metrics (#5968)
* chore: basic methods for SeriesScan

* chore: add to scanner enum

* feat: implement scan logic of each partition

* feat: use series scan when distribution is PerSeries

* refactor: remove per series scan from SeqScan

* fix: use series scan in PerSeries distribution

* feat: keep parallelize_scan unchanged

* fix: address compiler errors

* fix: include build merge reader cost to scan cost

* feat: use smallvec

* chore: update comment

* Revert "feat: keep parallelize_scan unchanged"

This reverts commit 96ba00d175.

* assign partition_ranges

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* feat: try send before send

reduce the send timeout to 10ms

* chore: add comments

* fix: add metrics to partition metrics list

* fix: correct scan cost metrics

* chore: reset instant

* fix: scanner metrics init

* chore: display more info in explain

* feat: metrics for send series timeout

* style: fix clippy

* refactor: use ChainedRecordBatchStream to simplify codes

* chore: fix typos

* feat: separate distributor metrics

* feat: remove parallelize hack

* chore: fix warning

* test: add test for series scan

* test: update sqlness test

---------

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
Co-authored-by: Ruihang Xia <waynestxia@gmail.com>
2025-05-16 08:53:24 +00:00
Ruihang Xia
57b53211d9 feat: don't hide atomic write dir (#6109)
* feat: don't hidden atomic write dir

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* compatible code

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* Update src/mito2/src/access_layer.rs

Co-authored-by: Yingwen <realevenyag@gmail.com>

---------

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
Co-authored-by: Yingwen <realevenyag@gmail.com>
2025-05-16 06:21:13 +00:00
zyy17
01076069a3 chore: modify default slow_query.threshold from 5s to 30s (#6107)
chore: modify slow_query.threshold from 5s to 30s
2025-05-15 20:16:13 +00:00
Ning Sun
73b4b710cd ci: update nix build linker (#6103)
* ci: update nix build linker

* ci: use mold for nix ci
2025-05-15 19:02:58 +00:00
zyy17
14b655ea57 refactor: add SlowQueryRecorder to record slow query in system table and refactor slow query options (#6008)
* refactor: add common-slow-query crate

* refactor: refine the naming

* chore: fix clippy

* chore: fix typo

* chore: sperate SlowQueryOptions From Logging

* chore: fix clippy

* chore: fix ci

* chore: refine the code

* chore: update config example

* refactor: use drop() to end the slow query timer

* refactor: move common-slow-query to frontend crate

* chore: polish some code

* refactor: code review

* refactor: add promql_range/promql_step/promql_start/promql_end fields in slow_queries

* refactor: add build_slow_query_logger()

* refactor: turn on slow query on frontend by default
2025-05-15 04:18:48 +00:00
Ruihang Xia
c780746171 perf: avoid some atomic operation on array slice (#6101)
* perf: avoid some atomic operation on array slice

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* finilise

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

---------

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
2025-05-15 02:29:07 +00:00
Weny Xu
1f62c3b545 fix: table metadata collection (#6102)
fix: fix collect metadata
2025-05-14 12:19:54 +00:00
Lei, HUANG
5a9023d6b3 feat(bulk): write to multiple time partitions (#6086)
* add benchmark for splitting according to time partition

* feat/write-to-multiple-time-partitions:
 **Enhancements to Bulk Processing and Time Partitioning**

 - **`part.rs`**: Added `Snafu` to imports and introduced `timestamp_index` in `BulkPart` struct. Implemented `timestamps` method for accessing timestamp columns.
 - **`simple_bulk_memtable.rs`**: Updated tests to include `timestamp_index` initialization.
 - **`time_partition.rs`**: Enhanced `TimePartition` to support partial writes with `write_record_batch_partial`. Implemented `split_record_batch` for filtering records by timestamp range. Added comprehensive tests for `split_record_batch`.
 - **`handle_bulk_insert.rs`**: Modified to retrieve timestamp index and column together, updating `BulkPart` initialization with `timestamp_index`.

* feat/write-to-multiple-time-partitions:
 ### Enhance Time Partitioning Logic

 - **`time_partition.rs`**:
   - Introduced `HashSet` for efficient partition management.
   - Refactored `write_bulk` to handle multiple partitions and added `find_partitions_by_time_range` for identifying existing and missing partitions.
   - Updated `get_or_create_time_partition` to manage partition creation.
   - Added comprehensive tests for partition finding logic, covering various scenarios including overlapping and non-overlapping time ranges.

 - **Tests**:
   - Added `test_find_partitions_by_time_range` to validate new partitioning logic.
   - Updated `test_split_record_batch` to ensure correct record batch splitting behavior.

* feat/write-to-multiple-time-partitions:
 ### Enhance Time Partitioning and Testing in `time_partition.rs`

 - **Time Partitioning Enhancements**:
   - Updated `split_record_batch` to handle multiple timestamp units (`Second`, `Millisecond`, `Microsecond`, `Nanosecond`) by matching on `DataType`.
   - Improved filtering logic for timestamp arrays to support various time units.

 - **Testing Enhancements**:
   - Added `test_write_bulk` to verify writing across multiple partitions and scenarios in `time_partition.rs`.
   - Updated `test_split_record_batch` to use `TimestampMillisecondArray` for testing timestamp partitioning.

 - **Imports and Dependencies**:
   - Added necessary imports for new timestamp array types and testing utilities.

* feat/write-to-multiple-time-partitions:
 ### Refactor and Enhance Time Partition Filtering

 - **Refactor Filtering Logic**: Consolidated the filtering logic for timestamp arrays using macros in `time_partition.rs` and `bench_filter_time_partition.rs`. This reduces code duplication and improves maintainability.
 - **Enhance `BulkPart` Struct**: Made fields in `BulkPart` public to facilitate easier access and manipulation in `memtable.rs` and `part.rs`.
 - **Rename Function**: Renamed `split_record_batch` to `filter_record_batch` for clarity in `time_partition.rs` and `bench_filter_time_partition.rs`.
 - **Add Feature Flag**: Introduced `int_roundings` feature in `lib.rs` to support new functionality.

* refactor tests

* feat/write-to-multiple-time-partitions:
 Improve timestamp handling in `time_partition.rs`

 - Enhanced safety comments for timestamp conversion to ensure clarity.
 - Modified logic to prevent overflow by using `div_euclid` for `bulk_start_sec` and `bulk_end_sec` calculations.
 - Adjusted the `filter_map` logic to correctly compute timestamps using `start_sec` and `part_duration_sec`.

* feat/write-to-multiple-time-partitions:
 **Refactor timestamp handling and add utility function**

 - **Refactor `time_partition.rs`:** Simplified timestamp handling by replacing direct type access with a utility function to retrieve the timestamp unit. Improved error handling for timestamp conversion.
 - **Enhance `metadata.rs`:** Added `time_index_type` function to `RegionMetadata` to retrieve the timestamp type of the time index column, ensuring safer and more readable code.

* feat/write-to-multiple-time-partitions:
 Refactor time partition variable names in `time_partition.rs`

 - Renamed variables for clarity: `bulk_start_sec` to `start_bucket` and `bulk_end_sec` to `end_bucket`.
 - Updated related logic to use new variable names for improved readability and maintainability.

* feat/write-to-multiple-time-partitions:
 **Refactor variable names in `time_partition.rs`**

 - Updated variable names from `matching` and `missing` to `matchings` and `missings` for clarity and consistency.
 - Modified function calls and loop iterations to align with the new variable names.
 - Affected file: `src/mito2/src/memtable/time_partition.rs`

* feat/write-to-multiple-time-partitions:
 ### Refactor variable names in `time_partition.rs`

 - Updated variable names for clarity in `time_partition.rs`:
   - Renamed `matchings` to `matching_parts`
   - Renamed `missings` to `missing_parts`
 - Adjusted logic to use new variable names in methods `find_partitions_by_time_range` and `write_record_batch`.

* feat/write-to-multiple-time-partitions:
 ### Enhance Time Partition Handling

 - **`time_partition.rs`**:
   - Added `ArrayRef` to handle timestamp arrays, improving the partitioning logic by allowing more efficient timestamp range checks.
   - Enhanced `find_partitions_by_time_range` to support sparse data and handle different timestamp units (`Second`, `Millisecond`, `Microsecond`, `Nanosecond`).
   - Updated test cases to cover new scenarios, including sparse data and edge cases, ensuring robustness of partition handling.

---------

Co-authored-by: Lei <lei@Leis-MacBook-Pro.local>
2025-05-14 05:09:59 +00:00
Ruihang Xia
209f8371f2 fix: promql regex escape behavior (#6094)
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
2025-05-13 18:19:17 +00:00
Weny Xu
30f1cbf0bf chore: bump rskafka version (#6090)
* chore: upgrade rskafka

* chore(test): bump kafka version
2025-05-13 11:57:31 +00:00
Ruihang Xia
bbb6f8685e feat: implement commutativity rule for prom-related plans (#5875)
* feat: implement commutativity rule for prom-related plans

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* fix range manipulate deserializer

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* blocklist in commutativity rule

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* change dictionary type

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* handle partition and ordering

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* fix clippy

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* update tests

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* add rate, increase and delta

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* update sqlness result

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* regexp_replace uses empty string instead of null value

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* update sqlness result

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* update sqlness result

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* update sqlness result again

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

---------

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
2025-05-13 09:06:25 +00:00
Weny Xu
29540b55ee feat(meta): add pusher deregister signal to mailbox receiver (#6072) 2025-05-13 08:04:43 +00:00
Yingwen
ca1641d1c4 feat: implement PlainBatch struct (#6079)
* feat: implement PlainBatch struct

* chore: typo

* style: fix clippy

* feat: assert num columns
2025-05-13 05:56:12 +00:00
omahs
b275793b36 fix: typos (#6084) 2025-05-12 12:12:47 +00:00
discord9
265b144ca2 fix: flownode chose fe randomly&not starve lock (#6077)
* fix: choose frontend randomly

* docs: update comment

* chore: more logs

* fix: ignore inserts until recovering flow is done

* chore: resolve TODO

* fix: rm unused code&set done in correct location

* refactor: speed up create flow
2025-05-12 12:11:28 +00:00
Weny Xu
2ce5631d3c chore: fix clippy error by feature-gating Query import (#6085) 2025-05-12 09:27:29 +00:00
Zhenchi
36d9346ffc refactor: introduce row group selection (#6075)
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>
v0.15.0-nightly-20250512
2025-05-12 07:15:17 +00:00
liyang
36ff36e094 ci: update homebrew greptime version when release (#6082)
Co-authored-by: update-helm-charts-version <update-helm-charts-version@greptime.com>
2025-05-12 07:13:09 +00:00
discord9
9cf5f0e940 chore: more cfg stuff on windows (#6083)
chore: more cfg stuff
2025-05-12 07:12:15 +00:00
discord9
2a0e9c930d chore: mv anyhow depend out of cfg (#6081) 2025-05-12 04:54:54 +00:00
liyang
787a50631b ci: automatically update helm-charts when release (#6071)
* ci: automatically update helm-charts when release

* Update .github/workflows/release.yml

Co-authored-by: Ning Sun <classicning@gmail.com>

* Update update-helm-charts-version.sh

---------

Co-authored-by: Ning Sun <classicning@gmail.com>
2025-05-12 02:10:22 +00:00
zyy17
50df275097 fix!: disable append mode in trace services table (#6066)
fix: disable append mode in trace services table and make 'service_name' as primary key
2025-05-09 19:06:51 +00:00
Weny Xu
8dca448baf feat: add datanode workloads support (#6055)
* feat: add datanode workload type support

* refactor: enhance datanode lease filtering with mode conditions

* chore: update config.md

* fix: fix clippy

* chore: apply suggestions from CR

* feat: add feature gate

* fix: fmt and clippy

* refactor: minor refactor

* chore: apply suggestions from CR

* chore: apply suggestions from CR

* refactor: minior refactor

* test: fix unit test
2025-05-09 10:16:21 +00:00
Ning Sun
828f69a562 ci: only trigger downstream when release success (#6074) 2025-05-09 09:48:07 +00:00
discord9
04cae4b21e feat: mem prof can gen flamegraph directly (#6073)
* feat: mem-prof

* fix: use enum&update how to
2025-05-09 09:43:24 +00:00
LFC
79f584316e feat: set read-preference for grpc client (#6069)
* feat: set read-preference for grpc client

* todo

* address PR comments

* fix ci
2025-05-09 08:51:51 +00:00
discord9
6ab0f0cc5c fix: alter table modify type should also modify default value (#6049)
* fix: select after alter

* fix: insert a proper row&catch a bug

* fix: alter table modify type modify default value type too

* refactor: per review

* chore: per review

* refactor: per review

* refactor: per review
2025-05-09 03:40:59 +00:00
Lei, HUANG
8685ceb232 feat: impl bulk memtable and bridge bulk inserts (#6054)
* feat/bridge-bulk-insert:
 ## Implement Bulk Insert and Update Dependencies

 - **Bulk Insert Implementation**: Added `handle_bulk_inserts` method in `src/operator/src/bulk_insert.rs` to manage bulk insert requests using `FlightDecoder` and `FlightData`.
 - **Dependency Updates**: Updated `Cargo.lock` and `Cargo.toml` to use the latest revision of `greptime-proto` and added new dependencies like `arrow`, `arrow-ipc`, `bytes`, and `prost`.
 - **gRPC Enhancements**: Modified `put_record_batch` method in `src/frontend/src/instance/grpc.rs` and `src/servers/src/grpc/flight.rs` to handle `FlightData` instead of `RawRecordBatch`.
 - **Error Handling**: Added new error types in `src/operator/src/error.rs` for handling Arrow operations and decoding flight data.
 - **Miscellaneous**: Updated `src/operator/src/insert.rs` to expose `partition_manager` and `node_manager` as public fields.

* feat/bridge-bulk-insert:
 - **Update `greptime-proto` Dependency**: Updated the `greptime-proto` dependency to a new revision in `Cargo.lock` and `Cargo.toml`.
 - **Refactor gRPC Query Handling**: Removed `RawRecordBatch` usage from `grpc.rs`, `flight.rs`, `greptime_handler.rs`, and test files, simplifying the gRPC query handling.
 - **Enhance Bulk Insert Logic**: Improved bulk insert logic in `bulk_insert.rs` and `region_request.rs` by using `FlightDecoder` and `BooleanArray` for better performance and clarity.
 - **Add `common-grpc` Dependency**: Added `common-grpc` as a workspace dependency in `store-api/Cargo.toml` to support gRPC functionalities.

* fix: clippy

* fix schema serialization

* feat/bridge-bulk-insert:
 Add error handling for encoding/decoding in `metadata.rs` and `region_request.rs`

 - Introduced new error variants `FlightCodec` and `Prost` in `MetadataError` to handle encoding/decoding failures in `metadata.rs`.
 - Updated `make_region_bulk_inserts` function in `region_request.rs` to use `context` for error handling with `ProstSnafu` and `FlightCodecSnafu`.
 - Enhanced error handling for `FlightData` decoding and `filter_record_batch` operations.

* fix: test

* refactor: rename

* allow empty app_metadata in FlightData

* feat/bridge-bulk-insert:
 - **Remove Logging**: Removed unnecessary logging of affected rows in `region_server.rs`.
 - **Error Handling Enhancement**: Improved error handling in `bulk_insert.rs` by adding context to `split_record_batch` and handling single datanode fast path.
 - **Error Enum Cleanup**: Removed unused `Arrow` error variant from `error.rs`.

* fix: standalone test

* feat/bridge-bulk-insert:
 ### Enhance Bulk Insert Handling and Metadata Management

 - **`lib.rs`**: Enabled the `result_flattening` feature for improved error handling.
 - **`request.rs`**: Made `name_to_index` and `has_null` fields public in `WriteRequest` for better accessibility.
 - **`handle_bulk_insert.rs`**:
   - Added `handle_record_batch` function to streamline processing of bulk insert payloads.
   - Improved error handling and task management for bulk insert operations.
   - Updated `region_metadata_to_column_schema` to return both column schemas and a name-to-index map for efficient data access.

* feat/bridge-bulk-insert:
 - **Refactor `handle_bulk_insert.rs`:**
   - Replaced `handle_record_batch` with `handle_payload` for handling payloads.
   - Modified the fast path to use `common_runtime::spawn_global` for asynchronous task execution.

 - **Optimize `multi_dim.rs`:**
   - Added a fast path for single-region scenarios in `MultiDimPartitionRule::partition_record_batch`.

* feat/bridge-bulk-insert:
 - **Update `greptime-proto` Dependency**: Updated the `greptime-proto` dependency to a new revision in both `Cargo.lock` and `Cargo.toml`.
 - **Optimize Memory Allocation**: Increased initial and builder capacities in `time_series.rs` to improve performance.
 - **Enhance Data Handling**: Modified `bulk_insert.rs` to use `Bytes` for efficient data handling.
 - **Improve Bulk Insert Logic**: Refined the bulk insert logic in `region_request.rs` to handle schema and payload data more effectively and optimize record batch filtering.
 - **String Handling Improvement**: Updated string conversion in `helper.rs` for better performance.

* fix: clippy warnings

* feat/bridge-bulk-insert:
 **Add Metrics and Improve Error Handling**

 - **Metrics Enhancements**: Introduced new metrics for bulk insert operations in `metrics.rs`, `bulk_insert.rs`, `greptime_handler.rs`, and `region_request.rs`. Added `HANDLE_BULK_INSERT_ELAPSED`, `BULK_REQUEST_MESSAGE_SIZE`, and `GRPC_BULK_INSERT_ELAPSED` histograms to
 monitor performance.
 - **Error Handling Improvements**: Removed unnecessary error handling in `handle_bulk_insert.rs` by eliminating redundant `let _ =` patterns.
 - **Dependency Updates**: Added `lazy_static` and `prometheus` to `Cargo.lock` and `Cargo.toml` for metrics support.
 - **Code Refactoring**: Simplified function calls in `region_server.rs` and `handle_bulk_insert.rs` for better readability.

* chore: rebase main

* implement simple bulk memtable

* impl write_bulk

* implement simple bulk memtable

* feat/simple-bulk-memtable:
 ### Enhance Time-Series Memtable and Bulk Insert Handling

 - **Visibility Modifications**: Made `mutable_array` in `PrimitiveVectorBuilder` and `StringVectorBuilder` public in `primitive.rs` and `string.rs`.
 - **New Module**: Added `builder.rs` to `memtable` for time-series builders, including `FieldBuilder` and `StringBuilder` implementations.
 - **Bulk Insert Enhancements**:
   - Added `sequence` field to `BulkPart` in `part.rs` and updated its handling in `simple_bulk_memtable.rs` and `region_write_ctx.rs`.
   - Introduced metrics for bulk insert operations in `metrics.rs` and `bulk_insert.rs`.
 - **Performance Metrics**: Added timing metrics for write operations in `metrics.rs`, `region_write_ctx.rs`, and `handle_write.rs`.
 - **Region Request Handling**: Updated `make_region_bulk_inserts` in `region_request.rs` to include performance metrics.

* feat/simple-bulk-memtable:
 **Improve Memtable Stats Calculation and Add Metrics Timer**

 - **`simple_bulk_memtable.rs`**: Refactored `stats` method to use `num_rows` for checking if rows have been written, improving accuracy in memory table statistics.
 - **`handle_bulk_insert.rs`**: Introduced a metrics timer to measure the elapsed time for processing bulk requests, enhancing performance monitoring.

* feat/simple-bulk-memtable:
 ### Commit Message

 **Enhancements and Bug Fixes**

 - **Dependency Update**: Updated `greptime-proto` dependency to a new revision in `Cargo.lock` and `Cargo.toml`.
 - **Feature Addition**: Implemented `to_mutation` method in `BulkPart` to convert `BulkPart` to `Mutation` for fallback `write_bulk` implementation in `src/mito2/src/memtable/bulk/part.rs`.
 - **Functionality Improvement**: Modified `write_bulk` method in `TimeSeriesMemtable` to support default implementation fallback to row iteration in `src/mito2/src/memtable/time_series.rs`.
 - **Performance Optimization**: Enhanced `bulk_insert` handling by optimizing region request processing and data partitioning in `src/operator/src/bulk_insert.rs`.
 - **Error Handling**: Added `ComputeArrow` error variant for better error management in `src/operator/src/error.rs`.
 - **Code Refactoring**: Simplified region bulk insert request processing in `src/store-api/src/region_request.rs`.

* fix: some clippy warnings

* feat/simple-bulk-memtable:
 ### Commit Summary

 - **Refactor Return Types to `Result`:**
   Updated the return type of the `ranges` method in `memtable.rs`, `bulk.rs`, `partition_tree.rs`, `simple_bulk_memtable.rs`, `time_series.rs`, and `memtable_util.rs` to return `Result<MemtableRanges>` for better error handling.

 - **Enhance Metrics Tracking:**
   Improved metrics tracking by adding `num_rows` and `max_sequence` to `WriteMetrics` in `stats.rs`. Updated related methods in `partition_tree.rs`, `simple_bulk_memtable.rs`, `time_series.rs`, and `scan_region.rs` to utilize these metrics.

 - **Remove Unused Imports:**
   Cleaned up unused imports in `time_series.rs` to streamline the codebase.

* merge main

* remove useless error vairant

* use newer version of proto

* feat/simple-bulk-memtable:
                                                                                                                                 Commit Message

                                                                                                                                     Summary

Enhance FieldBuilder and StringBuilder functionality, add tests, and improve error handling.

                                                                                                                                   Key Changes

 • builder.rs:
    • Added documentation for FieldBuilder methods.
    • Renamed append_string_vector to append_vector in StringBuilder.
 • simple_bulk_memtable.rs:
    • Added new test cases for write_one, write_bulk, is_empty, stats, fork, and sequence_filter.
 • time_series.rs:
    • Improved error handling in ValueBuilder for type mismatches.
 • memtable_util.rs:
    • Removed unused imports and streamlined code.

These changes enhance the robustness and test coverage of the memtable components.

* feat/simple-bulk-memtable:
 Improve Time Partition Matching Logic in `time_partition.rs`

 - Enhanced the `write_bulk` method in `time_partition.rs` to improve the logic for matching partitions based on time ranges.
 - Introduced a new mechanism to filter and select partitions that overlap with the record batch's timestamp range before writing.

* feat/simple-bulk-memtable:
 Improve Metrics Handling in `bulk_insert.rs`

 - Removed the `group_request_timer` and its associated metric observation to streamline the timing logic.
 - Moved the `BULK_REQUEST_ROWS` metric observation to occur after filtering, ensuring accurate row count metrics.

* feat/simple-bulk-memtable:
 **Enhance Stalled Requests Calculation and Update Metrics**

 - **`worker.rs`**: Updated the `stalled_count` method to include both `reqs` and `bulk_reqs` in the calculation of stalled requests.
 - **`bulk_insert.rs`**: Removed duplicate observation of `BULK_REQUEST_MESSAGE_SIZE` metric.
 - **`metrics.rs`**: Changed the bucket strategy for `BULK_REQUEST_ROWS` from linear to exponential, improving the granularity of metrics collection.

* feat/simple-bulk-memtable:
 **Refactor `StringVector` Usage and Update Method Signatures**

 - **`src/datatypes/src/vectors/string.rs`**: Changed `StringVector`'s `array` field from public to private.
 - **`src/mito2/src/memtable/builder.rs`**: Refactored `append_vector` method to `append_array`, updating its usage to work directly with `StringArray` instead of `StringVector`.
 - **`src/mito2/src/memtable/time_series.rs`**: Updated `ValueBuilder` to handle `StringArray` directly, replacing `StringVector` usage with `StringArray` in the `FieldBuilder::String` case.

* feat/simple-bulk-memtable:
 - **Refactor `PrimitiveVectorBuilder`**: Made `mutable_array` private in `src/datatypes/src/vectors/primitive.rs`.
 - **Optimize `ValueBuilder`**: Replaced `UInt64VectorBuilder` and `UInt8VectorBuilder` with `Vec<u64>` and `Vec<u8>` for `sequence` and `op_type` in `src/mito2/src/memtable/time_series.rs`.
 - **Improve Metrics Initialization**: Updated histogram bucket initialization to use `exponential_buckets` in `src/mito2/src/metrics.rs`.

* feat/simple-bulk-memtable:
 Improve error handling in `simple_bulk_memtable.rs` and `time_series.rs`

 - Enhanced error handling by using `OptionExt` for more concise error context management in `simple_bulk_memtable.rs` and `time_series.rs`.
 - Replaced `ok_or` with `with_context` to streamline error context creation in both files.

* feat/simple-bulk-memtable:
 **Enhance Time Partition Handling in `time_partition.rs`**

 - Introduced `create_time_partition` function to streamline the creation of new time partitions, ensuring thread safety by acquiring a lock.
 - Modified logic to handle cases where no matching time partitions exist, creating new partitions as needed.
 - Updated `write_record_batch` and `write_one` methods to utilize the new partition creation logic, improving partition management and data writing efficiency.

* replace proto

* feat/simple-bulk-memtable:
 Update `metrics.rs` to adjust the range of exponential buckets for bulk insert message rows from `10 ~ 1_000_000` to `10 ~ 100_000`.
2025-05-09 02:56:09 +00:00
shuiyisong
b442414422 chore: support rename syntax in field (#6065)
* chore: support rename syntax in field

* test: rename in transform
2025-05-09 00:12:23 +00:00
liyang
51f2cb1053 ci: run only in the GreptimeTeam/greptimedb repository (#6064)
ci: run only in the GreptimeTeam/greptimedb repository
2025-05-08 08:39:13 +00:00
dennis zhuang
fbf50c594e fix: csv format escaping (#6061)
* fix: csv format escaping

* chore: change status code

* fix: crate version
2025-05-08 05:52:20 +00:00
Ning Sun
5739302845 feat: update pgwire to 0.29 (#6058)
* feat: update pgwire to 0.29

* chore: only build default binary in nix ci

* Update src/servers/Cargo.toml

Co-authored-by: dennis zhuang <killme2008@gmail.com>

---------

Co-authored-by: dennis zhuang <killme2008@gmail.com>
2025-05-08 04:21:13 +00:00
Yingwen
148d96fc38 fix: ensures logical and physical region have the same timestamp unit (#6041)
* fix: check time unit of logical region

* test: enlarge ttl for alter test to avoid data expired during test

* chore: fix unused
2025-05-08 03:40:21 +00:00
LFC
e787007eb5 feat: scan with sst minimal sequence (#6051)
* feat: scan with sst minimal sequence

* Update src/store-api/src/storage/requests.rs

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* update proto

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-05-08 01:34:51 +00:00
Ruihang Xia
60acf28f3c feat: try cast timestamp types from number string (#6060)
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
2025-05-07 11:29:35 +00:00
Yingwen
06126147d2 fix: reset tags when creating an empty metric in prom call (#6056)
* Revert "chore: remove debug logs"

This reverts commit f73f3a7373c83db974d8ed80cb47f5f87317b490.

* chore: more logs

* fix: reset tags and fields

* test: add binary time fn test

* chore: remove logs

* test: sort result
2025-05-07 08:08:51 +00:00