Compare commits

...

127 Commits

Author SHA1 Message Date
evenyag
0d5b423eb7 feat: opendal metrics
Signed-off-by: evenyag <realevenyag@gmail.com>
2025-09-29 18:23:41 +08:00
evenyag
26bdb6a413 fix: disable on compaction
Signed-off-by: evenyag <realevenyag@gmail.com>
2025-09-26 17:40:00 +08:00
evenyag
2fe21469f8 chore: also print infos
Signed-off-by: evenyag <realevenyag@gmail.com>
2025-09-26 17:29:37 +08:00
evenyag
3aa67c7af4 feat: add series num to metrics
Signed-off-by: evenyag <realevenyag@gmail.com>
2025-09-26 17:08:27 +08:00
evenyag
e0d3e6ae97 chore: disable fulltext index
Signed-off-by: evenyag <realevenyag@gmail.com>
2025-09-26 17:01:53 +08:00
evenyag
2ce476dc42 feat: add prof-file flag to get flamegraph
Signed-off-by: evenyag <realevenyag@gmail.com>
2025-09-26 15:36:15 +08:00
Lei, HUANG
69a816fa0c feat/objbench:
### Update Metrics and Command Output

 - **`objbench.rs`**:
   - Renamed "Write time" to "Total time" in output.
   - Enhanced metrics output to include a sum of all metrics.

 - **`access_layer.rs`**:
   - Split `index` duration into `index_update` and `index_finish`.
   - Added a `sum` method to `Metrics` to calculate the total duration.

 - **`writer.rs`**:
   - Updated metrics to use `index_update` and `index_finish` for more granular tracking of index operations.

Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
2025-09-24 20:46:27 +08:00
Lei, HUANG
dcf5a62014 feat/objbench:
### Add Metrics for Indexing and Conversion in `access_layer.rs` and `writer.rs`

 - **Enhancements in `access_layer.rs`:**
   - Added new metrics `convert` and `index` to the `Metrics` struct to track conversion and indexing durations.

 - **Updates in `writer.rs`:**
   - Implemented tracking of indexing duration by measuring the time taken for `update` in the indexer.
   - Added measurement of conversion duration for `convert_batch` to enhance performance monitoring.

Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
2025-09-24 18:14:00 +08:00
Lei, HUANG
f3aa967aae fix storage config
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
2025-09-24 16:01:19 +08:00
Lei, HUANG
93e8510b2a pretty print 2025-09-23 16:01:50 +08:00
Lei, HUANG
53c58494fd feat/objbench:
### Add verbose logging and file deletion in `objbench.rs`

 - **Verbose Logging**: Introduced a `--verbose` flag in `Command` to enable detailed logging using `common_telemetry::init_default_ut_logging()`.
 - **File Deletion**: Implemented automatic deletion of the destination file after processing in `Command::run()`.

 ### Update tests in `parquet.rs`

 - Removed unused parameters in test functions to streamline the code.

Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
2025-09-23 15:42:21 +08:00
Lei, HUANG
741c5e2fb1 feat/objbench:
### Update `objbench.rs` and `parquet.rs` for Improved File Handling

 - **`objbench.rs`:**
   - Simplified target access layer initialization by directly using `self.target`.
   - Added assertion to ensure single file info and constructed destination file path for reporting.
   - Enhanced logging to include destination file path in write completion message.

 - **`parquet.rs`:**
   - Updated test cases to include `None` for additional parameter in function calls.

Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
2025-09-23 15:26:09 +08:00
Lei, HUANG
d68215dc88 feat/objbench:
### Add `objbench` Binary and Enhance Metrics Collection

 - **New Binary**: Introduced a new binary `objbench` in `src/cmd/src/bin/objbench.rs` for benchmarking object store operations.
 - **Metrics Collection**: Enhanced metrics collection by adding a `Metrics` struct in `access_layer.rs` and integrating it into SST writing processes across multiple files, including `write_cache.rs`, `compactor.rs`, `flush

Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
2025-09-23 14:49:33 +08:00
Yingwen
bcd63fdb87 chore: cherry pick #6821 and bump version to v0.12.2 (#6832)
* fix: correct heartbeat stream handling logic  (#6821)

* fix: correct heartbeat stream handling logic

Signed-off-by: WenyXu <wenymedia@gmail.com>

* Update src/meta-srv/src/service/heartbeat.rs

Co-authored-by: jeremyhi <jiachun_feng@proton.me>

---------

Signed-off-by: WenyXu <wenymedia@gmail.com>
Co-authored-by: jeremyhi <jiachun_feng@proton.me>
Signed-off-by: evenyag <realevenyag@gmail.com>

* chore: bump version to v0.12.2

Signed-off-by: evenyag <realevenyag@gmail.com>

* chore: fix typos

Signed-off-by: evenyag <realevenyag@gmail.com>

---------

Signed-off-by: WenyXu <wenymedia@gmail.com>
Signed-off-by: evenyag <realevenyag@gmail.com>
Co-authored-by: Weny Xu <wenymedia@gmail.com>
Co-authored-by: jeremyhi <jiachun_feng@proton.me>
2025-08-27 08:56:09 +00:00
Yingwen
f4c527cddf chore: cherry pick #5625 to v0.12 branch (#6831)
* ci: update 0.12 release ci

Signed-off-by: evenyag <realevenyag@gmail.com>

* fix: out of bound during bloom search (#5625)

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

---------

Signed-off-by: evenyag <realevenyag@gmail.com>
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>
Co-authored-by: Zhenchi <zhongzc_arch@outlook.com>
2025-08-27 16:12:25 +08:00
Yingwen
8da5949fc5 ci: update 0.12 ci to latest (#6376)
* ci: update 0.12 ci to latest

Except:
- Remove mysql_backend
- Remove workflows/grafana.json

Signed-off-by: evenyag <realevenyag@gmail.com>

* ci: update typos

Signed-off-by: evenyag <realevenyag@gmail.com>

* ci: ignore more words

Signed-off-by: evenyag <realevenyag@gmail.com>

---------

Signed-off-by: evenyag <realevenyag@gmail.com>
2025-06-21 18:20:43 +08:00
Ruihang Xia
db6a63ef6c chore: bump version to 0.12.1
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
2025-06-21 16:52:28 +08:00
Yingwen
f166b93b02 feat: expose virtual_host_style config for s3 storage (#5696)
* feat: expose enable_virtual_host_style for s3 storage

* docs: update examples

* test: fix config test
2025-06-21 16:34:15 +08:00
Weny Xu
904d560175 feat(promql-planner): introduce vector matching binary operation (#5578)
* feat(promql-planner): support vector matching for binary operation

* test: add sqlness tests
2025-02-27 07:39:19 +00:00
Lei, HUANG
765d1277ee fix(metasrv): clean expired nodes in memory (#5592)
* fix/frontend-node-state: Refactor NodeInfoKey and Context Handling in Meta Server

 • Removed unused cluster_id from NodeInfoKey struct.
 • Updated HeartbeatHandlerGroup to return Context alongside HeartbeatResponse.
 • Added current_node_info to Context for tracking node information.
 • Implemented on_node_disconnect in Context to handle node disconnection events, specifically for Frontend roles.
 • Adjusted register_pusher function to return PusherId directly.
 • Updated tests to accommodate changes in Context structure.

* fix/frontend-node-state: Refactor Heartbeat Handler Context Management

Refactored the HeartbeatHandlerGroup::handle method to use a mutable reference for Context instead of passing it by value. This change simplifies the
context management by eliminating the need to return the context with the response. Updated the Metasrv implementation to align with this new context
handling approach, improving code clarity and reducing unnecessary context cloning.

* revert: clean cluster info on disconnect

* fix/frontend-node-state: Add Frontend Expiry Listener and Update NodeInfoKey Conversion

 • Introduced FrontendExpiryListener to manage the expiration of frontend nodes, including its integration with leadership change notifications.
 • Modified NodeInfoKey conversion to use references, enhancing efficiency and consistency across the codebase.
 • Updated collect_cluster_info_handler and metasrv to incorporate the new listener and conversion changes.
 • Added frontend_expiry module to the project structure for better organization and maintainability.

* chore: add config for node expiry

* add some doc

* fix: clippy

* fix/frontend-node-state:
 ### Refactor Node Expiry Handling
 - **Configuration Update**: Removed `node_expiry_tick` from `metasrv.example.toml` and `MetasrvOptions` in `metasrv.rs`.
 - **Module Renaming**: Renamed `frontend_expiry.rs` to `node_expiry_listener.rs` and updated references in `lib.rs`.
 - **Code Refactoring**: Replaced `FrontendExpiryListener` with `NodeExpiryListener` in `node_expiry_listener.rs` and `metasrv.rs`, removing the tick     interval and adjusting logic to use a fixed 60-second interval for node expiry checks.

* fix/frontend-node-state:
 Improve logging in `node_expiry_listener.rs`

 - Enhanced warning message to include peer information when an unrecognized node info key is encountered in `node_expiry_listener.rs`.

* docs: update config docs

* fix/frontend-node-state:
 **Refactor Context Handling in Heartbeat Services**

 - Updated `HeartbeatHandlerGroup` in `handler.rs` to pass `Context` by value instead of by mutable reference, allowing for more flexible context
 management.
 - Modified `Metasrv` implementation in `heartbeat.rs` to clone `Context` when passing to `handle` method, ensuring thread safety and consistency in
 asynchronous operations.
2025-02-27 06:16:36 +00:00
discord9
ccf42a9d97 fix: flow heartbeat retry (#5600)
* fix: flow heartbeat retry

* fix?: not sure if fixed

* chore: per review
2025-02-27 03:58:21 +00:00
Weny Xu
71e2fb895f feat: introduce prom_round fn (#5604)
* feat: introduce `prom_round` fn

* test: add sqlness tests
2025-02-27 03:30:15 +00:00
Ruihang Xia
c9671fd669 feat(promql): implement subquery (#5606)
* feat: initial implement for promql subquery

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* impl and test

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* refactor

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* fix clippy

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

---------

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
2025-02-27 03:28:04 +00:00
Ruihang Xia
b5efc75aab feat(promql): ignore invalid input in histogram plan (#5607)
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
2025-02-27 03:18:20 +00:00
Weny Xu
c1d18d9980 fix(prom): preserve the order of series in PromQueryResult (#5601)
fix(prom): keep the order of tags
2025-02-26 13:40:09 +00:00
Lei, HUANG
5d9faaaf39 fix(metasrv): reject ddl when metasrv is follower (#5599)
* fix/reject-ddl-in-follower-metasrv:
 Add leader check and logging for gRPC requests in `procedure.rs`

 - Implemented leader verification for `query_procedure_state`, `ddl`, and `procedure_details` gRPC requests in `procedure.rs`.
 - Added logging with `warn` for requests reaching a non-leader node.
 - Introduced `ResponseHeader` and `Error::is_not_leader()` to handle non-leader responses.

* fix/reject-ddl-in-follower-metasrv:
 Improve leader address handling in `heartbeat.rs`

 - Refactor leader address retrieval by renaming `leader` to `leader_addr` for clarity.
 - Update `make_client` function to use a reference to `leader_addr`.
 - Enhance logging to include the leader address in the success message for creating a heartbeat stream.

* fmt

* fix/reject-ddl-in-follower-metasrv:
 **Enhance Leader Check in `procedure.rs`**

 - Updated the leader verification logic in `procedure.rs` to return a failed `MigrateRegionResponse` when the server is not the leader.
 - Added logging to warn when a migrate request is received by a non-leader server.
2025-02-26 08:10:40 +00:00
ZonaHe
538875abee feat: update dashboard to v0.7.11 (#5597)
Co-authored-by: sunchanglong <sunchanglong@users.noreply.github.com>
2025-02-26 07:57:59 +00:00
jeremyhi
5ed09c4584 fix: all heartbeat channel need to check leader (#5593) 2025-02-25 10:45:30 +00:00
Yingwen
3f6a41eac5 fix: update show create table output for fulltext index (#5591)
* fix: update full index syntax in show create table

* test: update fulltext sqlness result
2025-02-25 09:36:27 +00:00
yihong
ff0dcf12c5 perf: close issue 4974 by do not delete columns when drop logical region about 100 times faster (#5561)
* perf: do not delete columns when drop logical region in drop database

Signed-off-by: yihong0618 <zouzou0208@gmail.com>

* fix: make ci happy

Signed-off-by: yihong0618 <zouzou0208@gmail.com>

* fix: address review comments

Signed-off-by: yihong0618 <zouzou0208@gmail.com>

* fix: address some comments

Signed-off-by: yihong0618 <zouzou0208@gmail.com>

* fix: drop stupid comments by copilot

Signed-off-by: yihong0618 <zouzou0208@gmail.com>

* chore: minor refactor

* chore: minor refactor

* chore: update grpetime-proto

---------

Signed-off-by: yihong0618 <zouzou0208@gmail.com>
Co-authored-by: WenyXu <wenymedia@gmail.com>
2025-02-25 09:00:49 +00:00
Yingwen
5b1fca825a fix: remove cached and uploaded files on failure (#5590) 2025-02-25 08:51:37 +00:00
Ruihang Xia
7bd108e2be feat: impl hll_state, hll_merge and hll_calc for incremental distinct counting (#5579)
* basic impl

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* more tests

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* sqlness test

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* fix clippy

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* update with more test and logs

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* impl

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* impl merge fn

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* rename function names

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

---------

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
2025-02-24 19:07:37 +00:00
Weny Xu
286f225e50 fix: correct inverted_indexed_column_ids behavior (#5586)
* fix: correct `inverted_indexed_column_ids`

* fix: fix unit tests
2025-02-23 07:17:38 +00:00
Ruihang Xia
4f988b5ba9 feat: remove default inverted index for physical table (#5583)
* feat: remove default inverted index for physical table

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* update sqlness result

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

---------

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
2025-02-22 06:48:05 +00:00
Ruihang Xia
500d0852eb fix: avoid run labeler job concurrently (#5584)
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
2025-02-22 05:18:26 +00:00
Zhenchi
8d05fb3503 feat: unify puffin name passed to stager (#5564)
* feat: purge a given puffin file in staging area

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* polish log

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* ttl set to 2d

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* feat: expose staging_ttl to index config

* feat: unify puffin name passed to stager

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* fix test

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* address comments

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* fallback to remote index

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* fix

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* refactor

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

---------

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>
Co-authored-by: evenyag <realevenyag@gmail.com>
2025-02-21 09:27:03 +00:00
Ruihang Xia
d7b6718be0 feat: run sqlness in parallel (#5499)
* define server mode

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* bump sqlness

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* all good

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* clean up

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* refactor: Move config generation logic from Env to ServerMode

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* finalize

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* change license header

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* rename variables

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* override parallelism

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* rename more variables

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

---------

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
2025-02-21 07:05:19 +00:00
Ruihang Xia
6f0783e17e fix: broken link in AUTHOR.md (#5581) 2025-02-21 07:01:41 +00:00
Ruihang Xia
d69e93b91a feat: support to generate json output for explain analyze in http api (#5567)
* impl

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* integration test

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* Update src/servers/src/http/hints.rs

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* refactor: with FORMAT option for explain format

* lift some well-known metrics

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

---------

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Ning Sun <sunning@greptime.com>
2025-02-21 05:13:09 +00:00
Ruihang Xia
76083892cd feat: support UNNEST (#5580)
* feat: support UNNEST

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* fix clippy and sqlness

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

---------

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
2025-02-21 04:53:56 +00:00
Ruihang Xia
7981c06989 feat: implement uddsketch function to calculate percentile (#5574)
* basic impl

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* more tests

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* sqlness test

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* fix clippy

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* update with more test and logs

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

---------

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
2025-02-20 18:59:20 +00:00
beryl678
97bb1519f8 docs: revise the author list (#5575) 2025-02-20 18:04:23 +00:00
Weny Xu
1d8c9c1843 feat: enable gzip for prometheus query handlers and ignore NaN values in prometheus response (#5576)
* feat: enable gzip for prometheus query handlers and ignore nan values in prometheus response

* Apply suggestions from code review

Co-authored-by: shuiyisong <113876041+shuiyisong@users.noreply.github.com>

---------

Co-authored-by: shuiyisong <113876041+shuiyisong@users.noreply.github.com>
2025-02-20 11:34:32 +00:00
jeremyhi
71007e200c feat: remap flow route address (#5565)
* feat: remap fow peers

* refactor: not stream

* feat: remap flownode addr on FlowRoute and TableFlow

* fix: unit test

* Update src/meta-srv/src/handler/remap_flow_peer_handler.rs

Co-authored-by: Lei, HUANG <6406592+v0y4g3r@users.noreply.github.com>

* chore: by comment

* Update src/meta-srv/src/handler/remap_flow_peer_handler.rs

* Update src/common/meta/src/key/flow/table_flow.rs

* Update src/common/meta/src/key/flow/flow_route.rs

* chore: remove duplicate field

---------

Co-authored-by: Lei, HUANG <6406592+v0y4g3r@users.noreply.github.com>
2025-02-20 08:21:32 +00:00
jeremyhi
a0ff9e751e feat: flow type on creating procedure (#5572)
feat: flow type on creating
2025-02-20 08:12:02 +00:00
LFC
f6f617d667 feat: submit node's cpu cores number to metasrv in heartbeat (#5571)
* feat: submit node's cpu cores number to metasrv in heartbeat

* update greptime-proto dep
2025-02-20 03:55:18 +00:00
Ruihang Xia
e8788088a8 feat(log-query): implement the first part of log query expr (#5548)
* feat(log-query): implement the first part of log query expr

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* fix clippy

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

---------

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
2025-02-19 18:25:41 +00:00
shuiyisong
53b25c04a2 chore: support Loki's structured metadata for ingestion (#5541)
* chore: support loki's structured metadata

* test: update test

* chore: revert some code change

* chore: address CR comment
2025-02-19 16:44:26 +00:00
dennis zhuang
62a8b8b9dc feat(promql): supports sort, sort_desc etc. functions (#5542)
* feat(promql): supports sort, sort_desc etc. functions

* chore: fix toml format and tests

* chore: update deps

Co-authored-by: Weny Xu <wenymedia@gmail.com>

* chore: remove fixme

* fix: cargo lock

* chore: style

---------

Co-authored-by: Weny Xu <wenymedia@gmail.com>
2025-02-19 13:13:49 +00:00
Weny Xu
c8bdeaaa6a fix(promql-planner): update ctx field columns of OR operator (#5556)
* fix(promql-planner): update ctx field columns of OR operator

* test: add sqlness test
2025-02-19 11:18:58 +00:00
Ning Sun
81da18e5df refactor: use global type alias for pipeline input (#5568)
* refactor: use global type alias for pipeline input

* fmt: reformat
2025-02-19 10:41:33 +00:00
Weny Xu
7c65fddb30 fix(promql-planner): correct AND/UNLESS operator behavior (#5557)
* fix(promql-planner): keep field column in left input for AND operator

* test: add sqlness test

* fix: fix unless operator
2025-02-19 09:07:39 +00:00
Zhenchi
421e38c481 feat: allow purging a given puffin file in staging area (#5558)
* feat: purge a given puffin file in staging area

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* polish log

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* ttl set to 2d

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* feat: expose staging_ttl to index config

* fix test

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* use `invalidate_entries_if` instead of maintaining map

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* run_pending_tasks after purging

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

---------

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>
Co-authored-by: evenyag <realevenyag@gmail.com>
2025-02-19 08:58:30 +00:00
Weny Xu
aada5c1706 fix(promql-planner): remove le tag in ctx (#5560)
* fix(promql-planner): remove le tag in ctx

* test: add sqlness test

* chore: apply suggestions from CR
2025-02-19 03:51:27 +00:00
yihong
aa8f119bbb chore: format all toml files (#5529)
fix: format some cargo files

Signed-off-by: yihong0618 <zouzou0208@gmail.com>
2025-02-18 12:09:01 +00:00
ZonaHe
19a6d15849 feat: update dashboard to v0.7.10 (#5562)
Co-authored-by: ZonaHex <ZonaHex@users.noreply.github.com>
2025-02-18 12:06:22 +00:00
liyang
073aaefe65 chore: improve grafana dashboard (#5559) 2025-02-18 11:36:27 +00:00
Yingwen
77223a0f3e fix: window sort support alias time index (#5543)
* fix: use alias expr to check commutativity

* chore: debug sort

* feat: consider alias in window sort optimizer

* test: sqlness test

* test: update sqlness result
2025-02-18 10:35:43 +00:00
Ruihang Xia
4ef038d098 fix: correct promql behavior on nonexistent columns (#5547)
* Revert "fix(promql): ignore filters for non-existent labels (#5519)"

This reverts commit 33a2485f54.

* reimplement

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* state safety

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

---------

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
2025-02-17 18:43:50 +00:00
jeremyhi
deb9520970 fix: information_schema.cluster_info be covered by the same id (#5555)
* fix: information_schema.cluster_info be coverd by the same id

* chore: by comment
2025-02-17 11:51:02 +00:00
Yingwen
6bba5e0afa feat: collect stager metrics (#5553)
* feat: collect stager metrics

* Apply suggestions from code review

Co-authored-by: Zhenchi <zhongzc_arch@outlook.com>

* Update src/mito2/src/metrics.rs

---------

Co-authored-by: Weny Xu <wenymedia@gmail.com>
Co-authored-by: Zhenchi <zhongzc_arch@outlook.com>
2025-02-17 07:09:15 +00:00
Ruihang Xia
f359eeb667 feat(log-query): support specifying exclusive/inclusive for between filter (#5546)
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
2025-02-17 04:40:47 +00:00
liyang
009dbad581 ci: don't push nightly latest image (#5551)
* ci: don't push nightly latest image

* add push release latest image
2025-02-17 04:34:49 +00:00
liyang
a2047b096c ci: use s5cmd upload artifacts (#5550) 2025-02-17 02:57:13 +00:00
Ruihang Xia
6e8b1ba004 feat: drop noneffective regex filter (#5544)
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
2025-02-15 04:20:26 +00:00
Ruihang Xia
7fc935c61c feat!: support alter skipping index (#5538)
* feat: support alter skipping index

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* update test results

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* cargo fmt

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* update sqlness result

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* finalize

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

---------

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
2025-02-14 18:43:21 +00:00
discord9
1e6d2fb1fa feat: add snapshot seqs field to query context (#5477)
* TODO: snapshot read

* feat: RegionEngine get last seq

* feat: query context snapshot

* chore: use new proto

* feat: get_region_seqs in region engine

* chore: typo

* chore: toml

* feat: make snapshots modifiable

* feat: add hint for snapshot read

* chore: some typo

* refactor: remove hint as not used

* fix: use commited seqs

* refactor: remove sequences variant on RegionRequest

* refactor: per review

* chore: rebase solve conflict

* refactor: rm unused key

* chore: per review

* chore: per review
2025-02-14 09:07:48 +00:00
Ruihang Xia
0d19e8f089 fix: promql join operation won't consider time index (#5535)
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
Co-authored-by: Weny Xu <wenymedia@gmail.com>
2025-02-14 08:21:05 +00:00
Weny Xu
c56106b883 perf: optimize table alteration speed in metric engine (#5526)
* feat(metric-engine): introduce batch alter request handling

* refactor: minor refactor

* refactor: push down filter to mito

* chore: apply suggestions from CR
2025-02-14 08:11:48 +00:00
Yohan Wal
edb040dea3 refactor: refactor pg kvbackend impl in preparation for other rds kvbackend (#5494)
* refactor: unify rds kvbackend impl

* fix: licence header

* refactor: use unique sql template set

* fix: fix deps

* chore: apply optimization patch

* chore: apply optimization patch(2)

* chore: follow review comments
2025-02-14 08:10:09 +00:00
Ruihang Xia
7bbc87b3c0 feat(promql): add series count metrics (#5534)
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
Co-authored-by: Weny Xu <wenymedia@gmail.com>
2025-02-14 07:49:28 +00:00
Zhenchi
858dae7b23 feat: add stager nofitier to collect metrics (#5530)
* feat: add stager nofitier to collect metrics

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* apply prev commit

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* remove dup size

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* add load cost

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

---------

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>
2025-02-14 07:49:26 +00:00
Weny Xu
33a2485f54 fix(promql): ignore filters for non-existent labels (#5519)
* fix(promql): ignore filters for non-existent labels

* chore: add comments

* test: add sqlness test
2025-02-14 06:40:15 +00:00
zyy17
8ebf454bc1 fix(jaeger): return error when no tracing table (#5539)
fix: return error when no tracing table
2025-02-14 06:20:56 +00:00
Ning Sun
f5b9ade6df chore: add section marker for extenal dependencies (#5536)
* chore: add section marker for extenal dependencies

* chore: update cargo.lock

* Update Cargo.toml

Co-authored-by: shuiyisong <113876041+shuiyisong@users.noreply.github.com>

* chore: update meter-core

---------

Co-authored-by: shuiyisong <113876041+shuiyisong@users.noreply.github.com>
2025-02-14 06:16:57 +00:00
Ruihang Xia
9c1834accd fix: old typo (#5532)
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
2025-02-14 02:18:43 +00:00
Yingwen
918517d221 feat: window sort supports where on fields and time index (#5527)
* feat: handle filter for window sort

* test: sqlness filter test for window sort

* test: add test on tag column filter

* test: test for filter on ts

* test: update sqlness test
2025-02-14 01:38:15 +00:00
liyang
92d9e81a9f ci: use the repository variable to pass to image-name (#5517)
Co-authored-by: Yingwen <realevenyag@gmail.com>
2025-02-13 18:14:49 +00:00
yihong
224b1d15cd chore: use the same version of chrono-tz (#5523)
* fix: use the same version of chrono-tz

Signed-off-by: yihong0618 <zouzou0208@gmail.com>

* fix: address comments

Signed-off-by: yihong0618 <zouzou0208@gmail.com>

---------

Signed-off-by: yihong0618 <zouzou0208@gmail.com>
2025-02-13 17:23:29 +00:00
Yingwen
b4d5393080 feat: speed up read/write cache and stager eviction (#5531)
* feat: change cache policy for file cache

* feat: file cache run pending task after put

* feat: run pending task in put_dir

* feat: run pending task after stager recovered

* feat: purge recycle bin periodically

* feat: use lru policy for read cache
2025-02-13 17:13:24 +00:00
Weny Xu
73c29bb482 fix(promql): unescape matcher values (#5521)
* fix(promql): unescape matcher values

* test: add sqlness tests

* chore: apply suggestions from CR

* feat: use unescaper
2025-02-13 09:42:25 +00:00
Ning Sun
198ee87675 feat: alias database matcher for promql (#5522)
* feat: provide an alias db matcher for promql

* refactor: rename __db__ to __database__

* chore: fix sqlness test
2025-02-13 08:37:37 +00:00
jeremyhi
02af9dd21a refactor!: remove datetime type (#5506)
* feat remove datetime type

* chore: fix unit test

* chore: add column test

* refactor: move create and alter validation to one place

* chore: minor refactor ut

* refactor: rename expr_factory to expr_helper

* chore: remove unnecessary args
2025-02-13 08:01:16 +00:00
Weny Xu
bb97f1bf16 perf: optimize table creation speed in metric engine (#5503)
* feat(metric-engine): introduce batch create request handling

* chore: remove unused code

* test: add more tests

* chore: remove unused error

* chore: apply suggestions from CR
2025-02-13 07:39:04 +00:00
yihong
fbd5316fdb perf: better performance for LastNonNullIter close #5229 about 10x times faster (#5518)
* fix: better performance for LastNonNullIter close #LastNonNullIter

Signed-off-by: yihong0618 <zouzou0208@gmail.com>

* fix: address comments

Signed-off-by: yihong0618 <zouzou0208@gmail.com>

* fix: add Safety comments for the unwrap

Signed-off-by: yihong0618 <zouzou0208@gmail.com>

---------

Signed-off-by: yihong0618 <zouzou0208@gmail.com>
2025-02-13 05:14:39 +00:00
Weny Xu
63d5a69a31 fix(query_range): skip data field on errors (#5520)
* fix: skip serializing PrometheusResponse when None

* fix: fix unit test

* chore: clippy
2025-02-13 04:32:24 +00:00
zyy17
954310f917 feat: implement Jaeger query APIs (#5452)
* feat: implement jaeger query api

* test: add some unit tests

* test: add integration tests for jaeger query APIs

* refactor: parse tags from url parameters

* refactor: support to query traces by tags

* refactor: add limit parameter

* refactor: add jaeger query api metrics

* chore: add some comment docs and default limit value

* test: add more unit tests

* docs: add jaeger options in config docs

* refactor: code review

* wip

* refactor: use datafusion's dataframe APIs to query traces

* refactor: code review

* chore: format test cases

* refactor: add check_schema()

* chore: fix clippy errors and rename function name

* refactor: throw error when covert start_time and duration error

* chore: modify incorrect request type name

* chore: remove unecessary serde rename

* refactor: add some important comments

* refactor: add SPAN_KIND_PREFIX

* refactor: code review
2025-02-12 23:36:38 +00:00
zyy17
58c6274bf6 fix: use fixed tonistiigi/binfmt:qemu-v7.0.0-28 image version instead of latest version to avoid segmentation fault (#5516)
fix: use fixed tonistiigi/binfmt:qemu-v7.0.0-28 image version instead of latest version to avoid segmentation fault

Co-authored-by: Yingwen <realevenyag@gmail.com>
2025-02-12 19:29:49 +00:00
Ning Sun
46947fd1de ci: docbot requires pull_request_target (#5514) 2025-02-12 09:46:04 +00:00
Weny Xu
44fffdec8b refactor: refactor region server request handling (#5504)
* refactor: refactor region server requests handling

* chore: apply suggestions from CR
2025-02-12 08:34:42 +00:00
Ruihang Xia
8026b1d72c feat!: unify all index creation grammars (#5486)
* column options

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* handle table constrain

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* update test assertions

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* change inverted index table constrain usage

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* update sqlness result

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* don't create inverted index for pk on alter table

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* remove remaining pk-as-inverted-index

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* more inverted index magic

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* update sqlness result again

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* fix clippy

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* Update src/sql/src/statements.rs

Co-authored-by: jeremyhi <jiachun_feng@proton.me>

* drop support for index def in table constrain

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

---------

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
Co-authored-by: jeremyhi <jiachun_feng@proton.me>
2025-02-12 06:54:09 +00:00
Ruihang Xia
e22aa819be feat: support server-side keep-alive for mysql and pg protocols (#5496)
* feat: support server-side keep-alive for mysql and pg protocols

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* update config.md

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* update config to use humantime for keep-alive configuration

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* chore: Update socket2 dependency

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

---------

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
2025-02-11 19:22:10 +00:00
localhost
beb9c0a797 chore: set now as timestamp field default value (#5502)
* chore: set now as timestamp field default value

* chore: import pipeline default value
2025-02-11 17:41:44 +00:00
ZonaHe
5f6f5e980a feat: update dashboard to v0.7.10-rc (#5512)
Co-authored-by: ZonaHex <ZonaHex@users.noreply.github.com>
2025-02-11 11:00:10 +00:00
LFC
ccfa40dc41 ci: run nightly jobs only on greptimedb repo (#5505)
ci: skip nightly ci jobs (#9)

(cherry picked from commit 345b4c30474f47a0477263bfba9894d7b4acda2d)
(cherry picked from commit dcd779cd668802fb1ea12fefb4dc3f83f34e30a2)
2025-02-11 10:57:43 +00:00
Zhenchi
336b941113 feat: change puffin stager eviction policy (#5511)
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>
2025-02-11 08:16:27 +00:00
yihong
de3f817596 fix: drop useless clone and for loop second (#5507)
Signed-off-by: yihong0618 <zouzou0208@gmail.com>
2025-02-11 06:23:49 +00:00
ZonaHe
d094f48822 feat: update dashboard to v0.7.9 (#5508)
Co-authored-by: ZonaHex <ZonaHex@users.noreply.github.com>
2025-02-11 06:19:58 +00:00
yihong
342883e922 ci: safe ci using zizmor check (#5491)
* ci: safe ci using zizmor check

Signed-off-by: yihong0618 <zouzou0208@gmail.com>

* fix: lines empty

Signed-off-by: yihong0618 <zouzou0208@gmail.com>

* fix: delete useless code

Signed-off-by: yihong0618 <zouzou0208@gmail.com>

---------

Signed-off-by: yihong0618 <zouzou0208@gmail.com>
2025-02-11 02:38:14 +00:00
Zhenchi
5be81abba3 feat: add metadata method to puffin reader (#5501)
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>
2025-02-10 09:14:54 +00:00
Zhenchi
c19ecd7ea2 refactor: change traversal order during index construction (#5498)
* refactor: change traversal order during index construction

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* chain

Co-authored-by: jeremyhi <jiachun_feng@proton.me>

---------

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>
Co-authored-by: jeremyhi <jiachun_feng@proton.me>
2025-02-10 06:31:35 +00:00
Ning Sun
15f4b10065 chore: revert "docs: add TM to logos" (#5495)
* Revert "docs: add TM to logos (#4789)"

This reverts commit caf5f2c7a5.

* chore: transparent
2025-02-10 04:00:59 +00:00
yihong
c100a2d1a6 fix: refactor pgkv using prepare_cache about 10% better (#5497)
fix: refactor pgkv using prepare_cache about 15% better

Signed-off-by: yihong0618 <zouzou0208@gmail.com>
2025-02-10 03:59:18 +00:00
yihong
ccb1978c98 fix: close issue #5466 by do not shortcut the drop command (#5467)
fix: close issue #5466 by do not shortcut by back it to READY when fail

Signed-off-by: yihong0618 <zouzou0208@gmail.com>
2025-02-10 03:28:34 +00:00
Ning Sun
480b05c590 feat: pipeline dispatcher part 2: execution (#5409)
* fmt: correct format

* test: add negative tests

* feat: Add pipeline dispatching and execution output handling

* refactor: Enhance ingest function to correctly process original data values

custom table names during pipeline execution while optimizing the management of
transformed rows and multiple dispatched pipelines

* refactor: call greptime_identity with intermediate values

* fix: typo

* test: port tests to refactored apis

* refactor: adapt dryrun api call

* refactor: move pipeline execution code to a separated module

* refactor: update otlp pipeline execution path

* fmt: format imports

* fix: compilation

* fix: resolve residual issues

* refactor: address review comments

* chore: use btreemap as pipeline intermediate status trait modify

* refactor: update dispatcher to accept BTreeMap

* refactor: update identity pipeline

* refactor: use new input for pipeline

* chore: wip

* refactor: use updated prepare api

* refactor: improve error and header name

* feat: port flatten to new api

* chore: update pipeline api

* chore: fix transform and some pipeline test

* refactor: reimplement cmcd

* refactor: update csv processor

* fmt: update format

* chore: fix regex and dissect processor

* chore: fix test

* test: add integration test for http pipeline

* refactor: improve regex pipeline

* refactor: improve required field check

* refactor: rename table_part to table_suffix

* fix: resolve merge issue

---------

Co-authored-by: paomian <xpaomian@gmail.com>
2025-02-08 09:01:54 +00:00
Ruihang Xia
0de0fd80b0 feat: move pipelines to the first-class endpoint (#5480)
* feat: move pipelines to the first-class endpoint

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* change endpoints

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* prefix path with /

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* update integration result

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

---------

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
2025-02-08 03:46:31 +00:00
Yohan Wal
059cb6fdc3 feat: update topic-region map when create and drop table (#5423)
* feat: update topic-region map

* fix: parse topic correctly

* test: add unit test forraft engine wal

* Update src/common/meta/src/ddl/drop_table.rs

Co-authored-by: Weny Xu <wenymedia@gmail.com>

* test: fix unit tests

* test: fix unit tests

* chore: error handling and tests

* refactor: manage region-topic map in table_metadata_keys

* refactor: use WalOptions instead of String in deletion

* chore: revert unused change

* chore: follow review comments

* Apply suggestions from code review

Co-authored-by: jeremyhi <jiachun_feng@proton.me>

* chore: follow review comments

---------

Co-authored-by: Weny Xu <wenymedia@gmail.com>
Co-authored-by: jeremyhi <jiachun_feng@proton.me>
2025-02-07 15:09:37 +00:00
jeremyhi
29218b5fe7 refactor!: unify the option names across all components part2 (#5476)
* refactor: part2, replace old options in doc yaml

* chore: remove deprecated options

* chore: update config.md

* fix: ut
2025-02-07 13:06:50 +00:00
discord9
59e6ec0395 chore: update pprof (#5488)
dep: update pprof
2025-02-07 11:43:40 +00:00
Lei, HUANG
79ee230f2a fix: cross compiling for aarch64 targets and allow customizing page size (#5487) 2025-02-07 11:21:16 +00:00
ozewr
0e4bd59fac build: Update Loki proto (#5484)
* build: mv loki-api to loki-proto

* fmt: fmt toml

* fix: loki-proto using rev

---------

Co-authored-by: wangrui <wangrui@baihai.ai>
2025-02-07 09:09:39 +00:00
Yingwen
6eccadbf73 fix: force recycle region dir after gc duration (#5485) 2025-02-07 08:39:04 +00:00
discord9
f29a1c56e9 fix: unquote flow_name in create flow expr (#5483)
* fix: unquote flow_name in create flow expr

* chore: per review

* fix: compat with older version
2025-02-07 08:26:14 +00:00
shuiyisong
88c3d331a1 refactor: otlp logs insertion (#5479)
* chore: add test for selector overlapping

* refactor: simplify otlp logs insertion

* fix: use layered extracted value array

* fix: wrong len

* chore: minor renaming and update

* chore: rename

* fix: clippy

* fix: typos

* chore: update test

* chore: address CR comment & update meter-deps version
2025-02-07 07:21:20 +00:00
yihong
79acc9911e fix: Delete statement not supported in metric engine close #4649 (#5473)
* fix: Delete statement not supported in metric engine close #4649

Signed-off-by: yihong0618 <zouzou0208@gmail.com>

* fix: do not include Truncate address review comments

Signed-off-by: yihong0618 <zouzou0208@gmail.com>

* fix: address comments

Signed-off-by: yihong0618 <zouzou0208@gmail.com>

* fix: address comment again

Signed-off-by: yihong0618 <zouzou0208@gmail.com>

---------

Signed-off-by: yihong0618 <zouzou0208@gmail.com>
2025-02-07 06:47:53 +00:00
Yingwen
0a169980b7 fix: lose decimal precision when using decimal type as tag (#5481)
* fix: replicate() of decimal vector lose precision

* test: add sqlness test

* test: drop table
2025-02-06 13:17:05 +00:00
Weny Xu
c80d2a3222 fix: introduce gc task for metadata store (#5461)
* fix: introduce gc task for metadata kvbackend

* refactor: refine KvbackendConfig

* chore: apply suggestions from CR
2025-02-06 12:12:43 +00:00
Ruihang Xia
116bdaf690 refactor: pull column filling logic out of mito worker loop (#5455)
* avoid duplicated req catagorisation

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* pull column filling up

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* fill columns instead of fill column

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* add test with metadata

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

---------

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
2025-02-06 11:43:28 +00:00
Ruihang Xia
6341fb86c7 feat: write memtable in parallel (#5456)
* feat: write memtable in parallel

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* some comments

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* remove unwrap

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* fix clippy

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* unwrap spawn result

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* use FuturesUnordered

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

---------

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
2025-02-06 09:29:57 +00:00
Ruihang Xia
fa09e181be perf: optimize time series memtable ingestion (#5451)
* initialize with capacity

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* avoid collect

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* optimize zip

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* rename variable

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* ignore type checking in the upper level

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* change to two-step capacity

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

---------

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
2025-02-06 09:12:29 +00:00
Zhenchi
ab4663ec2b feat: add vec_add function (#5471)
* feat: add vec_add function

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* fix unexpected utf8

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

---------

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>
2025-02-06 06:48:50 +00:00
jeremyhi
fac22575aa refactor!: unify the option names across all components (#5457)
* refactor: rename grpc options

* refactor: make the arg clearly

* chore: comments on server_addr

* chore: fix test

* chore: remove the store_addr alias

* refactor: cli option rpc_server_addr

* chore: keep store-addr alias

* chore: by comment
2025-02-06 06:37:14 +00:00
Yingwen
0e249f69cd fix: don't transform Limit in TypeConversionRule, StringNormalizationRule and DistPlannerAnalyzer (#5472)
* fix: do not transform exprs in the limit plan

* chore: keep some logs for debug

* feat: workaround for limit in other rules

* test: add sqlness tests for offset 0

* chore: add fixme
2025-02-05 11:30:24 +00:00
yihong
5d1761f3e5 docs: fix memory perf command wrong (#5470)
* docs: fix memory perf command wrong

Signed-off-by: yihong0618 <zouzou0208@gmail.com>

* fix: address comments

Signed-off-by: yihong0618 <zouzou0208@gmail.com>

* fix: better format

Signed-off-by: yihong0618 <zouzou0208@gmail.com>

* fix: make macos right

Signed-off-by: yihong0618 <zouzou0208@gmail.com>

* docs: add jeprof install info

Signed-off-by: yihong0618 <zouzou0208@gmail.com>
Co-authored-by: evenyag <realevenyag@gmail.com>

---------

Signed-off-by: yihong0618 <zouzou0208@gmail.com>
Co-authored-by: evenyag <realevenyag@gmail.com>
2025-02-05 10:45:51 +00:00
Lei, HUANG
dba6da4d00 refactor(mito): Allow creating multiple files in ParquetWriter (#5291)
* - **Refactored SST File Handling**:
   - Introduced `FilePathProvider` trait and its implementations (`WriteCachePathProvider`, `RegionFilePathFactory`) to manage SST and index file paths.
   - Updated `AccessLayer`, `WriteCache`, and `ParquetWriter` to use `FilePathProvider` for path management.
   - Modified `SstWriteRequest` and `SstUploadRequest` to use path providers instead of direct paths.
   - Files affected: `access_layer.rs`, `write_cache.rs`, `parquet.rs`, `writer.rs`.

 - **Enhanced Indexer Management**:
   - Replaced `IndexerBuilder` with `IndexerBuilderImpl` and made it async to support dynamic indexer creation.
   - Updated `ParquetWriter` to handle multiple indexers and file IDs.
   - Files affected: `index.rs`, `parquet.rs`, `writer.rs`.

 - **Removed Redundant File ID Handling**:
   - Removed `file_id` from `SstWriteRequest` and `CompactionOutput`.
   - Updated related logic to dynamically generate file IDs where necessary.
   - Files affected: `compaction.rs`, `flush.rs`, `picker.rs`, `twcs.rs`, `window.rs`.

 - **Test Adjustments**:
   - Updated tests to align with new path and indexer management.
   - Introduced `FixedPathProvider` and `NoopIndexBuilder` for testing purposes.
   - Files affected: `sst_util.rs`, `version_util.rs`, `parquet.rs`.

* chore: merge main

* refactor/generate-file-id-in-parquet-writer:
 **Enhance Logging in Compactor**

 - Updated `compactor.rs` to improve logging of compaction process.
   - Added `itertools::Itertools` for efficient string joining.
   - Moved logging of compaction inputs and outputs to the async block for better context.
   - Enhanced log message to include both input and output file names for better traceability.
2025-02-05 09:00:54 +00:00
discord9
59b31372aa feat(cli): add proxy options (#5459)
* feat(cli): proxy options

* refactor: map
2025-02-05 03:24:22 +00:00
yihong
d6b8672e63 docs: the year is better to show in 2025 (#5468)
doc: the year is better to show in 2025

Signed-off-by: yihong0618 <zouzou0208@gmail.com>
2025-02-05 03:01:45 +00:00
453 changed files with 24376 additions and 11111 deletions

2
.github/CODEOWNERS vendored
View File

@@ -4,7 +4,7 @@
* @GreptimeTeam/db-approver
## [Module] Databse Engine
## [Module] Database Engine
/src/index @zhongzc
/src/mito2 @evenyag @v0y4g3r @waynexia
/src/query @evenyag

View File

@@ -41,7 +41,14 @@ runs:
username: ${{ inputs.dockerhub-image-registry-username }}
password: ${{ inputs.dockerhub-image-registry-token }}
- name: Build and push dev-builder-ubuntu image
- name: Set up qemu for multi-platform builds
uses: docker/setup-qemu-action@v3
with:
platforms: linux/amd64,linux/arm64
# The latest version will lead to segmentation fault.
image: tonistiigi/binfmt:qemu-v7.0.0-28
- name: Build and push dev-builder-ubuntu image # Build image for amd64 and arm64 platform.
shell: bash
if: ${{ inputs.build-dev-builder-ubuntu == 'true' }}
run: |
@@ -52,7 +59,7 @@ runs:
IMAGE_NAMESPACE=${{ inputs.dockerhub-image-namespace }} \
DEV_BUILDER_IMAGE_TAG=${{ inputs.version }}
- name: Build and push dev-builder-centos image
- name: Build and push dev-builder-centos image # Only build image for amd64 platform.
shell: bash
if: ${{ inputs.build-dev-builder-centos == 'true' }}
run: |
@@ -69,8 +76,7 @@ runs:
run: |
make dev-builder \
BASE_IMAGE=android \
BUILDX_MULTI_PLATFORM_BUILD=amd64 \
IMAGE_REGISTRY=${{ inputs.dockerhub-image-registry }} \
IMAGE_NAMESPACE=${{ inputs.dockerhub-image-namespace }} \
DEV_BUILDER_IMAGE_TAG=${{ inputs.version }} && \
docker push ${{ inputs.dockerhub-image-registry }}/${{ inputs.dockerhub-image-namespace }}/dev-builder-android:${{ inputs.version }}
DEV_BUILDER_IMAGE_TAG=${{ inputs.version }}

View File

@@ -34,8 +34,8 @@ inputs:
required: true
push-latest-tag:
description: Whether to push the latest tag
required: false
default: 'true'
required: true
default: 'false'
runs:
using: composite
steps:
@@ -47,7 +47,11 @@ runs:
password: ${{ inputs.image-registry-password }}
- name: Set up qemu for multi-platform builds
uses: docker/setup-qemu-action@v2
uses: docker/setup-qemu-action@v3
with:
platforms: linux/amd64,linux/arm64
# The latest version will lead to segmentation fault.
image: tonistiigi/binfmt:qemu-v7.0.0-28
- name: Set up buildx
uses: docker/setup-buildx-action@v2

View File

@@ -22,8 +22,8 @@ inputs:
required: true
push-latest-tag:
description: Whether to push the latest tag
required: false
default: 'true'
required: true
default: 'false'
dev-mode:
description: Enable dev mode, only build standard greptime
required: false

View File

@@ -52,7 +52,7 @@ runs:
uses: ./.github/actions/build-greptime-binary
with:
base-image: ubuntu
features: servers/dashboard,pg_kvbackend
features: servers/dashboard
cargo-profile: ${{ inputs.cargo-profile }}
artifacts-dir: greptime-linux-${{ inputs.arch }}-${{ inputs.version }}
version: ${{ inputs.version }}
@@ -70,7 +70,7 @@ runs:
if: ${{ inputs.arch == 'amd64' && inputs.dev-mode == 'false' }} # Builds greptime for centos if the host machine is amd64.
with:
base-image: centos
features: servers/dashboard,pg_kvbackend
features: servers/dashboard
cargo-profile: ${{ inputs.cargo-profile }}
artifacts-dir: greptime-linux-${{ inputs.arch }}-centos-${{ inputs.version }}
version: ${{ inputs.version }}

View File

@@ -47,7 +47,6 @@ runs:
shell: pwsh
run: make test sqlness-test
env:
RUSTUP_WINDOWS_PATH_ADD_BIN: 1 # Workaround for https://github.com/nextest-rs/nextest/issues/1493
RUST_BACKTRACE: 1
SQLNESS_OPTS: "--preserve-state"

View File

@@ -51,8 +51,8 @@ inputs:
required: true
upload-to-s3:
description: Upload to S3
required: false
default: 'true'
required: true
default: 'false'
artifacts-dir:
description: Directory to store artifacts
required: false
@@ -64,11 +64,11 @@ inputs:
upload-max-retry-times:
description: Max retry times for uploading artifacts to S3
required: false
default: "20"
default: "30"
upload-retry-timeout:
description: Timeout for uploading artifacts to S3
required: false
default: "30" # minutes
default: "120" # minutes
runs:
using: composite
steps:
@@ -77,13 +77,21 @@ runs:
with:
path: ${{ inputs.artifacts-dir }}
- name: Install s5cmd
shell: bash
run: |
wget https://github.com/peak/s5cmd/releases/download/v2.3.0/s5cmd_2.3.0_Linux-64bit.tar.gz
tar -xzf s5cmd_2.3.0_Linux-64bit.tar.gz
sudo mv s5cmd /usr/local/bin/
sudo chmod +x /usr/local/bin/s5cmd
- name: Release artifacts to cn region
uses: nick-invision/retry@v2
if: ${{ inputs.upload-to-s3 == 'true' }}
env:
AWS_ACCESS_KEY_ID: ${{ inputs.aws-cn-access-key-id }}
AWS_SECRET_ACCESS_KEY: ${{ inputs.aws-cn-secret-access-key }}
AWS_DEFAULT_REGION: ${{ inputs.aws-cn-region }}
AWS_REGION: ${{ inputs.aws-cn-region }}
UPDATE_VERSION_INFO: ${{ inputs.update-version-info }}
with:
max_attempts: ${{ inputs.upload-max-retry-times }}

View File

@@ -8,15 +8,15 @@ inputs:
default: 2
description: "Number of Datanode replicas"
meta-replicas:
default: 1
default: 2
description: "Number of Metasrv replicas"
image-registry:
image-registry:
default: "docker.io"
description: "Image registry"
image-repository:
image-repository:
default: "greptime/greptimedb"
description: "Image repository"
image-tag:
image-tag:
default: "latest"
description: 'Image tag'
etcd-endpoints:
@@ -32,12 +32,12 @@ runs:
steps:
- name: Install GreptimeDB operator
uses: nick-fields/retry@v3
with:
with:
timeout_minutes: 3
max_attempts: 3
shell: bash
command: |
helm repo add greptime https://greptimeteam.github.io/helm-charts/
helm repo add greptime https://greptimeteam.github.io/helm-charts/
helm repo update
helm upgrade \
--install \
@@ -48,10 +48,10 @@ runs:
--wait-for-jobs
- name: Install GreptimeDB cluster
shell: bash
run: |
run: |
helm upgrade \
--install my-greptimedb \
--set meta.etcdEndpoints=${{ inputs.etcd-endpoints }} \
--set meta.backendStorage.etcd.endpoints=${{ inputs.etcd-endpoints }} \
--set meta.enableRegionFailover=${{ inputs.enable-region-failover }} \
--set image.registry=${{ inputs.image-registry }} \
--set image.repository=${{ inputs.image-repository }} \
@@ -59,7 +59,7 @@ runs:
--set base.podTemplate.main.resources.requests.cpu=50m \
--set base.podTemplate.main.resources.requests.memory=256Mi \
--set base.podTemplate.main.resources.limits.cpu=2000m \
--set base.podTemplate.main.resources.limits.memory=2Gi \
--set base.podTemplate.main.resources.limits.memory=3Gi \
--set frontend.replicas=${{ inputs.frontend-replicas }} \
--set datanode.replicas=${{ inputs.datanode-replicas }} \
--set meta.replicas=${{ inputs.meta-replicas }} \
@@ -72,7 +72,7 @@ runs:
- name: Wait for GreptimeDB
shell: bash
run: |
while true; do
while true; do
PHASE=$(kubectl -n my-greptimedb get gtc my-greptimedb -o jsonpath='{.status.clusterPhase}')
if [ "$PHASE" == "Running" ]; then
echo "Cluster is ready"
@@ -86,10 +86,10 @@ runs:
- name: Print GreptimeDB info
if: always()
shell: bash
run: |
run: |
kubectl get all --show-labels -n my-greptimedb
- name: Describe Nodes
if: always()
shell: bash
run: |
run: |
kubectl describe nodes

View File

@@ -2,13 +2,14 @@ meta:
configData: |-
[runtime]
global_rt_size = 4
[wal]
provider = "kafka"
broker_endpoints = ["kafka.kafka-cluster.svc.cluster.local:9092"]
num_topics = 3
auto_prune_interval = "30s"
trigger_flush_threshold = 100
[datanode]
[datanode.client]
timeout = "120s"
@@ -21,7 +22,7 @@ datanode:
[wal]
provider = "kafka"
broker_endpoints = ["kafka.kafka-cluster.svc.cluster.local:9092"]
linger = "2ms"
overwrite_entry_start_id = true
frontend:
configData: |-
[runtime]

View File

@@ -56,7 +56,7 @@ runs:
- name: Start EC2 runner
if: startsWith(inputs.runner, 'ec2')
uses: machulav/ec2-github-runner@v2
uses: machulav/ec2-github-runner@v2.3.8
id: start-linux-arm64-ec2-runner
with:
mode: start

View File

@@ -33,7 +33,7 @@ runs:
- name: Stop EC2 runner
if: ${{ inputs.label && inputs.ec2-instance-id }}
uses: machulav/ec2-github-runner@v2
uses: machulav/ec2-github-runner@v2.3.8
with:
mode: stop
label: ${{ inputs.label }}

15
.github/labeler.yaml vendored Normal file
View File

@@ -0,0 +1,15 @@
ci:
- changed-files:
- any-glob-to-any-file: .github/**
docker:
- changed-files:
- any-glob-to-any-file: docker/**
documentation:
- changed-files:
- any-glob-to-any-file: docs/**
dashboard:
- changed-files:
- any-glob-to-any-file: grafana/**

42
.github/scripts/check-version.sh vendored Executable file
View File

@@ -0,0 +1,42 @@
#!/bin/bash
# Get current version
CURRENT_VERSION=$1
if [ -z "$CURRENT_VERSION" ]; then
echo "Error: Failed to get current version"
exit 1
fi
# Get the latest version from GitHub Releases
API_RESPONSE=$(curl -s "https://api.github.com/repos/GreptimeTeam/greptimedb/releases/latest")
if [ -z "$API_RESPONSE" ] || [ "$(echo "$API_RESPONSE" | jq -r '.message')" = "Not Found" ]; then
echo "Error: Failed to fetch latest version from GitHub"
exit 1
fi
# Get the latest version
LATEST_VERSION=$(echo "$API_RESPONSE" | jq -r '.tag_name')
if [ -z "$LATEST_VERSION" ] || [ "$LATEST_VERSION" = "null" ]; then
echo "Error: No valid version found in GitHub releases"
exit 1
fi
# Cleaned up version number format (removed possible 'v' prefix and -nightly suffix)
CLEAN_CURRENT=$(echo "$CURRENT_VERSION" | sed 's/^v//' | sed 's/-nightly-.*//')
CLEAN_LATEST=$(echo "$LATEST_VERSION" | sed 's/^v//' | sed 's/-nightly-.*//')
echo "Current version: $CLEAN_CURRENT"
echo "Latest release version: $CLEAN_LATEST"
# Use sort -V to compare versions
HIGHER_VERSION=$(printf "%s\n%s" "$CLEAN_CURRENT" "$CLEAN_LATEST" | sort -V | tail -n1)
if [ "$HIGHER_VERSION" = "$CLEAN_CURRENT" ]; then
echo "Current version ($CLEAN_CURRENT) is NEWER than or EQUAL to latest ($CLEAN_LATEST)"
echo "should-push-latest-tag=true" >> $GITHUB_OUTPUT
else
echo "Current version ($CLEAN_CURRENT) is OLDER than latest ($CLEAN_LATEST)"
echo "should-push-latest-tag=false" >> $GITHUB_OUTPUT
fi

View File

@@ -8,24 +8,25 @@ set -e
# - If it's a nightly build, the version is 'nightly-YYYYMMDD-$(git rev-parse --short HEAD)', like 'nightly-20230712-e5b243c'.
# create_version ${GIHUB_EVENT_NAME} ${NEXT_RELEASE_VERSION} ${NIGHTLY_RELEASE_PREFIX}
function create_version() {
# Read from envrionment variables.
# Read from environment variables.
if [ -z "$GITHUB_EVENT_NAME" ]; then
echo "GITHUB_EVENT_NAME is empty"
echo "GITHUB_EVENT_NAME is empty" >&2
exit 1
fi
if [ -z "$NEXT_RELEASE_VERSION" ]; then
echo "NEXT_RELEASE_VERSION is empty"
exit 1
echo "NEXT_RELEASE_VERSION is empty, use version from Cargo.toml" >&2
# NOTE: Need a `v` prefix for the version string.
export NEXT_RELEASE_VERSION=v$(grep '^version = ' Cargo.toml | cut -d '"' -f 2 | head -n 1)
fi
if [ -z "$NIGHTLY_RELEASE_PREFIX" ]; then
echo "NIGHTLY_RELEASE_PREFIX is empty"
echo "NIGHTLY_RELEASE_PREFIX is empty" >&2
exit 1
fi
# Reuse $NEXT_RELEASE_VERSION to identify whether it's a nightly build.
# It will be like 'nigtly-20230808-7d0d8dc6'.
# It will be like 'nightly-20230808-7d0d8dc6'.
if [ "$NEXT_RELEASE_VERSION" = nightly ]; then
echo "$NIGHTLY_RELEASE_PREFIX-$(date "+%Y%m%d")-$(git rev-parse --short HEAD)"
exit 0
@@ -35,7 +36,7 @@ function create_version() {
# It will be like 'dev-2023080819-f0e7216c'.
if [ "$NEXT_RELEASE_VERSION" = dev ]; then
if [ -z "$COMMIT_SHA" ]; then
echo "COMMIT_SHA is empty in dev build"
echo "COMMIT_SHA is empty in dev build" >&2
exit 1
fi
echo "dev-$(date "+%Y%m%d-%s")-$(echo "$COMMIT_SHA" | cut -c1-8)"
@@ -45,7 +46,7 @@ function create_version() {
# Note: Only output 'version=xxx' to stdout when everything is ok, so that it can be used in GitHub Actions Outputs.
if [ "$GITHUB_EVENT_NAME" = push ]; then
if [ -z "$GITHUB_REF_NAME" ]; then
echo "GITHUB_REF_NAME is empty in push event"
echo "GITHUB_REF_NAME is empty in push event" >&2
exit 1
fi
echo "$GITHUB_REF_NAME"
@@ -54,15 +55,15 @@ function create_version() {
elif [ "$GITHUB_EVENT_NAME" = schedule ]; then
echo "$NEXT_RELEASE_VERSION-$NIGHTLY_RELEASE_PREFIX-$(date "+%Y%m%d")"
else
echo "Unsupported GITHUB_EVENT_NAME: $GITHUB_EVENT_NAME"
echo "Unsupported GITHUB_EVENT_NAME: $GITHUB_EVENT_NAME" >&2
exit 1
fi
}
# You can run as following examples:
# GITHUB_EVENT_NAME=push NEXT_RELEASE_VERSION=v0.4.0 NIGHTLY_RELEASE_PREFIX=nigtly GITHUB_REF_NAME=v0.3.0 ./create-version.sh
# GITHUB_EVENT_NAME=workflow_dispatch NEXT_RELEASE_VERSION=v0.4.0 NIGHTLY_RELEASE_PREFIX=nigtly ./create-version.sh
# GITHUB_EVENT_NAME=schedule NEXT_RELEASE_VERSION=v0.4.0 NIGHTLY_RELEASE_PREFIX=nigtly ./create-version.sh
# GITHUB_EVENT_NAME=schedule NEXT_RELEASE_VERSION=nightly NIGHTLY_RELEASE_PREFIX=nigtly ./create-version.sh
# GITHUB_EVENT_NAME=workflow_dispatch COMMIT_SHA=f0e7216c4bb6acce9b29a21ec2d683be2e3f984a NEXT_RELEASE_VERSION=dev NIGHTLY_RELEASE_PREFIX=nigtly ./create-version.sh
# GITHUB_EVENT_NAME=push NEXT_RELEASE_VERSION=v0.4.0 NIGHTLY_RELEASE_PREFIX=nightly GITHUB_REF_NAME=v0.3.0 ./create-version.sh
# GITHUB_EVENT_NAME=workflow_dispatch NEXT_RELEASE_VERSION=v0.4.0 NIGHTLY_RELEASE_PREFIX=nightly ./create-version.sh
# GITHUB_EVENT_NAME=schedule NEXT_RELEASE_VERSION=v0.4.0 NIGHTLY_RELEASE_PREFIX=nightly ./create-version.sh
# GITHUB_EVENT_NAME=schedule NEXT_RELEASE_VERSION=nightly NIGHTLY_RELEASE_PREFIX=nightly ./create-version.sh
# GITHUB_EVENT_NAME=workflow_dispatch COMMIT_SHA=f0e7216c4bb6acce9b29a21ec2d683be2e3f984a NEXT_RELEASE_VERSION=dev NIGHTLY_RELEASE_PREFIX=nightly ./create-version.sh
create_version

View File

@@ -10,7 +10,7 @@ GREPTIMEDB_IMAGE_TAG=${GREPTIMEDB_IMAGE_TAG:-latest}
ETCD_CHART="oci://registry-1.docker.io/bitnamicharts/etcd"
GREPTIME_CHART="https://greptimeteam.github.io/helm-charts/"
# Ceate a cluster with 1 control-plane node and 5 workers.
# Create a cluster with 1 control-plane node and 5 workers.
function create_kind_cluster() {
cat <<EOF | kind create cluster --name "${CLUSTER}" --image kindest/node:"$KUBERNETES_VERSION" --config=-
kind: Cluster
@@ -68,7 +68,7 @@ function deploy_greptimedb_cluster() {
helm install "$cluster_name" greptime/greptimedb-cluster \
--set image.tag="$GREPTIMEDB_IMAGE_TAG" \
--set meta.etcdEndpoints="etcd.$install_namespace:2379" \
--set meta.backendStorage.etcd.endpoints="etcd.$install_namespace:2379" \
-n "$install_namespace"
# Wait for greptimedb cluster to be ready.
@@ -103,7 +103,7 @@ function deploy_greptimedb_cluster_with_s3_storage() {
helm install "$cluster_name" greptime/greptimedb-cluster -n "$install_namespace" \
--set image.tag="$GREPTIMEDB_IMAGE_TAG" \
--set meta.etcdEndpoints="etcd.$install_namespace:2379" \
--set meta.backendStorage.etcd.endpoints="etcd.$install_namespace:2379" \
--set storage.s3.bucket="$AWS_CI_TEST_BUCKET" \
--set storage.s3.region="$AWS_REGION" \
--set storage.s3.root="$DATA_ROOT" \

37
.github/scripts/update-dev-builder-version.sh vendored Executable file
View File

@@ -0,0 +1,37 @@
#!/bin/bash
DEV_BUILDER_IMAGE_TAG=$1
update_dev_builder_version() {
if [ -z "$DEV_BUILDER_IMAGE_TAG" ]; then
echo "Error: Should specify the dev-builder image tag"
exit 1
fi
# Configure Git configs.
git config --global user.email greptimedb-ci@greptime.com
git config --global user.name greptimedb-ci
# Checkout a new branch.
BRANCH_NAME="ci/update-dev-builder-$(date +%Y%m%d%H%M%S)"
git checkout -b $BRANCH_NAME
# Update the dev-builder image tag in the Makefile.
sed -i "s/DEV_BUILDER_IMAGE_TAG ?=.*/DEV_BUILDER_IMAGE_TAG ?= ${DEV_BUILDER_IMAGE_TAG}/g" Makefile
# Commit the changes.
git add Makefile
git commit -m "ci: update dev-builder image tag"
git push origin $BRANCH_NAME
# Create a Pull Request.
gh pr create \
--title "ci: update dev-builder image tag" \
--body "This PR updates the dev-builder image tag" \
--base main \
--head $BRANCH_NAME \
--reviewer zyy17 \
--reviewer daviderli614
}
update_dev_builder_version

46
.github/scripts/update-helm-charts-version.sh vendored Executable file
View File

@@ -0,0 +1,46 @@
#!/bin/bash
set -e
VERSION=${VERSION}
GITHUB_TOKEN=${GITHUB_TOKEN}
update_helm_charts_version() {
# Configure Git configs.
git config --global user.email update-helm-charts-version@greptime.com
git config --global user.name update-helm-charts-version
# Clone helm-charts repository.
git clone "https://x-access-token:${GITHUB_TOKEN}@github.com/GreptimeTeam/helm-charts.git"
cd helm-charts
# Set default remote for gh CLI
gh repo set-default GreptimeTeam/helm-charts
# Checkout a new branch.
BRANCH_NAME="chore/greptimedb-${VERSION}"
git checkout -b $BRANCH_NAME
# Update version.
make update-version CHART=greptimedb-cluster VERSION=${VERSION}
make update-version CHART=greptimedb-standalone VERSION=${VERSION}
# Update docs.
make docs
# Commit the changes.
git add .
git commit -s -m "chore: Update GreptimeDB version to ${VERSION}"
git push origin $BRANCH_NAME
# Create a Pull Request.
gh pr create \
--title "chore: Update GreptimeDB version to ${VERSION}" \
--body "This PR updates the GreptimeDB version." \
--base main \
--head $BRANCH_NAME \
--reviewer zyy17 \
--reviewer daviderli614
}
update_helm_charts_version

View File

@@ -0,0 +1,42 @@
#!/bin/bash
set -e
VERSION=${VERSION}
GITHUB_TOKEN=${GITHUB_TOKEN}
update_homebrew_greptime_version() {
# Configure Git configs.
git config --global user.email update-greptime-version@greptime.com
git config --global user.name update-greptime-version
# Clone helm-charts repository.
git clone "https://x-access-token:${GITHUB_TOKEN}@github.com/GreptimeTeam/homebrew-greptime.git"
cd homebrew-greptime
# Set default remote for gh CLI
gh repo set-default GreptimeTeam/homebrew-greptime
# Checkout a new branch.
BRANCH_NAME="chore/greptimedb-${VERSION}"
git checkout -b $BRANCH_NAME
# Update version.
make update-greptime-version VERSION=${VERSION}
# Commit the changes.
git add .
git commit -s -m "chore: Update GreptimeDB version to ${VERSION}"
git push origin $BRANCH_NAME
# Create a Pull Request.
gh pr create \
--title "chore: Update GreptimeDB version to ${VERSION}" \
--body "This PR updates the GreptimeDB version." \
--base main \
--head $BRANCH_NAME \
--reviewer zyy17 \
--reviewer daviderli614
}
update_homebrew_greptime_version

View File

@@ -33,7 +33,7 @@ function upload_artifacts() {
# ├── greptime-darwin-amd64-v0.2.0.sha256sum
# └── greptime-darwin-amd64-v0.2.0.tar.gz
find "$ARTIFACTS_DIR" -type f \( -name "*.tar.gz" -o -name "*.sha256sum" \) | while IFS= read -r file; do
aws s3 cp \
s5cmd cp \
"$file" "s3://$AWS_S3_BUCKET/$RELEASE_DIRS/$VERSION/$(basename "$file")"
done
}
@@ -41,11 +41,11 @@ function upload_artifacts() {
# Updates the latest version information in AWS S3 if UPDATE_VERSION_INFO is true.
function update_version_info() {
if [ "$UPDATE_VERSION_INFO" == "true" ]; then
# If it's the officail release(like v1.0.0, v1.0.1, v1.0.2, etc.), update latest-version.txt.
# If it's the official release(like v1.0.0, v1.0.1, v1.0.2, etc.), update latest-version.txt.
if [[ "$VERSION" =~ ^v[0-9]+\.[0-9]+\.[0-9]+$ ]]; then
echo "Updating latest-version.txt"
echo "$VERSION" > latest-version.txt
aws s3 cp \
s5cmd cp \
latest-version.txt "s3://$AWS_S3_BUCKET/$RELEASE_DIRS/latest-version.txt"
fi
@@ -53,7 +53,7 @@ function update_version_info() {
if [[ "$VERSION" == *"nightly"* ]]; then
echo "Updating latest-nightly-version.txt"
echo "$VERSION" > latest-nightly-version.txt
aws s3 cp \
s5cmd cp \
latest-nightly-version.txt "s3://$AWS_S3_BUCKET/$RELEASE_DIRS/latest-nightly-version.txt"
fi
fi

View File

@@ -14,9 +14,11 @@ name: Build API docs
jobs:
apidoc:
runs-on: ubuntu-20.04
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
with:
persist-credentials: false
- uses: arduino/setup-protoc@v3
with:
repo-token: ${{ secrets.GITHUB_TOKEN }}

View File

@@ -12,6 +12,8 @@ jobs:
steps:
- name: Checkout code
uses: actions/checkout@v4
with:
persist-credentials: false
- name: Set up Rust
uses: actions-rust-lang/setup-rust-toolchain@v1

View File

@@ -16,11 +16,11 @@ on:
description: The runner uses to build linux-amd64 artifacts
default: ec2-c6i.4xlarge-amd64
options:
- ubuntu-20.04
- ubuntu-20.04-8-cores
- ubuntu-20.04-16-cores
- ubuntu-20.04-32-cores
- ubuntu-20.04-64-cores
- ubuntu-22.04
- ubuntu-22.04-8-cores
- ubuntu-22.04-16-cores
- ubuntu-22.04-32-cores
- ubuntu-22.04-64-cores
- ec2-c6i.xlarge-amd64 # 4C8G
- ec2-c6i.2xlarge-amd64 # 8C16G
- ec2-c6i.4xlarge-amd64 # 16C32G
@@ -55,6 +55,11 @@ on:
description: Build and push images to DockerHub and ACR
required: false
default: true
upload_artifacts_to_s3:
type: boolean
description: Whether upload artifacts to s3
required: false
default: false
cargo_profile:
type: choice
description: The cargo profile to use in building GreptimeDB.
@@ -76,20 +81,14 @@ env:
NIGHTLY_RELEASE_PREFIX: nightly
# Use the different image name to avoid conflict with the release images.
IMAGE_NAME: greptimedb-dev
# The source code will check out in the following path: '${WORKING_DIR}/dev/greptime'.
CHECKOUT_GREPTIMEDB_PATH: dev/greptimedb
permissions:
issues: write
jobs:
allocate-runners:
name: Allocate runners
if: ${{ github.repository == 'GreptimeTeam/greptimedb' }}
runs-on: ubuntu-20.04
runs-on: ubuntu-latest
outputs:
linux-amd64-runner: ${{ steps.start-linux-amd64-runner.outputs.label }}
linux-arm64-runner: ${{ steps.start-linux-arm64-runner.outputs.label }}
@@ -107,6 +106,7 @@ jobs:
uses: actions/checkout@v4
with:
fetch-depth: 0
persist-credentials: false
- name: Create version
id: create-version
@@ -161,6 +161,7 @@ jobs:
uses: actions/checkout@v4
with:
fetch-depth: 0
persist-credentials: false
- name: Checkout greptimedb
uses: actions/checkout@v4
@@ -168,6 +169,7 @@ jobs:
repository: ${{ inputs.repository }}
ref: ${{ inputs.commit }}
path: ${{ env.CHECKOUT_GREPTIMEDB_PATH }}
persist-credentials: true
- uses: ./.github/actions/build-linux-artifacts
with:
@@ -192,6 +194,7 @@ jobs:
uses: actions/checkout@v4
with:
fetch-depth: 0
persist-credentials: false
- name: Checkout greptimedb
uses: actions/checkout@v4
@@ -199,6 +202,7 @@ jobs:
repository: ${{ inputs.repository }}
ref: ${{ inputs.commit }}
path: ${{ env.CHECKOUT_GREPTIMEDB_PATH }}
persist-credentials: true
- uses: ./.github/actions/build-linux-artifacts
with:
@@ -219,26 +223,34 @@ jobs:
build-linux-amd64-artifacts,
build-linux-arm64-artifacts,
]
runs-on: ubuntu-20.04
runs-on: ubuntu-latest
outputs:
build-result: ${{ steps.set-build-result.outputs.build-result }}
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0
persist-credentials: false
- name: Build and push images to dockerhub
uses: ./.github/actions/build-images
with:
image-registry: docker.io
image-namespace: ${{ vars.IMAGE_NAMESPACE }}
image-name: ${{ env.IMAGE_NAME }}
image-name: ${{ vars.DEV_BUILD_IMAGE_NAME }}
image-registry-username: ${{ secrets.DOCKERHUB_USERNAME }}
image-registry-password: ${{ secrets.DOCKERHUB_TOKEN }}
version: ${{ needs.allocate-runners.outputs.version }}
push-latest-tag: false # Don't push the latest tag to registry.
dev-mode: true # Only build the standard images.
- name: Echo Docker image tag to step summary
run: |
echo "## Docker Image Tag" >> $GITHUB_STEP_SUMMARY
echo "Image Tag: \`${{ needs.allocate-runners.outputs.version }}\`" >> $GITHUB_STEP_SUMMARY
echo "Full Image Name: \`docker.io/${{ vars.IMAGE_NAMESPACE }}/${{ vars.DEV_BUILD_IMAGE_NAME }}:${{ needs.allocate-runners.outputs.version }}\`" >> $GITHUB_STEP_SUMMARY
echo "Pull Command: \`docker pull docker.io/${{ vars.IMAGE_NAMESPACE }}/${{ vars.DEV_BUILD_IMAGE_NAME }}:${{ needs.allocate-runners.outputs.version }}\`" >> $GITHUB_STEP_SUMMARY
- name: Set build result
id: set-build-result
run: |
@@ -251,19 +263,20 @@ jobs:
allocate-runners,
release-images-to-dockerhub,
]
runs-on: ubuntu-20.04
runs-on: ubuntu-latest
continue-on-error: true
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0
persist-credentials: false
- name: Release artifacts to CN region
uses: ./.github/actions/release-cn-artifacts
with:
src-image-registry: docker.io
src-image-namespace: ${{ vars.IMAGE_NAMESPACE }}
src-image-name: ${{ env.IMAGE_NAME }}
src-image-name: ${{ vars.DEV_BUILD_IMAGE_NAME }}
dst-image-registry-username: ${{ secrets.ALICLOUD_USERNAME }}
dst-image-registry-password: ${{ secrets.ALICLOUD_PASSWORD }}
dst-image-registry: ${{ vars.ACR_IMAGE_REGISTRY }}
@@ -273,6 +286,7 @@ jobs:
aws-cn-access-key-id: ${{ secrets.AWS_CN_ACCESS_KEY_ID }}
aws-cn-secret-access-key: ${{ secrets.AWS_CN_SECRET_ACCESS_KEY }}
aws-cn-region: ${{ vars.AWS_RELEASE_BUCKET_REGION }}
upload-to-s3: ${{ inputs.upload_artifacts_to_s3 }}
dev-mode: true # Only build the standard images(exclude centos images).
push-latest-tag: false # Don't push the latest tag to registry.
update-version-info: false # Don't update the version info in S3.
@@ -281,7 +295,7 @@ jobs:
name: Stop linux-amd64 runner
# Only run this job when the runner is allocated.
if: ${{ always() }}
runs-on: ubuntu-20.04
runs-on: ubuntu-latest
needs: [
allocate-runners,
build-linux-amd64-artifacts,
@@ -291,6 +305,7 @@ jobs:
uses: actions/checkout@v4
with:
fetch-depth: 0
persist-credentials: false
- name: Stop EC2 runner
uses: ./.github/actions/stop-runner
@@ -306,7 +321,7 @@ jobs:
name: Stop linux-arm64 runner
# Only run this job when the runner is allocated.
if: ${{ always() }}
runs-on: ubuntu-20.04
runs-on: ubuntu-latest
needs: [
allocate-runners,
build-linux-arm64-artifacts,
@@ -316,6 +331,7 @@ jobs:
uses: actions/checkout@v4
with:
fetch-depth: 0
persist-credentials: false
- name: Stop EC2 runner
uses: ./.github/actions/stop-runner
@@ -333,11 +349,17 @@ jobs:
needs: [
release-images-to-dockerhub
]
runs-on: ubuntu-20.04
runs-on: ubuntu-latest
permissions:
issues: write
env:
SLACK_WEBHOOK_URL: ${{ secrets.SLACK_WEBHOOK_URL_DEVELOP_CHANNEL }}
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0
persist-credentials: false
- uses: ./.github/actions/setup-cyborg
- name: Report CI status
id: report-ci-status

View File

@@ -22,10 +22,13 @@ concurrency:
jobs:
check-typos-and-docs:
if: ${{ github.repository == 'GreptimeTeam/greptimedb' }}
name: Check typos and docs
runs-on: ubuntu-20.04
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
with:
persist-credentials: false
- uses: crate-ci/typos@master
- name: Check the config docs
run: |
@@ -34,21 +37,27 @@ jobs:
|| (echo "'config/config.md' is not up-to-date, please run 'make config-docs'." && exit 1)
license-header-check:
runs-on: ubuntu-20.04
if: ${{ github.repository == 'GreptimeTeam/greptimedb' }}
runs-on: ubuntu-latest
name: Check License Header
steps:
- uses: actions/checkout@v4
with:
persist-credentials: false
- uses: korandoru/hawkeye@v5
check:
if: ${{ github.repository == 'GreptimeTeam/greptimedb' }}
name: Check
runs-on: ${{ matrix.os }}
strategy:
matrix:
os: [ ubuntu-20.04 ]
os: [ ubuntu-latest ]
timeout-minutes: 60
steps:
- uses: actions/checkout@v4
with:
persist-credentials: false
- uses: arduino/setup-protoc@v3
with:
repo-token: ${{ secrets.GITHUB_TOKEN }}
@@ -65,11 +74,14 @@ jobs:
run: cargo check --locked --workspace --all-targets
toml:
if: ${{ github.repository == 'GreptimeTeam/greptimedb' }}
name: Toml Check
runs-on: ubuntu-20.04
runs-on: ubuntu-latest
timeout-minutes: 60
steps:
- uses: actions/checkout@v4
with:
persist-credentials: false
- uses: actions-rust-lang/setup-rust-toolchain@v1
- name: Install taplo
run: cargo +stable install taplo-cli --version ^0.9 --locked --force
@@ -77,14 +89,17 @@ jobs:
run: taplo format --check
build:
if: ${{ github.repository == 'GreptimeTeam/greptimedb' }}
name: Build GreptimeDB binaries
runs-on: ${{ matrix.os }}
strategy:
matrix:
os: [ ubuntu-20.04 ]
os: [ ubuntu-latest ]
timeout-minutes: 60
steps:
- uses: actions/checkout@v4
with:
persist-credentials: false
- uses: arduino/setup-protoc@v3
with:
repo-token: ${{ secrets.GITHUB_TOKEN }}
@@ -117,6 +132,7 @@ jobs:
version: current
fuzztest:
if: ${{ github.repository == 'GreptimeTeam/greptimedb' }}
name: Fuzz Test
needs: build
runs-on: ubuntu-latest
@@ -139,6 +155,8 @@ jobs:
echo "Disk space after:"
df -h
- uses: actions/checkout@v4
with:
persist-credentials: false
- uses: arduino/setup-protoc@v3
with:
repo-token: ${{ secrets.GITHUB_TOKEN }}
@@ -171,11 +189,13 @@ jobs:
max-total-time: 120
unstable-fuzztest:
if: ${{ github.repository == 'GreptimeTeam/greptimedb' }}
name: Unstable Fuzz Test
needs: build-greptime-ci
runs-on: ubuntu-latest
timeout-minutes: 60
strategy:
fail-fast: false
matrix:
target: [ "unstable_fuzz_create_table_standalone" ]
steps:
@@ -192,6 +212,8 @@ jobs:
echo "Disk space after:"
df -h
- uses: actions/checkout@v4
with:
persist-credentials: false
- uses: arduino/setup-protoc@v3
with:
repo-token: ${{ secrets.GITHUB_TOKEN }}
@@ -201,12 +223,12 @@ jobs:
run: |
sudo apt update && sudo apt install -y libfuzzer-14-dev
cargo install cargo-fuzz cargo-gc-bin --force
- name: Download pre-built binariy
- name: Download pre-built binary
uses: actions/download-artifact@v4
with:
name: bin
path: .
- name: Unzip bianry
- name: Unzip binary
run: |
tar -xvf ./bin.tar.gz
rm ./bin.tar.gz
@@ -228,16 +250,24 @@ jobs:
name: unstable-fuzz-logs
path: /tmp/unstable-greptime/
retention-days: 3
- name: Describe pods
if: failure()
shell: bash
run: |
kubectl describe pod -n my-greptimedb
build-greptime-ci:
if: ${{ github.repository == 'GreptimeTeam/greptimedb' }}
name: Build GreptimeDB binary (profile-CI)
runs-on: ${{ matrix.os }}
strategy:
matrix:
os: [ ubuntu-20.04 ]
os: [ ubuntu-latest ]
timeout-minutes: 60
steps:
- uses: actions/checkout@v4
with:
persist-credentials: false
- uses: arduino/setup-protoc@v3
with:
repo-token: ${{ secrets.GITHUB_TOKEN }}
@@ -251,7 +281,7 @@ jobs:
- name: Install cargo-gc-bin
shell: bash
run: cargo install cargo-gc-bin --force
- name: Build greptime bianry
- name: Build greptime binary
shell: bash
# `cargo gc` will invoke `cargo build` with specified args
run: cargo gc --profile ci -- --bin greptime --features pg_kvbackend
@@ -269,11 +299,13 @@ jobs:
version: current
distributed-fuzztest:
if: ${{ github.repository == 'GreptimeTeam/greptimedb' }}
name: Fuzz Test (Distributed, ${{ matrix.mode.name }}, ${{ matrix.target }})
runs-on: ubuntu-latest
needs: build-greptime-ci
timeout-minutes: 60
strategy:
fail-fast: false
matrix:
target: [ "fuzz_create_table", "fuzz_alter_table", "fuzz_create_database", "fuzz_create_logical_table", "fuzz_alter_logical_table", "fuzz_insert", "fuzz_insert_logical_table" ]
mode:
@@ -295,15 +327,17 @@ jobs:
echo "Disk space after:"
df -h
- uses: actions/checkout@v4
with:
persist-credentials: false
- name: Setup Kind
uses: ./.github/actions/setup-kind
- if: matrix.mode.minio
name: Setup Minio
uses: ./.github/actions/setup-minio
- if: matrix.mode.kafka
name: Setup Kafka cluser
name: Setup Kafka cluster
uses: ./.github/actions/setup-kafka-cluster
- name: Setup Etcd cluser
- name: Setup Etcd cluster
uses: ./.github/actions/setup-etcd-cluster
# Prepares for fuzz tests
- uses: arduino/setup-protoc@v3
@@ -376,6 +410,11 @@ jobs:
shell: bash
run: |
kubectl describe nodes
- name: Describe pod
if: failure()
shell: bash
run: |
kubectl describe pod -n my-greptimedb
- name: Export kind logs
if: failure()
shell: bash
@@ -398,11 +437,13 @@ jobs:
docker system prune -f
distributed-fuzztest-with-chaos:
if: ${{ github.repository == 'GreptimeTeam/greptimedb' }}
name: Fuzz Test with Chaos (Distributed, ${{ matrix.mode.name }}, ${{ matrix.target }})
runs-on: ubuntu-latest
needs: build-greptime-ci
timeout-minutes: 60
strategy:
fail-fast: false
matrix:
target: ["fuzz_migrate_mito_regions", "fuzz_migrate_metric_regions", "fuzz_failover_mito_regions", "fuzz_failover_metric_regions"]
mode:
@@ -437,6 +478,8 @@ jobs:
echo "Disk space after:"
df -h
- uses: actions/checkout@v4
with:
persist-credentials: false
- name: Setup Kind
uses: ./.github/actions/setup-kind
- name: Setup Chaos Mesh
@@ -445,9 +488,9 @@ jobs:
name: Setup Minio
uses: ./.github/actions/setup-minio
- if: matrix.mode.kafka
name: Setup Kafka cluser
name: Setup Kafka cluster
uses: ./.github/actions/setup-kafka-cluster
- name: Setup Etcd cluser
- name: Setup Etcd cluster
uses: ./.github/actions/setup-etcd-cluster
# Prepares for fuzz tests
- uses: arduino/setup-protoc@v3
@@ -521,6 +564,11 @@ jobs:
shell: bash
run: |
kubectl describe nodes
- name: Describe pods
if: failure()
shell: bash
run: |
kubectl describe pod -n my-greptimedb
- name: Export kind logs
if: failure()
shell: bash
@@ -543,12 +591,14 @@ jobs:
docker system prune -f
sqlness:
if: ${{ github.repository == 'GreptimeTeam/greptimedb' }}
name: Sqlness Test (${{ matrix.mode.name }})
needs: build
runs-on: ${{ matrix.os }}
strategy:
fail-fast: false
matrix:
os: [ ubuntu-20.04 ]
os: [ ubuntu-latest ]
mode:
- name: "Basic"
opts: ""
@@ -556,12 +606,14 @@ jobs:
- name: "Remote WAL"
opts: "-w kafka -k 127.0.0.1:9092"
kafka: true
- name: "Pg Kvbackend"
- name: "PostgreSQL KvBackend"
opts: "--setup-pg"
kafka: false
timeout-minutes: 60
steps:
- uses: actions/checkout@v4
with:
persist-credentials: false
- if: matrix.mode.kafka
name: Setup kafka server
working-directory: tests-integration/fixtures
@@ -584,11 +636,14 @@ jobs:
retention-days: 3
fmt:
if: ${{ github.repository == 'GreptimeTeam/greptimedb' }}
name: Rustfmt
runs-on: ubuntu-20.04
runs-on: ubuntu-latest
timeout-minutes: 60
steps:
- uses: actions/checkout@v4
with:
persist-credentials: false
- uses: arduino/setup-protoc@v3
with:
repo-token: ${{ secrets.GITHUB_TOKEN }}
@@ -599,11 +654,14 @@ jobs:
run: make fmt-check
clippy:
if: ${{ github.repository == 'GreptimeTeam/greptimedb' }}
name: Clippy
runs-on: ubuntu-20.04
runs-on: ubuntu-latest
timeout-minutes: 60
steps:
- uses: actions/checkout@v4
with:
persist-credentials: false
- uses: arduino/setup-protoc@v3
with:
repo-token: ${{ secrets.GITHUB_TOKEN }}
@@ -622,20 +680,25 @@ jobs:
run: make clippy
conflict-check:
if: ${{ github.repository == 'GreptimeTeam/greptimedb' }}
name: Check for conflict
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
with:
persist-credentials: false
- name: Merge Conflict Finder
uses: olivernybroe/action-conflict-finder@v4.0
test:
if: github.event_name != 'merge_group'
if: ${{ github.repository == 'GreptimeTeam/greptimedb' && github.event_name != 'merge_group' }}
runs-on: ubuntu-22.04-arm
timeout-minutes: 60
needs: [conflict-check, clippy, fmt]
steps:
- uses: actions/checkout@v4
with:
persist-credentials: false
- uses: arduino/setup-protoc@v3
with:
repo-token: ${{ secrets.GITHUB_TOKEN }}
@@ -643,7 +706,7 @@ jobs:
- name: Install toolchain
uses: actions-rust-lang/setup-rust-toolchain@v1
with:
cache: false
cache: false
- name: Rust Cache
uses: Swatinem/rust-cache@v2
with:
@@ -674,16 +737,19 @@ jobs:
GT_MINIO_ENDPOINT_URL: http://127.0.0.1:9000
GT_ETCD_ENDPOINTS: http://127.0.0.1:2379
GT_POSTGRES_ENDPOINTS: postgres://greptimedb:admin@127.0.0.1:5432/postgres
GT_MYSQL_ENDPOINTS: mysql://greptimedb:admin@127.0.0.1:3306/mysql
GT_KAFKA_ENDPOINTS: 127.0.0.1:9092
GT_KAFKA_SASL_ENDPOINTS: 127.0.0.1:9093
UNITTEST_LOG_DIR: "__unittest_logs"
coverage:
if: github.event_name == 'merge_group'
runs-on: ubuntu-20.04-8-cores
if: ${{ github.repository == 'GreptimeTeam/greptimedb' && github.event_name == 'merge_group' }}
runs-on: ubuntu-22.04-8-cores
timeout-minutes: 60
steps:
- uses: actions/checkout@v4
with:
persist-credentials: false
- uses: arduino/setup-protoc@v3
with:
repo-token: ${{ secrets.GITHUB_TOKEN }}
@@ -723,6 +789,7 @@ jobs:
GT_MINIO_ENDPOINT_URL: http://127.0.0.1:9000
GT_ETCD_ENDPOINTS: http://127.0.0.1:2379
GT_POSTGRES_ENDPOINTS: postgres://greptimedb:admin@127.0.0.1:5432/postgres
GT_MYSQL_ENDPOINTS: mysql://greptimedb:admin@127.0.0.1:3306/mysql
GT_KAFKA_ENDPOINTS: 127.0.0.1:9092
GT_KAFKA_SASL_ENDPOINTS: 127.0.0.1:9093
UNITTEST_LOG_DIR: "__unittest_logs"
@@ -736,9 +803,10 @@ jobs:
verbose: true
# compat:
# if: ${{ github.repository == 'GreptimeTeam/greptimedb' }}
# name: Compatibility Test
# needs: build
# runs-on: ubuntu-20.04
# runs-on: ubuntu-22.04
# timeout-minutes: 60
# steps:
# - uses: actions/checkout@v4

View File

@@ -3,16 +3,21 @@ on:
pull_request_target:
types: [opened, edited]
permissions:
pull-requests: write
contents: read
concurrency:
group: ${{ github.workflow }}-${{ github.head_ref || github.run_id }}
cancel-in-progress: true
jobs:
docbot:
runs-on: ubuntu-20.04
runs-on: ubuntu-latest
permissions:
pull-requests: write
contents: read
timeout-minutes: 10
steps:
- uses: actions/checkout@v4
with:
persist-credentials: false
- uses: ./.github/actions/setup-cyborg
- name: Maybe Follow Up Docs Issue
working-directory: cyborg

View File

@@ -31,43 +31,47 @@ name: CI
jobs:
typos:
name: Spell Check with Typos
runs-on: ubuntu-20.04
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
with:
persist-credentials: false
- uses: crate-ci/typos@master
license-header-check:
runs-on: ubuntu-20.04
runs-on: ubuntu-latest
name: Check License Header
steps:
- uses: actions/checkout@v4
with:
persist-credentials: false
- uses: korandoru/hawkeye@v5
check:
name: Check
runs-on: ubuntu-20.04
runs-on: ubuntu-latest
steps:
- run: 'echo "No action required"'
fmt:
name: Rustfmt
runs-on: ubuntu-20.04
runs-on: ubuntu-latest
steps:
- run: 'echo "No action required"'
clippy:
name: Clippy
runs-on: ubuntu-20.04
runs-on: ubuntu-latest
steps:
- run: 'echo "No action required"'
coverage:
runs-on: ubuntu-20.04
runs-on: ubuntu-latest
steps:
- run: 'echo "No action required"'
test:
runs-on: ubuntu-20.04
runs-on: ubuntu-latest
steps:
- run: 'echo "No action required"'
@@ -76,7 +80,7 @@ jobs:
runs-on: ${{ matrix.os }}
strategy:
matrix:
os: [ ubuntu-20.04 ]
os: [ ubuntu-latest ]
mode:
- name: "Basic"
- name: "Remote WAL"

View File

@@ -14,11 +14,11 @@ on:
description: The runner uses to build linux-amd64 artifacts
default: ec2-c6i.4xlarge-amd64
options:
- ubuntu-20.04
- ubuntu-20.04-8-cores
- ubuntu-20.04-16-cores
- ubuntu-20.04-32-cores
- ubuntu-20.04-64-cores
- ubuntu-22.04
- ubuntu-22.04-8-cores
- ubuntu-22.04-16-cores
- ubuntu-22.04-32-cores
- ubuntu-22.04-64-cores
- ec2-c6i.xlarge-amd64 # 4C8G
- ec2-c6i.2xlarge-amd64 # 8C16G
- ec2-c6i.4xlarge-amd64 # 16C32G
@@ -66,18 +66,11 @@ env:
NIGHTLY_RELEASE_PREFIX: nightly
# Use the different image name to avoid conflict with the release images.
# The DockerHub image will be greptime/greptimedb-nightly.
IMAGE_NAME: greptimedb-nightly
permissions:
issues: write
jobs:
allocate-runners:
name: Allocate runners
if: ${{ github.repository == 'GreptimeTeam/greptimedb' }}
runs-on: ubuntu-20.04
runs-on: ubuntu-latest
outputs:
linux-amd64-runner: ${{ steps.start-linux-amd64-runner.outputs.label }}
linux-arm64-runner: ${{ steps.start-linux-arm64-runner.outputs.label }}
@@ -95,6 +88,7 @@ jobs:
uses: actions/checkout@v4
with:
fetch-depth: 0
persist-credentials: false
- name: Create version
id: create-version
@@ -147,6 +141,7 @@ jobs:
- uses: actions/checkout@v4
with:
fetch-depth: 0
persist-credentials: false
- uses: ./.github/actions/build-linux-artifacts
with:
@@ -168,6 +163,7 @@ jobs:
- uses: actions/checkout@v4
with:
fetch-depth: 0
persist-credentials: false
- uses: ./.github/actions/build-linux-artifacts
with:
@@ -186,24 +182,25 @@ jobs:
build-linux-amd64-artifacts,
build-linux-arm64-artifacts,
]
runs-on: ubuntu-20.04
runs-on: ubuntu-latest
outputs:
nightly-build-result: ${{ steps.set-nightly-build-result.outputs.nightly-build-result }}
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0
persist-credentials: false
- name: Build and push images to dockerhub
uses: ./.github/actions/build-images
with:
image-registry: docker.io
image-namespace: ${{ vars.IMAGE_NAMESPACE }}
image-name: ${{ env.IMAGE_NAME }}
image-name: ${{ vars.NIGHTLY_BUILD_IMAGE_NAME }}
image-registry-username: ${{ secrets.DOCKERHUB_USERNAME }}
image-registry-password: ${{ secrets.DOCKERHUB_TOKEN }}
version: ${{ needs.allocate-runners.outputs.version }}
push-latest-tag: true
push-latest-tag: false
- name: Set nightly build result
id: set-nightly-build-result
@@ -217,7 +214,7 @@ jobs:
allocate-runners,
release-images-to-dockerhub,
]
runs-on: ubuntu-20.04
runs-on: ubuntu-latest
# When we push to ACR, it's easy to fail due to some unknown network issues.
# However, we don't want to fail the whole workflow because of this.
# The ACR have daily sync with DockerHub, so don't worry about the image not being updated.
@@ -226,13 +223,14 @@ jobs:
- uses: actions/checkout@v4
with:
fetch-depth: 0
persist-credentials: false
- name: Release artifacts to CN region
uses: ./.github/actions/release-cn-artifacts
with:
src-image-registry: docker.io
src-image-namespace: ${{ vars.IMAGE_NAMESPACE }}
src-image-name: ${{ env.IMAGE_NAME }}
src-image-name: ${{ vars.NIGHTLY_BUILD_IMAGE_NAME }}
dst-image-registry-username: ${{ secrets.ALICLOUD_USERNAME }}
dst-image-registry-password: ${{ secrets.ALICLOUD_PASSWORD }}
dst-image-registry: ${{ vars.ACR_IMAGE_REGISTRY }}
@@ -242,15 +240,16 @@ jobs:
aws-cn-access-key-id: ${{ secrets.AWS_CN_ACCESS_KEY_ID }}
aws-cn-secret-access-key: ${{ secrets.AWS_CN_SECRET_ACCESS_KEY }}
aws-cn-region: ${{ vars.AWS_RELEASE_BUCKET_REGION }}
upload-to-s3: false
dev-mode: false
update-version-info: false # Don't update version info in S3.
push-latest-tag: true
push-latest-tag: false
stop-linux-amd64-runner: # It's always run as the last job in the workflow to make sure that the runner is released.
name: Stop linux-amd64 runner
# Only run this job when the runner is allocated.
if: ${{ always() }}
runs-on: ubuntu-20.04
runs-on: ubuntu-latest
needs: [
allocate-runners,
build-linux-amd64-artifacts,
@@ -260,6 +259,7 @@ jobs:
uses: actions/checkout@v4
with:
fetch-depth: 0
persist-credentials: false
- name: Stop EC2 runner
uses: ./.github/actions/stop-runner
@@ -275,7 +275,7 @@ jobs:
name: Stop linux-arm64 runner
# Only run this job when the runner is allocated.
if: ${{ always() }}
runs-on: ubuntu-20.04
runs-on: ubuntu-latest
needs: [
allocate-runners,
build-linux-arm64-artifacts,
@@ -285,6 +285,7 @@ jobs:
uses: actions/checkout@v4
with:
fetch-depth: 0
persist-credentials: false
- name: Stop EC2 runner
uses: ./.github/actions/stop-runner
@@ -302,11 +303,15 @@ jobs:
needs: [
release-images-to-dockerhub
]
runs-on: ubuntu-20.04
runs-on: ubuntu-latest
permissions:
issues: write
env:
SLACK_WEBHOOK_URL: ${{ secrets.SLACK_WEBHOOK_URL_DEVELOP_CHANNEL }}
steps:
- uses: actions/checkout@v4
with:
persist-credentials: false
- uses: ./.github/actions/setup-cyborg
- name: Report CI status
id: report-ci-status

View File

@@ -9,19 +9,17 @@ concurrency:
group: ${{ github.workflow }}-${{ github.head_ref || github.run_id }}
cancel-in-progress: true
permissions:
issues: write
jobs:
sqlness-test:
name: Run sqlness test
if: ${{ github.repository == 'GreptimeTeam/greptimedb' }}
runs-on: ubuntu-22.04
runs-on: ubuntu-latest
steps:
- name: Checkout
uses: actions/checkout@v4
with:
fetch-depth: 0
persist-credentials: false
- name: Check install.sh
run: ./.github/scripts/check-install-script.sh
@@ -46,9 +44,14 @@ jobs:
name: Sqlness tests on Windows
if: ${{ github.repository == 'GreptimeTeam/greptimedb' }}
runs-on: windows-2022-8-cores
permissions:
issues: write
timeout-minutes: 60
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0
persist-credentials: false
- uses: ./.github/actions/setup-cyborg
- uses: arduino/setup-protoc@v3
with:
@@ -76,6 +79,9 @@ jobs:
steps:
- run: git config --global core.autocrlf false
- uses: actions/checkout@v4
with:
fetch-depth: 0
persist-credentials: false
- uses: ./.github/actions/setup-cyborg
- uses: arduino/setup-protoc@v3
with:
@@ -101,7 +107,6 @@ jobs:
CARGO_BUILD_RUSTFLAGS: "-C linker=lld-link"
RUST_BACKTRACE: 1
CARGO_INCREMENTAL: 0
RUSTUP_WINDOWS_PATH_ADD_BIN: 1 # Workaround for https://github.com/nextest-rs/nextest/issues/1493
GT_S3_BUCKET: ${{ vars.AWS_CI_TEST_BUCKET }}
GT_S3_ACCESS_KEY_ID: ${{ secrets.AWS_CI_TEST_ACCESS_KEY_ID }}
GT_S3_ACCESS_KEY: ${{ secrets.AWS_CI_TEST_SECRET_ACCESS_KEY }}
@@ -111,19 +116,23 @@ jobs:
cleanbuild-linux-nix:
name: Run clean build on Linux
runs-on: ubuntu-latest
timeout-minutes: 60
if: ${{ github.repository == 'GreptimeTeam/greptimedb' }}
timeout-minutes: 45
steps:
- uses: actions/checkout@v4
- uses: cachix/install-nix-action@v27
with:
nix_path: nixpkgs=channel:nixos-24.11
- run: nix develop --command cargo build
fetch-depth: 0
persist-credentials: false
- uses: cachix/install-nix-action@v31
- run: nix develop --command cargo check --bin greptime
env:
CARGO_BUILD_RUSTFLAGS: "-C link-arg=-fuse-ld=mold"
check-status:
name: Check status
needs: [sqlness-test, sqlness-windows, test-on-windows]
if: ${{ github.repository == 'GreptimeTeam/greptimedb' }}
runs-on: ubuntu-20.04
runs-on: ubuntu-latest
outputs:
check-result: ${{ steps.set-check-result.outputs.check-result }}
steps:
@@ -136,11 +145,14 @@ jobs:
if: ${{ github.repository == 'GreptimeTeam/greptimedb' && always() }} # Not requiring successful dependent jobs, always run.
name: Send notification to Greptime team
needs: [check-status]
runs-on: ubuntu-20.04
runs-on: ubuntu-latest
env:
SLACK_WEBHOOK_URL: ${{ secrets.SLACK_WEBHOOK_URL_DEVELOP_CHANNEL }}
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0
persist-credentials: false
- uses: ./.github/actions/setup-cyborg
- name: Report CI status
id: report-ci-status

42
.github/workflows/pr-labeling.yaml vendored Normal file
View File

@@ -0,0 +1,42 @@
name: 'PR Labeling'
on:
pull_request_target:
types:
- opened
- synchronize
- reopened
permissions:
contents: read
pull-requests: write
issues: write
jobs:
labeler:
runs-on: ubuntu-latest
steps:
- name: Checkout sources
uses: actions/checkout@v4
- uses: actions/labeler@v5
with:
configuration-path: ".github/labeler.yaml"
repo-token: "${{ secrets.GITHUB_TOKEN }}"
size-label:
runs-on: ubuntu-latest
steps:
- uses: pascalgn/size-label-action@v0.5.5
env:
GITHUB_TOKEN: "${{ secrets.GITHUB_TOKEN }}"
with:
sizes: >
{
"0": "XS",
"100": "S",
"300": "M",
"1000": "L",
"1500": "XL",
"2000": "XXL"
}

View File

@@ -24,12 +24,20 @@ on:
description: Release dev-builder-android image
required: false
default: false
update_dev_builder_image_tag:
type: boolean
description: Update the DEV_BUILDER_IMAGE_TAG in Makefile and create a PR
required: false
default: false
jobs:
release-dev-builder-images:
name: Release dev builder images
if: ${{ inputs.release_dev_builder_ubuntu_image || inputs.release_dev_builder_centos_image || inputs.release_dev_builder_android_image }} # Only manually trigger this job.
runs-on: ubuntu-20.04-16-cores
# The jobs are triggered by the following events:
# 1. Manually triggered workflow_dispatch event
# 2. Push event when the PR that modifies the `rust-toolchain.toml` or `docker/dev-builder/**` is merged to main
if: ${{ github.event_name == 'push' || inputs.release_dev_builder_ubuntu_image || inputs.release_dev_builder_centos_image || inputs.release_dev_builder_android_image }}
runs-on: ubuntu-latest
outputs:
version: ${{ steps.set-version.outputs.version }}
steps:
@@ -37,6 +45,7 @@ jobs:
uses: actions/checkout@v4
with:
fetch-depth: 0
persist-credentials: false
- name: Configure build image version
id: set-version
@@ -56,13 +65,13 @@ jobs:
version: ${{ env.VERSION }}
dockerhub-image-registry-username: ${{ secrets.DOCKERHUB_USERNAME }}
dockerhub-image-registry-token: ${{ secrets.DOCKERHUB_TOKEN }}
build-dev-builder-ubuntu: ${{ inputs.release_dev_builder_ubuntu_image }}
build-dev-builder-centos: ${{ inputs.release_dev_builder_centos_image }}
build-dev-builder-android: ${{ inputs.release_dev_builder_android_image }}
build-dev-builder-ubuntu: ${{ inputs.release_dev_builder_ubuntu_image || github.event_name == 'push' }}
build-dev-builder-centos: ${{ inputs.release_dev_builder_centos_image || github.event_name == 'push' }}
build-dev-builder-android: ${{ inputs.release_dev_builder_android_image || github.event_name == 'push' }}
release-dev-builder-images-ecr:
name: Release dev builder images to AWS ECR
runs-on: ubuntu-20.04
runs-on: ubuntu-latest
needs: [
release-dev-builder-images
]
@@ -84,52 +93,70 @@ jobs:
- name: Push dev-builder-ubuntu image
shell: bash
if: ${{ inputs.release_dev_builder_ubuntu_image }}
if: ${{ inputs.release_dev_builder_ubuntu_image || github.event_name == 'push' }}
env:
IMAGE_VERSION: ${{ needs.release-dev-builder-images.outputs.version }}
IMAGE_NAMESPACE: ${{ vars.IMAGE_NAMESPACE }}
ECR_IMAGE_REGISTRY: ${{ vars.ECR_IMAGE_REGISTRY }}
ECR_IMAGE_NAMESPACE: ${{ vars.ECR_IMAGE_NAMESPACE }}
run: |
docker run -v "${DOCKER_CONFIG:-$HOME/.docker}:/root/.docker:ro" \
-e "REGISTRY_AUTH_FILE=/root/.docker/config.json" \
quay.io/skopeo/stable:latest \
copy -a docker://docker.io/${{ vars.IMAGE_NAMESPACE }}/dev-builder-ubuntu:${{ needs.release-dev-builder-images.outputs.version }} \
docker://${{ vars.ECR_IMAGE_REGISTRY }}/${{ vars.ECR_IMAGE_NAMESPACE }}/dev-builder-ubuntu:${{ needs.release-dev-builder-images.outputs.version }}
copy -a docker://docker.io/$IMAGE_NAMESPACE/dev-builder-ubuntu:$IMAGE_VERSION \
docker://$ECR_IMAGE_REGISTRY/$ECR_IMAGE_NAMESPACE/dev-builder-ubuntu:$IMAGE_VERSION
docker run -v "${DOCKER_CONFIG:-$HOME/.docker}:/root/.docker:ro" \
-e "REGISTRY_AUTH_FILE=/root/.docker/config.json" \
quay.io/skopeo/stable:latest \
copy -a docker://docker.io/${{ vars.IMAGE_NAMESPACE }}/dev-builder-ubuntu:latest \
docker://${{ vars.ECR_IMAGE_REGISTRY }}/${{ vars.ECR_IMAGE_NAMESPACE }}/dev-builder-ubuntu:latest
copy -a docker://docker.io/$IMAGE_NAMESPACE/dev-builder-ubuntu:latest \
docker://$ECR_IMAGE_REGISTRY/$ECR_IMAGE_NAMESPACE/dev-builder-ubuntu:latest
- name: Push dev-builder-centos image
shell: bash
if: ${{ inputs.release_dev_builder_centos_image }}
if: ${{ inputs.release_dev_builder_centos_image || github.event_name == 'push' }}
env:
IMAGE_VERSION: ${{ needs.release-dev-builder-images.outputs.version }}
IMAGE_NAMESPACE: ${{ vars.IMAGE_NAMESPACE }}
ECR_IMAGE_REGISTRY: ${{ vars.ECR_IMAGE_REGISTRY }}
ECR_IMAGE_NAMESPACE: ${{ vars.ECR_IMAGE_NAMESPACE }}
run: |
docker run -v "${DOCKER_CONFIG:-$HOME/.docker}:/root/.docker:ro" \
-e "REGISTRY_AUTH_FILE=/root/.docker/config.json" \
quay.io/skopeo/stable:latest \
copy -a docker://docker.io/${{ vars.IMAGE_NAMESPACE }}/dev-builder-centos:${{ needs.release-dev-builder-images.outputs.version }} \
docker://${{ vars.ECR_IMAGE_REGISTRY }}/${{ vars.ECR_IMAGE_NAMESPACE }}/dev-builder-centos:${{ needs.release-dev-builder-images.outputs.version }}
copy -a docker://docker.io/$IMAGE_NAMESPACE/dev-builder-centos:$IMAGE_VERSION \
docker://$ECR_IMAGE_REGISTRY/$ECR_IMAGE_NAMESPACE/dev-builder-centos:$IMAGE_VERSION
docker run -v "${DOCKER_CONFIG:-$HOME/.docker}:/root/.docker:ro" \
-e "REGISTRY_AUTH_FILE=/root/.docker/config.json" \
quay.io/skopeo/stable:latest \
copy -a docker://docker.io/${{ vars.IMAGE_NAMESPACE }}/dev-builder-centos:latest \
docker://${{ vars.ECR_IMAGE_REGISTRY }}/${{ vars.ECR_IMAGE_NAMESPACE }}/dev-builder-centos:latest
copy -a docker://docker.io/$IMAGE_NAMESPACE/dev-builder-centos:latest \
docker://$ECR_IMAGE_REGISTRY/$ECR_IMAGE_NAMESPACE/dev-builder-centos:latest
- name: Push dev-builder-android image
shell: bash
if: ${{ inputs.release_dev_builder_android_image }}
if: ${{ inputs.release_dev_builder_android_image || github.event_name == 'push' }}
env:
IMAGE_VERSION: ${{ needs.release-dev-builder-images.outputs.version }}
IMAGE_NAMESPACE: ${{ vars.IMAGE_NAMESPACE }}
ECR_IMAGE_REGISTRY: ${{ vars.ECR_IMAGE_REGISTRY }}
ECR_IMAGE_NAMESPACE: ${{ vars.ECR_IMAGE_NAMESPACE }}
run: |
docker run -v "${DOCKER_CONFIG:-$HOME/.docker}:/root/.docker:ro" \
-e "REGISTRY_AUTH_FILE=/root/.docker/config.json" \
quay.io/skopeo/stable:latest \
copy -a docker://docker.io/${{ vars.IMAGE_NAMESPACE }}/dev-builder-android:${{ needs.release-dev-builder-images.outputs.version }} \
docker://${{ vars.ECR_IMAGE_REGISTRY }}/${{ vars.ECR_IMAGE_NAMESPACE }}/dev-builder-android:${{ needs.release-dev-builder-images.outputs.version }}
copy -a docker://docker.io/$IMAGE_NAMESPACE/dev-builder-android:$IMAGE_VERSION \
docker://$ECR_IMAGE_REGISTRY/$ECR_IMAGE_NAMESPACE/dev-builder-android:$IMAGE_VERSION
docker run -v "${DOCKER_CONFIG:-$HOME/.docker}:/root/.docker:ro" \
-e "REGISTRY_AUTH_FILE=/root/.docker/config.json" \
quay.io/skopeo/stable:latest \
copy -a docker://docker.io/${{ vars.IMAGE_NAMESPACE }}/dev-builder-android:latest \
docker://${{ vars.ECR_IMAGE_REGISTRY }}/${{ vars.ECR_IMAGE_NAMESPACE }}/dev-builder-android:latest
copy -a docker://docker.io/$IMAGE_NAMESPACE/dev-builder-android:latest \
docker://$ECR_IMAGE_REGISTRY/$ECR_IMAGE_NAMESPACE/dev-builder-android:latest
release-dev-builder-images-cn: # Note: Be careful issue: https://github.com/containers/skopeo/issues/1874 and we decide to use the latest stable skopeo container.
name: Release dev builder images to CN region
runs-on: ubuntu-20.04
runs-on: ubuntu-latest
needs: [
release-dev-builder-images
]
@@ -143,30 +170,63 @@ jobs:
- name: Push dev-builder-ubuntu image
shell: bash
if: ${{ inputs.release_dev_builder_ubuntu_image }}
if: ${{ inputs.release_dev_builder_ubuntu_image || github.event_name == 'push' }}
env:
IMAGE_VERSION: ${{ needs.release-dev-builder-images.outputs.version }}
IMAGE_NAMESPACE: ${{ vars.IMAGE_NAMESPACE }}
ACR_IMAGE_REGISTRY: ${{ vars.ACR_IMAGE_REGISTRY }}
run: |
docker run -v "${DOCKER_CONFIG:-$HOME/.docker}:/root/.docker:ro" \
-e "REGISTRY_AUTH_FILE=/root/.docker/config.json" \
quay.io/skopeo/stable:latest \
copy -a docker://docker.io/${{ vars.IMAGE_NAMESPACE }}/dev-builder-ubuntu:${{ needs.release-dev-builder-images.outputs.version }} \
docker://${{ vars.ACR_IMAGE_REGISTRY }}/${{ vars.IMAGE_NAMESPACE }}/dev-builder-ubuntu:${{ needs.release-dev-builder-images.outputs.version }}
copy -a docker://docker.io/$IMAGE_NAMESPACE/dev-builder-ubuntu:$IMAGE_VERSION \
docker://$ACR_IMAGE_REGISTRY/$IMAGE_NAMESPACE/dev-builder-ubuntu:$IMAGE_VERSION
- name: Push dev-builder-centos image
shell: bash
if: ${{ inputs.release_dev_builder_centos_image }}
if: ${{ inputs.release_dev_builder_centos_image || github.event_name == 'push' }}
env:
IMAGE_VERSION: ${{ needs.release-dev-builder-images.outputs.version }}
IMAGE_NAMESPACE: ${{ vars.IMAGE_NAMESPACE }}
ACR_IMAGE_REGISTRY: ${{ vars.ACR_IMAGE_REGISTRY }}
run: |
docker run -v "${DOCKER_CONFIG:-$HOME/.docker}:/root/.docker:ro" \
-e "REGISTRY_AUTH_FILE=/root/.docker/config.json" \
quay.io/skopeo/stable:latest \
copy -a docker://docker.io/${{ vars.IMAGE_NAMESPACE }}/dev-builder-centos:${{ needs.release-dev-builder-images.outputs.version }} \
docker://${{ vars.ACR_IMAGE_REGISTRY }}/${{ vars.IMAGE_NAMESPACE }}/dev-builder-centos:${{ needs.release-dev-builder-images.outputs.version }}
copy -a docker://docker.io/$IMAGE_NAMESPACE/dev-builder-centos:$IMAGE_VERSION \
docker://$ACR_IMAGE_REGISTRY/$IMAGE_NAMESPACE/dev-builder-centos:$IMAGE_VERSION
- name: Push dev-builder-android image
shell: bash
if: ${{ inputs.release_dev_builder_android_image }}
if: ${{ inputs.release_dev_builder_android_image || github.event_name == 'push' }}
env:
IMAGE_VERSION: ${{ needs.release-dev-builder-images.outputs.version }}
IMAGE_NAMESPACE: ${{ vars.IMAGE_NAMESPACE }}
ACR_IMAGE_REGISTRY: ${{ vars.ACR_IMAGE_REGISTRY }}
run: |
docker run -v "${DOCKER_CONFIG:-$HOME/.docker}:/root/.docker:ro" \
-e "REGISTRY_AUTH_FILE=/root/.docker/config.json" \
quay.io/skopeo/stable:latest \
copy -a docker://docker.io/${{ vars.IMAGE_NAMESPACE }}/dev-builder-android:${{ needs.release-dev-builder-images.outputs.version }} \
docker://${{ vars.ACR_IMAGE_REGISTRY }}/${{ vars.IMAGE_NAMESPACE }}/dev-builder-android:${{ needs.release-dev-builder-images.outputs.version }}
copy -a docker://docker.io/$IMAGE_NAMESPACE/dev-builder-android:$IMAGE_VERSION \
docker://$ACR_IMAGE_REGISTRY/$IMAGE_NAMESPACE/dev-builder-android:$IMAGE_VERSION
update-dev-builder-image-tag:
name: Update dev-builder image tag
runs-on: ubuntu-latest
permissions:
contents: write
pull-requests: write
if: ${{ github.event_name == 'push' || inputs.update_dev_builder_image_tag }}
needs: [
release-dev-builder-images
]
steps:
- name: Checkout repository
uses: actions/checkout@v4
- name: Update dev-builder image tag
shell: bash
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
run: |
./.github/scripts/update-dev-builder-version.sh ${{ needs.release-dev-builder-images.outputs.version }}

View File

@@ -18,11 +18,11 @@ on:
description: The runner uses to build linux-amd64 artifacts
default: ec2-c6i.4xlarge-amd64
options:
- ubuntu-20.04
- ubuntu-20.04-8-cores
- ubuntu-20.04-16-cores
- ubuntu-20.04-32-cores
- ubuntu-20.04-64-cores
- ubuntu-22.04
- ubuntu-22.04-8-cores
- ubuntu-22.04-16-cores
- ubuntu-22.04-32-cores
- ubuntu-22.04-64-cores
- ec2-c6i.xlarge-amd64 # 4C8G
- ec2-c6i.2xlarge-amd64 # 8C16G
- ec2-c6i.4xlarge-amd64 # 16C32G
@@ -88,21 +88,14 @@ env:
# Controls whether to run tests, include unit-test, integration-test and sqlness.
DISABLE_RUN_TESTS: ${{ inputs.skip_test || vars.DEFAULT_SKIP_TEST }}
# The scheduled version is '${{ env.NEXT_RELEASE_VERSION }}-nightly-YYYYMMDD', like v0.2.0-nigthly-20230313;
# The scheduled version is '${{ env.NEXT_RELEASE_VERSION }}-nightly-YYYYMMDD', like v0.2.0-nightly-20230313;
NIGHTLY_RELEASE_PREFIX: nightly
# Note: The NEXT_RELEASE_VERSION should be modified manually by every formal release.
NEXT_RELEASE_VERSION: v0.12.0
# Permission reference: https://docs.github.com/en/actions/using-jobs/assigning-permissions-to-jobs
permissions:
issues: write # Allows the action to create issues for cyborg.
contents: write # Allows the action to create a release.
jobs:
allocate-runners:
name: Allocate runners
if: ${{ github.repository == 'GreptimeTeam/greptimedb' }}
runs-on: ubuntu-20.04
runs-on: ubuntu-latest
outputs:
linux-amd64-runner: ${{ steps.start-linux-amd64-runner.outputs.label }}
linux-arm64-runner: ${{ steps.start-linux-arm64-runner.outputs.label }}
@@ -117,11 +110,14 @@ jobs:
# The 'version' use as the global tag name of the release workflow.
version: ${{ steps.create-version.outputs.version }}
should-push-latest-tag: ${{ steps.check-version.outputs.should-push-latest-tag }}
steps:
- name: Checkout
uses: actions/checkout@v4
with:
fetch-depth: 0
persist-credentials: false
- name: Check Rust toolchain version
shell: bash
@@ -130,7 +126,7 @@ jobs:
# The create-version will create a global variable named 'version' in the global workflows.
# - If it's a tag push release, the version is the tag name(${{ github.ref_name }});
# - If it's a scheduled release, the version is '${{ env.NEXT_RELEASE_VERSION }}-nightly-$buildTime', like v0.2.0-nigthly-20230313;
# - If it's a scheduled release, the version is '${{ env.NEXT_RELEASE_VERSION }}-nightly-$buildTime', like v0.2.0-nightly-20230313;
# - If it's a manual release, the version is '${{ env.NEXT_RELEASE_VERSION }}-<short-git-sha>-YYYYMMDDSS', like v0.2.0-e5b243c-2023071245;
- name: Create version
id: create-version
@@ -139,9 +135,13 @@ jobs:
env:
GITHUB_EVENT_NAME: ${{ github.event_name }}
GITHUB_REF_NAME: ${{ github.ref_name }}
NEXT_RELEASE_VERSION: ${{ env.NEXT_RELEASE_VERSION }}
NIGHTLY_RELEASE_PREFIX: ${{ env.NIGHTLY_RELEASE_PREFIX }}
- name: Check version
id: check-version
run: |
./.github/scripts/check-version.sh "${{ steps.create-version.outputs.version }}"
- name: Allocate linux-amd64 runner
if: ${{ inputs.build_linux_amd64_artifacts || github.event_name == 'push' || github.event_name == 'schedule' }}
uses: ./.github/actions/start-runner
@@ -181,6 +181,7 @@ jobs:
- uses: actions/checkout@v4
with:
fetch-depth: 0
persist-credentials: false
- uses: ./.github/actions/build-linux-artifacts
with:
@@ -202,6 +203,7 @@ jobs:
- uses: actions/checkout@v4
with:
fetch-depth: 0
persist-credentials: false
- uses: ./.github/actions/build-linux-artifacts
with:
@@ -237,6 +239,7 @@ jobs:
- uses: actions/checkout@v4
with:
fetch-depth: 0
persist-credentials: false
- uses: ./.github/actions/build-macos-artifacts
with:
@@ -276,6 +279,7 @@ jobs:
- uses: actions/checkout@v4
with:
fetch-depth: 0
persist-credentials: false
- uses: ./.github/actions/build-windows-artifacts
with:
@@ -299,22 +303,25 @@ jobs:
build-linux-amd64-artifacts,
build-linux-arm64-artifacts,
]
runs-on: ubuntu-2004-16-cores
runs-on: ubuntu-latest
outputs:
build-image-result: ${{ steps.set-build-image-result.outputs.build-image-result }}
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0
persist-credentials: false
- name: Build and push images to dockerhub
uses: ./.github/actions/build-images
with:
image-registry: docker.io
image-namespace: ${{ vars.IMAGE_NAMESPACE }}
image-name: ${{ vars.GREPTIMEDB_IMAGE_NAME }}
image-registry-username: ${{ secrets.DOCKERHUB_USERNAME }}
image-registry-password: ${{ secrets.DOCKERHUB_TOKEN }}
version: ${{ needs.allocate-runners.outputs.version }}
push-latest-tag: ${{ needs.allocate-runners.outputs.should-push-latest-tag == 'true' && github.ref_type == 'tag' && !contains(github.ref_name, 'nightly') && github.event_name != 'schedule' }}
- name: Set build image result
id: set-build-image-result
@@ -332,7 +339,7 @@ jobs:
build-windows-artifacts,
release-images-to-dockerhub,
]
runs-on: ubuntu-20.04
runs-on: ubuntu-latest-16-cores
# When we push to ACR, it's easy to fail due to some unknown network issues.
# However, we don't want to fail the whole workflow because of this.
# The ACR have daily sync with DockerHub, so don't worry about the image not being updated.
@@ -341,13 +348,14 @@ jobs:
- uses: actions/checkout@v4
with:
fetch-depth: 0
persist-credentials: false
- name: Release artifacts to CN region
uses: ./.github/actions/release-cn-artifacts
with:
src-image-registry: docker.io
src-image-namespace: ${{ vars.IMAGE_NAMESPACE }}
src-image-name: greptimedb
src-image-name: ${{ vars.GREPTIMEDB_IMAGE_NAME }}
dst-image-registry-username: ${{ secrets.ALICLOUD_USERNAME }}
dst-image-registry-password: ${{ secrets.ALICLOUD_PASSWORD }}
dst-image-registry: ${{ vars.ACR_IMAGE_REGISTRY }}
@@ -358,8 +366,9 @@ jobs:
aws-cn-secret-access-key: ${{ secrets.AWS_CN_SECRET_ACCESS_KEY }}
aws-cn-region: ${{ vars.AWS_RELEASE_BUCKET_REGION }}
dev-mode: false
upload-to-s3: true
update-version-info: true
push-latest-tag: true
push-latest-tag: ${{ needs.allocate-runners.outputs.should-push-latest-tag == 'true' && github.ref_type == 'tag' && !contains(github.ref_name, 'nightly') && github.event_name != 'schedule' }}
publish-github-release:
name: Create GitHub release and upload artifacts
@@ -372,11 +381,12 @@ jobs:
build-windows-artifacts,
release-images-to-dockerhub,
]
runs-on: ubuntu-20.04
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0
persist-credentials: false
- name: Publish GitHub release
uses: ./.github/actions/publish-github-release
@@ -385,12 +395,12 @@ jobs:
### Stop runners ###
# It's very necessary to split the job of releasing runners into 'stop-linux-amd64-runner' and 'stop-linux-arm64-runner'.
# Because we can terminate the specified EC2 instance immediately after the job is finished without uncessary waiting.
# Because we can terminate the specified EC2 instance immediately after the job is finished without unnecessary waiting.
stop-linux-amd64-runner: # It's always run as the last job in the workflow to make sure that the runner is released.
name: Stop linux-amd64 runner
# Only run this job when the runner is allocated.
if: ${{ always() }}
runs-on: ubuntu-20.04
runs-on: ubuntu-latest
needs: [
allocate-runners,
build-linux-amd64-artifacts,
@@ -400,6 +410,7 @@ jobs:
uses: actions/checkout@v4
with:
fetch-depth: 0
persist-credentials: false
- name: Stop EC2 runner
uses: ./.github/actions/stop-runner
@@ -415,7 +426,7 @@ jobs:
name: Stop linux-arm64 runner
# Only run this job when the runner is allocated.
if: ${{ always() }}
runs-on: ubuntu-20.04
runs-on: ubuntu-latest
needs: [
allocate-runners,
build-linux-arm64-artifacts,
@@ -425,6 +436,7 @@ jobs:
uses: actions/checkout@v4
with:
fetch-depth: 0
persist-credentials: false
- name: Stop EC2 runner
uses: ./.github/actions/stop-runner
@@ -436,21 +448,73 @@ jobs:
aws-region: ${{ vars.EC2_RUNNER_REGION }}
github-token: ${{ secrets.GH_PERSONAL_ACCESS_TOKEN }}
bump-doc-version:
name: Bump doc version
bump-downstream-repo-versions:
name: Bump downstream repo versions
if: ${{ github.event_name == 'push' || github.event_name == 'schedule' }}
needs: [allocate-runners]
runs-on: ubuntu-20.04
needs: [allocate-runners, publish-github-release]
runs-on: ubuntu-latest
# Permission reference: https://docs.github.com/en/actions/using-jobs/assigning-permissions-to-jobs
permissions:
issues: write # Allows the action to create issues for cyborg.
contents: write # Allows the action to create a release.
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0
persist-credentials: false
- uses: ./.github/actions/setup-cyborg
- name: Bump doc version
- name: Bump downstream repo versions
working-directory: cyborg
run: pnpm tsx bin/bump-doc-version.ts
run: pnpm tsx bin/bump-versions.ts
env:
TARGET_REPOS: website,docs,demo
VERSION: ${{ needs.allocate-runners.outputs.version }}
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
WEBSITE_REPO_TOKEN: ${{ secrets.WEBSITE_REPO_TOKEN }}
DOCS_REPO_TOKEN: ${{ secrets.DOCS_REPO_TOKEN }}
DEMO_REPO_TOKEN: ${{ secrets.DEMO_REPO_TOKEN }}
bump-helm-charts-version:
name: Bump helm charts version
if: ${{ github.ref_type == 'tag' && !contains(github.ref_name, 'nightly') && github.event_name != 'schedule' }}
needs: [allocate-runners, publish-github-release]
runs-on: ubuntu-latest
permissions:
contents: write
pull-requests: write
steps:
- name: Checkout repository
uses: actions/checkout@v4
with:
fetch-depth: 0
- name: Bump helm charts version
env:
GITHUB_TOKEN: ${{ secrets.HELM_CHARTS_REPO_TOKEN }}
VERSION: ${{ needs.allocate-runners.outputs.version }}
run: |
./.github/scripts/update-helm-charts-version.sh
bump-homebrew-greptime-version:
name: Bump homebrew greptime version
if: ${{ github.ref_type == 'tag' && !contains(github.ref_name, 'nightly') && github.event_name != 'schedule' }}
needs: [allocate-runners, publish-github-release]
runs-on: ubuntu-latest
permissions:
contents: write
pull-requests: write
steps:
- name: Checkout repository
uses: actions/checkout@v4
with:
fetch-depth: 0
- name: Bump homebrew greptime version
env:
GITHUB_TOKEN: ${{ secrets.HOMEBREW_GREPTIME_REPO_TOKEN }}
VERSION: ${{ needs.allocate-runners.outputs.version }}
run: |
./.github/scripts/update-homebrew-greptme-version.sh
notification:
if: ${{ github.repository == 'GreptimeTeam/greptimedb' && (github.event_name == 'push' || github.event_name == 'schedule') && always() }}
@@ -460,11 +524,18 @@ jobs:
build-macos-artifacts,
build-windows-artifacts,
]
runs-on: ubuntu-20.04
runs-on: ubuntu-latest
# Permission reference: https://docs.github.com/en/actions/using-jobs/assigning-permissions-to-jobs
permissions:
issues: write # Allows the action to create issues for cyborg.
contents: write # Allows the action to create a release.
env:
SLACK_WEBHOOK_URL: ${{ secrets.SLACK_WEBHOOK_URL_DEVELOP_CHANNEL }}
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0
persist-credentials: false
- uses: ./.github/actions/setup-cyborg
- name: Report CI status
id: report-ci-status

View File

@@ -4,18 +4,20 @@ on:
- cron: '4 2 * * *'
workflow_dispatch:
permissions:
contents: read
issues: write
pull-requests: write
jobs:
maintenance:
name: Periodic Maintenance
runs-on: ubuntu-latest
permissions:
contents: read
issues: write
pull-requests: write
if: ${{ github.repository == 'GreptimeTeam/greptimedb' }}
steps:
- uses: actions/checkout@v4
with:
persist-credentials: false
- uses: ./.github/actions/setup-cyborg
- name: Do Maintenance
working-directory: cyborg

View File

@@ -1,15 +1,24 @@
name: "Semantic Pull Request"
on:
pull_request_target:
pull_request:
types:
- opened
- reopened
- edited
concurrency:
group: ${{ github.workflow }}-${{ github.head_ref || github.run_id }}
cancel-in-progress: true
permissions:
issues: write
contents: write
pull-requests: write
jobs:
check:
runs-on: ubuntu-20.04
runs-on: ubuntu-latest
timeout-minutes: 10
steps:
- uses: actions/checkout@v4

7
.gitignore vendored
View File

@@ -54,3 +54,10 @@ tests-fuzz/corpus/
# Nix
.direnv
.envrc
## default data home
greptimedb_data
# github
!/.github

View File

@@ -3,30 +3,28 @@
## Individual Committers (in alphabetical order)
* [CookiePieWw](https://github.com/CookiePieWw)
* [KKould](https://github.com/KKould)
* [NiwakaDev](https://github.com/NiwakaDev)
* [etolbakov](https://github.com/etolbakov)
* [irenjj](https://github.com/irenjj)
* [tisonkun](https://github.com/tisonkun)
* [KKould](https://github.com/KKould)
* [Lanqing Yang](https://github.com/lyang24)
* [NiwakaDev](https://github.com/NiwakaDev)
* [tisonkun](https://github.com/tisonkun)
## Team Members (in alphabetical order)
* [Breeze-P](https://github.com/Breeze-P)
* [GrepTime](https://github.com/GrepTime)
* [MichaelScofield](https://github.com/MichaelScofield)
* [Wenjie0329](https://github.com/Wenjie0329)
* [WenyXu](https://github.com/WenyXu)
* [ZonaHex](https://github.com/ZonaHex)
* [apdong2022](https://github.com/apdong2022)
* [beryl678](https://github.com/beryl678)
* [Breeze-P](https://github.com/Breeze-P)
* [daviderli614](https://github.com/daviderli614)
* [discord9](https://github.com/discord9)
* [evenyag](https://github.com/evenyag)
* [fengjiachun](https://github.com/fengjiachun)
* [fengys1996](https://github.com/fengys1996)
* [GrepTime](https://github.com/GrepTime)
* [holalengyu](https://github.com/holalengyu)
* [killme2008](https://github.com/killme2008)
* [MichaelScofield](https://github.com/MichaelScofield)
* [nicecui](https://github.com/nicecui)
* [paomian](https://github.com/paomian)
* [shuiyisong](https://github.com/shuiyisong)
@@ -34,11 +32,14 @@
* [sunng87](https://github.com/sunng87)
* [v0y4g3r](https://github.com/v0y4g3r)
* [waynexia](https://github.com/waynexia)
* [Wenjie0329](https://github.com/Wenjie0329)
* [WenyXu](https://github.com/WenyXu)
* [xtang](https://github.com/xtang)
* [zhaoyingnan01](https://github.com/zhaoyingnan01)
* [zhongzc](https://github.com/zhongzc)
* [ZonaHex](https://github.com/ZonaHex)
* [zyy17](https://github.com/zyy17)
## All Contributors
[![All Contributors](https://contrib.rocks/image?repo=GreptimeTeam/greptimedb)](https://github.com/GreptimeTeam/greptimedb/graphs/contributors)
To see the full list of contributors, please visit our [Contributors page](https://github.com/GreptimeTeam/greptimedb/graphs/contributors)

964
Cargo.lock generated

File diff suppressed because it is too large Load Diff

View File

@@ -67,7 +67,7 @@ members = [
resolver = "2"
[workspace.package]
version = "0.12.0"
version = "0.12.2"
edition = "2021"
license = "Apache-2.0"
@@ -81,6 +81,7 @@ rust.unknown_lints = "deny"
rust.unexpected_cfgs = { level = "warn", check-cfg = ['cfg(tokio_unstable)'] }
[workspace.dependencies]
# DO_NOT_REMOVE_THIS: BEGIN_OF_EXTERNAL_DEPENDENCIES
# We turn off default-features for some dependencies here so the workspaces which inherit them can
# selectively turn them on if needed, since we can override default-features = true (from false)
# for the inherited dependency but cannot do the reverse (override from true to false).
@@ -106,6 +107,7 @@ bitflags = "2.4.1"
bytemuck = "1.12"
bytes = { version = "1.7", features = ["serde"] }
chrono = { version = "0.4", features = ["serde"] }
chrono-tz = "0.10.1"
clap = { version = "4.4", features = ["derive"] }
config = "0.13.0"
crossbeam-utils = "0.8"
@@ -127,7 +129,7 @@ etcd-client = "0.14"
fst = "0.4.7"
futures = "0.3"
futures-util = "0.3"
greptime-proto = { git = "https://github.com/GreptimeTeam/greptime-proto.git", rev = "683e9d10ae7f3dfb8aaabd89082fc600c17e3795" }
greptime-proto = { git = "https://github.com/GreptimeTeam/greptime-proto.git", rev = "072ce580502e015df1a6b03a185b60309a7c2a7a" }
hex = "0.4"
http = "1"
humantime = "2.1"
@@ -138,8 +140,8 @@ itertools = "0.10"
jsonb = { git = "https://github.com/databendlabs/jsonb.git", rev = "8c8d2fc294a39f3ff08909d60f718639cfba3875", default-features = false }
lazy_static = "1.4"
local-ip-address = "0.6"
loki-api = { git = "https://github.com/shuiyisong/tracing-loki", branch = "chore/prost_version" }
meter-core = { git = "https://github.com/GreptimeTeam/greptime-meter.git", rev = "a10facb353b41460eeb98578868ebf19c2084fac" }
loki-proto = { git = "https://github.com/GreptimeTeam/loki-proto.git", rev = "1434ecf23a2654025d86188fb5205e7a74b225d3" }
meter-core = { git = "https://github.com/GreptimeTeam/greptime-meter.git", rev = "5618e779cf2bb4755b499c630fba4c35e91898cb" }
mockall = "0.11.4"
moka = "0.12"
nalgebra = "0.33"
@@ -158,7 +160,9 @@ parquet = { version = "53.0.0", default-features = false, features = ["arrow", "
paste = "1.0"
pin-project = "1.0"
prometheus = { version = "0.13.3", features = ["process"] }
promql-parser = { version = "0.4.3", features = ["ser"] }
promql-parser = { git = "https://github.com/GreptimeTeam/promql-parser.git", features = [
"ser",
], rev = "27abb8e16003a50c720f00d6c85f41f5fa2a2a8e" }
prost = "0.13"
raft-engine = { version = "0.4.1", default-features = false }
rand = "0.8"
@@ -207,6 +211,7 @@ tracing-subscriber = { version = "0.3", features = ["env-filter", "json", "fmt"]
typetag = "0.2"
uuid = { version = "1.7", features = ["serde", "v4", "fast-rng"] }
zstd = "0.13"
# DO_NOT_REMOVE_THIS: END_OF_EXTERNAL_DEPENDENCIES
## workspaces members
api = { path = "src/api" }
@@ -278,12 +283,10 @@ tokio-rustls = { git = "https://github.com/GreptimeTeam/tokio-rustls", rev = "46
# This is commented, since we are not using aws-lc-sys, if we need to use it, we need to uncomment this line or use a release after this commit, or it wouldn't compile with gcc < 8.1
# see https://github.com/aws/aws-lc-rs/pull/526
# aws-lc-sys = { git ="https://github.com/aws/aws-lc-rs", rev = "556558441e3494af4b156ae95ebc07ebc2fd38aa" }
# Apply a fix for pprof for unaligned pointer access
pprof = { git = "https://github.com/GreptimeTeam/pprof-rs", rev = "1bd1e21" }
[workspace.dependencies.meter-macros]
git = "https://github.com/GreptimeTeam/greptime-meter.git"
rev = "a10facb353b41460eeb98578868ebf19c2084fac"
rev = "5618e779cf2bb4755b499c630fba4c35e91898cb"
[profile.release]
debug = 1

View File

@@ -1,3 +1,6 @@
[target.aarch64-unknown-linux-gnu]
image = "ghcr.io/cross-rs/aarch64-unknown-linux-gnu:0.2.5"
[build]
pre-build = [
"dpkg --add-architecture $CROSS_DEB_ARCH",
@@ -5,3 +8,8 @@ pre-build = [
"curl -LO https://github.com/protocolbuffers/protobuf/releases/download/v3.15.8/protoc-3.15.8-linux-x86_64.zip && unzip protoc-3.15.8-linux-x86_64.zip -d /usr/",
"chmod a+x /usr/bin/protoc && chmod -R a+rx /usr/include/google",
]
[build.env]
passthrough = [
"JEMALLOC_SYS_WITH_LG_PAGE",
]

View File

@@ -13,7 +13,7 @@
<a href="https://greptime.com/product/cloud">GreptimeCloud</a> |
<a href="https://docs.greptime.com/">User Guide</a> |
<a href="https://greptimedb.rs/">API Docs</a> |
<a href="https://github.com/GreptimeTeam/greptimedb/issues/3412">Roadmap 2024</a>
<a href="https://github.com/GreptimeTeam/greptimedb/issues/5446">Roadmap 2025</a>
</h4>
<a href="https://github.com/GreptimeTeam/greptimedb/releases/latest">
@@ -116,7 +116,7 @@ docker run -p 127.0.0.1:4000-4003:4000-4003 \
--name greptime --rm \
greptime/greptimedb:latest standalone start \
--http-addr 0.0.0.0:4000 \
--rpc-addr 0.0.0.0:4001 \
--rpc-bind-addr 0.0.0.0:4001 \
--mysql-addr 0.0.0.0:4002 \
--postgres-addr 0.0.0.0:4003
```

View File

@@ -29,7 +29,7 @@
| `http.enable_cors` | Bool | `true` | HTTP CORS support, it's turned on by default<br/>This allows browser to access http APIs without CORS restrictions |
| `http.cors_allowed_origins` | Array | Unset | Customize allowed origins for HTTP CORS. |
| `grpc` | -- | -- | The gRPC server options. |
| `grpc.addr` | String | `127.0.0.1:4001` | The address to bind the gRPC server. |
| `grpc.bind_addr` | String | `127.0.0.1:4001` | The address to bind the gRPC server. |
| `grpc.runtime_size` | Integer | `8` | The number of server worker threads. |
| `grpc.tls` | -- | -- | gRPC server TLS options, see `mysql.tls` section. |
| `grpc.tls.mode` | String | `disable` | TLS mode. |
@@ -40,6 +40,7 @@
| `mysql.enable` | Bool | `true` | Whether to enable. |
| `mysql.addr` | String | `127.0.0.1:4002` | The addr to bind the MySQL server. |
| `mysql.runtime_size` | Integer | `2` | The number of server worker threads. |
| `mysql.keep_alive` | String | `0s` | Server-side keep-alive time.<br/>Set to 0 (default) to disable. |
| `mysql.tls` | -- | -- | -- |
| `mysql.tls.mode` | String | `disable` | TLS mode, refer to https://www.postgresql.org/docs/current/libpq-ssl.html<br/>- `disable` (default value)<br/>- `prefer`<br/>- `require`<br/>- `verify-ca`<br/>- `verify-full` |
| `mysql.tls.cert_path` | String | Unset | Certificate file path. |
@@ -49,6 +50,7 @@
| `postgres.enable` | Bool | `true` | Whether to enable |
| `postgres.addr` | String | `127.0.0.1:4003` | The addr to bind the PostgresSQL server. |
| `postgres.runtime_size` | Integer | `2` | The number of server worker threads. |
| `postgres.keep_alive` | String | `0s` | Server-side keep-alive time.<br/>Set to 0 (default) to disable. |
| `postgres.tls` | -- | -- | PostgresSQL server TLS options, see `mysql.tls` section. |
| `postgres.tls.mode` | String | `disable` | TLS mode. |
| `postgres.tls.cert_path` | String | Unset | Certificate file path. |
@@ -58,6 +60,8 @@
| `opentsdb.enable` | Bool | `true` | Whether to enable OpenTSDB put in HTTP API. |
| `influxdb` | -- | -- | InfluxDB protocol options. |
| `influxdb.enable` | Bool | `true` | Whether to enable InfluxDB protocol in HTTP API. |
| `jaeger` | -- | -- | Jaeger protocol options. |
| `jaeger.enable` | Bool | `true` | Whether to enable Jaeger protocol in HTTP API. |
| `prom_store` | -- | -- | Prometheus remote storage options |
| `prom_store.enable` | Bool | `true` | Whether to enable Prometheus remote write and read in HTTP API. |
| `prom_store.with_metric_engine` | Bool | `true` | Whether to store the data from Prometheus remote write in metric engine. |
@@ -65,8 +69,8 @@
| `wal.provider` | String | `raft_engine` | The provider of the WAL.<br/>- `raft_engine`: the wal is stored in the local file system by raft-engine.<br/>- `kafka`: it's remote wal that data is stored in Kafka. |
| `wal.dir` | String | Unset | The directory to store the WAL files.<br/>**It's only used when the provider is `raft_engine`**. |
| `wal.file_size` | String | `128MB` | The size of the WAL segment file.<br/>**It's only used when the provider is `raft_engine`**. |
| `wal.purge_threshold` | String | `1GB` | The threshold of the WAL size to trigger a flush.<br/>**It's only used when the provider is `raft_engine`**. |
| `wal.purge_interval` | String | `1m` | The interval to trigger a flush.<br/>**It's only used when the provider is `raft_engine`**. |
| `wal.purge_threshold` | String | `1GB` | The threshold of the WAL size to trigger a purge.<br/>**It's only used when the provider is `raft_engine`**. |
| `wal.purge_interval` | String | `1m` | The interval to trigger a purge.<br/>**It's only used when the provider is `raft_engine`**. |
| `wal.read_batch_size` | Integer | `128` | The read batch size.<br/>**It's only used when the provider is `raft_engine`**. |
| `wal.sync_write` | Bool | `false` | Whether to use sync write.<br/>**It's only used when the provider is `raft_engine`**. |
| `wal.enable_log_recycle` | Bool | `true` | Whether to reuse logically truncated log files.<br/>**It's only used when the provider is `raft_engine`**. |
@@ -88,8 +92,9 @@
| `wal.backoff_deadline` | String | `5mins` | The deadline of retries.<br/>**It's only used when the provider is `kafka`**. |
| `wal.overwrite_entry_start_id` | Bool | `false` | Ignore missing entries during read WAL.<br/>**It's only used when the provider is `kafka`**.<br/><br/>This option ensures that when Kafka messages are deleted, the system<br/>can still successfully replay memtable data without throwing an<br/>out-of-range error.<br/>However, enabling this option might lead to unexpected data loss,<br/>as the system will skip over missing entries instead of treating<br/>them as critical errors. |
| `metadata_store` | -- | -- | Metadata storage options. |
| `metadata_store.file_size` | String | `256MB` | Kv file size in bytes. |
| `metadata_store.purge_threshold` | String | `4GB` | Kv purge threshold. |
| `metadata_store.file_size` | String | `64MB` | The size of the metadata store log file. |
| `metadata_store.purge_threshold` | String | `256MB` | The threshold of the metadata store size to trigger a purge. |
| `metadata_store.purge_interval` | String | `1m` | The interval of the metadata store to trigger a purge. |
| `procedure` | -- | -- | Procedure storage options. |
| `procedure.max_retry_times` | Integer | `3` | Procedure max retry time. |
| `procedure.retry_delay` | String | `500ms` | Initial retry delay of procedures, increases exponentially |
@@ -147,6 +152,7 @@
| `region_engine.mito.index` | -- | -- | The options for index in Mito engine. |
| `region_engine.mito.index.aux_path` | String | `""` | Auxiliary directory path for the index in filesystem, used to store intermediate files for<br/>creating the index and staging files for searching the index, defaults to `{data_home}/index_intermediate`.<br/>The default name for this directory is `index_intermediate` for backward compatibility.<br/><br/>This path contains two subdirectories:<br/>- `__intm`: for storing intermediate files used during creating index.<br/>- `staging`: for storing staging files used during searching index. |
| `region_engine.mito.index.staging_size` | String | `2GB` | The max capacity of the staging directory. |
| `region_engine.mito.index.staging_ttl` | String | `7d` | The TTL of the staging directory.<br/>Defaults to 7 days.<br/>Setting it to "0s" to disable TTL. |
| `region_engine.mito.index.metadata_cache_size` | String | `64MiB` | Cache size for inverted index metadata. |
| `region_engine.mito.index.content_cache_size` | String | `128MiB` | Cache size for inverted index content. |
| `region_engine.mito.index.content_cache_page_size` | String | `64KiB` | Page size for inverted index content cache. |
@@ -221,8 +227,8 @@
| `http.enable_cors` | Bool | `true` | HTTP CORS support, it's turned on by default<br/>This allows browser to access http APIs without CORS restrictions |
| `http.cors_allowed_origins` | Array | Unset | Customize allowed origins for HTTP CORS. |
| `grpc` | -- | -- | The gRPC server options. |
| `grpc.addr` | String | `127.0.0.1:4001` | The address to bind the gRPC server. |
| `grpc.hostname` | String | `127.0.0.1:4001` | The hostname advertised to the metasrv,<br/>and used for connections from outside the host |
| `grpc.bind_addr` | String | `127.0.0.1:4001` | The address to bind the gRPC server. |
| `grpc.server_addr` | String | `127.0.0.1:4001` | The address advertised to the metasrv, and used for connections from outside the host.<br/>If left empty or unset, the server will automatically use the IP address of the first network interface<br/>on the host, with the same port number as the one specified in `grpc.bind_addr`. |
| `grpc.runtime_size` | Integer | `8` | The number of server worker threads. |
| `grpc.tls` | -- | -- | gRPC server TLS options, see `mysql.tls` section. |
| `grpc.tls.mode` | String | `disable` | TLS mode. |
@@ -233,6 +239,7 @@
| `mysql.enable` | Bool | `true` | Whether to enable. |
| `mysql.addr` | String | `127.0.0.1:4002` | The addr to bind the MySQL server. |
| `mysql.runtime_size` | Integer | `2` | The number of server worker threads. |
| `mysql.keep_alive` | String | `0s` | Server-side keep-alive time.<br/>Set to 0 (default) to disable. |
| `mysql.tls` | -- | -- | -- |
| `mysql.tls.mode` | String | `disable` | TLS mode, refer to https://www.postgresql.org/docs/current/libpq-ssl.html<br/>- `disable` (default value)<br/>- `prefer`<br/>- `require`<br/>- `verify-ca`<br/>- `verify-full` |
| `mysql.tls.cert_path` | String | Unset | Certificate file path. |
@@ -242,6 +249,7 @@
| `postgres.enable` | Bool | `true` | Whether to enable |
| `postgres.addr` | String | `127.0.0.1:4003` | The addr to bind the PostgresSQL server. |
| `postgres.runtime_size` | Integer | `2` | The number of server worker threads. |
| `postgres.keep_alive` | String | `0s` | Server-side keep-alive time.<br/>Set to 0 (default) to disable. |
| `postgres.tls` | -- | -- | PostgresSQL server TLS options, see `mysql.tls` section. |
| `postgres.tls.mode` | String | `disable` | TLS mode. |
| `postgres.tls.cert_path` | String | Unset | Certificate file path. |
@@ -251,6 +259,8 @@
| `opentsdb.enable` | Bool | `true` | Whether to enable OpenTSDB put in HTTP API. |
| `influxdb` | -- | -- | InfluxDB protocol options. |
| `influxdb.enable` | Bool | `true` | Whether to enable InfluxDB protocol in HTTP API. |
| `jaeger` | -- | -- | Jaeger protocol options. |
| `jaeger.enable` | Bool | `true` | Whether to enable Jaeger protocol in HTTP API. |
| `prom_store` | -- | -- | Prometheus remote storage options |
| `prom_store.enable` | Bool | `true` | Whether to enable Prometheus remote write and read in HTTP API. |
| `prom_store.with_metric_engine` | Bool | `true` | Whether to store the data from Prometheus remote write in metric engine. |
@@ -300,7 +310,7 @@
| --- | -----| ------- | ----------- |
| `data_home` | String | `/tmp/metasrv/` | The working home directory. |
| `bind_addr` | String | `127.0.0.1:3002` | The bind address of metasrv. |
| `server_addr` | String | `127.0.0.1:3002` | The communication server address for frontend and datanode to connect to metasrv, "127.0.0.1:3002" by default for localhost. |
| `server_addr` | String | `127.0.0.1:3002` | The communication server address for the frontend and datanode to connect to metasrv.<br/>If left empty or unset, the server will automatically use the IP address of the first network interface<br/>on the host, with the same port number as the one specified in `bind_addr`. |
| `store_addrs` | Array | -- | Store server address default to etcd store.<br/>For postgres store, the format is:<br/>"password=password dbname=postgres user=postgres host=localhost port=5432"<br/>For etcd store, the format is:<br/>"127.0.0.1:2379" |
| `store_key_prefix` | String | `""` | If it's not empty, the metasrv will store all data with this key prefix. |
| `backend` | String | `etcd_store` | The datastore for meta server.<br/>Available values:<br/>- `etcd_store` (default value)<br/>- `memory_store`<br/>- `postgres_store` |
@@ -309,6 +319,7 @@
| `selector` | String | `round_robin` | Datanode selector type.<br/>- `round_robin` (default value)<br/>- `lease_based`<br/>- `load_based`<br/>For details, please see "https://docs.greptime.com/developer-guide/metasrv/selector". |
| `use_memory_store` | Bool | `false` | Store data in memory. |
| `enable_region_failover` | Bool | `false` | Whether to enable region failover.<br/>This feature is only available on GreptimeDB running on cluster mode and<br/>- Using Remote WAL<br/>- Using shared storage (e.g., s3). |
| `node_max_idle_time` | String | `24hours` | Max allowed idle time before removing node info from metasrv memory. |
| `enable_telemetry` | Bool | `true` | Whether to enable greptimedb telemetry. Enabled by default. |
| `runtime` | -- | -- | The runtime options. |
| `runtime.global_rt_size` | Integer | `8` | The number of threads to execute the runtime for global read operations. |
@@ -376,19 +387,14 @@
| `init_regions_in_background` | Bool | `false` | Initialize all regions in the background during the startup.<br/>By default, it provides services after all regions have been initialized. |
| `init_regions_parallelism` | Integer | `16` | Parallelism of initializing regions. |
| `max_concurrent_queries` | Integer | `0` | The maximum current queries allowed to be executed. Zero means unlimited. |
| `rpc_addr` | String | Unset | Deprecated, use `grpc.addr` instead. |
| `rpc_hostname` | String | Unset | Deprecated, use `grpc.hostname` instead. |
| `rpc_runtime_size` | Integer | Unset | Deprecated, use `grpc.runtime_size` instead. |
| `rpc_max_recv_message_size` | String | Unset | Deprecated, use `grpc.rpc_max_recv_message_size` instead. |
| `rpc_max_send_message_size` | String | Unset | Deprecated, use `grpc.rpc_max_send_message_size` instead. |
| `enable_telemetry` | Bool | `true` | Enable telemetry to collect anonymous usage data. Enabled by default. |
| `http` | -- | -- | The HTTP server options. |
| `http.addr` | String | `127.0.0.1:4000` | The address to bind the HTTP server. |
| `http.timeout` | String | `30s` | HTTP request timeout. Set to 0 to disable timeout. |
| `http.body_limit` | String | `64MB` | HTTP request body limit.<br/>The following units are supported: `B`, `KB`, `KiB`, `MB`, `MiB`, `GB`, `GiB`, `TB`, `TiB`, `PB`, `PiB`.<br/>Set to 0 to disable limit. |
| `grpc` | -- | -- | The gRPC server options. |
| `grpc.addr` | String | `127.0.0.1:3001` | The address to bind the gRPC server. |
| `grpc.hostname` | String | `127.0.0.1:3001` | The hostname advertised to the metasrv,<br/>and used for connections from outside the host |
| `grpc.bind_addr` | String | `127.0.0.1:3001` | The address to bind the gRPC server. |
| `grpc.server_addr` | String | `127.0.0.1:3001` | The address advertised to the metasrv, and used for connections from outside the host.<br/>If left empty or unset, the server will automatically use the IP address of the first network interface<br/>on the host, with the same port number as the one specified in `grpc.bind_addr`. |
| `grpc.runtime_size` | Integer | `8` | The number of server worker threads. |
| `grpc.max_recv_message_size` | String | `512MB` | The maximum receive message size for gRPC server. |
| `grpc.max_send_message_size` | String | `512MB` | The maximum send message size for gRPC server. |
@@ -487,6 +493,7 @@
| `region_engine.mito.index` | -- | -- | The options for index in Mito engine. |
| `region_engine.mito.index.aux_path` | String | `""` | Auxiliary directory path for the index in filesystem, used to store intermediate files for<br/>creating the index and staging files for searching the index, defaults to `{data_home}/index_intermediate`.<br/>The default name for this directory is `index_intermediate` for backward compatibility.<br/><br/>This path contains two subdirectories:<br/>- `__intm`: for storing intermediate files used during creating index.<br/>- `staging`: for storing staging files used during searching index. |
| `region_engine.mito.index.staging_size` | String | `2GB` | The max capacity of the staging directory. |
| `region_engine.mito.index.staging_ttl` | String | `7d` | The TTL of the staging directory.<br/>Defaults to 7 days.<br/>Setting it to "0s" to disable TTL. |
| `region_engine.mito.index.metadata_cache_size` | String | `64MiB` | Cache size for inverted index metadata. |
| `region_engine.mito.index.content_cache_size` | String | `128MiB` | Cache size for inverted index content. |
| `region_engine.mito.index.content_cache_page_size` | String | `64KiB` | Page size for inverted index content cache. |
@@ -549,8 +556,8 @@
| `flow` | -- | -- | flow engine options. |
| `flow.num_workers` | Integer | `0` | The number of flow worker in flownode.<br/>Not setting(or set to 0) this value will use the number of CPU cores divided by 2. |
| `grpc` | -- | -- | The gRPC server options. |
| `grpc.addr` | String | `127.0.0.1:6800` | The address to bind the gRPC server. |
| `grpc.hostname` | String | `127.0.0.1` | The hostname advertised to the metasrv,<br/>and used for connections from outside the host |
| `grpc.bind_addr` | String | `127.0.0.1:6800` | The address to bind the gRPC server. |
| `grpc.server_addr` | String | `127.0.0.1:6800` | The address advertised to the metasrv,<br/>and used for connections from outside the host |
| `grpc.runtime_size` | Integer | `2` | The number of server worker threads. |
| `grpc.max_recv_message_size` | String | `512MB` | The maximum receive message size for gRPC server. |
| `grpc.max_send_message_size` | String | `512MB` | The maximum send message size for gRPC server. |

View File

@@ -19,26 +19,6 @@ init_regions_parallelism = 16
## The maximum current queries allowed to be executed. Zero means unlimited.
max_concurrent_queries = 0
## Deprecated, use `grpc.addr` instead.
## @toml2docs:none-default
rpc_addr = "127.0.0.1:3001"
## Deprecated, use `grpc.hostname` instead.
## @toml2docs:none-default
rpc_hostname = "127.0.0.1"
## Deprecated, use `grpc.runtime_size` instead.
## @toml2docs:none-default
rpc_runtime_size = 8
## Deprecated, use `grpc.rpc_max_recv_message_size` instead.
## @toml2docs:none-default
rpc_max_recv_message_size = "512MB"
## Deprecated, use `grpc.rpc_max_send_message_size` instead.
## @toml2docs:none-default
rpc_max_send_message_size = "512MB"
## Enable telemetry to collect anonymous usage data. Enabled by default.
#+ enable_telemetry = true
@@ -56,10 +36,11 @@ body_limit = "64MB"
## The gRPC server options.
[grpc]
## The address to bind the gRPC server.
addr = "127.0.0.1:3001"
## The hostname advertised to the metasrv,
## and used for connections from outside the host
hostname = "127.0.0.1:3001"
bind_addr = "127.0.0.1:3001"
## The address advertised to the metasrv, and used for connections from outside the host.
## If left empty or unset, the server will automatically use the IP address of the first network interface
## on the host, with the same port number as the one specified in `grpc.bind_addr`.
server_addr = "127.0.0.1:3001"
## The number of server worker threads.
runtime_size = 8
## The maximum receive message size for gRPC server.
@@ -250,6 +231,7 @@ overwrite_entry_start_id = false
# secret_access_key = "123456"
# endpoint = "https://s3.amazonaws.com"
# region = "us-west-2"
# enable_virtual_host_style = false
# Example of using Oss as the storage.
# [storage]
@@ -516,6 +498,11 @@ aux_path = ""
## The max capacity of the staging directory.
staging_size = "2GB"
## The TTL of the staging directory.
## Defaults to 7 days.
## Setting it to "0s" to disable TTL.
staging_ttl = "7d"
## Cache size for inverted index metadata.
metadata_cache_size = "64MiB"

View File

@@ -14,10 +14,10 @@ node_id = 14
## The gRPC server options.
[grpc]
## The address to bind the gRPC server.
addr = "127.0.0.1:6800"
## The hostname advertised to the metasrv,
bind_addr = "127.0.0.1:6800"
## The address advertised to the metasrv,
## and used for connections from outside the host
hostname = "127.0.0.1"
server_addr = "127.0.0.1:6800"
## The number of server worker threads.
runtime_size = 2
## The maximum receive message size for gRPC server.

View File

@@ -41,10 +41,11 @@ cors_allowed_origins = ["https://example.com"]
## The gRPC server options.
[grpc]
## The address to bind the gRPC server.
addr = "127.0.0.1:4001"
## The hostname advertised to the metasrv,
## and used for connections from outside the host
hostname = "127.0.0.1:4001"
bind_addr = "127.0.0.1:4001"
## The address advertised to the metasrv, and used for connections from outside the host.
## If left empty or unset, the server will automatically use the IP address of the first network interface
## on the host, with the same port number as the one specified in `grpc.bind_addr`.
server_addr = "127.0.0.1:4001"
## The number of server worker threads.
runtime_size = 8
@@ -73,6 +74,9 @@ enable = true
addr = "127.0.0.1:4002"
## The number of server worker threads.
runtime_size = 2
## Server-side keep-alive time.
## Set to 0 (default) to disable.
keep_alive = "0s"
# MySQL server TLS options.
[mysql.tls]
@@ -104,6 +108,9 @@ enable = true
addr = "127.0.0.1:4003"
## The number of server worker threads.
runtime_size = 2
## Server-side keep-alive time.
## Set to 0 (default) to disable.
keep_alive = "0s"
## PostgresSQL server TLS options, see `mysql.tls` section.
[postgres.tls]
@@ -131,6 +138,11 @@ enable = true
## Whether to enable InfluxDB protocol in HTTP API.
enable = true
## Jaeger protocol options.
[jaeger]
## Whether to enable Jaeger protocol in HTTP API.
enable = true
## Prometheus remote storage options
[prom_store]
## Whether to enable Prometheus remote write and read in HTTP API.

View File

@@ -4,7 +4,9 @@ data_home = "/tmp/metasrv/"
## The bind address of metasrv.
bind_addr = "127.0.0.1:3002"
## The communication server address for frontend and datanode to connect to metasrv, "127.0.0.1:3002" by default for localhost.
## The communication server address for the frontend and datanode to connect to metasrv.
## If left empty or unset, the server will automatically use the IP address of the first network interface
## on the host, with the same port number as the one specified in `bind_addr`.
server_addr = "127.0.0.1:3002"
## Store server address default to etcd store.
@@ -48,6 +50,9 @@ use_memory_store = false
## - Using shared storage (e.g., s3).
enable_region_failover = false
## Max allowed idle time before removing node info from metasrv memory.
node_max_idle_time = "24hours"
## Whether to enable greptimedb telemetry. Enabled by default.
#+ enable_telemetry = true

View File

@@ -49,7 +49,7 @@ cors_allowed_origins = ["https://example.com"]
## The gRPC server options.
[grpc]
## The address to bind the gRPC server.
addr = "127.0.0.1:4001"
bind_addr = "127.0.0.1:4001"
## The number of server worker threads.
runtime_size = 8
@@ -78,6 +78,9 @@ enable = true
addr = "127.0.0.1:4002"
## The number of server worker threads.
runtime_size = 2
## Server-side keep-alive time.
## Set to 0 (default) to disable.
keep_alive = "0s"
# MySQL server TLS options.
[mysql.tls]
@@ -109,6 +112,9 @@ enable = true
addr = "127.0.0.1:4003"
## The number of server worker threads.
runtime_size = 2
## Server-side keep-alive time.
## Set to 0 (default) to disable.
keep_alive = "0s"
## PostgresSQL server TLS options, see `mysql.tls` section.
[postgres.tls]
@@ -136,6 +142,11 @@ enable = true
## Whether to enable InfluxDB protocol in HTTP API.
enable = true
## Jaeger protocol options.
[jaeger]
## Whether to enable Jaeger protocol in HTTP API.
enable = true
## Prometheus remote storage options
[prom_store]
## Whether to enable Prometheus remote write and read in HTTP API.
@@ -159,11 +170,11 @@ dir = "/tmp/greptimedb/wal"
## **It's only used when the provider is `raft_engine`**.
file_size = "128MB"
## The threshold of the WAL size to trigger a flush.
## The threshold of the WAL size to trigger a purge.
## **It's only used when the provider is `raft_engine`**.
purge_threshold = "1GB"
## The interval to trigger a flush.
## The interval to trigger a purge.
## **It's only used when the provider is `raft_engine`**.
purge_interval = "1m"
@@ -278,10 +289,12 @@ overwrite_entry_start_id = false
## Metadata storage options.
[metadata_store]
## Kv file size in bytes.
file_size = "256MB"
## Kv purge threshold.
purge_threshold = "4GB"
## The size of the metadata store log file.
file_size = "64MB"
## The threshold of the metadata store size to trigger a purge.
purge_threshold = "256MB"
## The interval of the metadata store to trigger a purge.
purge_interval = "1m"
## Procedure storage options.
[procedure]
@@ -305,6 +318,7 @@ retry_delay = "500ms"
# secret_access_key = "123456"
# endpoint = "https://s3.amazonaws.com"
# region = "us-west-2"
# enable_virtual_host_style = false
# Example of using Oss as the storage.
# [storage]
@@ -571,6 +585,11 @@ aux_path = ""
## The max capacity of the staging directory.
staging_size = "2GB"
## The TTL of the staging directory.
## Defaults to 7 days.
## Setting it to "0s" to disable TTL.
staging_ttl = "7d"
## Cache size for inverted index metadata.
metadata_cache_size = "64MiB"

156
cyborg/bin/bump-versions.ts Normal file
View File

@@ -0,0 +1,156 @@
/*
* Copyright 2023 Greptime Team
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
import * as core from "@actions/core";
import {obtainClient} from "@/common";
interface RepoConfig {
tokenEnv: string;
repo: string;
workflowLogic: (version: string) => [string, string] | null;
}
const REPO_CONFIGS: Record<string, RepoConfig> = {
website: {
tokenEnv: "WEBSITE_REPO_TOKEN",
repo: "website",
workflowLogic: (version: string) => {
// Skip nightly versions for website
if (version.includes('nightly')) {
console.log('Nightly version detected for website, skipping workflow trigger.');
return null;
}
return ['bump-patch-version.yml', version];
}
},
demo: {
tokenEnv: "DEMO_REPO_TOKEN",
repo: "demo-scene",
workflowLogic: (version: string) => {
// Skip nightly versions for demo
if (version.includes('nightly')) {
console.log('Nightly version detected for demo, skipping workflow trigger.');
return null;
}
return ['bump-patch-version.yml', version];
}
},
docs: {
tokenEnv: "DOCS_REPO_TOKEN",
repo: "docs",
workflowLogic: (version: string) => {
// Check if it's a nightly version
if (version.includes('nightly')) {
return ['bump-nightly-version.yml', version];
}
const parts = version.split('.');
if (parts.length !== 3) {
throw new Error('Invalid version format');
}
// If patch version (last number) is 0, it's a major version
// Return only major.minor version
if (parts[2] === '0') {
return ['bump-version.yml', `${parts[0]}.${parts[1]}`];
}
// Otherwise it's a patch version, use full version
return ['bump-patch-version.yml', version];
}
}
};
async function triggerWorkflow(repoConfig: RepoConfig, workflowId: string, version: string) {
const client = obtainClient(repoConfig.tokenEnv);
try {
await client.rest.actions.createWorkflowDispatch({
owner: "GreptimeTeam",
repo: repoConfig.repo,
workflow_id: workflowId,
ref: "main",
inputs: {
version,
},
});
console.log(`Successfully triggered ${workflowId} workflow for ${repoConfig.repo} with version ${version}`);
} catch (error) {
core.setFailed(`Failed to trigger workflow for ${repoConfig.repo}: ${error.message}`);
throw error;
}
}
async function processRepo(repoName: string, version: string) {
const repoConfig = REPO_CONFIGS[repoName];
if (!repoConfig) {
throw new Error(`Unknown repository: ${repoName}`);
}
try {
const workflowResult = repoConfig.workflowLogic(version);
if (workflowResult === null) {
// Skip this repo (e.g., nightly version for website)
return;
}
const [workflowId, apiVersion] = workflowResult;
await triggerWorkflow(repoConfig, workflowId, apiVersion);
} catch (error) {
core.setFailed(`Error processing ${repoName} with version ${version}: ${error.message}`);
throw error;
}
}
async function main() {
const version = process.env.VERSION;
if (!version) {
core.setFailed("VERSION environment variable is required");
process.exit(1);
}
// Remove 'v' prefix if exists
const cleanVersion = version.startsWith('v') ? version.slice(1) : version;
// Get target repositories from environment variable
// Default to both if not specified
const targetRepos = process.env.TARGET_REPOS?.split(',').map(repo => repo.trim()) || ['website', 'docs'];
console.log(`Processing version ${cleanVersion} for repositories: ${targetRepos.join(', ')}`);
const errors: string[] = [];
// Process each repository
for (const repo of targetRepos) {
try {
await processRepo(repo, cleanVersion);
} catch (error) {
errors.push(`${repo}: ${error.message}`);
}
}
if (errors.length > 0) {
core.setFailed(`Failed to process some repositories: ${errors.join('; ')}`);
process.exit(1);
}
console.log('All repositories processed successfully');
}
// Execute main function
main().catch((error) => {
core.setFailed(`Unexpected error: ${error.message}`);
process.exit(1);
});

View File

@@ -55,12 +55,25 @@ async function main() {
await client.rest.issues.addLabels({
owner, repo, issue_number: number, labels: [labelDocsRequired],
})
// Get available assignees for the docs repo
const assigneesResponse = await docsClient.rest.issues.listAssignees({
owner: 'GreptimeTeam',
repo: 'docs',
})
const validAssignees = assigneesResponse.data.map(assignee => assignee.login)
core.info(`Available assignees: ${validAssignees.join(', ')}`)
// Check if the actor is a valid assignee, otherwise fallback to fengjiachun
const assignee = validAssignees.includes(actor) ? actor : 'fengjiachun'
core.info(`Assigning issue to: ${assignee}`)
await docsClient.rest.issues.create({
owner: 'GreptimeTeam',
repo: 'docs',
title: `Update docs for ${title}`,
body: `A document change request is generated from ${html_url}`,
assignee: actor,
assignee: assignee,
}).then((res) => {
core.info(`Created issue ${res.data}`)
})

View File

@@ -43,8 +43,8 @@ services:
command:
- metasrv
- start
- --bind-addr=0.0.0.0:3002
- --server-addr=metasrv:3002
- --rpc-bind-addr=0.0.0.0:3002
- --rpc-server-addr=metasrv:3002
- --store-addrs=etcd0:2379
- --http-addr=0.0.0.0:3000
healthcheck:
@@ -68,8 +68,8 @@ services:
- datanode
- start
- --node-id=0
- --rpc-addr=0.0.0.0:3001
- --rpc-hostname=datanode0:3001
- --rpc-bind-addr=0.0.0.0:3001
- --rpc-server-addr=datanode0:3001
- --metasrv-addrs=metasrv:3002
- --http-addr=0.0.0.0:5000
volumes:
@@ -98,7 +98,7 @@ services:
- start
- --metasrv-addrs=metasrv:3002
- --http-addr=0.0.0.0:4000
- --rpc-addr=0.0.0.0:4001
- --rpc-bind-addr=0.0.0.0:4001
- --mysql-addr=0.0.0.0:4002
- --postgres-addr=0.0.0.0:4003
healthcheck:
@@ -123,8 +123,8 @@ services:
- start
- --node-id=0
- --metasrv-addrs=metasrv:3002
- --rpc-addr=0.0.0.0:4004
- --rpc-hostname=flownode0:4004
- --rpc-bind-addr=0.0.0.0:4004
- --rpc-server-addr=flownode0:4004
- --http-addr=0.0.0.0:4005
depends_on:
frontend0:

View File

@@ -4,6 +4,16 @@ This crate provides an easy approach to dump memory profiling info.
## Prerequisites
### jemalloc
jeprof is already compiled in the target directory of GreptimeDB. You can find the binary and use it.
```
# find jeprof binary
find . -name 'jeprof'
# add executable permission
chmod +x <path_to_jeprof>
```
The path is usually under `./target/${PROFILE}/build/tikv-jemalloc-sys-${HASH}/out/build/bin/jeprof`.
The default version of jemalloc installed from the package manager may not have the `--collapsed` option.
You may need to check the whether the `jeprof` version is >= `5.3.0` if you want to install it from the package manager.
```bash
# for macOS
brew install jemalloc
@@ -23,7 +33,11 @@ curl https://raw.githubusercontent.com/brendangregg/FlameGraph/master/flamegraph
Start GreptimeDB instance with environment variables:
```bash
# for Linux
MALLOC_CONF=prof:true ./target/debug/greptime standalone start
# for macOS
_RJEM_MALLOC_CONF=prof:true ./target/debug/greptime standalone start
```
Dump memory profiling data through HTTP API:

Binary file not shown.

Before

Width:  |  Height:  |  Size: 36 KiB

After

Width:  |  Height:  |  Size: 25 KiB

BIN
docs/logo-text-padding.png Executable file → Normal file

Binary file not shown.

Before

Width:  |  Height:  |  Size: 25 KiB

After

Width:  |  Height:  |  Size: 21 KiB

File diff suppressed because it is too large Load Diff

View File

@@ -384,8 +384,8 @@
"rowHeight": 0.9,
"showValue": "auto",
"tooltip": {
"mode": "none",
"sort": "none"
"mode": "multi",
"sort": "desc"
}
},
"targets": [
@@ -483,8 +483,8 @@
"showLegend": true
},
"tooltip": {
"mode": "single",
"sort": "none"
"mode": "multi",
"sort": "desc"
}
},
"pluginVersion": "10.2.3",
@@ -578,8 +578,8 @@
"showLegend": true
},
"tooltip": {
"mode": "single",
"sort": "none"
"mode": "multi",
"sort": "desc"
}
},
"pluginVersion": "10.2.3",
@@ -601,7 +601,7 @@
"type": "timeseries"
},
{
"collapsed": true,
"collapsed": false,
"gridPos": {
"h": 1,
"w": 24,
@@ -684,8 +684,8 @@
"showLegend": true
},
"tooltip": {
"mode": "single",
"sort": "none"
"mode": "multi",
"sort": "desc"
}
},
"targets": [
@@ -878,8 +878,8 @@
"showLegend": true
},
"tooltip": {
"mode": "single",
"sort": "none"
"mode": "multi",
"sort": "desc"
}
},
"targets": [
@@ -1124,8 +1124,8 @@
"showLegend": true
},
"tooltip": {
"mode": "single",
"sort": "none"
"mode": "multi",
"sort": "desc"
}
},
"targets": [
@@ -1223,8 +1223,8 @@
"showLegend": true
},
"tooltip": {
"mode": "single",
"sort": "none"
"mode": "multi",
"sort": "desc"
}
},
"targets": [
@@ -1322,8 +1322,8 @@
"showLegend": true
},
"tooltip": {
"mode": "single",
"sort": "none"
"mode": "multi",
"sort": "desc"
}
},
"targets": [
@@ -1456,8 +1456,8 @@
"showLegend": true
},
"tooltip": {
"mode": "single",
"sort": "none"
"mode": "multi",
"sort": "desc"
}
},
"targets": [
@@ -1573,8 +1573,8 @@
"showLegend": true
},
"tooltip": {
"mode": "single",
"sort": "none"
"mode": "multi",
"sort": "desc"
}
},
"targets": [
@@ -1673,8 +1673,8 @@
"showLegend": true
},
"tooltip": {
"mode": "single",
"sort": "none"
"mode": "multi",
"sort": "desc"
}
},
"targets": [
@@ -1773,8 +1773,8 @@
"showLegend": true
},
"tooltip": {
"mode": "single",
"sort": "none"
"mode": "multi",
"sort": "desc"
}
},
"targets": [
@@ -1890,8 +1890,8 @@
"showLegend": true
},
"tooltip": {
"mode": "single",
"sort": "none"
"mode": "multi",
"sort": "desc"
}
},
"targets": [
@@ -2002,8 +2002,8 @@
"showLegend": true
},
"tooltip": {
"mode": "single",
"sort": "none"
"mode": "multi",
"sort": "desc"
}
},
"targets": [
@@ -2120,8 +2120,8 @@
"showLegend": true
},
"tooltip": {
"mode": "single",
"sort": "none"
"mode": "multi",
"sort": "desc"
}
},
"targets": [
@@ -2233,8 +2233,8 @@
"showLegend": true
},
"tooltip": {
"mode": "single",
"sort": "none"
"mode": "multi",
"sort": "desc"
}
},
"targets": [
@@ -2334,8 +2334,8 @@
"showLegend": true
},
"tooltip": {
"mode": "single",
"sort": "none"
"mode": "multi",
"sort": "desc"
}
},
"targets": [
@@ -2435,8 +2435,8 @@
"showLegend": true
},
"tooltip": {
"mode": "single",
"sort": "none"
"mode": "multi",
"sort": "desc"
}
},
"targets": [
@@ -2548,8 +2548,8 @@
"showLegend": true
},
"tooltip": {
"mode": "single",
"sort": "none"
"mode": "multi",
"sort": "desc"
}
},
"targets": [
@@ -2661,8 +2661,8 @@
"showLegend": true
},
"tooltip": {
"mode": "single",
"sort": "none"
"mode": "multi",
"sort": "desc"
}
},
"targets": [
@@ -2788,8 +2788,8 @@
"showLegend": true
},
"tooltip": {
"mode": "single",
"sort": "none"
"mode": "multi",
"sort": "desc"
}
},
"targets": [
@@ -2889,8 +2889,8 @@
"showLegend": true
},
"tooltip": {
"mode": "single",
"sort": "none"
"mode": "multi",
"sort": "desc"
}
},
"targets": [
@@ -2990,8 +2990,8 @@
"showLegend": true
},
"tooltip": {
"mode": "single",
"sort": "none"
"mode": "multi",
"sort": "desc"
}
},
"targets": [
@@ -3091,8 +3091,8 @@
"showLegend": true
},
"tooltip": {
"mode": "single",
"sort": "none"
"mode": "multi",
"sort": "desc"
}
},
"targets": [
@@ -3191,8 +3191,8 @@
"showLegend": true
},
"tooltip": {
"mode": "single",
"sort": "none"
"mode": "multi",
"sort": "desc"
}
},
"targets": [
@@ -3302,8 +3302,8 @@
"showLegend": true
},
"tooltip": {
"mode": "single",
"sort": "none"
"mode": "multi",
"sort": "desc"
}
},
"targets": [
@@ -3432,8 +3432,8 @@
"showLegend": true
},
"tooltip": {
"mode": "single",
"sort": "none"
"mode": "multi",
"sort": "desc"
}
},
"targets": [
@@ -3543,8 +3543,8 @@
"showLegend": true
},
"tooltip": {
"mode": "single",
"sort": "none"
"mode": "multi",
"sort": "desc"
}
},
"targets": [
@@ -3657,8 +3657,8 @@
"showLegend": true
},
"tooltip": {
"mode": "single",
"sort": "none"
"mode": "multi",
"sort": "desc"
}
},
"targets": [
@@ -3808,8 +3808,8 @@
"showLegend": true
},
"tooltip": {
"mode": "single",
"sort": "none"
"mode": "multi",
"sort": "desc"
}
},
"targets": [
@@ -3909,8 +3909,8 @@
"showLegend": true
},
"tooltip": {
"mode": "single",
"sort": "none"
"mode": "multi",
"sort": "desc"
}
},
"targets": [
@@ -4011,8 +4011,8 @@
"showLegend": true
},
"tooltip": {
"mode": "single",
"sort": "none"
"mode": "multi",
"sort": "desc"
}
},
"targets": [
@@ -4113,8 +4113,8 @@
"showLegend": true
},
"tooltip": {
"mode": "single",
"sort": "none"
"mode": "multi",
"sort": "desc"
}
},
"targets": [

View File

@@ -53,6 +53,54 @@ get_arch_type() {
esac
}
# Verify SHA256 checksum
verify_sha256() {
file="$1"
expected_sha256="$2"
if command -v sha256sum >/dev/null 2>&1; then
actual_sha256=$(sha256sum "$file" | cut -d' ' -f1)
elif command -v shasum >/dev/null 2>&1; then
actual_sha256=$(shasum -a 256 "$file" | cut -d' ' -f1)
else
echo "Warning: No SHA256 verification tool found (sha256sum or shasum). Skipping checksum verification."
return 0
fi
if [ "$actual_sha256" = "$expected_sha256" ]; then
echo "SHA256 checksum verified successfully."
return 0
else
echo "Error: SHA256 checksum verification failed!"
echo "Expected: $expected_sha256"
echo "Actual: $actual_sha256"
return 1
fi
}
# Prompt for user confirmation (compatible with different shells)
prompt_confirmation() {
message="$1"
printf "%s (y/N): " "$message"
# Try to read user input, fallback if read fails
answer=""
if read answer </dev/tty 2>/dev/null; then
case "$answer" in
[Yy]|[Yy][Ee][Ss])
return 0
;;
*)
return 1
;;
esac
else
echo ""
echo "Cannot read user input. Defaulting to No."
return 1
fi
}
download_artifact() {
if [ -n "${OS_TYPE}" ] && [ -n "${ARCH_TYPE}" ]; then
# Use the latest stable released version.
@@ -71,17 +119,104 @@ download_artifact() {
fi
echo "Downloading ${BIN}, OS: ${OS_TYPE}, Arch: ${ARCH_TYPE}, Version: ${VERSION}"
PACKAGE_NAME="${BIN}-${OS_TYPE}-${ARCH_TYPE}-${VERSION}.tar.gz"
PKG_NAME="${BIN}-${OS_TYPE}-${ARCH_TYPE}-${VERSION}"
PACKAGE_NAME="${PKG_NAME}.tar.gz"
SHA256_FILE="${PKG_NAME}.sha256sum"
if [ -n "${PACKAGE_NAME}" ]; then
wget "https://github.com/${GITHUB_ORG}/${GITHUB_REPO}/releases/download/${VERSION}/${PACKAGE_NAME}"
# Check if files already exist and prompt for override
if [ -f "${PACKAGE_NAME}" ]; then
echo "File ${PACKAGE_NAME} already exists."
if prompt_confirmation "Do you want to override it?"; then
echo "Overriding existing file..."
rm -f "${PACKAGE_NAME}"
else
echo "Skipping download. Using existing file."
fi
fi
if [ -f "${BIN}" ]; then
echo "Binary ${BIN} already exists."
if prompt_confirmation "Do you want to override it?"; then
echo "Will override existing binary..."
rm -f "${BIN}"
else
echo "Installation cancelled."
exit 0
fi
fi
# Download package if not exists
if [ ! -f "${PACKAGE_NAME}" ]; then
echo "Downloading ${PACKAGE_NAME}..."
# Use curl instead of wget for better compatibility
if command -v curl >/dev/null 2>&1; then
if ! curl -L -o "${PACKAGE_NAME}" "https://github.com/${GITHUB_ORG}/${GITHUB_REPO}/releases/download/${VERSION}/${PACKAGE_NAME}"; then
echo "Error: Failed to download ${PACKAGE_NAME}"
exit 1
fi
elif command -v wget >/dev/null 2>&1; then
if ! wget -O "${PACKAGE_NAME}" "https://github.com/${GITHUB_ORG}/${GITHUB_REPO}/releases/download/${VERSION}/${PACKAGE_NAME}"; then
echo "Error: Failed to download ${PACKAGE_NAME}"
exit 1
fi
else
echo "Error: Neither curl nor wget is available for downloading."
exit 1
fi
fi
# Download and verify SHA256 checksum
echo "Downloading SHA256 checksum..."
sha256_download_success=0
if command -v curl >/dev/null 2>&1; then
if curl -L -s -o "${SHA256_FILE}" "https://github.com/${GITHUB_ORG}/${GITHUB_REPO}/releases/download/${VERSION}/${SHA256_FILE}" 2>/dev/null; then
sha256_download_success=1
fi
elif command -v wget >/dev/null 2>&1; then
if wget -q -O "${SHA256_FILE}" "https://github.com/${GITHUB_ORG}/${GITHUB_REPO}/releases/download/${VERSION}/${SHA256_FILE}" 2>/dev/null; then
sha256_download_success=1
fi
fi
if [ $sha256_download_success -eq 1 ] && [ -f "${SHA256_FILE}" ]; then
expected_sha256=$(cat "${SHA256_FILE}" | cut -d' ' -f1)
if [ -n "$expected_sha256" ]; then
if ! verify_sha256 "${PACKAGE_NAME}" "${expected_sha256}"; then
echo "SHA256 verification failed. Removing downloaded file."
rm -f "${PACKAGE_NAME}" "${SHA256_FILE}"
exit 1
fi
else
echo "Warning: Could not parse SHA256 checksum from file."
fi
rm -f "${SHA256_FILE}"
else
echo "Warning: Could not download SHA256 checksum file. Skipping verification."
fi
# Extract the binary and clean the rest.
tar xvf "${PACKAGE_NAME}" && \
mv "${PACKAGE_NAME%.tar.gz}/${BIN}" "${PWD}" && \
rm -r "${PACKAGE_NAME}" && \
rm -r "${PACKAGE_NAME%.tar.gz}" && \
echo "Run './${BIN} --help' to get started"
echo "Extracting ${PACKAGE_NAME}..."
if ! tar xf "${PACKAGE_NAME}"; then
echo "Error: Failed to extract ${PACKAGE_NAME}"
exit 1
fi
# Find the binary in the extracted directory
extracted_dir="${PACKAGE_NAME%.tar.gz}"
if [ -f "${extracted_dir}/${BIN}" ]; then
mv "${extracted_dir}/${BIN}" "${PWD}/"
rm -f "${PACKAGE_NAME}"
rm -rf "${extracted_dir}"
chmod +x "${BIN}"
echo "Installation completed successfully!"
echo "Run './${BIN} --help' to get started"
else
echo "Error: Binary ${BIN} not found in extracted archive"
rm -f "${PACKAGE_NAME}"
rm -rf "${extracted_dir}"
exit 1
fi
fi
fi
}

View File

@@ -15,13 +15,10 @@ common-macro.workspace = true
common-time.workspace = true
datatypes.workspace = true
greptime-proto.workspace = true
paste = "1.0"
paste.workspace = true
prost.workspace = true
serde_json.workspace = true
snafu.workspace = true
[build-dependencies]
tonic-build = "0.11"
[dev-dependencies]
paste = "1.0"

View File

@@ -15,10 +15,10 @@
use std::collections::HashMap;
use datatypes::schema::{
ColumnDefaultConstraint, ColumnSchema, FulltextAnalyzer, FulltextOptions, COMMENT_KEY,
FULLTEXT_KEY, INVERTED_INDEX_KEY, SKIPPING_INDEX_KEY,
ColumnDefaultConstraint, ColumnSchema, FulltextAnalyzer, FulltextOptions, SkippingIndexType,
COMMENT_KEY, FULLTEXT_KEY, INVERTED_INDEX_KEY, SKIPPING_INDEX_KEY,
};
use greptime_proto::v1::Analyzer;
use greptime_proto::v1::{Analyzer, SkippingIndexType as PbSkippingIndexType};
use snafu::ResultExt;
use crate::error::{self, Result};
@@ -121,6 +121,13 @@ pub fn as_fulltext_option(analyzer: Analyzer) -> FulltextAnalyzer {
}
}
/// Tries to construct a `SkippingIndexType` from the given skipping index type.
pub fn as_skipping_index_type(skipping_index_type: PbSkippingIndexType) -> SkippingIndexType {
match skipping_index_type {
PbSkippingIndexType::BloomFilter => SkippingIndexType::BloomFilter,
}
}
#[cfg(test)]
mod tests {

View File

@@ -15,7 +15,7 @@ api.workspace = true
arrow.workspace = true
arrow-schema.workspace = true
async-stream.workspace = true
async-trait = "0.1"
async-trait.workspace = true
bytes.workspace = true
common-catalog.workspace = true
common-error.workspace = true
@@ -31,7 +31,7 @@ common-version.workspace = true
dashmap.workspace = true
datafusion.workspace = true
datatypes.workspace = true
futures = "0.3"
futures.workspace = true
futures-util.workspace = true
humantime.workspace = true
itertools.workspace = true
@@ -39,7 +39,7 @@ lazy_static.workspace = true
meta-client.workspace = true
moka = { workspace = true, features = ["future", "sync"] }
partition.workspace = true
paste = "1.0"
paste.workspace = true
prometheus.workspace = true
rustc-hash.workspace = true
serde_json.workspace = true
@@ -49,7 +49,7 @@ sql.workspace = true
store-api.workspace = true
table.workspace = true
tokio.workspace = true
tokio-stream = "0.1"
tokio-stream.workspace = true
[dev-dependencies]
cache.workspace = true

View File

@@ -228,12 +228,6 @@ impl InformationSchemaKeyColumnUsageBuilder {
let keys = &table_info.meta.primary_key_indices;
let schema = table.schema();
// For compatibility, use primary key columns as inverted index columns.
let pk_as_inverted_index = !schema
.column_schemas()
.iter()
.any(|c| c.has_inverted_index_key());
for (idx, column) in schema.column_schemas().iter().enumerate() {
let mut constraints = vec![];
if column.is_time_index() {
@@ -251,10 +245,6 @@ impl InformationSchemaKeyColumnUsageBuilder {
// TODO(dimbtp): foreign key constraint not supported yet
if keys.contains(&idx) {
constraints.push(PRI_CONSTRAINT_NAME);
if pk_as_inverted_index {
constraints.push(INVERTED_INDEX_CONSTRAINT_NAME);
}
}
if column.is_inverted_indexed() {
constraints.push(INVERTED_INDEX_CONSTRAINT_NAME);

View File

@@ -24,10 +24,11 @@ use common_meta::key::{TableMetadataManager, TableMetadataManagerRef};
use common_meta::kv_backend::etcd::EtcdStore;
use common_meta::kv_backend::memory::MemoryKvBackend;
#[cfg(feature = "pg_kvbackend")]
use common_meta::kv_backend::postgres::PgStore;
use common_meta::kv_backend::rds::PgStore;
use common_meta::peer::Peer;
use common_meta::rpc::router::{Region, RegionRoute};
use common_telemetry::info;
use common_wal::options::WalOptions;
use datatypes::data_type::ConcreteDataType;
use datatypes::schema::{ColumnSchema, RawSchema};
use rand::Rng;
@@ -184,7 +185,7 @@ fn create_region_routes(regions: Vec<RegionNumber>) -> Vec<RegionRoute> {
region_routes
}
fn create_region_wal_options(regions: Vec<RegionNumber>) -> HashMap<RegionNumber, String> {
fn create_region_wal_options(regions: Vec<RegionNumber>) -> HashMap<RegionNumber, WalOptions> {
// TODO(niebayes): construct region wal options for benchmark.
let _ = regions;
HashMap::default()

View File

@@ -49,7 +49,12 @@ impl TableMetadataBencher {
let regions: Vec<_> = (0..64).collect();
let region_routes = create_region_routes(regions.clone());
let region_wal_options = create_region_wal_options(regions);
let region_wal_options = create_region_wal_options(regions)
.into_iter()
.map(|(region_id, wal_options)| {
(region_id, serde_json::to_string(&wal_options).unwrap())
})
.collect();
let start = Instant::now();
@@ -109,9 +114,17 @@ impl TableMetadataBencher {
let table_info = table_info.unwrap();
let table_route = table_route.unwrap();
let table_id = table_info.table_info.ident.table_id;
let regions: Vec<_> = (0..64).collect();
let region_wal_options = create_region_wal_options(regions);
let _ = self
.table_metadata_manager
.delete_table_metadata(table_id, &table_info.table_name(), &table_route)
.delete_table_metadata(
table_id,
&table_info.table_name(),
&table_route,
&region_wal_options,
)
.await;
start.elapsed()
},

View File

@@ -17,6 +17,7 @@ use std::time::Duration;
use base64::engine::general_purpose;
use base64::Engine;
use common_catalog::consts::{DEFAULT_CATALOG_NAME, DEFAULT_SCHEMA_NAME};
use common_error::ext::BoxedError;
use humantime::format_duration;
use serde_json::Value;
use servers::http::header::constants::GREPTIME_DB_HEADER_TIMEOUT;
@@ -24,7 +25,9 @@ use servers::http::result::greptime_result_v1::GreptimedbV1Response;
use servers::http::GreptimeQueryOutput;
use snafu::ResultExt;
use crate::error::{HttpQuerySqlSnafu, Result, SerdeJsonSnafu};
use crate::error::{
BuildClientSnafu, HttpQuerySqlSnafu, ParseProxyOptsSnafu, Result, SerdeJsonSnafu,
};
#[derive(Debug, Clone)]
pub struct DatabaseClient {
@@ -32,6 +35,23 @@ pub struct DatabaseClient {
catalog: String,
auth_header: Option<String>,
timeout: Duration,
proxy: Option<reqwest::Proxy>,
}
pub fn parse_proxy_opts(
proxy: Option<String>,
no_proxy: bool,
) -> std::result::Result<Option<reqwest::Proxy>, BoxedError> {
if no_proxy {
return Ok(None);
}
proxy
.map(|proxy| {
reqwest::Proxy::all(proxy)
.context(ParseProxyOptsSnafu)
.map_err(BoxedError::new)
})
.transpose()
}
impl DatabaseClient {
@@ -40,6 +60,7 @@ impl DatabaseClient {
catalog: String,
auth_basic: Option<String>,
timeout: Duration,
proxy: Option<reqwest::Proxy>,
) -> Self {
let auth_header = if let Some(basic) = auth_basic {
let encoded = general_purpose::STANDARD.encode(basic);
@@ -48,11 +69,18 @@ impl DatabaseClient {
None
};
if let Some(ref proxy) = proxy {
common_telemetry::info!("Using proxy: {:?}", proxy);
} else {
common_telemetry::info!("Using system proxy(if any)");
}
Self {
addr,
catalog,
auth_header,
timeout,
proxy,
}
}
@@ -67,7 +95,13 @@ impl DatabaseClient {
("db", format!("{}-{}", self.catalog, schema)),
("sql", sql.to_string()),
];
let mut request = reqwest::Client::new()
let client = self
.proxy
.clone()
.map(|proxy| reqwest::Client::builder().proxy(proxy).build())
.unwrap_or_else(|| Ok(reqwest::Client::new()))
.context(BuildClientSnafu)?;
let mut request = client
.post(&url)
.form(&params)
.header("Content-Type", "application/x-www-form-urlencoded");

View File

@@ -86,6 +86,22 @@ pub enum Error {
location: Location,
},
#[snafu(display("Failed to parse proxy options: {}", error))]
ParseProxyOpts {
#[snafu(source)]
error: reqwest::Error,
#[snafu(implicit)]
location: Location,
},
#[snafu(display("Failed to build reqwest client: {}", error))]
BuildClient {
#[snafu(implicit)]
location: Location,
#[snafu(source)]
error: reqwest::Error,
},
#[snafu(display("Invalid REPL command: {reason}"))]
InvalidReplCommand { reason: String },
@@ -278,7 +294,8 @@ impl ErrorExt for Error {
| Error::InitTimezone { .. }
| Error::ConnectEtcd { .. }
| Error::CreateDir { .. }
| Error::EmptyResult { .. } => StatusCode::InvalidArguments,
| Error::EmptyResult { .. }
| Error::ParseProxyOpts { .. } => StatusCode::InvalidArguments,
Error::StartProcedureManager { source, .. }
| Error::StopProcedureManager { source, .. } => source.status_code(),
@@ -298,7 +315,8 @@ impl ErrorExt for Error {
Error::SerdeJson { .. }
| Error::FileIo { .. }
| Error::SpawnThread { .. }
| Error::InitTlsProvider { .. } => StatusCode::Unexpected,
| Error::InitTlsProvider { .. }
| Error::BuildClient { .. } => StatusCode::Unexpected,
Error::Other { source, .. } => source.status_code(),

View File

@@ -28,7 +28,7 @@ use tokio::io::{AsyncWriteExt, BufWriter};
use tokio::sync::Semaphore;
use tokio::time::Instant;
use crate::database::DatabaseClient;
use crate::database::{parse_proxy_opts, DatabaseClient};
use crate::error::{EmptyResultSnafu, Error, FileIoSnafu, Result, SchemaNotFoundSnafu};
use crate::{database, Tool};
@@ -91,19 +91,30 @@ pub struct ExportCommand {
/// The default behavior will disable server-side default timeout(i.e. `0s`).
#[clap(long, value_parser = humantime::parse_duration)]
timeout: Option<Duration>,
/// The proxy server address to connect, if set, will override the system proxy.
///
/// The default behavior will use the system proxy if neither `proxy` nor `no_proxy` is set.
#[clap(long)]
proxy: Option<String>,
/// Disable proxy server, if set, will not use any proxy.
#[clap(long)]
no_proxy: bool,
}
impl ExportCommand {
pub async fn build(&self) -> std::result::Result<Box<dyn Tool>, BoxedError> {
let (catalog, schema) =
database::split_database(&self.database).map_err(BoxedError::new)?;
let proxy = parse_proxy_opts(self.proxy.clone(), self.no_proxy)?;
let database_client = DatabaseClient::new(
self.addr.clone(),
catalog.clone(),
self.auth_basic.clone(),
// Treats `None` as `0s` to disable server-side default timeout.
self.timeout.unwrap_or_default(),
proxy,
);
Ok(Box::new(Export {

View File

@@ -25,7 +25,7 @@ use snafu::{OptionExt, ResultExt};
use tokio::sync::Semaphore;
use tokio::time::Instant;
use crate::database::DatabaseClient;
use crate::database::{parse_proxy_opts, DatabaseClient};
use crate::error::{Error, FileIoSnafu, Result, SchemaNotFoundSnafu};
use crate::{database, Tool};
@@ -76,18 +76,30 @@ pub struct ImportCommand {
/// The default behavior will disable server-side default timeout(i.e. `0s`).
#[clap(long, value_parser = humantime::parse_duration)]
timeout: Option<Duration>,
/// The proxy server address to connect, if set, will override the system proxy.
///
/// The default behavior will use the system proxy if neither `proxy` nor `no_proxy` is set.
#[clap(long)]
proxy: Option<String>,
/// Disable proxy server, if set, will not use any proxy.
#[clap(long, default_value = "false")]
no_proxy: bool,
}
impl ImportCommand {
pub async fn build(&self) -> std::result::Result<Box<dyn Tool>, BoxedError> {
let (catalog, schema) =
database::split_database(&self.database).map_err(BoxedError::new)?;
let proxy = parse_proxy_opts(self.proxy.clone(), self.no_proxy)?;
let database_client = DatabaseClient::new(
self.addr.clone(),
catalog.clone(),
self.auth_basic.clone(),
// Treats `None` as `0s` to disable server-side default timeout.
self.timeout.unwrap_or_default(),
proxy,
);
Ok(Box::new(Import {

View File

@@ -9,6 +9,10 @@ default-run = "greptime"
name = "greptime"
path = "src/bin/greptime.rs"
[[bin]]
name = "objbench"
path = "src/bin/objbench.rs"
[features]
default = ["servers/pprof", "servers/mem-prof"]
tokio-console = ["common-telemetry/tokio-console"]
@@ -20,6 +24,7 @@ workspace = true
async-trait.workspace = true
auth.workspace = true
base64.workspace = true
colored = "2.0"
cache.workspace = true
catalog.workspace = true
chrono.workspace = true
@@ -55,6 +60,9 @@ futures.workspace = true
human-panic = "2.0"
humantime.workspace = true
lazy_static.workspace = true
object-store.workspace = true
parquet = "53"
pprof = "0.14"
meta-client.workspace = true
meta-srv.workspace = true
metric-engine.workspace = true

View File

@@ -21,6 +21,8 @@ use cmd::{cli, datanode, flownode, frontend, metasrv, standalone, App};
use common_version::version;
use servers::install_ring_crypto_provider;
pub mod objbench;
#[derive(Parser)]
#[command(name = "greptime", author, version, long_version = version(), about)]
#[command(propagate_version = true)]

602
src/cmd/src/bin/objbench.rs Normal file
View File

@@ -0,0 +1,602 @@
// Copyright 2025 Greptime Team
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
use std::path::{Path, PathBuf};
use std::time::Instant;
use clap::Parser;
use cmd::error::{self, Result};
use colored::Colorize;
use datanode::config::ObjectStoreConfig;
use mito2::config::{FulltextIndexConfig, MitoConfig, Mode};
use mito2::read::Source;
use mito2::sst::file::{FileHandle, FileId, FileMeta};
use mito2::sst::file_purger::{FilePurger, FilePurgerRef, PurgeRequest};
use mito2::sst::parquet::{WriteOptions, PARQUET_METADATA_KEY};
use mito2::{build_access_layer, Metrics, OperationType, SstWriteRequest};
use object_store::ObjectStore;
use serde::{Deserialize, Serialize};
use store_api::metadata::{RegionMetadata, RegionMetadataRef};
#[tokio::main]
pub async fn main() {
// common_telemetry::init_default_ut_logging();
let cmd = Command::parse();
if let Err(e) = cmd.run().await {
eprintln!("{}: {}", "Error".red().bold(), e);
std::process::exit(1);
}
}
#[derive(Debug, Clone, Serialize, Deserialize, PartialEq, Default)]
#[serde(default)]
pub struct StorageConfigWrapper {
storage: StorageConfig,
}
/// Storage engine config
#[derive(Debug, Clone, Serialize, Deserialize, PartialEq, Default)]
#[serde(default)]
pub struct StorageConfig {
/// The working directory of database
pub data_home: String,
#[serde(flatten)]
pub store: ObjectStoreConfig,
}
#[derive(Debug, Parser)]
pub struct Command {
/// Path to the object-store config file (TOML). Must deserialize into datanode::config::ObjectStoreConfig.
#[clap(long, value_name = "FILE")]
pub config: PathBuf,
/// Source SST file path in object-store (e.g. "region_dir/<uuid>.parquet").
#[clap(long, value_name = "PATH")]
pub source: String,
/// Target SST file path in object-store; its parent directory is used as destination region dir.
#[clap(long, value_name = "PATH")]
pub target: String,
/// Verbose output
#[clap(short, long, default_value_t = false)]
pub verbose: bool,
/// Output file path for pprof flamegraph (enables profiling)
#[clap(long, value_name = "FILE")]
pub pprof_file: Option<PathBuf>,
}
impl Command {
pub async fn run(&self) -> Result<()> {
if self.verbose {
common_telemetry::init_default_ut_logging();
}
println!("{}", "Starting objbench...".cyan().bold());
// Build object store from config
let cfg_str = std::fs::read_to_string(&self.config).map_err(|e| {
error::IllegalConfigSnafu {
msg: format!("failed to read config {}: {e}", self.config.display()),
}
.build()
})?;
let store_cfg: StorageConfigWrapper = toml::from_str(&cfg_str).map_err(|e| {
error::IllegalConfigSnafu {
msg: format!("failed to parse config {}: {e}", self.config.display()),
}
.build()
})?;
let object_store = build_object_store(&store_cfg.storage).await?;
println!("{} Object store initialized", "".green());
// Prepare source identifiers
let (src_region_dir, src_file_id) = split_sst_path(&self.source)?;
println!("{} Source path parsed: {}", "".green(), self.source);
// Load parquet metadata to extract RegionMetadata and file stats
println!("{}", "Loading parquet metadata...".yellow());
let file_size = object_store
.stat(&self.source)
.await
.map_err(|e| {
error::IllegalConfigSnafu {
msg: format!("stat failed: {e}"),
}
.build()
})?
.content_length();
let parquet_meta = load_parquet_metadata(object_store.clone(), &self.source, file_size)
.await
.map_err(|e| {
error::IllegalConfigSnafu {
msg: format!("read parquet metadata failed: {e}"),
}
.build()
})?;
let region_meta = extract_region_metadata(&self.source, &parquet_meta)?;
let num_rows = parquet_meta.file_metadata().num_rows() as u64;
let num_row_groups = parquet_meta.num_row_groups() as u64;
println!(
"{} Metadata loaded - rows: {}, size: {} bytes",
"".green(),
num_rows,
file_size
);
// Build a FileHandle for the source file
let file_meta = FileMeta {
region_id: region_meta.region_id,
file_id: src_file_id,
time_range: Default::default(),
level: 0,
file_size,
available_indexes: Default::default(),
index_file_size: 0,
num_rows,
num_row_groups,
sequence: None,
};
let src_handle = FileHandle::new(file_meta, new_noop_file_purger());
// Build the reader for a single file via ParquetReaderBuilder
println!("{}", "Building reader...".yellow());
let (_src_access_layer, _cache_manager) =
build_access_layer_simple(src_region_dir.clone(), object_store.clone()).await?;
let reader_build_start = Instant::now();
let reader = mito2::sst::parquet::reader::ParquetReaderBuilder::new(
src_region_dir.clone(),
src_handle.clone(),
object_store.clone(),
)
.expected_metadata(Some(region_meta.clone()))
.build()
.await
.map_err(|e| {
error::IllegalConfigSnafu {
msg: format!("build reader failed: {e}"),
}
.build()
})?;
let reader_build_elapsed = reader_build_start.elapsed();
let total_rows = reader.parquet_metadata().file_metadata().num_rows();
println!("{} Reader built in {:?}", "".green(), reader_build_elapsed);
// Prepare target access layer for writing
println!("{}", "Preparing target access layer...".yellow());
let (tgt_access_layer, tgt_cache_manager) =
build_access_layer_simple(self.target.clone(), object_store.clone()).await?;
// Build write request
let fulltext_index_config = FulltextIndexConfig {
create_on_compaction: Mode::Disable,
..Default::default()
};
let write_opts = WriteOptions::default();
let write_req = SstWriteRequest {
op_type: OperationType::Compact,
metadata: region_meta,
source: Source::Reader(Box::new(reader)),
cache_manager: tgt_cache_manager,
storage: None,
max_sequence: None,
index_options: Default::default(),
inverted_index_config: MitoConfig::default().inverted_index,
fulltext_index_config,
bloom_filter_index_config: MitoConfig::default().bloom_filter_index,
};
// Write SST
println!("{}", "Writing SST...".yellow());
let mut metrics = Metrics::default();
// Start profiling if pprof_file is specified
#[cfg(unix)]
let profiler_guard = if self.pprof_file.is_some() {
println!("{} Starting profiling...", "".yellow());
Some(
pprof::ProfilerGuardBuilder::default()
.frequency(99)
.blocklist(&["libc", "libgcc", "pthread", "vdso"])
.build()
.map_err(|e| {
error::IllegalConfigSnafu {
msg: format!("Failed to start profiler: {e}"),
}
.build()
})?,
)
} else {
None
};
#[cfg(not(unix))]
if self.pprof_file.is_some() {
eprintln!(
"{}: Profiling is not supported on this platform",
"Warning".yellow()
);
}
let write_start = Instant::now();
let infos = tgt_access_layer
.write_sst(write_req, &write_opts, &mut metrics)
.await
.map_err(|e| {
error::IllegalConfigSnafu {
msg: format!("write_sst failed: {e}"),
}
.build()
})?;
let write_elapsed = write_start.elapsed();
// Stop profiling and generate flamegraph if enabled
#[cfg(unix)]
if let (Some(guard), Some(pprof_file)) = (profiler_guard, &self.pprof_file) {
println!("{} Generating flamegraph...", "🔥".yellow());
match guard.report().build() {
Ok(report) => {
let mut flamegraph_data = Vec::new();
if let Err(e) = report.flamegraph(&mut flamegraph_data) {
eprintln!(
"{}: Failed to generate flamegraph: {}",
"Warning".yellow(),
e
);
} else if let Err(e) = std::fs::write(pprof_file, flamegraph_data) {
eprintln!(
"{}: Failed to write flamegraph to {}: {}",
"Warning".yellow(),
pprof_file.display(),
e
);
} else {
println!(
"{} Flamegraph saved to {}",
"".green(),
pprof_file.display().to_string().cyan()
);
}
}
Err(e) => {
eprintln!(
"{}: Failed to generate pprof report: {}",
"Warning".yellow(),
e
);
}
}
}
assert_eq!(infos.len(), 1);
let dst_file_id = infos[0].file_id;
let dst_file_path = format!("{}{}", self.target, dst_file_id.as_parquet(),);
// Report results with ANSI colors
println!("\n{} {}", "Write complete!".green().bold(), "".green());
println!(" {}: {}", "Destination file".bold(), dst_file_path.cyan());
println!(" {}: {}", "Rows".bold(), total_rows.to_string().cyan());
println!(
" {}: {}",
"File size".bold(),
format!("{} bytes", file_size).cyan()
);
println!(
" {}: {:?}",
"Reader build time".bold(),
reader_build_elapsed
);
println!(" {}: {:?}", "Total time".bold(), write_elapsed);
// Print metrics in a formatted way
println!(
" {}: {:?}, sum: {:?}",
"Metrics".bold(),
metrics,
metrics.sum()
);
// Print infos
println!(" {}: {:?}", "Index".bold(), infos[0].index_metadata);
// Cleanup
println!("\n{}", "Cleaning up...".yellow());
object_store.delete(&dst_file_path).await.map_err(|e| {
error::IllegalConfigSnafu {
msg: format!("Failed to delete dest file {}: {}", dst_file_path, e),
}
.build()
})?;
println!("{} Temporary file deleted", "".green());
println!("\n{}", "Benchmark completed successfully!".green().bold());
Ok(())
}
}
fn split_sst_path(path: &str) -> Result<(String, FileId)> {
let p = Path::new(path);
let file_name = p.file_name().and_then(|s| s.to_str()).ok_or_else(|| {
error::IllegalConfigSnafu {
msg: "invalid source path".to_string(),
}
.build()
})?;
let uuid_str = file_name.strip_suffix(".parquet").ok_or_else(|| {
error::IllegalConfigSnafu {
msg: "expect .parquet file".to_string(),
}
.build()
})?;
let file_id = FileId::parse_str(uuid_str).map_err(|e| {
error::IllegalConfigSnafu {
msg: format!("invalid file id: {e}"),
}
.build()
})?;
let parent = p
.parent()
.and_then(|s| s.to_str())
.unwrap_or("")
.to_string();
Ok((parent, file_id))
}
fn extract_region_metadata(
file_path: &str,
meta: &parquet::file::metadata::ParquetMetaData,
) -> Result<RegionMetadataRef> {
use parquet::format::KeyValue;
let kvs: Option<&Vec<KeyValue>> = meta.file_metadata().key_value_metadata();
let Some(kvs) = kvs else {
return Err(error::IllegalConfigSnafu {
msg: format!("{file_path}: missing parquet key_value metadata"),
}
.build());
};
let json = kvs
.iter()
.find(|kv| kv.key == PARQUET_METADATA_KEY)
.and_then(|kv| kv.value.as_ref())
.ok_or_else(|| {
error::IllegalConfigSnafu {
msg: format!("{file_path}: key {PARQUET_METADATA_KEY} not found or empty"),
}
.build()
})?;
let region: RegionMetadata = RegionMetadata::from_json(json).map_err(|e| {
error::IllegalConfigSnafu {
msg: format!("invalid region metadata json: {e}"),
}
.build()
})?;
Ok(std::sync::Arc::new(region))
}
async fn build_object_store(sc: &StorageConfig) -> Result<ObjectStore> {
use datanode::config::ObjectStoreConfig::*;
let oss = &sc.store;
match oss {
File(_) => {
use object_store::services::Fs;
let builder = Fs::default().root(&sc.data_home);
Ok(ObjectStore::new(builder)
.map_err(|e| {
error::IllegalConfigSnafu {
msg: format!("init fs backend failed: {e}"),
}
.build()
})?
.finish())
}
S3(s3) => {
use common_base::secrets::ExposeSecret;
use object_store::services::S3;
use object_store::util;
let root = util::normalize_dir(&s3.root);
let mut builder = S3::default()
.root(&root)
.bucket(&s3.bucket)
.access_key_id(s3.access_key_id.expose_secret())
.secret_access_key(s3.secret_access_key.expose_secret());
if let Some(ep) = &s3.endpoint {
builder = builder.endpoint(ep);
}
if let Some(region) = &s3.region {
builder = builder.region(region);
}
if s3.enable_virtual_host_style {
builder = builder.enable_virtual_host_style();
}
Ok(ObjectStore::new(builder)
.map_err(|e| {
error::IllegalConfigSnafu {
msg: format!("init s3 backend failed: {e}"),
}
.build()
})?
.finish())
}
Oss(oss) => {
use common_base::secrets::ExposeSecret;
use object_store::services::Oss;
use object_store::util;
let root = util::normalize_dir(&oss.root);
let builder = Oss::default()
.root(&root)
.bucket(&oss.bucket)
.endpoint(&oss.endpoint)
.access_key_id(oss.access_key_id.expose_secret())
.access_key_secret(oss.access_key_secret.expose_secret());
Ok(ObjectStore::new(builder)
.map_err(|e| {
error::IllegalConfigSnafu {
msg: format!("init oss backend failed: {e}"),
}
.build()
})?
.finish())
}
Azblob(az) => {
use common_base::secrets::ExposeSecret;
use object_store::services::Azblob;
use object_store::util;
let root = util::normalize_dir(&az.root);
let mut builder = Azblob::default()
.root(&root)
.container(&az.container)
.endpoint(&az.endpoint)
.account_name(az.account_name.expose_secret())
.account_key(az.account_key.expose_secret());
if let Some(token) = &az.sas_token {
builder = builder.sas_token(token);
}
Ok(ObjectStore::new(builder)
.map_err(|e| {
error::IllegalConfigSnafu {
msg: format!("init azblob backend failed: {e}"),
}
.build()
})?
.finish())
}
Gcs(gcs) => {
use common_base::secrets::ExposeSecret;
use object_store::services::Gcs;
use object_store::util;
let root = util::normalize_dir(&gcs.root);
let builder = Gcs::default()
.root(&root)
.bucket(&gcs.bucket)
.scope(&gcs.scope)
.credential_path(gcs.credential_path.expose_secret())
.credential(gcs.credential.expose_secret())
.endpoint(&gcs.endpoint);
Ok(ObjectStore::new(builder)
.map_err(|e| {
error::IllegalConfigSnafu {
msg: format!("init gcs backend failed: {e}"),
}
.build()
})?
.finish())
}
}
}
async fn build_access_layer_simple(
region_dir: String,
object_store: ObjectStore,
) -> Result<(
std::sync::Arc<mito2::AccessLayer>,
std::sync::Arc<mito2::CacheManager>,
)> {
// Minimal index aux path setup
let mut mito_cfg = MitoConfig::default();
// Use a temporary directory as aux path
let data_home = std::env::temp_dir().join("greptime_objbench");
let _ = std::fs::create_dir_all(&data_home);
let _ = mito_cfg.index.sanitize(
data_home.to_str().unwrap_or("/tmp"),
&mito_cfg.inverted_index,
);
let access_layer = build_access_layer(&region_dir, object_store, &mito_cfg)
.await
.map_err(|e| {
error::IllegalConfigSnafu {
msg: format!("build_access_layer failed: {e}"),
}
.build()
})?;
Ok((
access_layer,
std::sync::Arc::new(mito2::CacheManager::default()),
))
}
fn new_noop_file_purger() -> FilePurgerRef {
#[derive(Debug)]
struct Noop;
impl FilePurger for Noop {
fn send_request(&self, _request: PurgeRequest) {}
}
std::sync::Arc::new(Noop)
}
async fn load_parquet_metadata(
object_store: ObjectStore,
path: &str,
file_size: u64,
) -> std::result::Result<
parquet::file::metadata::ParquetMetaData,
Box<dyn std::error::Error + Send + Sync>,
> {
use parquet::file::metadata::ParquetMetaDataReader;
use parquet::file::FOOTER_SIZE;
let actual_size = if file_size == 0 {
object_store.stat(path).await?.content_length()
} else {
file_size
};
if actual_size < FOOTER_SIZE as u64 {
return Err("file too small".into());
}
let prefetch: u64 = 64 * 1024;
let start = actual_size.saturating_sub(prefetch);
let buffer = object_store
.read_with(path)
.range(start..actual_size)
.await?
.to_vec();
let buffer_len = buffer.len();
let mut footer = [0; 8];
footer.copy_from_slice(&buffer[buffer_len - FOOTER_SIZE..]);
let metadata_len = ParquetMetaDataReader::decode_footer(&footer)? as u64;
if actual_size - (FOOTER_SIZE as u64) < metadata_len {
return Err("invalid footer/metadata length".into());
}
if (metadata_len as usize) <= buffer_len - FOOTER_SIZE {
let metadata_start = buffer_len - metadata_len as usize - FOOTER_SIZE;
let meta = ParquetMetaDataReader::decode_metadata(
&buffer[metadata_start..buffer_len - FOOTER_SIZE],
)?;
Ok(meta)
} else {
let metadata_start = actual_size - metadata_len - FOOTER_SIZE as u64;
let data = object_store
.read_with(path)
.range(metadata_start..(actual_size - FOOTER_SIZE as u64))
.await?
.to_vec();
let meta = ParquetMetaDataReader::decode_metadata(&data)?;
Ok(meta)
}
}
#[cfg(test)]
mod tests {
use super::StorageConfigWrapper;
#[test]
fn test_decode() {
let cfg = std::fs::read_to_string("/home/lei/datanode-bulk.toml").unwrap();
let storage: StorageConfigWrapper = toml::from_str(&cfg).unwrap();
println!("{:?}", storage);
}
}

View File

@@ -126,10 +126,14 @@ impl SubCommand {
struct StartCommand {
#[clap(long)]
node_id: Option<u64>,
#[clap(long)]
rpc_addr: Option<String>,
#[clap(long)]
rpc_hostname: Option<String>,
/// The address to bind the gRPC server.
#[clap(long, alias = "rpc-addr")]
rpc_bind_addr: Option<String>,
/// The address advertised to the metasrv, and used for connections from outside the host.
/// If left empty or unset, the server will automatically use the IP address of the first network interface
/// on the host, with the same port number as the one specified in `rpc_bind_addr`.
#[clap(long, alias = "rpc-hostname")]
rpc_server_addr: Option<String>,
#[clap(long, value_delimiter = ',', num_args = 1..)]
metasrv_addrs: Option<Vec<String>>,
#[clap(short, long)]
@@ -181,18 +185,18 @@ impl StartCommand {
tokio_console_addr: global_options.tokio_console_addr.clone(),
};
if let Some(addr) = &self.rpc_addr {
opts.grpc.addr.clone_from(addr);
if let Some(addr) = &self.rpc_bind_addr {
opts.grpc.bind_addr.clone_from(addr);
} else if let Some(addr) = &opts.rpc_addr {
warn!("Use the deprecated attribute `DatanodeOptions.rpc_addr`, please use `grpc.addr` instead.");
opts.grpc.addr.clone_from(addr);
opts.grpc.bind_addr.clone_from(addr);
}
if let Some(hostname) = &self.rpc_hostname {
opts.grpc.hostname.clone_from(hostname);
} else if let Some(hostname) = &opts.rpc_hostname {
if let Some(server_addr) = &self.rpc_server_addr {
opts.grpc.server_addr.clone_from(server_addr);
} else if let Some(server_addr) = &opts.rpc_hostname {
warn!("Use the deprecated attribute `DatanodeOptions.rpc_hostname`, please use `grpc.hostname` instead.");
opts.grpc.hostname.clone_from(hostname);
opts.grpc.server_addr.clone_from(server_addr);
}
if let Some(runtime_size) = opts.rpc_runtime_size {
@@ -277,7 +281,7 @@ impl StartCommand {
let plugin_opts = opts.plugins;
let mut opts = opts.component;
opts.grpc.detect_hostname();
opts.grpc.detect_server_addr();
let mut plugins = Plugins::new();
plugins::setup_datanode_plugins(&mut plugins, &plugin_opts, &opts)
.await
@@ -357,8 +361,8 @@ mod tests {
rpc_addr = "127.0.0.1:4001"
rpc_hostname = "192.168.0.1"
[grpc]
addr = "127.0.0.1:3001"
hostname = "127.0.0.1"
bind_addr = "127.0.0.1:3001"
server_addr = "127.0.0.1"
runtime_size = 8
"#;
write!(file, "{}", toml_str).unwrap();
@@ -369,8 +373,8 @@ mod tests {
};
let options = cmd.load_options(&Default::default()).unwrap().component;
assert_eq!("127.0.0.1:4001".to_string(), options.grpc.addr);
assert_eq!("192.168.0.1".to_string(), options.grpc.hostname);
assert_eq!("127.0.0.1:4001".to_string(), options.grpc.bind_addr);
assert_eq!("192.168.0.1".to_string(), options.grpc.server_addr);
}
#[test]
@@ -431,7 +435,7 @@ mod tests {
let options = cmd.load_options(&Default::default()).unwrap().component;
assert_eq!("127.0.0.1:3001".to_string(), options.grpc.addr);
assert_eq!("127.0.0.1:3001".to_string(), options.grpc.bind_addr);
assert_eq!(Some(42), options.node_id);
let DatanodeWalConfig::RaftEngine(raft_engine_config) = options.wal else {
@@ -645,7 +649,7 @@ mod tests {
opts.http.addr,
DatanodeOptions::default().component.http.addr
);
assert_eq!(opts.grpc.hostname, "10.103.174.219");
assert_eq!(opts.grpc.server_addr, "10.103.174.219");
},
);
}

View File

@@ -129,11 +129,13 @@ struct StartCommand {
#[clap(long)]
node_id: Option<u64>,
/// Bind address for the gRPC server.
#[clap(long)]
rpc_addr: Option<String>,
/// Hostname for the gRPC server.
#[clap(long)]
rpc_hostname: Option<String>,
#[clap(long, alias = "rpc-addr")]
rpc_bind_addr: Option<String>,
/// The address advertised to the metasrv, and used for connections from outside the host.
/// If left empty or unset, the server will automatically use the IP address of the first network interface
/// on the host, with the same port number as the one specified in `rpc_bind_addr`.
#[clap(long, alias = "rpc-hostname")]
rpc_server_addr: Option<String>,
/// Metasrv address list;
#[clap(long, value_delimiter = ',', num_args = 1..)]
metasrv_addrs: Option<Vec<String>>,
@@ -184,12 +186,12 @@ impl StartCommand {
tokio_console_addr: global_options.tokio_console_addr.clone(),
};
if let Some(addr) = &self.rpc_addr {
opts.grpc.addr.clone_from(addr);
if let Some(addr) = &self.rpc_bind_addr {
opts.grpc.bind_addr.clone_from(addr);
}
if let Some(hostname) = &self.rpc_hostname {
opts.grpc.hostname.clone_from(hostname);
if let Some(server_addr) = &self.rpc_server_addr {
opts.grpc.server_addr.clone_from(server_addr);
}
if let Some(node_id) = self.node_id {
@@ -237,7 +239,7 @@ impl StartCommand {
info!("Flownode options: {:#?}", opts);
let mut opts = opts.component;
opts.grpc.detect_hostname();
opts.grpc.detect_server_addr();
// TODO(discord9): make it not optionale after cluster id is required
let cluster_id = opts.cluster_id.unwrap_or(0);

View File

@@ -136,13 +136,19 @@ impl SubCommand {
#[derive(Debug, Default, Parser)]
pub struct StartCommand {
/// The address to bind the gRPC server.
#[clap(long, alias = "rpc-addr")]
rpc_bind_addr: Option<String>,
/// The address advertised to the metasrv, and used for connections from outside the host.
/// If left empty or unset, the server will automatically use the IP address of the first network interface
/// on the host, with the same port number as the one specified in `rpc_bind_addr`.
#[clap(long, alias = "rpc-hostname")]
rpc_server_addr: Option<String>,
#[clap(long)]
http_addr: Option<String>,
#[clap(long)]
http_timeout: Option<u64>,
#[clap(long)]
rpc_addr: Option<String>,
#[clap(long)]
mysql_addr: Option<String>,
#[clap(long)]
postgres_addr: Option<String>,
@@ -218,11 +224,15 @@ impl StartCommand {
opts.http.disable_dashboard = disable_dashboard;
}
if let Some(addr) = &self.rpc_addr {
opts.grpc.addr.clone_from(addr);
if let Some(addr) = &self.rpc_bind_addr {
opts.grpc.bind_addr.clone_from(addr);
opts.grpc.tls = tls_opts.clone();
}
if let Some(addr) = &self.rpc_server_addr {
opts.grpc.server_addr.clone_from(addr);
}
if let Some(addr) = &self.mysql_addr {
opts.mysql.enable = true;
opts.mysql.addr.clone_from(addr);
@@ -269,7 +279,7 @@ impl StartCommand {
let plugin_opts = opts.plugins;
let mut opts = opts.component;
opts.grpc.detect_hostname();
opts.grpc.detect_server_addr();
let mut plugins = Plugins::new();
plugins::setup_frontend_plugins(&mut plugins, &plugin_opts, &opts)
.await
@@ -413,7 +423,7 @@ mod tests {
let default_opts = FrontendOptions::default().component;
assert_eq!(opts.grpc.addr, default_opts.grpc.addr);
assert_eq!(opts.grpc.bind_addr, default_opts.grpc.bind_addr);
assert!(opts.mysql.enable);
assert_eq!(opts.mysql.runtime_size, default_opts.mysql.runtime_size);
assert!(opts.postgres.enable);
@@ -604,7 +614,7 @@ mod tests {
assert_eq!(fe_opts.http.addr, "127.0.0.1:14000");
// Should be default value.
assert_eq!(fe_opts.grpc.addr, GrpcOptions::default().addr);
assert_eq!(fe_opts.grpc.bind_addr, GrpcOptions::default().bind_addr);
},
);
}

View File

@@ -42,7 +42,7 @@ pub struct Instance {
}
impl Instance {
fn new(instance: MetasrvInstance, guard: Vec<WorkerGuard>) -> Self {
pub fn new(instance: MetasrvInstance, guard: Vec<WorkerGuard>) -> Self {
Self {
instance,
_guard: guard,
@@ -133,11 +133,15 @@ impl SubCommand {
#[derive(Debug, Default, Parser)]
struct StartCommand {
#[clap(long)]
bind_addr: Option<String>,
#[clap(long)]
server_addr: Option<String>,
#[clap(long, aliases = ["store-addr"], value_delimiter = ',', num_args = 1..)]
/// The address to bind the gRPC server.
#[clap(long, alias = "bind-addr")]
rpc_bind_addr: Option<String>,
/// The communication server address for the frontend and datanode to connect to metasrv.
/// If left empty or unset, the server will automatically use the IP address of the first network interface
/// on the host, with the same port number as the one specified in `rpc_bind_addr`.
#[clap(long, alias = "server-addr")]
rpc_server_addr: Option<String>,
#[clap(long, alias = "store-addr", value_delimiter = ',', num_args = 1..)]
store_addrs: Option<Vec<String>>,
#[clap(short, long)]
config_file: Option<String>,
@@ -201,11 +205,11 @@ impl StartCommand {
tokio_console_addr: global_options.tokio_console_addr.clone(),
};
if let Some(addr) = &self.bind_addr {
if let Some(addr) = &self.rpc_bind_addr {
opts.bind_addr.clone_from(addr);
}
if let Some(addr) = &self.server_addr {
if let Some(addr) = &self.rpc_server_addr {
opts.server_addr.clone_from(addr);
}
@@ -269,11 +273,13 @@ impl StartCommand {
log_versions(version(), short_version(), APP_NAME);
info!("Metasrv start command: {:#?}", self);
info!("Metasrv options: {:#?}", opts);
let plugin_opts = opts.plugins;
let mut opts = opts.component;
opts.detect_server_addr();
info!("Metasrv options: {:#?}", opts);
let mut plugins = Plugins::new();
plugins::setup_metasrv_plugins(&mut plugins, &plugin_opts, &opts)
.await
@@ -306,8 +312,8 @@ mod tests {
#[test]
fn test_read_from_cmd() {
let cmd = StartCommand {
bind_addr: Some("127.0.0.1:3002".to_string()),
server_addr: Some("127.0.0.1:3002".to_string()),
rpc_bind_addr: Some("127.0.0.1:3002".to_string()),
rpc_server_addr: Some("127.0.0.1:3002".to_string()),
store_addrs: Some(vec!["127.0.0.1:2380".to_string()]),
selector: Some("LoadBased".to_string()),
..Default::default()
@@ -381,8 +387,8 @@ mod tests {
#[test]
fn test_load_log_options_from_cli() {
let cmd = StartCommand {
bind_addr: Some("127.0.0.1:3002".to_string()),
server_addr: Some("127.0.0.1:3002".to_string()),
rpc_bind_addr: Some("127.0.0.1:3002".to_string()),
rpc_server_addr: Some("127.0.0.1:3002".to_string()),
store_addrs: Some(vec!["127.0.0.1:2380".to_string()]),
selector: Some("LoadBased".to_string()),
..Default::default()

View File

@@ -60,7 +60,8 @@ use frontend::instance::builder::FrontendBuilder;
use frontend::instance::{FrontendInstance, Instance as FeInstance, StandaloneDatanodeManager};
use frontend::server::Services;
use frontend::service_config::{
InfluxdbOptions, MysqlOptions, OpentsdbOptions, PostgresOptions, PromStoreOptions,
InfluxdbOptions, JaegerOptions, MysqlOptions, OpentsdbOptions, PostgresOptions,
PromStoreOptions,
};
use meta_srv::metasrv::{FLOW_ID_SEQ, TABLE_ID_SEQ};
use mito2::config::MitoConfig;
@@ -140,6 +141,7 @@ pub struct StandaloneOptions {
pub postgres: PostgresOptions,
pub opentsdb: OpentsdbOptions,
pub influxdb: InfluxdbOptions,
pub jaeger: JaegerOptions,
pub prom_store: PromStoreOptions,
pub wal: DatanodeWalConfig,
pub storage: StorageConfig,
@@ -169,6 +171,7 @@ impl Default for StandaloneOptions {
postgres: PostgresOptions::default(),
opentsdb: OpentsdbOptions::default(),
influxdb: InfluxdbOptions::default(),
jaeger: JaegerOptions::default(),
prom_store: PromStoreOptions::default(),
wal: DatanodeWalConfig::default(),
storage: StorageConfig::default(),
@@ -217,6 +220,7 @@ impl StandaloneOptions {
postgres: cloned_opts.postgres,
opentsdb: cloned_opts.opentsdb,
influxdb: cloned_opts.influxdb,
jaeger: cloned_opts.jaeger,
prom_store: cloned_opts.prom_store,
meta_client: None,
logging: cloned_opts.logging,
@@ -329,8 +333,8 @@ impl App for Instance {
pub struct StartCommand {
#[clap(long)]
http_addr: Option<String>,
#[clap(long)]
rpc_addr: Option<String>,
#[clap(long, alias = "rpc-addr")]
rpc_bind_addr: Option<String>,
#[clap(long)]
mysql_addr: Option<String>,
#[clap(long)]
@@ -407,9 +411,9 @@ impl StartCommand {
opts.storage.data_home.clone_from(data_home);
}
if let Some(addr) = &self.rpc_addr {
if let Some(addr) = &self.rpc_bind_addr {
// frontend grpc addr conflict with datanode default grpc addr
let datanode_grpc_addr = DatanodeOptions::default().grpc.addr;
let datanode_grpc_addr = DatanodeOptions::default().grpc.bind_addr;
if addr.eq(&datanode_grpc_addr) {
return IllegalConfigSnafu {
msg: format!(
@@ -417,7 +421,7 @@ impl StartCommand {
),
}.fail();
}
opts.grpc.addr.clone_from(addr)
opts.grpc.bind_addr.clone_from(addr)
}
if let Some(addr) = &self.mysql_addr {
@@ -464,7 +468,7 @@ impl StartCommand {
let mut plugins = Plugins::new();
let plugin_opts = opts.plugins;
let mut opts = opts.component;
opts.grpc.detect_hostname();
opts.grpc.detect_server_addr();
let fe_opts = opts.frontend_options();
let dn_opts = opts.datanode_options();
@@ -486,8 +490,8 @@ impl StartCommand {
let metadata_dir = metadata_store_dir(data_home);
let (kv_backend, procedure_manager) = FeInstance::try_build_standalone_components(
metadata_dir,
opts.metadata_store.clone(),
opts.procedure.clone(),
opts.metadata_store,
opts.procedure,
)
.await
.context(StartFrontendSnafu)?;
@@ -907,7 +911,7 @@ mod tests {
assert_eq!("127.0.0.1:4000".to_string(), fe_opts.http.addr);
assert_eq!(Duration::from_secs(33), fe_opts.http.timeout);
assert_eq!(ReadableSize::mb(128), fe_opts.http.body_limit);
assert_eq!("127.0.0.1:4001".to_string(), fe_opts.grpc.addr);
assert_eq!("127.0.0.1:4001".to_string(), fe_opts.grpc.bind_addr);
assert!(fe_opts.mysql.enable);
assert_eq!("127.0.0.1:4002", fe_opts.mysql.addr);
assert_eq!(2, fe_opts.mysql.runtime_size);
@@ -1037,7 +1041,7 @@ mod tests {
assert_eq!(ReadableSize::mb(64), fe_opts.http.body_limit);
// Should be default value.
assert_eq!(fe_opts.grpc.addr, GrpcOptions::default().addr);
assert_eq!(fe_opts.grpc.bind_addr, GrpcOptions::default().bind_addr);
},
);
}

View File

@@ -63,7 +63,7 @@ mod tests {
.args([
"datanode",
"start",
"--rpc-addr=0.0.0.0:4321",
"--rpc-bind-addr=0.0.0.0:4321",
"--node-id=1",
&format!("--data-home={}", data_home.path().display()),
&format!("--wal-dir={}", wal_dir.path().display()),
@@ -80,7 +80,7 @@ mod tests {
"--log-level=off",
"cli",
"attach",
"--grpc-addr=0.0.0.0:4321",
"--grpc-bind-addr=0.0.0.0:4321",
// history commands can sneaky into stdout and mess up our tests, so disable it
"--disable-helper",
]);

View File

@@ -17,9 +17,6 @@ use std::time::Duration;
use cmd::options::GreptimeOptions;
use cmd::standalone::StandaloneOptions;
use common_config::Configurable;
use common_grpc::channel_manager::{
DEFAULT_MAX_GRPC_RECV_MESSAGE_SIZE, DEFAULT_MAX_GRPC_SEND_MESSAGE_SIZE,
};
use common_options::datanode::{ClientOptions, DatanodeClientOptions};
use common_telemetry::logging::{LoggingOptions, SlowQueryOptions, DEFAULT_OTLP_ENDPOINT};
use common_wal::config::raft_engine::RaftEngineConfig;
@@ -91,13 +88,8 @@ fn test_load_datanode_example_config() {
..Default::default()
},
grpc: GrpcOptions::default()
.with_addr("127.0.0.1:3001")
.with_hostname("127.0.0.1:3001"),
rpc_addr: Some("127.0.0.1:3001".to_string()),
rpc_hostname: Some("127.0.0.1".to_string()),
rpc_runtime_size: Some(8),
rpc_max_recv_message_size: Some(DEFAULT_MAX_GRPC_RECV_MESSAGE_SIZE),
rpc_max_send_message_size: Some(DEFAULT_MAX_GRPC_SEND_MESSAGE_SIZE),
.with_bind_addr("127.0.0.1:3001")
.with_server_addr("127.0.0.1:3001"),
..Default::default()
},
..Default::default()
@@ -144,7 +136,9 @@ fn test_load_frontend_example_config() {
remote_write: Some(Default::default()),
..Default::default()
},
grpc: GrpcOptions::default().with_hostname("127.0.0.1:4001"),
grpc: GrpcOptions::default()
.with_bind_addr("127.0.0.1:4001")
.with_server_addr("127.0.0.1:4001"),
http: HttpOptions {
cors_allowed_origins: vec!["https://example.com".to_string()],
..Default::default()

View File

@@ -18,7 +18,7 @@ bytes.workspace = true
common-error.workspace = true
common-macro.workspace = true
futures.workspace = true
paste = "1.0"
paste.workspace = true
pin-project.workspace = true
rand.workspace = true
serde = { version = "1.0", features = ["derive"] }

View File

@@ -12,9 +12,11 @@ common-base.workspace = true
common-error.workspace = true
common-macro.workspace = true
config.workspace = true
humantime-serde.workspace = true
num_cpus.workspace = true
serde.workspace = true
serde_json.workspace = true
serde_with.workspace = true
snafu.workspace = true
sysinfo.workspace = true
toml.workspace = true

View File

@@ -16,6 +16,8 @@ pub mod config;
pub mod error;
pub mod utils;
use std::time::Duration;
use common_base::readable_size::ReadableSize;
pub use config::*;
use serde::{Deserialize, Serialize};
@@ -34,22 +36,27 @@ pub enum Mode {
Distributed,
}
#[derive(Debug, Clone, Serialize, Deserialize, PartialEq, Eq)]
#[derive(Debug, Clone, Copy, Serialize, Deserialize, PartialEq, Eq)]
#[serde(default)]
pub struct KvBackendConfig {
// Kv file size in bytes
/// The size of the metadata store backend log file.
pub file_size: ReadableSize,
// Kv purge threshold in bytes
/// The threshold of the metadata store size to trigger a purge.
pub purge_threshold: ReadableSize,
/// The interval of the metadata store to trigger a purge.
#[serde(with = "humantime_serde")]
pub purge_interval: Duration,
}
impl Default for KvBackendConfig {
fn default() -> Self {
Self {
// log file size 256MB
file_size: ReadableSize::mb(256),
// purge threshold 4GB
purge_threshold: ReadableSize::gb(4),
// The log file size 64MB
file_size: ReadableSize::mb(64),
// The log purge threshold 256MB
purge_threshold: ReadableSize::mb(256),
// The log purge interval 1m
purge_interval: Duration::from_secs(60),
}
}
}

View File

@@ -35,7 +35,7 @@ orc-rust = { version = "0.5", default-features = false, features = [
"async",
] }
parquet.workspace = true
paste = "1.0"
paste.workspace = true
rand.workspace = true
regex = "1.7"
serde.workspace = true

View File

@@ -12,9 +12,11 @@ default = ["geo"]
geo = ["geohash", "h3o", "s2", "wkt", "geo-types", "dep:geo"]
[dependencies]
ahash = "0.8"
api.workspace = true
arc-swap = "1.0"
async-trait.workspace = true
bincode = "1.3"
common-base.workspace = true
common-catalog.workspace = true
common-error.workspace = true
@@ -32,12 +34,13 @@ geo = { version = "0.29", optional = true }
geo-types = { version = "0.7", optional = true }
geohash = { version = "0.13", optional = true }
h3o = { version = "0.6", optional = true }
hyperloglogplus = "0.4"
jsonb.workspace = true
nalgebra.workspace = true
num = "0.4"
num-traits = "0.2"
once_cell.workspace = true
paste = "1.0"
paste.workspace = true
s2 = { version = "0.0.12", optional = true }
serde.workspace = true
serde_json.workspace = true
@@ -47,6 +50,7 @@ sql.workspace = true
statrs = "0.16"
store-api.workspace = true
table.workspace = true
uddsketch = { git = "https://github.com/GreptimeTeam/timescaledb-toolkit.git", rev = "84828fe8fb494a6a61412a3da96517fc80f7bb20" }
wkt = { version = "0.11", optional = true }
[dev-dependencies]

View File

@@ -0,0 +1,20 @@
// Copyright 2023 Greptime Team
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
mod hll;
mod uddsketch_state;
pub(crate) use hll::HllStateType;
pub use hll::{HllState, HLL_MERGE_NAME, HLL_NAME};
pub use uddsketch_state::{UddSketchState, UDDSKETCH_STATE_NAME};

View File

@@ -0,0 +1,319 @@
// Copyright 2023 Greptime Team
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
use std::sync::Arc;
use common_query::prelude::*;
use common_telemetry::trace;
use datafusion::arrow::array::ArrayRef;
use datafusion::common::cast::{as_binary_array, as_string_array};
use datafusion::common::not_impl_err;
use datafusion::error::{DataFusionError, Result as DfResult};
use datafusion::logical_expr::function::AccumulatorArgs;
use datafusion::logical_expr::{Accumulator as DfAccumulator, AggregateUDF};
use datafusion::prelude::create_udaf;
use datatypes::arrow::datatypes::DataType;
use hyperloglogplus::{HyperLogLog, HyperLogLogPlus};
use crate::utils::FixedRandomState;
pub const HLL_NAME: &str = "hll";
pub const HLL_MERGE_NAME: &str = "hll_merge";
const DEFAULT_PRECISION: u8 = 14;
pub(crate) type HllStateType = HyperLogLogPlus<String, FixedRandomState>;
pub struct HllState {
hll: HllStateType,
}
impl std::fmt::Debug for HllState {
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
write!(f, "HllState<Opaque>")
}
}
impl Default for HllState {
fn default() -> Self {
Self::new()
}
}
impl HllState {
pub fn new() -> Self {
Self {
// Safety: the DEFAULT_PRECISION is fixed and valid
hll: HllStateType::new(DEFAULT_PRECISION, FixedRandomState::new()).unwrap(),
}
}
/// Create a UDF for the `hll` function.
///
/// `hll` accepts a string column and aggregates the
/// values into a HyperLogLog state.
pub fn state_udf_impl() -> AggregateUDF {
create_udaf(
HLL_NAME,
vec![DataType::Utf8],
Arc::new(DataType::Binary),
Volatility::Immutable,
Arc::new(Self::create_accumulator),
Arc::new(vec![DataType::Binary]),
)
}
/// Create a UDF for the `hll_merge` function.
///
/// `hll_merge` accepts a binary column of states generated by `hll`
/// and merges them into a single state.
pub fn merge_udf_impl() -> AggregateUDF {
create_udaf(
HLL_MERGE_NAME,
vec![DataType::Binary],
Arc::new(DataType::Binary),
Volatility::Immutable,
Arc::new(Self::create_merge_accumulator),
Arc::new(vec![DataType::Binary]),
)
}
fn update(&mut self, value: &str) {
self.hll.insert(value);
}
fn merge(&mut self, raw: &[u8]) {
if let Ok(serialized) = bincode::deserialize::<HllStateType>(raw) {
if let Ok(()) = self.hll.merge(&serialized) {
return;
}
}
trace!("Warning: Failed to merge HyperLogLog from {:?}", raw);
}
fn create_accumulator(acc_args: AccumulatorArgs) -> DfResult<Box<dyn DfAccumulator>> {
let data_type = acc_args.exprs[0].data_type(acc_args.schema)?;
match data_type {
DataType::Utf8 => Ok(Box::new(HllState::new())),
other => not_impl_err!("{HLL_NAME} does not support data type: {other}"),
}
}
fn create_merge_accumulator(acc_args: AccumulatorArgs) -> DfResult<Box<dyn DfAccumulator>> {
let data_type = acc_args.exprs[0].data_type(acc_args.schema)?;
match data_type {
DataType::Binary => Ok(Box::new(HllState::new())),
other => not_impl_err!("{HLL_MERGE_NAME} does not support data type: {other}"),
}
}
}
impl DfAccumulator for HllState {
fn update_batch(&mut self, values: &[ArrayRef]) -> DfResult<()> {
let array = &values[0];
match array.data_type() {
DataType::Utf8 => {
let string_array = as_string_array(array)?;
for value in string_array.iter().flatten() {
self.update(value);
}
}
DataType::Binary => {
let binary_array = as_binary_array(array)?;
for v in binary_array.iter().flatten() {
self.merge(v);
}
}
_ => {
return not_impl_err!(
"HLL functions do not support data type: {}",
array.data_type()
)
}
}
Ok(())
}
fn evaluate(&mut self) -> DfResult<ScalarValue> {
Ok(ScalarValue::Binary(Some(
bincode::serialize(&self.hll).map_err(|e| {
DataFusionError::Internal(format!("Failed to serialize HyperLogLog: {}", e))
})?,
)))
}
fn size(&self) -> usize {
std::mem::size_of_val(&self.hll)
}
fn state(&mut self) -> DfResult<Vec<ScalarValue>> {
Ok(vec![ScalarValue::Binary(Some(
bincode::serialize(&self.hll).map_err(|e| {
DataFusionError::Internal(format!("Failed to serialize HyperLogLog: {}", e))
})?,
))])
}
fn merge_batch(&mut self, states: &[ArrayRef]) -> DfResult<()> {
let array = &states[0];
let binary_array = as_binary_array(array)?;
for v in binary_array.iter().flatten() {
self.merge(v);
}
Ok(())
}
}
#[cfg(test)]
mod tests {
use datafusion::arrow::array::{BinaryArray, StringArray};
use super::*;
#[test]
fn test_hll_basic() {
let mut state = HllState::new();
state.update("1");
state.update("2");
state.update("3");
let result = state.evaluate().unwrap();
if let ScalarValue::Binary(Some(bytes)) = result {
let mut hll: HllStateType = bincode::deserialize(&bytes).unwrap();
assert_eq!(hll.count().trunc() as u32, 3);
} else {
panic!("Expected binary scalar value");
}
}
#[test]
fn test_hll_roundtrip() {
let mut state = HllState::new();
state.update("1");
state.update("2");
// Serialize
let serialized = state.evaluate().unwrap();
// Create new state and merge the serialized data
let mut new_state = HllState::new();
if let ScalarValue::Binary(Some(bytes)) = &serialized {
new_state.merge(bytes);
// Verify the merged state matches original
let result = new_state.evaluate().unwrap();
if let ScalarValue::Binary(Some(new_bytes)) = result {
let mut original: HllStateType = bincode::deserialize(bytes).unwrap();
let mut merged: HllStateType = bincode::deserialize(&new_bytes).unwrap();
assert_eq!(original.count(), merged.count());
} else {
panic!("Expected binary scalar value");
}
} else {
panic!("Expected binary scalar value");
}
}
#[test]
fn test_hll_batch_update() {
let mut state = HllState::new();
// Test string values
let str_values = vec!["a", "b", "c", "d", "e", "f", "g", "h", "i"];
let str_array = Arc::new(StringArray::from(str_values)) as ArrayRef;
state.update_batch(&[str_array]).unwrap();
let result = state.evaluate().unwrap();
if let ScalarValue::Binary(Some(bytes)) = result {
let mut hll: HllStateType = bincode::deserialize(&bytes).unwrap();
assert_eq!(hll.count().trunc() as u32, 9);
} else {
panic!("Expected binary scalar value");
}
}
#[test]
fn test_hll_merge_batch() {
let mut state1 = HllState::new();
state1.update("1");
let state1_binary = state1.evaluate().unwrap();
let mut state2 = HllState::new();
state2.update("2");
let state2_binary = state2.evaluate().unwrap();
let mut merged_state = HllState::new();
if let (ScalarValue::Binary(Some(bytes1)), ScalarValue::Binary(Some(bytes2))) =
(&state1_binary, &state2_binary)
{
let binary_array = Arc::new(BinaryArray::from(vec![
bytes1.as_slice(),
bytes2.as_slice(),
])) as ArrayRef;
merged_state.merge_batch(&[binary_array]).unwrap();
let result = merged_state.evaluate().unwrap();
if let ScalarValue::Binary(Some(bytes)) = result {
let mut hll: HllStateType = bincode::deserialize(&bytes).unwrap();
assert_eq!(hll.count().trunc() as u32, 2);
} else {
panic!("Expected binary scalar value");
}
} else {
panic!("Expected binary scalar values");
}
}
#[test]
fn test_hll_merge_function() {
// Create two HLL states with different values
let mut state1 = HllState::new();
state1.update("1");
state1.update("2");
let state1_binary = state1.evaluate().unwrap();
let mut state2 = HllState::new();
state2.update("2");
state2.update("3");
let state2_binary = state2.evaluate().unwrap();
// Create a merge state and merge both states
let mut merge_state = HllState::new();
if let (ScalarValue::Binary(Some(bytes1)), ScalarValue::Binary(Some(bytes2))) =
(&state1_binary, &state2_binary)
{
let binary_array = Arc::new(BinaryArray::from(vec![
bytes1.as_slice(),
bytes2.as_slice(),
])) as ArrayRef;
merge_state.update_batch(&[binary_array]).unwrap();
let result = merge_state.evaluate().unwrap();
if let ScalarValue::Binary(Some(bytes)) = result {
let mut hll: HllStateType = bincode::deserialize(&bytes).unwrap();
// Should have 3 unique values: "1", "2", "3"
assert_eq!(hll.count().trunc() as u32, 3);
} else {
panic!("Expected binary scalar value");
}
} else {
panic!("Expected binary scalar values");
}
}
}

View File

@@ -0,0 +1,307 @@
// Copyright 2023 Greptime Team
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
use std::sync::Arc;
use common_query::prelude::*;
use common_telemetry::trace;
use datafusion::common::cast::{as_binary_array, as_primitive_array};
use datafusion::common::not_impl_err;
use datafusion::error::{DataFusionError, Result as DfResult};
use datafusion::logical_expr::function::AccumulatorArgs;
use datafusion::logical_expr::{Accumulator as DfAccumulator, AggregateUDF};
use datafusion::physical_plan::expressions::Literal;
use datafusion::prelude::create_udaf;
use datatypes::arrow::array::ArrayRef;
use datatypes::arrow::datatypes::{DataType, Float64Type};
use uddsketch::{SketchHashKey, UDDSketch};
pub const UDDSKETCH_STATE_NAME: &str = "uddsketch_state";
#[derive(Debug)]
pub struct UddSketchState {
uddsketch: UDDSketch,
}
impl UddSketchState {
pub fn new(bucket_size: u64, error_rate: f64) -> Self {
Self {
uddsketch: UDDSketch::new(bucket_size, error_rate),
}
}
pub fn udf_impl() -> AggregateUDF {
create_udaf(
UDDSKETCH_STATE_NAME,
vec![DataType::Int64, DataType::Float64, DataType::Float64],
Arc::new(DataType::Binary),
Volatility::Immutable,
Arc::new(|args| {
let (bucket_size, error_rate) = downcast_accumulator_args(args)?;
Ok(Box::new(UddSketchState::new(bucket_size, error_rate)))
}),
Arc::new(vec![DataType::Binary]),
)
}
fn update(&mut self, value: f64) {
self.uddsketch.add_value(value);
}
fn merge(&mut self, raw: &[u8]) {
if let Ok(uddsketch) = bincode::deserialize::<UDDSketch>(raw) {
if uddsketch.count() != 0 {
self.uddsketch.merge_sketch(&uddsketch);
}
} else {
trace!("Warning: Failed to deserialize UDDSketch from {:?}", raw);
}
}
}
fn downcast_accumulator_args(args: AccumulatorArgs) -> DfResult<(u64, f64)> {
let bucket_size = match args.exprs[0]
.as_any()
.downcast_ref::<Literal>()
.map(|lit| lit.value())
{
Some(ScalarValue::Int64(Some(value))) => *value as u64,
_ => {
return not_impl_err!(
"{} not supported for bucket size: {}",
UDDSKETCH_STATE_NAME,
&args.exprs[0]
)
}
};
let error_rate = match args.exprs[1]
.as_any()
.downcast_ref::<Literal>()
.map(|lit| lit.value())
{
Some(ScalarValue::Float64(Some(value))) => *value,
_ => {
return not_impl_err!(
"{} not supported for error rate: {}",
UDDSKETCH_STATE_NAME,
&args.exprs[1]
)
}
};
Ok((bucket_size, error_rate))
}
impl DfAccumulator for UddSketchState {
fn update_batch(&mut self, values: &[ArrayRef]) -> DfResult<()> {
let array = &values[2]; // the third column is data value
let f64_array = as_primitive_array::<Float64Type>(array)?;
for v in f64_array.iter().flatten() {
self.update(v);
}
Ok(())
}
fn evaluate(&mut self) -> DfResult<ScalarValue> {
Ok(ScalarValue::Binary(Some(
bincode::serialize(&self.uddsketch).map_err(|e| {
DataFusionError::Internal(format!("Failed to serialize UDDSketch: {}", e))
})?,
)))
}
fn size(&self) -> usize {
// Base size of UDDSketch struct fields
let mut total_size = std::mem::size_of::<f64>() * 3 + // alpha, gamma, values_sum
std::mem::size_of::<u32>() + // compactions
std::mem::size_of::<u64>() * 2; // max_buckets, num_values
// Size of buckets (SketchHashMap)
// Each bucket entry contains:
// - SketchHashKey (enum with i64/Zero/Invalid variants)
// - SketchHashEntry (count: u64, next: SketchHashKey)
let bucket_entry_size = std::mem::size_of::<SketchHashKey>() + // key
std::mem::size_of::<u64>() + // count
std::mem::size_of::<SketchHashKey>(); // next
total_size += self.uddsketch.current_buckets_count() * bucket_entry_size;
total_size
}
fn state(&mut self) -> DfResult<Vec<ScalarValue>> {
Ok(vec![ScalarValue::Binary(Some(
bincode::serialize(&self.uddsketch).map_err(|e| {
DataFusionError::Internal(format!("Failed to serialize UDDSketch: {}", e))
})?,
))])
}
fn merge_batch(&mut self, states: &[ArrayRef]) -> DfResult<()> {
let array = &states[0];
let binary_array = as_binary_array(array)?;
for v in binary_array.iter().flatten() {
self.merge(v);
}
Ok(())
}
}
#[cfg(test)]
mod tests {
use datafusion::arrow::array::{BinaryArray, Float64Array};
use super::*;
#[test]
fn test_uddsketch_state_basic() {
let mut state = UddSketchState::new(10, 0.01);
state.update(1.0);
state.update(2.0);
state.update(3.0);
let result = state.evaluate().unwrap();
if let ScalarValue::Binary(Some(bytes)) = result {
let deserialized: UDDSketch = bincode::deserialize(&bytes).unwrap();
assert_eq!(deserialized.count(), 3);
} else {
panic!("Expected binary scalar value");
}
}
#[test]
fn test_uddsketch_state_roundtrip() {
let mut state = UddSketchState::new(10, 0.01);
state.update(1.0);
state.update(2.0);
// Serialize
let serialized = state.evaluate().unwrap();
// Create new state and merge the serialized data
let mut new_state = UddSketchState::new(10, 0.01);
if let ScalarValue::Binary(Some(bytes)) = &serialized {
new_state.merge(bytes);
// Verify the merged state matches original by comparing deserialized values
let original_sketch: UDDSketch = bincode::deserialize(bytes).unwrap();
let new_result = new_state.evaluate().unwrap();
if let ScalarValue::Binary(Some(new_bytes)) = new_result {
let new_sketch: UDDSketch = bincode::deserialize(&new_bytes).unwrap();
assert_eq!(original_sketch.count(), new_sketch.count());
assert_eq!(original_sketch.sum(), new_sketch.sum());
assert_eq!(original_sketch.mean(), new_sketch.mean());
assert_eq!(original_sketch.max_error(), new_sketch.max_error());
// Compare a few quantiles to ensure statistical equivalence
for q in [0.1, 0.5, 0.9].iter() {
assert!(
(original_sketch.estimate_quantile(*q) - new_sketch.estimate_quantile(*q))
.abs()
< 1e-10,
"Quantile {} mismatch: original={}, new={}",
q,
original_sketch.estimate_quantile(*q),
new_sketch.estimate_quantile(*q)
);
}
} else {
panic!("Expected binary scalar value");
}
} else {
panic!("Expected binary scalar value");
}
}
#[test]
fn test_uddsketch_state_batch_update() {
let mut state = UddSketchState::new(10, 0.01);
let values = vec![1.0f64, 2.0, 3.0];
let array = Arc::new(Float64Array::from(values)) as ArrayRef;
state
.update_batch(&[array.clone(), array.clone(), array])
.unwrap();
let result = state.evaluate().unwrap();
if let ScalarValue::Binary(Some(bytes)) = result {
let deserialized: UDDSketch = bincode::deserialize(&bytes).unwrap();
assert_eq!(deserialized.count(), 3);
} else {
panic!("Expected binary scalar value");
}
}
#[test]
fn test_uddsketch_state_merge_batch() {
let mut state1 = UddSketchState::new(10, 0.01);
state1.update(1.0);
let state1_binary = state1.evaluate().unwrap();
let mut state2 = UddSketchState::new(10, 0.01);
state2.update(2.0);
let state2_binary = state2.evaluate().unwrap();
let mut merged_state = UddSketchState::new(10, 0.01);
if let (ScalarValue::Binary(Some(bytes1)), ScalarValue::Binary(Some(bytes2))) =
(&state1_binary, &state2_binary)
{
let binary_array = Arc::new(BinaryArray::from(vec![
bytes1.as_slice(),
bytes2.as_slice(),
])) as ArrayRef;
merged_state.merge_batch(&[binary_array]).unwrap();
let result = merged_state.evaluate().unwrap();
if let ScalarValue::Binary(Some(bytes)) = result {
let deserialized: UDDSketch = bincode::deserialize(&bytes).unwrap();
assert_eq!(deserialized.count(), 2);
} else {
panic!("Expected binary scalar value");
}
} else {
panic!("Expected binary scalar values");
}
}
#[test]
fn test_uddsketch_state_size() {
let mut state = UddSketchState::new(10, 0.01);
let initial_size = state.size();
// Add some values to create buckets
state.update(1.0);
state.update(2.0);
state.update(3.0);
let size_with_values = state.size();
assert!(
size_with_values > initial_size,
"Size should increase after adding values: initial={}, with_values={}",
initial_size,
size_with_values
);
// Verify size increases with more buckets
state.update(10.0); // This should create a new bucket
assert!(
state.size() > size_with_values,
"Size should increase after adding new bucket: prev={}, new={}",
size_with_values,
state.size()
);
}
}

View File

@@ -22,10 +22,12 @@ use crate::function::{AsyncFunctionRef, FunctionRef};
use crate::scalars::aggregate::{AggregateFunctionMetaRef, AggregateFunctions};
use crate::scalars::date::DateFunction;
use crate::scalars::expression::ExpressionFunction;
use crate::scalars::hll_count::HllCalcFunction;
use crate::scalars::json::JsonFunction;
use crate::scalars::matches::MatchesFunction;
use crate::scalars::math::MathFunction;
use crate::scalars::timestamp::TimestampFunction;
use crate::scalars::uddsketch_calc::UddSketchCalcFunction;
use crate::scalars::vector::VectorFunction;
use crate::system::SystemFunction;
use crate::table::TableFunction;
@@ -105,6 +107,8 @@ pub static FUNCTION_REGISTRY: Lazy<Arc<FunctionRegistry>> = Lazy::new(|| {
TimestampFunction::register(&function_registry);
DateFunction::register(&function_registry);
ExpressionFunction::register(&function_registry);
UddSketchCalcFunction::register(&function_registry);
HllCalcFunction::register(&function_registry);
// Aggregate functions
AggregateFunctions::register(&function_registry);

View File

@@ -21,6 +21,7 @@ pub mod scalars;
mod system;
mod table;
pub mod aggr;
pub mod function;
pub mod function_registry;
pub mod handlers;

View File

@@ -22,7 +22,9 @@ pub mod matches;
pub mod math;
pub mod vector;
pub(crate) mod hll_count;
#[cfg(test)]
pub(crate) mod test;
pub(crate) mod timestamp;
pub(crate) mod uddsketch_calc;
pub mod udf;

View File

@@ -0,0 +1,175 @@
// Copyright 2023 Greptime Team
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
//! Implementation of the scalar function `hll_count`.
use std::fmt;
use std::fmt::Display;
use std::sync::Arc;
use common_query::error::{DowncastVectorSnafu, InvalidFuncArgsSnafu, Result};
use common_query::prelude::{Signature, Volatility};
use datatypes::data_type::ConcreteDataType;
use datatypes::prelude::Vector;
use datatypes::scalars::{ScalarVector, ScalarVectorBuilder};
use datatypes::vectors::{BinaryVector, MutableVector, UInt64VectorBuilder, VectorRef};
use hyperloglogplus::HyperLogLog;
use snafu::OptionExt;
use crate::aggr::HllStateType;
use crate::function::{Function, FunctionContext};
use crate::function_registry::FunctionRegistry;
const NAME: &str = "hll_count";
/// HllCalcFunction implements the scalar function `hll_count`.
///
/// It accepts one argument:
/// 1. The serialized HyperLogLogPlus state, as produced by the aggregator (binary).
///
/// For each row, it deserializes the sketch and returns the estimated cardinality.
#[derive(Debug, Default)]
pub struct HllCalcFunction;
impl HllCalcFunction {
pub fn register(registry: &FunctionRegistry) {
registry.register(Arc::new(HllCalcFunction));
}
}
impl Display for HllCalcFunction {
fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
write!(f, "{}", NAME.to_ascii_uppercase())
}
}
impl Function for HllCalcFunction {
fn name(&self) -> &str {
NAME
}
fn return_type(&self, _input_types: &[ConcreteDataType]) -> Result<ConcreteDataType> {
Ok(ConcreteDataType::uint64_datatype())
}
fn signature(&self) -> Signature {
// Only argument: HyperLogLogPlus state (binary)
Signature::exact(
vec![ConcreteDataType::binary_datatype()],
Volatility::Immutable,
)
}
fn eval(&self, _func_ctx: FunctionContext, columns: &[VectorRef]) -> Result<VectorRef> {
if columns.len() != 1 {
return InvalidFuncArgsSnafu {
err_msg: format!("hll_count expects 1 argument, got {}", columns.len()),
}
.fail();
}
let hll_vec = columns[0]
.as_any()
.downcast_ref::<BinaryVector>()
.with_context(|| DowncastVectorSnafu {
err_msg: format!("expect BinaryVector, got {}", columns[0].vector_type_name()),
})?;
let len = hll_vec.len();
let mut builder = UInt64VectorBuilder::with_capacity(len);
for i in 0..len {
let hll_opt = hll_vec.get_data(i);
if hll_opt.is_none() {
builder.push_null();
continue;
}
let hll_bytes = hll_opt.unwrap();
// Deserialize the HyperLogLogPlus from its bincode representation
let mut hll: HllStateType = match bincode::deserialize(hll_bytes) {
Ok(h) => h,
Err(e) => {
common_telemetry::trace!("Failed to deserialize HyperLogLogPlus: {}", e);
builder.push_null();
continue;
}
};
builder.push(Some(hll.count().round() as u64));
}
Ok(builder.to_vector())
}
}
#[cfg(test)]
mod tests {
use datatypes::vectors::BinaryVector;
use super::*;
use crate::utils::FixedRandomState;
#[test]
fn test_hll_count_function() {
let function = HllCalcFunction;
assert_eq!("hll_count", function.name());
assert_eq!(
ConcreteDataType::uint64_datatype(),
function
.return_type(&[ConcreteDataType::uint64_datatype()])
.unwrap()
);
// Create a test HLL
let mut hll = HllStateType::new(14, FixedRandomState::new()).unwrap();
for i in 1..=10 {
hll.insert(&i.to_string());
}
let serialized_bytes = bincode::serialize(&hll).unwrap();
let args: Vec<VectorRef> = vec![Arc::new(BinaryVector::from(vec![Some(serialized_bytes)]))];
let result = function.eval(FunctionContext::default(), &args).unwrap();
assert_eq!(result.len(), 1);
// Test cardinality estimate
if let datatypes::value::Value::UInt64(v) = result.get(0) {
assert_eq!(v, 10);
} else {
panic!("Expected uint64 value");
}
}
#[test]
fn test_hll_count_function_errors() {
let function = HllCalcFunction;
// Test with invalid number of arguments
let args: Vec<VectorRef> = vec![];
let result = function.eval(FunctionContext::default(), &args);
assert!(result.is_err());
assert!(result
.unwrap_err()
.to_string()
.contains("hll_count expects 1 argument"));
// Test with invalid binary data
let args: Vec<VectorRef> = vec![Arc::new(BinaryVector::from(vec![Some(vec![1, 2, 3])]))]; // Invalid binary data
let result = function.eval(FunctionContext::default(), &args).unwrap();
assert_eq!(result.len(), 1);
assert!(matches!(result.get(0), datatypes::value::Value::Null));
}
}

View File

@@ -13,7 +13,7 @@
// limitations under the License.
use std::sync::Arc;
mod json_get;
pub mod json_get;
mod json_is;
mod json_path_exists;
mod json_path_match;

View File

@@ -0,0 +1,211 @@
// Copyright 2023 Greptime Team
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
//! Implementation of the scalar function `uddsketch_calc`.
use std::fmt;
use std::fmt::Display;
use std::sync::Arc;
use common_query::error::{DowncastVectorSnafu, InvalidFuncArgsSnafu, Result};
use common_query::prelude::{Signature, Volatility};
use datatypes::data_type::ConcreteDataType;
use datatypes::prelude::Vector;
use datatypes::scalars::{ScalarVector, ScalarVectorBuilder};
use datatypes::vectors::{BinaryVector, Float64VectorBuilder, MutableVector, VectorRef};
use snafu::OptionExt;
use uddsketch::UDDSketch;
use crate::function::{Function, FunctionContext};
use crate::function_registry::FunctionRegistry;
const NAME: &str = "uddsketch_calc";
/// UddSketchCalcFunction implements the scalar function `uddsketch_calc`.
///
/// It accepts two arguments:
/// 1. A percentile (as f64) for which to compute the estimated quantile (e.g. 0.95 for p95).
/// 2. The serialized UDDSketch state, as produced by the aggregator (binary).
///
/// For each row, it deserializes the sketch and returns the computed quantile value.
#[derive(Debug, Default)]
pub struct UddSketchCalcFunction;
impl UddSketchCalcFunction {
pub fn register(registry: &FunctionRegistry) {
registry.register(Arc::new(UddSketchCalcFunction));
}
}
impl Display for UddSketchCalcFunction {
fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
write!(f, "{}", NAME.to_ascii_uppercase())
}
}
impl Function for UddSketchCalcFunction {
fn name(&self) -> &str {
NAME
}
fn return_type(&self, _input_types: &[ConcreteDataType]) -> Result<ConcreteDataType> {
Ok(ConcreteDataType::float64_datatype())
}
fn signature(&self) -> Signature {
// First argument: percentile (float64)
// Second argument: UDDSketch state (binary)
Signature::exact(
vec![
ConcreteDataType::float64_datatype(),
ConcreteDataType::binary_datatype(),
],
Volatility::Immutable,
)
}
fn eval(&self, _func_ctx: FunctionContext, columns: &[VectorRef]) -> Result<VectorRef> {
if columns.len() != 2 {
return InvalidFuncArgsSnafu {
err_msg: format!("uddsketch_calc expects 2 arguments, got {}", columns.len()),
}
.fail();
}
let perc_vec = &columns[0];
let sketch_vec = columns[1]
.as_any()
.downcast_ref::<BinaryVector>()
.with_context(|| DowncastVectorSnafu {
err_msg: format!("expect BinaryVector, got {}", columns[1].vector_type_name()),
})?;
let len = sketch_vec.len();
let mut builder = Float64VectorBuilder::with_capacity(len);
for i in 0..len {
let perc_opt = perc_vec.get(i).as_f64_lossy();
let sketch_opt = sketch_vec.get_data(i);
if sketch_opt.is_none() || perc_opt.is_none() {
builder.push_null();
continue;
}
let sketch_bytes = sketch_opt.unwrap();
let perc = perc_opt.unwrap();
// Deserialize the UDDSketch from its bincode representation
let sketch: UDDSketch = match bincode::deserialize(sketch_bytes) {
Ok(s) => s,
Err(e) => {
common_telemetry::trace!("Failed to deserialize UDDSketch: {}", e);
builder.push_null();
continue;
}
};
// Compute the estimated quantile from the sketch
let result = sketch.estimate_quantile(perc);
builder.push(Some(result));
}
Ok(builder.to_vector())
}
}
#[cfg(test)]
mod tests {
use std::sync::Arc;
use datatypes::vectors::{BinaryVector, Float64Vector};
use super::*;
#[test]
fn test_uddsketch_calc_function() {
let function = UddSketchCalcFunction;
assert_eq!("uddsketch_calc", function.name());
assert_eq!(
ConcreteDataType::float64_datatype(),
function
.return_type(&[ConcreteDataType::float64_datatype()])
.unwrap()
);
// Create a test sketch
let mut sketch = UDDSketch::new(128, 0.01);
sketch.add_value(10.0);
sketch.add_value(20.0);
sketch.add_value(30.0);
sketch.add_value(40.0);
sketch.add_value(50.0);
sketch.add_value(60.0);
sketch.add_value(70.0);
sketch.add_value(80.0);
sketch.add_value(90.0);
sketch.add_value(100.0);
// Get expected values directly from the sketch
let expected_p50 = sketch.estimate_quantile(0.5);
let expected_p90 = sketch.estimate_quantile(0.9);
let expected_p95 = sketch.estimate_quantile(0.95);
let serialized = bincode::serialize(&sketch).unwrap();
let percentiles = vec![0.5, 0.9, 0.95];
let args: Vec<VectorRef> = vec![
Arc::new(Float64Vector::from_vec(percentiles.clone())),
Arc::new(BinaryVector::from(vec![Some(serialized.clone()); 3])),
];
let result = function.eval(FunctionContext::default(), &args).unwrap();
assert_eq!(result.len(), 3);
// Test median (p50)
assert!(
matches!(result.get(0), datatypes::value::Value::Float64(v) if (v - expected_p50).abs() < 1e-10)
);
// Test p90
assert!(
matches!(result.get(1), datatypes::value::Value::Float64(v) if (v - expected_p90).abs() < 1e-10)
);
// Test p95
assert!(
matches!(result.get(2), datatypes::value::Value::Float64(v) if (v - expected_p95).abs() < 1e-10)
);
}
#[test]
fn test_uddsketch_calc_function_errors() {
let function = UddSketchCalcFunction;
// Test with invalid number of arguments
let args: Vec<VectorRef> = vec![Arc::new(Float64Vector::from_vec(vec![0.95]))];
let result = function.eval(FunctionContext::default(), &args);
assert!(result.is_err());
assert!(result
.unwrap_err()
.to_string()
.contains("uddsketch_calc expects 2 arguments"));
// Test with invalid binary data
let args: Vec<VectorRef> = vec![
Arc::new(Float64Vector::from_vec(vec![0.95])),
Arc::new(BinaryVector::from(vec![Some(vec![1, 2, 3])])), // Invalid binary data
];
let result = function.eval(FunctionContext::default(), &args).unwrap();
assert_eq!(result.len(), 1);
assert!(matches!(result.get(0), datatypes::value::Value::Null));
}
}

View File

@@ -20,11 +20,12 @@ pub mod impl_conv;
pub(crate) mod product;
mod scalar_add;
mod scalar_mul;
mod sub;
pub(crate) mod sum;
mod vector_add;
mod vector_div;
mod vector_mul;
mod vector_norm;
mod vector_sub;
use std::sync::Arc;
@@ -48,10 +49,11 @@ impl VectorFunction {
registry.register(Arc::new(scalar_mul::ScalarMulFunction));
// vector calculation
registry.register(Arc::new(vector_add::VectorAddFunction));
registry.register(Arc::new(vector_sub::VectorSubFunction));
registry.register(Arc::new(vector_mul::VectorMulFunction));
registry.register(Arc::new(vector_norm::VectorNormFunction));
registry.register(Arc::new(vector_div::VectorDivFunction));
registry.register(Arc::new(sub::SubFunction));
registry.register(Arc::new(vector_norm::VectorNormFunction));
registry.register(Arc::new(elem_sum::ElemSumFunction));
registry.register(Arc::new(elem_product::ElemProductFunction));
}

View File

@@ -0,0 +1,214 @@
// Copyright 2023 Greptime Team
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
use std::borrow::Cow;
use std::fmt::Display;
use common_query::error::InvalidFuncArgsSnafu;
use common_query::prelude::Signature;
use datatypes::prelude::ConcreteDataType;
use datatypes::scalars::ScalarVectorBuilder;
use datatypes::vectors::{BinaryVectorBuilder, MutableVector, VectorRef};
use nalgebra::DVectorView;
use snafu::ensure;
use crate::function::{Function, FunctionContext};
use crate::helper;
use crate::scalars::vector::impl_conv::{as_veclit, as_veclit_if_const, veclit_to_binlit};
const NAME: &str = "vec_add";
/// Adds corresponding elements of two vectors, returns a vector.
///
/// # Example
///
/// ```sql
/// SELECT vec_to_string(vec_add("[1.0, 1.0]", "[1.0, 2.0]")) as result;
///
/// +---------------------------------------------------------------+
/// | vec_to_string(vec_add(Utf8("[1.0, 1.0]"),Utf8("[1.0, 2.0]"))) |
/// +---------------------------------------------------------------+
/// | [2,3] |
/// +---------------------------------------------------------------+
///
#[derive(Debug, Clone, Default)]
pub struct VectorAddFunction;
impl Function for VectorAddFunction {
fn name(&self) -> &str {
NAME
}
fn return_type(
&self,
_input_types: &[ConcreteDataType],
) -> common_query::error::Result<ConcreteDataType> {
Ok(ConcreteDataType::binary_datatype())
}
fn signature(&self) -> Signature {
helper::one_of_sigs2(
vec![
ConcreteDataType::string_datatype(),
ConcreteDataType::binary_datatype(),
],
vec![
ConcreteDataType::string_datatype(),
ConcreteDataType::binary_datatype(),
],
)
}
fn eval(
&self,
_func_ctx: FunctionContext,
columns: &[VectorRef],
) -> common_query::error::Result<VectorRef> {
ensure!(
columns.len() == 2,
InvalidFuncArgsSnafu {
err_msg: format!(
"The length of the args is not correct, expect exactly two, have: {}",
columns.len()
)
}
);
let arg0 = &columns[0];
let arg1 = &columns[1];
ensure!(
arg0.len() == arg1.len(),
InvalidFuncArgsSnafu {
err_msg: format!(
"The lengths of the vector are not aligned, args 0: {}, args 1: {}",
arg0.len(),
arg1.len(),
)
}
);
let len = arg0.len();
let mut result = BinaryVectorBuilder::with_capacity(len);
if len == 0 {
return Ok(result.to_vector());
}
let arg0_const = as_veclit_if_const(arg0)?;
let arg1_const = as_veclit_if_const(arg1)?;
for i in 0..len {
let arg0 = match arg0_const.as_ref() {
Some(arg0) => Some(Cow::Borrowed(arg0.as_ref())),
None => as_veclit(arg0.get_ref(i))?,
};
let arg1 = match arg1_const.as_ref() {
Some(arg1) => Some(Cow::Borrowed(arg1.as_ref())),
None => as_veclit(arg1.get_ref(i))?,
};
let (Some(arg0), Some(arg1)) = (arg0, arg1) else {
result.push_null();
continue;
};
let vec0 = DVectorView::from_slice(&arg0, arg0.len());
let vec1 = DVectorView::from_slice(&arg1, arg1.len());
let vec_res = vec0 + vec1;
let veclit = vec_res.as_slice();
let binlit = veclit_to_binlit(veclit);
result.push(Some(&binlit));
}
Ok(result.to_vector())
}
}
impl Display for VectorAddFunction {
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
write!(f, "{}", NAME.to_ascii_uppercase())
}
}
#[cfg(test)]
mod tests {
use std::sync::Arc;
use common_query::error::Error;
use datatypes::vectors::StringVector;
use super::*;
#[test]
fn test_sub() {
let func = VectorAddFunction;
let input0 = Arc::new(StringVector::from(vec![
Some("[1.0,2.0,3.0]".to_string()),
Some("[4.0,5.0,6.0]".to_string()),
None,
Some("[2.0,3.0,3.0]".to_string()),
]));
let input1 = Arc::new(StringVector::from(vec![
Some("[1.0,1.0,1.0]".to_string()),
Some("[6.0,5.0,4.0]".to_string()),
Some("[3.0,2.0,2.0]".to_string()),
None,
]));
let result = func
.eval(FunctionContext::default(), &[input0, input1])
.unwrap();
let result = result.as_ref();
assert_eq!(result.len(), 4);
assert_eq!(
result.get_ref(0).as_binary().unwrap(),
Some(veclit_to_binlit(&[2.0, 3.0, 4.0]).as_slice())
);
assert_eq!(
result.get_ref(1).as_binary().unwrap(),
Some(veclit_to_binlit(&[10.0, 10.0, 10.0]).as_slice())
);
assert!(result.get_ref(2).is_null());
assert!(result.get_ref(3).is_null());
}
#[test]
fn test_sub_error() {
let func = VectorAddFunction;
let input0 = Arc::new(StringVector::from(vec![
Some("[1.0,2.0,3.0]".to_string()),
Some("[4.0,5.0,6.0]".to_string()),
None,
Some("[2.0,3.0,3.0]".to_string()),
]));
let input1 = Arc::new(StringVector::from(vec![
Some("[1.0,1.0,1.0]".to_string()),
Some("[6.0,5.0,4.0]".to_string()),
Some("[3.0,2.0,2.0]".to_string()),
]));
let result = func.eval(FunctionContext::default(), &[input0, input1]);
match result {
Err(Error::InvalidFuncArgs { err_msg, .. }) => {
assert_eq!(
err_msg,
"The lengths of the vector are not aligned, args 0: 4, args 1: 3"
)
}
_ => unreachable!(),
}
}
}

View File

@@ -42,19 +42,10 @@ const NAME: &str = "vec_sub";
/// | [0,-1] |
/// +---------------------------------------------------------------+
///
/// -- Negative scalar to simulate subtraction
/// SELECT vec_to_string(vec_sub('[-1.0, -1.0]', '[1.0, 2.0]'));
///
/// +-----------------------------------------------------------------+
/// | vec_to_string(vec_sub(Utf8("[-1.0, -1.0]"),Utf8("[1.0, 2.0]"))) |
/// +-----------------------------------------------------------------+
/// | [-2,-3] |
/// +-----------------------------------------------------------------+
///
#[derive(Debug, Clone, Default)]
pub struct SubFunction;
pub struct VectorSubFunction;
impl Function for SubFunction {
impl Function for VectorSubFunction {
fn name(&self) -> &str {
NAME
}
@@ -142,7 +133,7 @@ impl Function for SubFunction {
}
}
impl Display for SubFunction {
impl Display for VectorSubFunction {
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
write!(f, "{}", NAME.to_ascii_uppercase())
}
@@ -159,7 +150,7 @@ mod tests {
#[test]
fn test_sub() {
let func = SubFunction;
let func = VectorSubFunction;
let input0 = Arc::new(StringVector::from(vec![
Some("[1.0,2.0,3.0]".to_string()),
@@ -194,7 +185,7 @@ mod tests {
#[test]
fn test_sub_error() {
let func = SubFunction;
let func = VectorSubFunction;
let input0 = Arc::new(StringVector::from(vec![
Some("[1.0,2.0,3.0]".to_string()),

View File

@@ -12,6 +12,11 @@
// See the License for the specific language governing permissions and
// limitations under the License.
use std::hash::BuildHasher;
use ahash::RandomState;
use serde::{Deserialize, Serialize};
/// Escapes special characters in the provided pattern string for `LIKE`.
///
/// Specifically, it prefixes the backslash (`\`), percent (`%`), and underscore (`_`)
@@ -32,6 +37,71 @@ pub fn escape_like_pattern(pattern: &str) -> String {
})
.collect::<String>()
}
/// A random state with fixed seeds.
///
/// This is used to ensure that the hash values are consistent across
/// different processes, and easy to serialize and deserialize.
#[derive(Debug)]
pub struct FixedRandomState {
state: RandomState,
}
impl FixedRandomState {
// some random seeds
const RANDOM_SEED_0: u64 = 0x517cc1b727220a95;
const RANDOM_SEED_1: u64 = 0x428a2f98d728ae22;
const RANDOM_SEED_2: u64 = 0x7137449123ef65cd;
const RANDOM_SEED_3: u64 = 0xb5c0fbcfec4d3b2f;
pub fn new() -> Self {
Self {
state: ahash::RandomState::with_seeds(
Self::RANDOM_SEED_0,
Self::RANDOM_SEED_1,
Self::RANDOM_SEED_2,
Self::RANDOM_SEED_3,
),
}
}
}
impl Default for FixedRandomState {
fn default() -> Self {
Self::new()
}
}
impl BuildHasher for FixedRandomState {
type Hasher = ahash::AHasher;
fn build_hasher(&self) -> Self::Hasher {
self.state.build_hasher()
}
fn hash_one<T: std::hash::Hash>(&self, x: T) -> u64 {
self.state.hash_one(x)
}
}
impl Serialize for FixedRandomState {
fn serialize<S>(&self, serializer: S) -> Result<S::Ok, S::Error>
where
S: serde::Serializer,
{
serializer.serialize_unit()
}
}
impl<'de> Deserialize<'de> for FixedRandomState {
fn deserialize<D>(_deserializer: D) -> Result<Self, D::Error>
where
D: serde::Deserializer<'de>,
{
Ok(Self::new())
}
}
#[cfg(test)]
mod tests {
use super::*;

View File

@@ -22,4 +22,4 @@ store-api.workspace = true
table.workspace = true
[dev-dependencies]
paste = "1.0"
paste.workspace = true

View File

@@ -15,13 +15,14 @@
use api::helper::ColumnDataTypeWrapper;
use api::v1::add_column_location::LocationType;
use api::v1::alter_table_expr::Kind;
use api::v1::column_def::as_fulltext_option;
use api::v1::column_def::{as_fulltext_option, as_skipping_index_type};
use api::v1::{
column_def, AddColumnLocation as Location, AlterTableExpr, Analyzer, CreateTableExpr,
DropColumns, ModifyColumnTypes, RenameTable, SemanticType,
SkippingIndexType as PbSkippingIndexType,
};
use common_query::AddColumnLocation;
use datatypes::schema::{ColumnSchema, FulltextOptions, RawSchema};
use datatypes::schema::{ColumnSchema, FulltextOptions, RawSchema, SkippingIndexOptions};
use snafu::{ensure, OptionExt, ResultExt};
use store_api::region_request::{SetRegionOption, UnsetRegionOption};
use table::metadata::TableId;
@@ -31,7 +32,8 @@ use table::requests::{
};
use crate::error::{
InvalidColumnDefSnafu, InvalidSetFulltextOptionRequestSnafu, InvalidSetTableOptionRequestSnafu,
InvalidColumnDefSnafu, InvalidSetFulltextOptionRequestSnafu,
InvalidSetSkippingIndexOptionRequestSnafu, InvalidSetTableOptionRequestSnafu,
InvalidUnsetTableOptionRequestSnafu, MissingAlterIndexOptionSnafu, MissingFieldSnafu,
MissingTimestampColumnSnafu, Result, UnknownLocationTypeSnafu,
};
@@ -137,6 +139,18 @@ pub fn alter_expr_to_request(table_id: TableId, expr: AlterTableExpr) -> Result<
column_name: i.column_name,
},
},
api::v1::set_index::Options::Skipping(s) => AlterKind::SetIndex {
options: SetIndexOptions::Skipping {
column_name: s.column_name,
options: SkippingIndexOptions {
granularity: s.granularity as u32,
index_type: as_skipping_index_type(
PbSkippingIndexType::try_from(s.skipping_index_type)
.context(InvalidSetSkippingIndexOptionRequestSnafu)?,
),
},
},
},
},
None => return MissingAlterIndexOptionSnafu.fail(),
},
@@ -152,6 +166,11 @@ pub fn alter_expr_to_request(table_id: TableId, expr: AlterTableExpr) -> Result<
column_name: i.column_name,
},
},
api::v1::unset_index::Options::Skipping(s) => AlterKind::UnsetIndex {
options: UnsetIndexOptions::Skipping {
column_name: s.column_name,
},
},
},
None => return MissingAlterIndexOptionSnafu.fail(),
},

View File

@@ -140,6 +140,14 @@ pub enum Error {
error: prost::UnknownEnumValue,
},
#[snafu(display("Invalid set skipping index option request"))]
InvalidSetSkippingIndexOptionRequest {
#[snafu(implicit)]
location: Location,
#[snafu(source)]
error: prost::UnknownEnumValue,
},
#[snafu(display("Missing alter index options"))]
MissingAlterIndexOption {
#[snafu(implicit)]
@@ -171,6 +179,7 @@ impl ErrorExt for Error {
Error::InvalidSetTableOptionRequest { .. }
| Error::InvalidUnsetTableOptionRequest { .. }
| Error::InvalidSetFulltextOptionRequest { .. }
| Error::InvalidSetSkippingIndexOptionRequest { .. }
| Error::MissingAlterIndexOption { .. } => StatusCode::InvalidArguments,
}
}

View File

@@ -14,37 +14,12 @@
use api::helper;
use api::v1::column::Values;
use api::v1::{Column, CreateTableExpr};
use common_base::BitVec;
use datatypes::data_type::{ConcreteDataType, DataType};
use datatypes::prelude::VectorRef;
use snafu::{ensure, ResultExt};
use table::metadata::TableId;
use table::table_reference::TableReference;
use crate::error::{CreateVectorSnafu, Result, UnexpectedValuesLengthSnafu};
use crate::util;
use crate::util::ColumnExpr;
/// Try to build create table request from insert data.
pub fn build_create_expr_from_insertion(
catalog_name: &str,
schema_name: &str,
table_id: Option<TableId>,
table_name: &str,
columns: &[Column],
engine: &str,
) -> Result<CreateTableExpr> {
let table_name = TableReference::full(catalog_name, schema_name, table_name);
let column_exprs = ColumnExpr::from_columns(columns);
util::build_create_table_expr(
table_id,
&table_name,
column_exprs,
engine,
"Created on insertion",
)
}
pub(crate) fn add_values_to_builder(
data_type: ConcreteDataType,
@@ -87,276 +62,7 @@ fn is_null(null_mask: &BitVec, idx: usize) -> Option<bool> {
#[cfg(test)]
mod tests {
use std::sync::Arc;
use std::{assert_eq, vec};
use api::helper::ColumnDataTypeWrapper;
use api::v1::column::Values;
use api::v1::column_data_type_extension::TypeExt;
use api::v1::{
Column, ColumnDataType, ColumnDataTypeExtension, Decimal128, DecimalTypeExtension,
IntervalMonthDayNano, SemanticType,
};
use common_base::BitVec;
use common_catalog::consts::MITO_ENGINE;
use common_time::interval::IntervalUnit;
use common_time::timestamp::TimeUnit;
use datatypes::data_type::ConcreteDataType;
use datatypes::schema::{ColumnSchema, SchemaBuilder};
use snafu::ResultExt;
use super::*;
use crate::error;
use crate::error::ColumnDataTypeSnafu;
#[inline]
fn build_column_schema(
column_name: &str,
datatype: i32,
nullable: bool,
) -> error::Result<ColumnSchema> {
let datatype_wrapper =
ColumnDataTypeWrapper::try_new(datatype, None).context(ColumnDataTypeSnafu)?;
Ok(ColumnSchema::new(
column_name,
datatype_wrapper.into(),
nullable,
))
}
#[test]
fn test_build_create_table_request() {
let table_id = Some(10);
let table_name = "test_metric";
assert!(
build_create_expr_from_insertion("", "", table_id, table_name, &[], MITO_ENGINE)
.is_err()
);
let insert_batch = mock_insert_batch();
let create_expr = build_create_expr_from_insertion(
"",
"",
table_id,
table_name,
&insert_batch.0,
MITO_ENGINE,
)
.unwrap();
assert_eq!(table_id, create_expr.table_id.map(|x| x.id));
assert_eq!(table_name, create_expr.table_name);
assert_eq!("Created on insertion".to_string(), create_expr.desc);
assert_eq!(
vec![create_expr.column_defs[0].name.clone()],
create_expr.primary_keys
);
let column_defs = create_expr.column_defs;
assert_eq!(column_defs[5].name, create_expr.time_index);
assert_eq!(7, column_defs.len());
assert_eq!(
ConcreteDataType::string_datatype(),
ConcreteDataType::from(
ColumnDataTypeWrapper::try_new(
column_defs
.iter()
.find(|c| c.name == "host")
.unwrap()
.data_type,
None
)
.unwrap()
)
);
assert_eq!(
ConcreteDataType::float64_datatype(),
ConcreteDataType::from(
ColumnDataTypeWrapper::try_new(
column_defs
.iter()
.find(|c| c.name == "cpu")
.unwrap()
.data_type,
None
)
.unwrap()
)
);
assert_eq!(
ConcreteDataType::float64_datatype(),
ConcreteDataType::from(
ColumnDataTypeWrapper::try_new(
column_defs
.iter()
.find(|c| c.name == "memory")
.unwrap()
.data_type,
None
)
.unwrap()
)
);
assert_eq!(
ConcreteDataType::time_datatype(TimeUnit::Millisecond),
ConcreteDataType::from(
ColumnDataTypeWrapper::try_new(
column_defs
.iter()
.find(|c| c.name == "time")
.unwrap()
.data_type,
None
)
.unwrap()
)
);
assert_eq!(
ConcreteDataType::interval_datatype(IntervalUnit::MonthDayNano),
ConcreteDataType::from(
ColumnDataTypeWrapper::try_new(
column_defs
.iter()
.find(|c| c.name == "interval")
.unwrap()
.data_type,
None
)
.unwrap()
)
);
assert_eq!(
ConcreteDataType::timestamp_millisecond_datatype(),
ConcreteDataType::from(
ColumnDataTypeWrapper::try_new(
column_defs
.iter()
.find(|c| c.name == "ts")
.unwrap()
.data_type,
None
)
.unwrap()
)
);
let decimal_column = column_defs.iter().find(|c| c.name == "decimals").unwrap();
assert_eq!(
ConcreteDataType::decimal128_datatype(38, 10),
ConcreteDataType::from(
ColumnDataTypeWrapper::try_new(
decimal_column.data_type,
decimal_column.datatype_extension,
)
.unwrap()
)
);
}
#[test]
fn test_find_new_columns() {
let mut columns = Vec::with_capacity(1);
let cpu_column = build_column_schema("cpu", 10, true).unwrap();
let ts_column = build_column_schema("ts", 15, false)
.unwrap()
.with_time_index(true);
columns.push(cpu_column);
columns.push(ts_column);
let schema = Arc::new(SchemaBuilder::try_from(columns).unwrap().build().unwrap());
assert!(
util::extract_new_columns(&schema, ColumnExpr::from_columns(&[]))
.unwrap()
.is_none()
);
let insert_batch = mock_insert_batch();
let add_columns =
util::extract_new_columns(&schema, ColumnExpr::from_columns(&insert_batch.0))
.unwrap()
.unwrap();
assert_eq!(5, add_columns.add_columns.len());
let host_column = &add_columns.add_columns[0];
assert_eq!(
ConcreteDataType::string_datatype(),
ConcreteDataType::from(
ColumnDataTypeWrapper::try_new(
host_column.column_def.as_ref().unwrap().data_type,
None
)
.unwrap()
)
);
assert!(host_column.add_if_not_exists);
let memory_column = &add_columns.add_columns[1];
assert_eq!(
ConcreteDataType::float64_datatype(),
ConcreteDataType::from(
ColumnDataTypeWrapper::try_new(
memory_column.column_def.as_ref().unwrap().data_type,
None
)
.unwrap()
)
);
assert!(host_column.add_if_not_exists);
let time_column = &add_columns.add_columns[2];
assert_eq!(
ConcreteDataType::time_datatype(TimeUnit::Millisecond),
ConcreteDataType::from(
ColumnDataTypeWrapper::try_new(
time_column.column_def.as_ref().unwrap().data_type,
None
)
.unwrap()
)
);
assert!(host_column.add_if_not_exists);
let interval_column = &add_columns.add_columns[3];
assert_eq!(
ConcreteDataType::interval_datatype(IntervalUnit::MonthDayNano),
ConcreteDataType::from(
ColumnDataTypeWrapper::try_new(
interval_column.column_def.as_ref().unwrap().data_type,
None
)
.unwrap()
)
);
assert!(host_column.add_if_not_exists);
let decimal_column = &add_columns.add_columns[4];
assert_eq!(
ConcreteDataType::decimal128_datatype(38, 10),
ConcreteDataType::from(
ColumnDataTypeWrapper::try_new(
decimal_column.column_def.as_ref().unwrap().data_type,
decimal_column
.column_def
.as_ref()
.unwrap()
.datatype_extension
)
.unwrap()
)
);
assert!(host_column.add_if_not_exists);
}
#[test]
fn test_is_null() {
@@ -371,127 +77,4 @@ mod tests {
assert_eq!(None, is_null(&null_mask, 16));
assert_eq!(None, is_null(&null_mask, 99));
}
fn mock_insert_batch() -> (Vec<Column>, u32) {
let row_count = 2;
let host_vals = Values {
string_values: vec!["host1".to_string(), "host2".to_string()],
..Default::default()
};
let host_column = Column {
column_name: "host".to_string(),
semantic_type: SemanticType::Tag as i32,
values: Some(host_vals),
null_mask: vec![0],
datatype: ColumnDataType::String as i32,
..Default::default()
};
let cpu_vals = Values {
f64_values: vec![0.31],
..Default::default()
};
let cpu_column = Column {
column_name: "cpu".to_string(),
semantic_type: SemanticType::Field as i32,
values: Some(cpu_vals),
null_mask: vec![2],
datatype: ColumnDataType::Float64 as i32,
..Default::default()
};
let mem_vals = Values {
f64_values: vec![0.1],
..Default::default()
};
let mem_column = Column {
column_name: "memory".to_string(),
semantic_type: SemanticType::Field as i32,
values: Some(mem_vals),
null_mask: vec![1],
datatype: ColumnDataType::Float64 as i32,
..Default::default()
};
let time_vals = Values {
time_millisecond_values: vec![100, 101],
..Default::default()
};
let time_column = Column {
column_name: "time".to_string(),
semantic_type: SemanticType::Field as i32,
values: Some(time_vals),
null_mask: vec![0],
datatype: ColumnDataType::TimeMillisecond as i32,
..Default::default()
};
let interval1 = IntervalMonthDayNano {
months: 1,
days: 2,
nanoseconds: 3,
};
let interval2 = IntervalMonthDayNano {
months: 4,
days: 5,
nanoseconds: 6,
};
let interval_vals = Values {
interval_month_day_nano_values: vec![interval1, interval2],
..Default::default()
};
let interval_column = Column {
column_name: "interval".to_string(),
semantic_type: SemanticType::Field as i32,
values: Some(interval_vals),
null_mask: vec![0],
datatype: ColumnDataType::IntervalMonthDayNano as i32,
..Default::default()
};
let ts_vals = Values {
timestamp_millisecond_values: vec![100, 101],
..Default::default()
};
let ts_column = Column {
column_name: "ts".to_string(),
semantic_type: SemanticType::Timestamp as i32,
values: Some(ts_vals),
null_mask: vec![0],
datatype: ColumnDataType::TimestampMillisecond as i32,
..Default::default()
};
let decimal_vals = Values {
decimal128_values: vec![Decimal128 { hi: 0, lo: 123 }, Decimal128 { hi: 0, lo: 456 }],
..Default::default()
};
let decimal_column = Column {
column_name: "decimals".to_string(),
semantic_type: SemanticType::Field as i32,
values: Some(decimal_vals),
null_mask: vec![0],
datatype: ColumnDataType::Decimal128 as i32,
datatype_extension: Some(ColumnDataTypeExtension {
type_ext: Some(TypeExt::DecimalType(DecimalTypeExtension {
precision: 38,
scale: 10,
})),
}),
options: None,
};
(
vec![
host_column,
cpu_column,
mem_column,
time_column,
interval_column,
ts_column,
decimal_column,
],
row_count,
)
}
}

View File

@@ -19,4 +19,3 @@ pub mod insert;
pub mod util;
pub use alter::{alter_expr_to_request, create_table_schema};
pub use insert::build_create_expr_from_insertion;

View File

@@ -236,3 +236,414 @@ pub fn extract_new_columns(
}))
}
}
#[cfg(test)]
mod tests {
use std::sync::Arc;
use std::{assert_eq, vec};
use api::helper::ColumnDataTypeWrapper;
use api::v1::column::Values;
use api::v1::column_data_type_extension::TypeExt;
use api::v1::{
Column, ColumnDataType, ColumnDataTypeExtension, Decimal128, DecimalTypeExtension,
IntervalMonthDayNano, SemanticType,
};
use common_catalog::consts::MITO_ENGINE;
use common_time::interval::IntervalUnit;
use common_time::timestamp::TimeUnit;
use datatypes::data_type::ConcreteDataType;
use datatypes::schema::{ColumnSchema, SchemaBuilder};
use snafu::ResultExt;
use super::*;
use crate::error;
use crate::error::ColumnDataTypeSnafu;
#[inline]
fn build_column_schema(
column_name: &str,
datatype: i32,
nullable: bool,
) -> error::Result<ColumnSchema> {
let datatype_wrapper =
ColumnDataTypeWrapper::try_new(datatype, None).context(ColumnDataTypeSnafu)?;
Ok(ColumnSchema::new(
column_name,
datatype_wrapper.into(),
nullable,
))
}
fn build_create_expr_from_insertion(
catalog_name: &str,
schema_name: &str,
table_id: Option<TableId>,
table_name: &str,
columns: &[Column],
engine: &str,
) -> Result<CreateTableExpr> {
let table_name = TableReference::full(catalog_name, schema_name, table_name);
let column_exprs = ColumnExpr::from_columns(columns);
build_create_table_expr(
table_id,
&table_name,
column_exprs,
engine,
"Created on insertion",
)
}
#[test]
fn test_build_create_table_request() {
let table_id = Some(10);
let table_name = "test_metric";
assert!(
build_create_expr_from_insertion("", "", table_id, table_name, &[], MITO_ENGINE)
.is_err()
);
let insert_batch = mock_insert_batch();
let create_expr = build_create_expr_from_insertion(
"",
"",
table_id,
table_name,
&insert_batch.0,
MITO_ENGINE,
)
.unwrap();
assert_eq!(table_id, create_expr.table_id.map(|x| x.id));
assert_eq!(table_name, create_expr.table_name);
assert_eq!("Created on insertion".to_string(), create_expr.desc);
assert_eq!(
vec![create_expr.column_defs[0].name.clone()],
create_expr.primary_keys
);
let column_defs = create_expr.column_defs;
assert_eq!(column_defs[5].name, create_expr.time_index);
assert_eq!(7, column_defs.len());
assert_eq!(
ConcreteDataType::string_datatype(),
ConcreteDataType::from(
ColumnDataTypeWrapper::try_new(
column_defs
.iter()
.find(|c| c.name == "host")
.unwrap()
.data_type,
None
)
.unwrap()
)
);
assert_eq!(
ConcreteDataType::float64_datatype(),
ConcreteDataType::from(
ColumnDataTypeWrapper::try_new(
column_defs
.iter()
.find(|c| c.name == "cpu")
.unwrap()
.data_type,
None
)
.unwrap()
)
);
assert_eq!(
ConcreteDataType::float64_datatype(),
ConcreteDataType::from(
ColumnDataTypeWrapper::try_new(
column_defs
.iter()
.find(|c| c.name == "memory")
.unwrap()
.data_type,
None
)
.unwrap()
)
);
assert_eq!(
ConcreteDataType::time_datatype(TimeUnit::Millisecond),
ConcreteDataType::from(
ColumnDataTypeWrapper::try_new(
column_defs
.iter()
.find(|c| c.name == "time")
.unwrap()
.data_type,
None
)
.unwrap()
)
);
assert_eq!(
ConcreteDataType::interval_datatype(IntervalUnit::MonthDayNano),
ConcreteDataType::from(
ColumnDataTypeWrapper::try_new(
column_defs
.iter()
.find(|c| c.name == "interval")
.unwrap()
.data_type,
None
)
.unwrap()
)
);
assert_eq!(
ConcreteDataType::timestamp_millisecond_datatype(),
ConcreteDataType::from(
ColumnDataTypeWrapper::try_new(
column_defs
.iter()
.find(|c| c.name == "ts")
.unwrap()
.data_type,
None
)
.unwrap()
)
);
let decimal_column = column_defs.iter().find(|c| c.name == "decimals").unwrap();
assert_eq!(
ConcreteDataType::decimal128_datatype(38, 10),
ConcreteDataType::from(
ColumnDataTypeWrapper::try_new(
decimal_column.data_type,
decimal_column.datatype_extension,
)
.unwrap()
)
);
}
#[test]
fn test_find_new_columns() {
let mut columns = Vec::with_capacity(1);
let cpu_column = build_column_schema("cpu", 10, true).unwrap();
let ts_column = build_column_schema("ts", 15, false)
.unwrap()
.with_time_index(true);
columns.push(cpu_column);
columns.push(ts_column);
let schema = Arc::new(SchemaBuilder::try_from(columns).unwrap().build().unwrap());
assert!(extract_new_columns(&schema, ColumnExpr::from_columns(&[]))
.unwrap()
.is_none());
let insert_batch = mock_insert_batch();
let add_columns = extract_new_columns(&schema, ColumnExpr::from_columns(&insert_batch.0))
.unwrap()
.unwrap();
assert_eq!(5, add_columns.add_columns.len());
let host_column = &add_columns.add_columns[0];
assert_eq!(
ConcreteDataType::string_datatype(),
ConcreteDataType::from(
ColumnDataTypeWrapper::try_new(
host_column.column_def.as_ref().unwrap().data_type,
None
)
.unwrap()
)
);
assert!(host_column.add_if_not_exists);
let memory_column = &add_columns.add_columns[1];
assert_eq!(
ConcreteDataType::float64_datatype(),
ConcreteDataType::from(
ColumnDataTypeWrapper::try_new(
memory_column.column_def.as_ref().unwrap().data_type,
None
)
.unwrap()
)
);
assert!(host_column.add_if_not_exists);
let time_column = &add_columns.add_columns[2];
assert_eq!(
ConcreteDataType::time_datatype(TimeUnit::Millisecond),
ConcreteDataType::from(
ColumnDataTypeWrapper::try_new(
time_column.column_def.as_ref().unwrap().data_type,
None
)
.unwrap()
)
);
assert!(host_column.add_if_not_exists);
let interval_column = &add_columns.add_columns[3];
assert_eq!(
ConcreteDataType::interval_datatype(IntervalUnit::MonthDayNano),
ConcreteDataType::from(
ColumnDataTypeWrapper::try_new(
interval_column.column_def.as_ref().unwrap().data_type,
None
)
.unwrap()
)
);
assert!(host_column.add_if_not_exists);
let decimal_column = &add_columns.add_columns[4];
assert_eq!(
ConcreteDataType::decimal128_datatype(38, 10),
ConcreteDataType::from(
ColumnDataTypeWrapper::try_new(
decimal_column.column_def.as_ref().unwrap().data_type,
decimal_column
.column_def
.as_ref()
.unwrap()
.datatype_extension
)
.unwrap()
)
);
assert!(host_column.add_if_not_exists);
}
fn mock_insert_batch() -> (Vec<Column>, u32) {
let row_count = 2;
let host_vals = Values {
string_values: vec!["host1".to_string(), "host2".to_string()],
..Default::default()
};
let host_column = Column {
column_name: "host".to_string(),
semantic_type: SemanticType::Tag as i32,
values: Some(host_vals),
null_mask: vec![0],
datatype: ColumnDataType::String as i32,
..Default::default()
};
let cpu_vals = Values {
f64_values: vec![0.31],
..Default::default()
};
let cpu_column = Column {
column_name: "cpu".to_string(),
semantic_type: SemanticType::Field as i32,
values: Some(cpu_vals),
null_mask: vec![2],
datatype: ColumnDataType::Float64 as i32,
..Default::default()
};
let mem_vals = Values {
f64_values: vec![0.1],
..Default::default()
};
let mem_column = Column {
column_name: "memory".to_string(),
semantic_type: SemanticType::Field as i32,
values: Some(mem_vals),
null_mask: vec![1],
datatype: ColumnDataType::Float64 as i32,
..Default::default()
};
let time_vals = Values {
time_millisecond_values: vec![100, 101],
..Default::default()
};
let time_column = Column {
column_name: "time".to_string(),
semantic_type: SemanticType::Field as i32,
values: Some(time_vals),
null_mask: vec![0],
datatype: ColumnDataType::TimeMillisecond as i32,
..Default::default()
};
let interval1 = IntervalMonthDayNano {
months: 1,
days: 2,
nanoseconds: 3,
};
let interval2 = IntervalMonthDayNano {
months: 4,
days: 5,
nanoseconds: 6,
};
let interval_vals = Values {
interval_month_day_nano_values: vec![interval1, interval2],
..Default::default()
};
let interval_column = Column {
column_name: "interval".to_string(),
semantic_type: SemanticType::Field as i32,
values: Some(interval_vals),
null_mask: vec![0],
datatype: ColumnDataType::IntervalMonthDayNano as i32,
..Default::default()
};
let ts_vals = Values {
timestamp_millisecond_values: vec![100, 101],
..Default::default()
};
let ts_column = Column {
column_name: "ts".to_string(),
semantic_type: SemanticType::Timestamp as i32,
values: Some(ts_vals),
null_mask: vec![0],
datatype: ColumnDataType::TimestampMillisecond as i32,
..Default::default()
};
let decimal_vals = Values {
decimal128_values: vec![Decimal128 { hi: 0, lo: 123 }, Decimal128 { hi: 0, lo: 456 }],
..Default::default()
};
let decimal_column = Column {
column_name: "decimals".to_string(),
semantic_type: SemanticType::Field as i32,
values: Some(decimal_vals),
null_mask: vec![0],
datatype: ColumnDataType::Decimal128 as i32,
datatype_extension: Some(ColumnDataTypeExtension {
type_ext: Some(TypeExt::DecimalType(DecimalTypeExtension {
precision: 38,
scale: 10,
})),
}),
options: None,
};
(
vec![
host_column,
cpu_column,
mem_column,
time_column,
interval_column,
ts_column,
decimal_column,
],
row_count,
)
}
}

View File

@@ -6,7 +6,7 @@ license.workspace = true
[features]
testing = []
pg_kvbackend = ["dep:tokio-postgres", "dep:backon"]
pg_kvbackend = ["dep:tokio-postgres", "dep:backon", "dep:deadpool-postgres", "dep:deadpool"]
[lints]
workspace = true
@@ -36,8 +36,8 @@ common-wal.workspace = true
datafusion-common.workspace = true
datafusion-expr.workspace = true
datatypes.workspace = true
deadpool.workspace = true
deadpool-postgres.workspace = true
deadpool = { workspace = true, optional = true }
deadpool-postgres = { workspace = true, optional = true }
derive_builder.workspace = true
etcd-client.workspace = true
futures.workspace = true

View File

@@ -16,7 +16,6 @@ use std::collections::HashMap;
use std::sync::Arc;
use futures::future::BoxFuture;
use futures::TryStreamExt;
use moka::future::Cache;
use moka::ops::compute::Op;
use table::metadata::TableId;
@@ -54,9 +53,13 @@ fn init_factory(table_flow_manager: TableFlowManagerRef) -> Initializer<TableId,
Box::pin(async move {
table_flow_manager
.flows(table_id)
.map_ok(|(key, value)| (key.flownode_id(), value.peer))
.try_collect::<HashMap<_, _>>()
.await
.map(|flows| {
flows
.into_iter()
.map(|(key, value)| (key.flownode_id(), value.peer))
.collect::<HashMap<_, _>>()
})
// We must cache the `HashSet` even if it's empty,
// to avoid future requests to the remote storage next time;
// If the value is added to the remote storage,

View File

@@ -12,8 +12,10 @@
// See the License for the specific language governing permissions and
// limitations under the License.
use std::hash::{DefaultHasher, Hash, Hasher};
use std::str::FromStr;
use api::v1::meta::HeartbeatRequest;
use common_error::ext::ErrorExt;
use lazy_static::lazy_static;
use regex::Regex;
@@ -55,12 +57,10 @@ pub trait ClusterInfo {
}
/// The key of [NodeInfo] in the storage. The format is `__meta_cluster_node_info-{cluster_id}-{role}-{node_id}`.
///
/// This key cannot be used to describe the `Metasrv` because the `Metasrv` does not have
/// a `cluster_id`, it serves multiple clusters.
#[derive(Debug, Clone, Eq, Hash, PartialEq, Serialize, Deserialize)]
#[derive(Debug, Clone, Copy, Eq, Hash, PartialEq, Serialize, Deserialize)]
pub struct NodeInfoKey {
/// The cluster id.
// todo(hl): remove cluster_id as it is not assigned anywhere.
pub cluster_id: ClusterId,
/// The role of the node. It can be `[Role::Datanode]` or `[Role::Frontend]`.
pub role: Role,
@@ -69,6 +69,28 @@ pub struct NodeInfoKey {
}
impl NodeInfoKey {
/// Try to create a `NodeInfoKey` from a "good" heartbeat request. "good" as in every needed
/// piece of information is provided and valid.
pub fn new(request: &HeartbeatRequest) -> Option<Self> {
let HeartbeatRequest { header, peer, .. } = request;
let header = header.as_ref()?;
let peer = peer.as_ref()?;
let role = header.role.try_into().ok()?;
let node_id = match role {
// Because the Frontend is stateless, it's too easy to neglect choosing a unique id
// for it when setting up a cluster. So we calculate its id from its address.
Role::Frontend => calculate_node_id(&peer.addr),
_ => peer.id,
};
Some(NodeInfoKey {
cluster_id: header.cluster_id,
role,
node_id,
})
}
pub fn key_prefix_with_cluster_id(cluster_id: u64) -> String {
format!("{}-{}-", CLUSTER_NODE_INFO_PREFIX, cluster_id)
}
@@ -83,6 +105,13 @@ impl NodeInfoKey {
}
}
/// Calculate (by using the DefaultHasher) the node's id from its address.
fn calculate_node_id(addr: &str) -> u64 {
let mut hasher = DefaultHasher::new();
addr.hash(&mut hasher);
hasher.finish()
}
/// The information of a node in the cluster.
#[derive(Debug, Serialize, Deserialize)]
pub struct NodeInfo {
@@ -100,7 +129,7 @@ pub struct NodeInfo {
pub start_time_ms: u64,
}
#[derive(Debug, Clone, Eq, Hash, PartialEq, Serialize, Deserialize)]
#[derive(Debug, Clone, Copy, Eq, Hash, PartialEq, Serialize, Deserialize)]
pub enum Role {
Datanode,
Frontend,
@@ -201,8 +230,8 @@ impl TryFrom<Vec<u8>> for NodeInfoKey {
}
}
impl From<NodeInfoKey> for Vec<u8> {
fn from(key: NodeInfoKey) -> Self {
impl From<&NodeInfoKey> for Vec<u8> {
fn from(key: &NodeInfoKey) -> Self {
format!(
"{}-{}-{}-{}",
CLUSTER_NODE_INFO_PREFIX,
@@ -271,6 +300,7 @@ impl TryFrom<i32> for Role {
mod tests {
use std::assert_matches::assert_matches;
use super::*;
use crate::cluster::Role::{Datanode, Frontend};
use crate::cluster::{DatanodeStatus, NodeInfo, NodeInfoKey, NodeStatus};
use crate::peer::Peer;
@@ -283,7 +313,7 @@ mod tests {
node_id: 2,
};
let key_bytes: Vec<u8> = key.into();
let key_bytes: Vec<u8> = (&key).into();
let new_key: NodeInfoKey = key_bytes.try_into().unwrap();
assert_eq!(1, new_key.cluster_id);
@@ -338,4 +368,26 @@ mod tests {
let prefix = NodeInfoKey::key_prefix_with_role(2, Frontend);
assert_eq!(prefix, "__meta_cluster_node_info-2-1-");
}
#[test]
fn test_calculate_node_id_from_addr() {
// Test empty string
assert_eq!(calculate_node_id(""), calculate_node_id(""));
// Test same addresses return same ids
let addr1 = "127.0.0.1:8080";
let id1 = calculate_node_id(addr1);
let id2 = calculate_node_id(addr1);
assert_eq!(id1, id2);
// Test different addresses return different ids
let addr2 = "127.0.0.1:8081";
let id3 = calculate_node_id(addr2);
assert_ne!(id1, id3);
// Test long address
let long_addr = "very.long.domain.name.example.com:9999";
let id4 = calculate_node_id(long_addr);
assert!(id4 > 0);
}
}

Some files were not shown because too many files have changed in this diff Show More