Compare commits

...

277 Commits

Author SHA1 Message Date
Ruihang Xia
f3a02effa7 assign partition_ranges
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
2025-04-08 21:36:29 +08:00
Ruihang Xia
52f9fc25ba Revert "feat: keep parallelize_scan unchanged"
This reverts commit 96ba00d175.
2025-04-08 21:16:09 +08:00
evenyag
214a16565a chore: update comment 2025-04-08 21:00:25 +08:00
evenyag
21790a607e feat: use smallvec 2025-04-08 20:54:34 +08:00
evenyag
b33d8c1bad fix: include build merge reader cost to scan cost 2025-04-08 20:52:19 +08:00
evenyag
916e1c2d9e fix: address compiler errors 2025-04-08 20:51:55 +08:00
evenyag
96ba00d175 feat: keep parallelize_scan unchanged 2025-04-08 20:36:48 +08:00
evenyag
7173401732 fix: use series scan in PerSeries distribution 2025-04-08 20:34:08 +08:00
evenyag
17c797a6d0 refactor: remove per series scan from SeqScan 2025-04-08 20:34:06 +08:00
evenyag
c44ba1aa69 feat: parallelize PerSeries 2025-04-08 20:26:50 +08:00
evenyag
843d33f9d0 feat: use series scan when distribution is PerSeries 2025-04-08 20:26:50 +08:00
evenyag
b74e2a7d9b feat: implement scan logic of each partition 2025-04-08 20:26:47 +08:00
evenyag
4a79c1527d chore: add to scanner enum 2025-04-08 20:24:54 +08:00
evenyag
b7a6ff9cc3 chore: basic methods for SeriesScan 2025-04-08 20:24:54 +08:00
Yingwen
609e228852 fix: get root cause of the procedure when coverting to pb (#5841) 2025-04-08 08:14:47 +00:00
Ruihang Xia
c16bae32c4 perf: evolve promql execution engine (#5691)
* use the same sort option across every prom plan

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* tweak plans

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* wip

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* fix merge compile

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* Revert "wip"

This reverts commit db58884236.

* tweak merge scan

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* handle error

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* pass distribution rule

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* reverse sort order

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* refine plans

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* more optimizations for plans

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* check logical table

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* fix tests

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* wierd tests

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* add comment

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* add test for series_divide

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* update sqlness result

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* fix scalar calculation

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* update sqlness result

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* fix: workaround join partition

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* update proto

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

---------

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
2025-04-08 08:12:15 +00:00
zyy17
ee4fe9d273 refactor: improve performance for Jaeger APIs (#5838)
* refactor: improve jaeger '/api/services' performance by adding the trace services table

* chore: refine some logic

* chore: compatible v0

* test: add integration test

* chore: expand default limit from 100 to 2000

* test: fix integration test

* refactor: make trace service table configurable

* refactor: use a timestamp(2100-01-01 00:00:00) as large as possible

* refactor: use '<trace_table>_services' as trace services table name
2025-04-08 02:28:06 +00:00
Yuhan Wang
6e6e335a81 feat(remote-wal): send flush request when pruning remote wal (#5825)
* feat: update minimum entry id in kvbackend

* fix: persist before delete

* chore: apply comments

* feat: add flush region in wal prune procedure

* fix: cherry-pick error

* chore: fmt

* chore: drop rx to avoid block by response

* chore: update comments

* chore: apply review comments

* test: fix unit test

* feat: add option not to flush region during wal prune

* test: fix unit test

* fix: delete at minimum replay entry id + 1

* fix: cas

* chore: add comments

* chore: apply review comments

* chore: apply review comments

* chore: fix error msg

* chore: apply review comments

* fix: idempotent cas

* refactor: use a one-way sender

* chore: better err msg

* chore: fix unit test

* chore: apply review comments

* chore: apply review comments

* chore: replace send oneway
2025-04-07 14:05:18 +00:00
Weny Xu
981d51785b fix: throw errors instead of ignoring (#5792)
* fix: throw errors instead of ignoring

* fix: fix unit tests

* refactor: remove schema version check

* fix: fix clippy

* chore: remove unused error

* refactor: remove schema version check

* feat: handle mutliple results

* feat: introduce consistency guard

* fix: release consistency guard on datanode operation completion

* test: add tests

* chore: remove schema version

* refactor: rename

* test: add more tests

* chore: print all error

* tests: query table after alteration

* log ignored request

* refine fuzz test

* chore: fix clippy and log mailbox message

* chore: close prepared statement after execution

* chore: add comment

* chore: remove log

* chore: rename to `ConsistencyPoison`

* chore: remove unused error

* fix: fix unit tests

* chore: apply suggestions from CR
2025-04-07 13:51:00 +00:00
Weny Xu
cf1eda28aa feat: add region_id to CountdownTaskHandlerExt (#5834) 2025-04-07 09:25:59 +00:00
zyy17
cf1440fc32 refactor: add time range for jager get operations API (#5791)
* refactor: add default time range for jager get operations API

* refactor: use desc order for timestamp colomn

* chore: modify http header name
2025-04-07 09:07:31 +00:00
Yingwen
21a209f7ba fix: skip replacing exprs of the DistinctOn node (#5823)
* fix: handle distinct on specially

* chore: update comment
2025-04-07 08:59:40 +00:00
Weny Xu
917510ffd0 feat: introduce poison mechanism for procedure (#5822)
* feat: introduce poison for procedure

* tests: add unit tests

* refactor: minor refactor

* fix: unit tests

* chore: fix unit tests

* chore: apply suggestions from CR

* chore: apply suggestions from CR

* chore: update comments

* chore: introduce `ProcedureStatus::Poisoned`

* chore: upgrade greptime-proto to `2be0f`

* chore: apply suggestions from CR
2025-04-07 08:25:13 +00:00
fys
7b48ef1e97 chore: remove patch.crates-io for rustls (#5832)
* chore: remove patch.crates-io for rustls

* enable default-rustls-ring feature for mysql_sync

* fix: build error

* add comment

* update comment
2025-04-07 07:51:50 +00:00
Weny Xu
ac0f9ab575 refactor: remove backoff config (#5808)
* refactor: remove backoff config

* chore: update config.md

* fix: correct backoff config

* chore: change deadline to 120s
2025-04-07 07:22:22 +00:00
Ning Sun
f2907bb009 refactor!: make pipeline a required parameter when ingesting trace (#5828)
* feat: make pipeline a required header for trace

* test: add test case without pipeline
2025-04-07 06:18:17 +00:00
Ryan Despain
1695919ee7 clear message for an awesome achievement (#5829)
Initially there was what I think was a typo. `s/archive/achieve` but then I thought some clarification might be nice on this great achievement.
2025-04-07 02:37:19 +00:00
Weny Xu
eab702cc02 feat: implement sync_region for metric engine (#5826)
* feat: implement `sync_region` for metric engine

* chore: apply suggestions from CR

* chore: upgrade proto
2025-04-03 12:46:20 +00:00
Zhenchi
dd63068df6 feat: add matches_term function (#5817)
* feat: add `matches_term` function

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* merge & fix

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* address comments

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* fix & skip char after boundary mismatch

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

---------

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>
2025-04-03 09:09:41 +00:00
Yuhan Wang
f73b61e767 feat(remote-wal): add remote wal prune procedure (#5714)
* feat: add remote wal prune procedure

* feat: add retry logic and remove rollback

* chore: simplify the logic

* fix: remove REMOTE_WAL_LOCK

* fix: use in-memory kv

* perf: O(n) judgement

* chore: add single write lock

* test: add unit test

* chore: remove unused function

* chore: update comments

* chore: apply comments

* chore: apply comments
2025-04-03 08:11:51 +00:00
Yingwen
2acecd3620 feat: support REPLACE INTO statement (#5820)
* feat: support replace into

* feat: support replace into
2025-04-03 03:22:43 +00:00
Zhenchi
f797de3497 feat: add backend field to fulltext options (#5806)
* feat: add backend field to fulltext options

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* update proto

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* fix option conv

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* fix display

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* polish

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

---------

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>
2025-04-02 09:15:54 +00:00
dennis zhuang
d53afa849d fix: interval cast expression can't work in range query, #5805 (#5813)
* fix: interval cast expression can't work in range query, #5805

* fix: nested cast

* test: make vector test stable
2025-04-02 08:46:17 +00:00
discord9
3aebfc1716 test: looser condition (#5816) 2025-04-02 07:38:05 +00:00
Weny Xu
dbb79c9671 feat: introduce CollectLeaderRegionHandler (#5811)
* feat: introduce `CollectLeaderRegionHandler`

* feat: add to default handler group

* fix: correct unit test

* chore: rename
2025-04-02 04:47:00 +00:00
shuiyisong
054056fcbb refactor: remove prom store write dispatch (#5812)
* refactor: remove prom store remote write dispatch pattern

* chore: ref XIX-22
2025-04-02 04:35:28 +00:00
Zhenchi
aa486db8b7 refactor: allow bloom filter search to apply and conjunction (#5770)
* refactor: change bloom filter search from any to all match

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* polish

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* place back in list

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* nit

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

---------

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>
2025-04-01 12:50:34 +00:00
Weny Xu
4ef9afd8d8 feat: introduce read preference (#5783)
* feat: introduce read preference

* feat: introduce `RegionQueryHandlerFactory`

* feat: extract ReadPreference from http header

* test: add more tests

* chore: apply suggestions from CR

* chore: apply suggestions from CR
2025-04-01 09:17:01 +00:00
shuiyisong
f9221e9e66 perf: introduce simd_json for parsing ndjson (#5794)
* perf: introduce simd_json for parsing ndjson

* fix: some tests

* fix: some tests

* fix: es test case

* chore: use `as_bytes_mut()`

* chore: remove unnecessary `to_string`

* chore: add safety comment
2025-04-01 08:17:26 +00:00
Weny Xu
6c26fe9c80 fix: correct error status code (#5802) 2025-04-01 07:34:16 +00:00
fys
33c9fb737c refactor: remove mode option in configuration files (#5809)
* refactor: remove mode option in configuration files

* chore: remove mode in configuration file

* remvoe mode field in FlownodeOptions

* add comment for test

* update config.md

* remove mode field in standalone options

* fix: ci
2025-04-01 07:14:10 +00:00
Weny Xu
68ce796771 chore: expose modules (#5810) 2025-04-01 05:33:20 +00:00
Weny Xu
d701c18150 feat: introduce CustomizedRegionLeaseRenewer (#5762)
* feat: add manifest_version to `GrantedRegion`

* chore: upgrade proto

* chore: apply review suggestions

* chore: apply suggestions from CR

* feat: introduce `CustomizedRegionLeaseRenewerRef`

* chore: upgrade to `103948`
2025-03-31 13:25:05 +00:00
Weny Xu
d3a60d8821 feat: add limit for the number of running procedures (#5793)
* refactor: remove unused `messages`

* feat: introduce running procedure num limit

* feat: update config

* chore: apply suggestions from CR

* feat: impl `status_code` for `log-store` crate
2025-03-31 06:14:21 +00:00
discord9
5d688c6565 feat(flow): time window expr (#5785)
* feat: time window expr

* chore: comments

* refactor: per review

* chore: partially per review

* chore: per review

* chore: per review use query engine's session
2025-03-31 04:46:37 +00:00
Weny Xu
41aee1f1b7 feat: implement sync_region for mito engine (#5765)
* chore: upgrade proto to `2d52b`

* feat: add `SyncRegion` to `WorkerRequest`

* feat: impl `sync_region` for `Engine` trait

* test: add tests

* chore: fmt code

* chore: upgrade proto

* chore: unify `RegionLeaderState` and `RegionFollowerState`

* chore: check immutable memtable

* chore: fix clippy

* chore: apply suggestions from CR
2025-03-31 03:53:47 +00:00
yihong
c5b55fd8cf fix: close issue #3902 since upstream fixed (#5801)
Signed-off-by: yihong0618 <zouzou0208@gmail.com>
2025-03-30 12:34:52 +00:00
Ruihang Xia
8051dbbc31 fix: typo variadic (#5800)
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
2025-03-29 07:09:36 +00:00
Ruihang Xia
2d3192984d refactor: remove deprecated find_unique method (#5790)
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
2025-03-28 19:32:11 +00:00
shuiyisong
bef45ed0e8 feat(pipeline): support table name suffix templating in pipeline (#5775)
* chore: add table name template in pipeline yaml

* chore: implement apply function and add simple test

* chore: add comment and integration test

* chore: minor update

* fix: typos

* chore: change to table suffix

* chore: update comment and test

* chore: change name to table_suffix
2025-03-28 18:12:46 +00:00
LFC
a9e990768d refactor: skip re-taking arrays in memtable if possible (#5779)
experiment: skip sorting and re-taking arrays if possible when scanning memtable
2025-03-28 09:58:55 +00:00
Weny Xu
7e1ba49d3d refactor: remove useless region follower legacy code (#5795) 2025-03-28 08:10:30 +00:00
Yingwen
737558ef53 fix: support __name__ matcher in label values (#5773) 2025-03-28 02:18:59 +00:00
Yingwen
dbc25dd8da feat: expose scanner metrics to df execution metrics (#5699)
* feat: add metrics list to scanner

* chore: add report metrics method

* feat: use df metrics in PartitionMetrics

* feat: pass execution metrics to scan partition

* refactor: remove PartitionMetricsList

* feat: better debug format for ScanMetricsSet

* feat: do not expose all metrics to execution metrics by default

* refactor: use struct destruction

* feat: add metrics list to scanner

* chore: Add custom Debug for ScanMetricsSet and partition metrics display

* test: update sqlness result
2025-03-27 23:40:39 +00:00
Ruihang Xia
76a58a07e1 feat: simple implementation of DictionaryVector (#5758)
* basic impl

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* impl vector op

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* unit tests

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* remove unwraps

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* feat: enhance DictionaryVector operations and deprecate find_unique method

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* fix typo

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* chore: remove find_unique test

* chore: remove unused import

* fix test

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

---------

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
Co-authored-by: Yingwen <realevenyag@gmail.com>
2025-03-27 23:19:10 +00:00
Weny Xu
c2ba7fb16c refactor: remove useless region follower legacy code (#5787)
chore: remove region follower procedure
2025-03-27 11:50:29 +00:00
Lei, HUANG
09ef24fd75 refactor: remove useless partition legacy code (#5786)
* refactor: remove useless partition legacy code

* also remove error variants

* fix imports
2025-03-27 11:08:25 +00:00
Weny Xu
9b7b012620 feat: impl show region (#5782)
* fix: fix region follower procedure

* feat: add table related info to region peers table and follower regions

* feat: impl show region

* chore: apply suggestions from CR
2025-03-27 10:41:44 +00:00
fys
898e0bd828 chore: expose some methods (#5784) 2025-03-27 09:00:51 +00:00
shuiyisong
2b4ed43692 chore: accept table options in auto create table from hints (#5776)
chore: accept table options in auto create table from hint
2025-03-27 08:17:27 +00:00
Weny Xu
8f2ae4e136 feat: add AddRegionFollower and RemoveRegionFollower admin fn (#5780) 2025-03-27 06:30:50 +00:00
Weny Xu
0cd219a5d2 refactor: move list_flow_stats to ClusterInfo trait. (#5774)
refactor: minor refactor
2025-03-27 04:20:12 +00:00
fys
2b2ea5bf72 chore: upgrade some dependencies (#5777)
* chore: upgrade some dependencies

* chore: upgrade some dependencies

* fix: cr

* fix: ci

* fix: test

* fix: cargo fmt
2025-03-27 02:48:44 +00:00
discord9
e107bd5529 feat(flow): utils function for recording rule (#5768)
* chore: utils for rr

* chore: one more test

* chore: more test case

* test: even more tests

* chore: per review

* tests: add more&update testcase

* chore: update comment
2025-03-26 08:55:35 +00:00
Weny Xu
a31f0e255b feat: introduce RegionFollowerClient trait (#5771)
* chore: expose AskLeader

* feat: introduce `RegionFollowerClient` trait

* feat: build meta client with region follower client
2025-03-26 08:05:15 +00:00
Lei, HUANG
40b52f3b13 feat(mito): allow skipping wal while creating tables (#5740)
* chore: add Noop Wal option

* remove: WalOptionsAllocator::alloc method

* feat/no-op-wal:
 ### Add Noop WAL Option

 - **`engine.rs`, `opener.rs`, `wal.rs`, `entry_reader.rs`, `handle_write.rs`, `provider.rs`**:
   - Introduced a new `WalOptions::Noop` variant to handle scenarios where no write-ahead logging is required.
   - Implemented `NoopEntryReader` to provide a no-operation entry reader.
   - Updated logic to skip WAL operations for regions with `Noop` option.
   - Added `Provider::Noop` to handle `Noop` operations in the provider logic.

* feat/no-op-wal:
 ### Add `skip_wal` Option to Table Metadata

 - **Enhancements in `table_meta.rs`**:
   - Added a `skip_wal` parameter to the `create_wal_options` function to allow skipping WAL writes.
   - Updated the `create_table_route` function to utilize the `skip_wal` option from `table_info.meta.options`.

 - **Updates in `wal_options_allocator.rs`**:
   - Modified `alloc_batch` to handle the `skip_wal` flag, setting WAL options to `Noop` when true.
   - Added a test case `test_allocator_with_skip_wal` to verify the `skip_wal` functionality.

 - **Changes in `requests.rs`**:
   - Introduced `skip_wal` in `TableOptions` and added parsing logic.
   - Updated `TableOptions` display to include `skip_wal`.

 These changes introduce the ability to skip WAL writes for tables, enhancing flexibility in table metadata management.

* feat/no-op-wal:
 **Add WAL Option Handling and Table Option Validation**

 - **`handle_write.rs`**: Introduced a check for `WalOptions::Noop` in the `RegionWorkerLoop` to skip WAL writing for regions with this option.
 - **`requests.rs`**: Added `SKIP_WAL_KEY` to the list of valid table options for enhanced table configuration validation.

* feat/no-op-wal:
 ### Update WAL Options Allocation

 - **`key.rs`**: Modified the `allocate_region_wal_options` function to include an additional boolean parameter, enhancing the allocation logic.
 - **`wal_options_allocator.rs`**: Simplified the `test_allocator_with_skip_wal` test by removing unnecessary variable declarations and directly using `WalOptionsAllocator::RaftEngine`.

 These changes improve the flexibility and efficiency of WAL options allocation in the system.

* chore: reformat code

* feat/no-op-wal:
 **Enhancement:** Conditional Addition of `SKIP_WAL_KEY` in `requests.rs`

 - Updated `TableOptions` implementation in `requests.rs` to conditionally add `SKIP_WAL_KEY` to `key_vals` only when `self.skip_wal` is true, optimizing the key-value pair generation.

* feat/no-op-wal:
 Update `requests.rs` tests to reflect changes in `skip_wal` option

 - Modified test assertions in `requests.rs` to remove `skip_wal=false` from expected strings.
 - Added a new test case to verify `skip_wal=true` is correctly represented in `TableOptions`.

* feat/no-op-wal: Add Debug Logging and Improve Error Handling for WAL and Table Options

 • Introduced debug logging in wal.rs to skip obsolete regions, enhancing traceability.
 • Improved error handling in requests.rs by replacing warn with error propagation for invalid skip_wal values.
 • Added new test cases for skip_wal functionality, including SQL scripts and expected results, to ensure correct behavior and validation of the changes.
2025-03-26 07:53:52 +00:00
shuiyisong
f13a43647a chore: remove Transformer trait (#5772)
* chore: remove transformer trait

* chore: remove unnecessory generic
2025-03-26 02:53:30 +00:00
Zhenchi
7bcb01d269 feat: utilize blob metadata properties (#5767)
* feat: utilize blob metadata properties

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* Update src/puffin/src/puffin_manager/fs_puffin_manager/reader.rs

---------

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>
2025-03-26 02:47:20 +00:00
Ruihang Xia
e81213728b feat: add/correct some kafka-related metrics (#5757)
* feat: add/correct some kafka-related metrics

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* fix clippy

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* fix dumb issues

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* per-partition produce latency

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

---------

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
2025-03-25 19:16:39 +00:00
Yingwen
d88482b996 feat: support explain analyze verbose (#5763)
* Add explain_verbose to QueryContext

* feat: fmt plan by display type

* feat: update proto to use ExplainOptions

* feat: display more info in verbose mode

* chore: fix clippy

* test: add sqlness test

* test: update sqlness result

* chore: update proto version

* chore: Simplify QueryContextBuilder::explain_options using get_or_insert_default
2025-03-25 03:48:36 +00:00
discord9
3b547d9d13 feat(flow): frontend client for handle sql (#5761)
* feat: frontend client for handle sql

* refactor: per review

* chore: revert unnecessary change
2025-03-25 02:26:04 +00:00
Yuhan Wang
278553fc3f docs: rfc for wal purge (#5475)
* docs: add rfc for wal purge

* docs: fix typo

* docs: follow name format

* chore: all in heartbeat

* fix: unneeded sentence in rfc

* chore: apply comments
2025-03-24 12:07:50 +00:00
Yuhan Wang
a36901a653 chore: ut and some fix (#5752)
* chore: ut and some fix

* fix: remove NOWAIT

* refactor: use param for meta lease ttl

* chore: feature gate

* chore: add comments

* chore: apply comments

* fix: advice by claude 3.7 sonnet

* chore: apply comments
2025-03-24 09:05:06 +00:00
discord9
c4ac242c69 fix: properly give placeholder types (#5760)
* fix: properly give placeholder types

* chore: update sqlness
2025-03-24 08:41:32 +00:00
fys
9f9307de73 refactor: make frontend instance clear (#5754)
* refactor: the startup of frontend

* remove unnecessary error type

* fix: cr

* remove unnecessary trait FrontendInstance

* fix: cr

* fix: cr

* adjust the startup order of services
2025-03-24 06:08:02 +00:00
shuiyisong
c77ce958a3 chore: support custom time index selector for identity pipeline (#5750)
* chore: minor refactor

* chore: minor refactor

* chore: support custom ts for identity pipeline

* chore: fix clippy

* chore: minor refactor & update tests

* chore: use ref on identity pipeline param
2025-03-24 04:27:22 +00:00
discord9
5ad2d8b3b8 fix: handle nullable default value (#5747)
* fix: handle nullable default value

* chore: update sqlness
2025-03-24 02:38:26 +00:00
Ruihang Xia
2724c3c142 feat: support regex in simple filter (#5753)
* feat: support regex in simple filter

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* update sqlness result

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* Update src/common/recordbatch/src/filter.rs

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

---------

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-03-24 02:10:42 +00:00
Weny Xu
4eb0771afe feat: introduce install_manifest_to for RegionManifestManager (#5742)
* feat: introduce `install_manifest_changes` for `RegionManifestManager`

* chore: rename function to `install_manifest_to`

* Apply suggestions from code review

Co-authored-by: jeremyhi <jiachun_feng@proton.me>

* chore: add comments

* chore: add comments

* chore: update logic and add comments

* chore: add more check

* Update src/mito2/src/manifest/manager.rs

Co-authored-by: Yingwen <realevenyag@gmail.com>

---------

Co-authored-by: jeremyhi <jiachun_feng@proton.me>
Co-authored-by: Yingwen <realevenyag@gmail.com>
2025-03-21 05:19:23 +00:00
Yohan Wal
a0739a96e4 fix: wrap table name with `` (#5748)
* fix: wrap table name with quotes

* fix: minor fix
2025-03-20 09:38:54 +00:00
Ning Sun
77ccf1eac8 chore: add datanode write rows to grafana dashboard (#5745) 2025-03-20 03:39:40 +00:00
Yohan Wal
1dc4a196bf feat: add mysql election logic (#5694)
* feat: add mysql election

* feat: add mysql election

* chore: fix deps

* chore: fix deps

* fix: duplicate container

* fix: duplicate setup for sqlness

* fix: call once

* fix: do not use NOWAIT for mysql 5.7

* chore: apply comments

* fix: no parallel sqlness for mysql

* chore: comments and minor revert

* chore: apply comments

* chore: apply comments

* chore: add  to table name

* ci: use 2 metasrv to detect election bugs

* refactor: better election logic

* chore: apply comments

* chore: apply comments

* feat: version check before startup
2025-03-19 11:31:18 +00:00
shuiyisong
2431cd3bdf chore: merge error files under pipeline crate (#5738) 2025-03-19 09:55:51 +00:00
discord9
cd730e0486 fix: mysql prepare limit&offset param (#5734)
* fix: prepare limit&offset param

* test: sqlness

* chore: per review

* chore: per review
2025-03-19 07:49:26 +00:00
zyy17
a19441bed8 refactor: remove trace id from primary key in opentelemetry_traces table (#5733)
* refactor: remove trace id in primary key

* refactor: remove trace id in primary key in v0 model

* refactor: add span id in v1

* fix: integration test
2025-03-19 06:17:58 +00:00
dennis zhuang
162e3b8620 docs: adds news to readme (#5735) 2025-03-19 01:33:46 +00:00
Wenbin
83642dab87 feat: remove duplicated peer definition (#5728)
* remove duplicate peer

* fix
2025-03-18 11:30:25 +00:00
discord9
46070958c9 fix: mysql prepare bool value (#5732) 2025-03-18 10:50:45 +00:00
pikady
eea8b1c730 feat: add vec_kth_elem function (#5674)
* feat: add vec_kth_elem function

Signed-off-by: pikady <2652917633@qq.com>

* code format

Signed-off-by: pikady <2652917633@qq.com>

* add test sql

Signed-off-by: pikady <2652917633@qq.com>

* change indexing from 1-based to 0-based

Signed-off-by: pikady <2652917633@qq.com>

* improve code formatting and correct spelling errors

Signed-off-by: pikady <2652917633@qq.com>

* Update tests/cases/standalone/common/function/vector/vector.sql

I noticed the two lines are identical. Could you clarify the reason for the change? Thanks!

Co-authored-by: Zhenchi <zhongzc_arch@outlook.com>

---------

Signed-off-by: pikady <2652917633@qq.com>
Co-authored-by: Zhenchi <zhongzc_arch@outlook.com>
2025-03-18 07:25:53 +00:00
Ning Sun
1ab4ddab8d feat: update pipeline header name to x-greptime-pipeline-name (#5710)
* feat: update pipeline header name to x-greptime-pipeline-name

* refactor: update string_value_from_header
2025-03-18 02:39:54 +00:00
Ning Sun
9e63018198 feat: disable http timeout (#5721)
* feat: update to disable http timeout by default

* feat: make http timeout default to 0

* test: correct test case

* chore: generate new config doc

* test: correct tests
2025-03-18 01:18:56 +00:00
discord9
594bec8c36 feat: load manifest manually in mito engine (#5725)
* feat: load manifest and some

* chore: per review
2025-03-18 01:18:08 +00:00
localhost
1586732d20 chore: add some method for log query handler (#5685)
* chore: add some method for log query handler

* chore: make clippy happy

* chore: add some method for log query handler

* Update src/frontend/src/instance/logs.rs

Co-authored-by: shuiyisong <113876041+shuiyisong@users.noreply.github.com>

---------

Co-authored-by: shuiyisong <113876041+shuiyisong@users.noreply.github.com>
2025-03-17 18:36:43 +00:00
yihong
16fddd97a7 chore: revert commit update flate2 version (#5706)" (#5715)
Revert "chore: update flate2 version (#5706)"

This reverts commit a5df3954f3.
2025-03-17 12:16:26 +00:00
Ning Sun
2260782c12 refactor: update jaeger api implementation for new trace modeling (#5655)
* refactor: update jaeger api implementation

* test: add tests for v1 data model

* feat: customize trace table name

* fix: update column requirements to use Column type instead of String

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* fix: lint fix

* refactor: accumulate resource attributes for v1

* fix: add empty check for additional string

* feat: add table option to mark data model version

* fix: do not overwrite all tags

* feat: use table option to mark table data model version and process accordingly

* chore: update comments to reflect query changes

* feat: use header for jaeger table name

* feat: update index for service_name, drop index for span_name

---------

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
Co-authored-by: Ruihang Xia <waynestxia@gmail.com>
Co-authored-by: zyy17 <zyylsxm@gmail.com>
2025-03-17 07:31:32 +00:00
Sicong Hu
09dacc8e9b feat: add vec_subvector function (#5683)
* feat: add vec_subvector function

* change datatype of arg1 and arg2 from u64 to i64

* add sqlness test

* improve description comments
2025-03-16 10:43:53 +00:00
Ruihang Xia
dec439db2b chore: bump version to 0.14.0 (#5711)
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
2025-03-16 09:58:19 +00:00
Ning Sun
dc76571166 feat: move default data path from /tmp to current directory (#5719) 2025-03-16 09:57:46 +00:00
shuiyisong
3e17f8c426 chore: use Bytes instead of string in bulk ingestion (#5717)
chore: use bytes instead of string in bulk log ingestion
2025-03-14 09:31:35 +00:00
yihong
a5df3954f3 chore: update flate2 version (#5706)
Signed-off-by: yihong0618 <zouzou0208@gmail.com>
2025-03-14 02:15:27 +00:00
Ruihang Xia
32fd850c20 perf: support in list in simple filter (#5709)
* feat: support in list in simple filter

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* fix clippy

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

---------

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
2025-03-14 01:08:29 +00:00
shuiyisong
2bfdae4f8f feat: add simple extract processor (#5688)
* feat: add simple extract processor

* chore: add test

* chore: add license header

* chore: minor update
2025-03-13 09:19:58 +00:00
shuiyisong
fcb898e9a4 chore: support inverted index in pipeline (#5700)
chore: rebase main
2025-03-13 08:30:29 +00:00
Ning Sun
8fa2fdfc42 feat: make empty parent_span_id null for v1 (#5690) 2025-03-13 07:48:15 +00:00
shuiyisong
4dc1a1d60f chore: support tag in transform (#5701)
chore: support tag in transform to specify tag
2025-03-13 07:27:12 +00:00
Lei, HUANG
e375a18011 fix: conversion from TableMeta to TableMetaBuilder (#5693)
* refactor: use proc macro to generate conversion between TableMeta and TableMetaBuilder

* chore: format

* fix/partition-key-index:
 ### Update `TableMeta` and Add Partition and Alter Table Tests

 - **`metadata.rs`**: Modified `new_meta_builder` method in `TableMeta` to manually remove `value_indices` by setting it to `None` in the `TableMetaBuilder`.
 - **`partition_and_alter.result` & `partition_and_alter.sql`**: Added new test cases for creating, inserting, selecting, altering, and dropping a partitioned table `molestiAe`. These tests verify partitioning on the `sImiLiQUE` column and altering the table with a TTL
 setting.

fix/partition-key-index:
 ### Remove Obsolete TODO Comment in `metadata.rs`

 - Removed an outdated TODO comment regarding the `new_meta_builder` function in `src/table/src/metadata.rs`.

chore: check struct name in derive_meta_builder

refactor: Simplify TableMeta struct name check in macro

refactor: Improve ToMetaBuilder derive macro validation and error handling

refactor: Enforce ToMetaBuilder macro for table::metadata::TableMeta struct

* fix/partition-key-index:
 Update `partition_and_alter.sql` to modify TTL setting

 - Modified the TTL setting for the `molestiAe` table to '1d' in `partition_and_alter.sql`.

* fix: sqlness

* fix/partition-key-index:
 ### Update `TableMeta` and Test File Structure

 - **Enhancement**: Added a note in `metadata.rs` to always use `new_meta_builder` for creating `TableMetaBuilder`.
 - **Refactor**: Renamed test result and SQL files for better organization:
   - `partition_and_alter.result` to `alter/partition_and_alter.result`
   - `partition_and_alter.sql` to `alter/partition_and_alter.sql`

* refactor: Simplify `derive_meta_builder` by initializing fields with `Default::default()`

* fix/partition-key-index:
 ### Commit Summary

 - **Refactor `TableMetaBuilder` Initialization**:
   - Replaced `TableMetaBuilder::default()` with `TableMetaBuilder::empty()` across multiple files for initializing `TableMetaBuilder` instances.
   - Affected files include:
     - `src/catalog/src/system_schema.rs`
     - `src/common/meta/src/key/test_utils.rs`
     - `src/operator/src/req_convert/insert/fill_impure_default.rs`
     - `src/query/src/log_query/planner.rs`
     - `src/query/src/promql/planner.rs`
     - `src/query/src/range_select/plan_rewrite.rs`
     - `src/query/src/sql/show_create_table.rs`
     - `src/table/src/test_util/memtable.rs`
     - `src/table/src/test_util/table_info.rs`

 - **Enhance `TableMetaBuilder`**:
   - Added `custom_constructor` to `TableMeta` and implemented an `empty` method for `TableMetaBuilder`.
   - Modified `TableMetaBuilder` to include a `new_external_table` method with default values.
   - Updated `src/table/src/metadata.rs` to reflect these changes.

 - **Add Testing Feature**:
   - Introduced a conditional compilation for `test_util` in `src/table/src/lib.rs` to include testing utilities when the `testing` feature is enabled.

 - **Update `Cargo.toml`**:
   - Enabled the `testing` feature for the `table` module in `src/common/meta/Cargo.toml`.

 - **Modify `NumbersTable` Initialization**:
   - Replaced `TableMetaBuilder` with direct `TableMeta` struct initialization in `src/table/src/table/numbers.rs`.

 - **Test Result Update**:
   - Updated test results in `tests/cases/standalone/common/alter/partition_and_alter.result` to reflect changes in table meta handling.

* fix: rename default to empty

* docs: add doc for TableMetaBuilder::empty

* chore: Update src/table/src/metadata.rs

---------

Co-authored-by: Yingwen <realevenyag@gmail.com>
2025-03-13 06:30:16 +00:00
shuiyisong
e0ff701e51 chore: support application/x-ndjson for log ingest (#5697)
chore: support ndjson content type
2025-03-13 04:29:22 +00:00
Yingwen
25645a3303 feat: expose virtual_host_style config for s3 storage (#5696)
* feat: expose enable_virtual_host_style for s3 storage

* docs: update examples

* test: fix config test
2025-03-12 13:46:56 +00:00
Ruihang Xia
b32ea7d84c feat: add Docker image tag information to step summary in dev-build workflow (#5692)
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
2025-03-12 13:45:19 +00:00
discord9
f164f6eaf3 fix: FlowInfoValue's compatibility (#5695) 2025-03-12 09:02:48 +00:00
Yohan Wal
af1920defc feat: add mysql kvbackend (#5528)
* feat: add mysql kvbackend txn support

* chore: error handling

* chore: follow review comments

* chore: follow review comments

* chore: follow review comments

* revert: mysql QAQ

* revert: revert changes to sqls

This reverts commit cf98c50dd9.

* chore: add comments
2025-03-12 06:52:56 +00:00
Lei, HUANG
7c97fae522 chore: check region wal provider on startup to avoid inconsistence (#5687) 2025-03-11 17:51:18 +00:00
AntiTopQuark
b8070adc3a feat: enhancement information_schema.flows (#5623)
* feat: enhancement information_schema.flows

* feat: enhancement information_schema.flows

* u

* u

* u

* u

* u

* u

* u

* u

* u

* update

* update

* update

* delete unused code

* u

* u

* Update src/flow/src/adapter/worker.rs

Co-authored-by: dennis zhuang <killme2008@gmail.com>

* Update src/common/meta/src/key/flow/flow_state.rs

Co-authored-by: dennis zhuang <killme2008@gmail.com>

* Update src/common/meta/src/key/flow/flow_info.rs

Co-authored-by: dennis zhuang <killme2008@gmail.com>

* Update src/common/meta/src/key/flow/flow_state.rs

Co-authored-by: dennis zhuang <killme2008@gmail.com>

* Update src/common/meta/src/key/flow/flow_info.rs

Co-authored-by: dennis zhuang <killme2008@gmail.com>

* u

* u

* u

* u

* u

* u

* chore: fix sqlness

* chore: update proto

* fix: remove date time

* fix: update result of information_schema test

---------

Co-authored-by: dennis zhuang <killme2008@gmail.com>
Co-authored-by: discord9 <discord9@163.com>
2025-03-11 15:08:10 +00:00
yihong
11bfb17328 feat: support export command export data to s3 (#5585)
* feat: s3 first step

Signed-off-by: yihong0618 <zouzou0208@gmail.com>

* fix: finish s3 export

Signed-off-by: yihong0618 <zouzou0208@gmail.com>

* fix: drop useless comment

Signed-off-by: yihong0618 <zouzou0208@gmail.com>

* fix: forget to create_database and copy_from

Signed-off-by: yihong0618 <zouzou0208@gmail.com>

* fix: address comment use opendal Fs

Signed-off-by: yihong0618 <zouzou0208@gmail.com>

* refactor: make the export mess code clean

Signed-off-by: yihong0618 <zouzou0208@gmail.com>

---------

Signed-off-by: yihong0618 <zouzou0208@gmail.com>
2025-03-11 08:56:59 +00:00
jeremyhi
1d87bd2d43 feat: alter region follower (#5676)
* feat: add region follower manager

* feat: add region procudure

* refactor: make add, remove follower procedure look  nice

* feat: add region follower procedure

* chore: undo some chane, possibly made by AI

* feat: on prepare cheking

* feat: on update metadata

* feat: on broadcast

* chore: unit test

* feat: add remove follower operation

* feat: add or remove region follower procedure

* chore: ut

* chore: rename

* chore: by comment

* chore: by comment

---------

Co-authored-by: jeremy <jeremy@greptime.local>
2025-03-11 08:44:50 +00:00
jeremyhi
ababeaf538 chore: make memorykv write happily (#5686)
chore: make memorykv write happly
2025-03-11 07:37:14 +00:00
Lin Yihai
2cbf51d0be refactor!: Remove Value::DateTime and ValueRef::DateTime. (#5616)
* refactor: Remove Value::DateTime and ValueRef::DateTime

* fix: don't panic if arrow cast field.

* fix: map `ColumnDataType::Datetime` to `ConcreteDataType::timestamp_microsecond_datatype`

* fix: Map `ValueData::DatetimeValue` correctly.

* refactor: Replace `datetime` with `timestamp_micro_second`
2025-03-11 07:03:27 +00:00
Yingwen
3059b04b19 feat: add a gauge for download tasks (#5681) 2025-03-11 06:55:13 +00:00
Yingwen
352b197be4 feat: add hint for logical region in RegionScanner (#5684)
* feat: add a flag to check logical region

* feat: sets logical region hint in metric engine

* refactor: rename to logical_region
2025-03-11 06:34:39 +00:00
Ning Sun
d0254f9705 feat: update promql-parser to 0.5 for duration literal (#5682) 2025-03-11 06:27:36 +00:00
Ning Sun
8a86903c73 feat: add description for each grafana panel (#5673)
* feat: add description for each grafana panel

* Apply suggestions from code review

Co-authored-by: Yingwen <realevenyag@gmail.com>

* fix: unit of write stall

* feat: add jq script to summary the grafana dashboard

* fix: update description

* ci: add ci step to valid grafana and send summary as comment

* ci: update check

* ci: update ci

---------

Co-authored-by: Yingwen <realevenyag@gmail.com>
2025-03-11 06:16:49 +00:00
Weny Xu
0bd322a078 perf(prom): optimize label values query (#5653)
perf: optimize label values query
2025-03-10 13:20:47 +00:00
discord9
3811e3f632 feat: also get index file&expose mito in metrics (#5680)
* feat: download index file too

* feat: expose mito in metrics

* chore: fmt
2025-03-10 13:07:08 +00:00
localhost
c14aa176b5 chore: impl ref and ref_mut for json like (#5679)
* chore: impl ref and ref_mut for json like

* chore: add code source
2025-03-10 10:43:15 +00:00
Lei, HUANG
a922dcd9df refactor(mito): move wal sync task to background (#5677)
chore/move-wal-sync-to-bg:
 ### Refactor Log Store Task Management

 - **Error Handling Enhancements**: Updated error handling for task management in `error.rs` by renaming `StartGcTask` and `StopGcTask` to `StartWalTask` and `StopWalTask`, respectively, and added a `name` field for more descriptive error messages.
 - **Task Management Improvements**: Introduced `SyncWalTaskFunction` in `log_store.rs` to handle periodic synchronization of WAL tasks, replacing the previous atomic-based sync logic.
 - **Backend Adjustments**: Modified `backend.rs` to use the new `StartWalTaskSnafu` for starting tasks, ensuring consistency with the updated error handling approach.
2025-03-10 08:22:35 +00:00
dennis zhuang
530ff53422 feat(promql): supports quantile and count_values (#5652)
* feat(promql): supports quantile

* fix: merge_batch

* chore: sqlness test

* test: unit tests

* feat: implements count_values

* fix: typo

* refactor: planner

* chore: apply review suggestions

---------

Co-authored-by: Yingwen <realevenyag@gmail.com>
2025-03-10 06:41:40 +00:00
Ruihang Xia
73ca39f37e feat: time series distribution in scanner (#5675)
* define distribution

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* feat: SeqScan support per series distribution

* probe distribution

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* reverse sort order

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* more strict check

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* fix clippy

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* change null's ordering

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

---------

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
Co-authored-by: evenyag <realevenyag@gmail.com>
2025-03-10 05:43:17 +00:00
Yingwen
0acc6b0354 fix: correct stalled count (#5678) 2025-03-10 04:25:38 +00:00
Zhenchi
face361fcb feat: introduce roaring bitmap to optimize sparse value scenarios (#5603)
* feat: introduce roaring bitmap to optimize sparse value scenarios

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* fix taplo

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* address comments

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* polish

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* address comments

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

---------

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>
2025-03-10 04:24:08 +00:00
Yingwen
9860bca986 feat: support exact filter on time index column (#5671)
* feat: add predicate group

* feat: pass predicate group

* feat: memtable prune by time filters

* test: test PruneTimeIterator with time filters

* feat: push down returns exact for timestamp simple filters

---------

Co-authored-by: Ruihang Xia <waynestxia@gmail.com>
2025-03-07 21:55:46 +00:00
ZonaHe
3a83c33a48 feat: update dashboard to v0.8.0 (#5666)
Co-authored-by: sunchanglong <sunchanglong@users.noreply.github.com>
Co-authored-by: Ning Sun <sunng@protonmail.com>
2025-03-07 19:47:02 +00:00
Ruihang Xia
373bd59b07 fix: update column requirements to use Column type instead of String (#5672)
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
2025-03-07 18:50:15 +00:00
shuiyisong
c8db4b286d fix: use DateTime instead of NaiveDateTime (#5669)
chore: use datetime instead of naivedatetime
2025-03-07 07:41:59 +00:00
Lei, HUANG
56c8c0651f fix: skip schema check to avoid schema mismatch brought by metadata (#5662)
* fix: skip schema check to avoid schema mismatch brought by metadata

* docs: add some comment to remind me add that check back

* test: add sqlness case

* fix/skip-schema-check:
 ### Update CTE Test Cases

 - **Added GRPC Latencies Test**: Introduced a new test case for GRPC latencies in `cte.result` and `cte.sql` under `standalone/common/cte`.
 - **Removed Redundant Test Files**: Deleted `cte.result` and `cte.sql` under `standalone/common/range` as they were duplicates of the new test case.
2025-03-07 05:47:45 +00:00
shuiyisong
448e588fa7 chore: improve /v1/jaeger/api/trace/{trace_id}'s resp (#5663)
* chore: improve jaeger trace api resp

* chore: fix timestamp type

* chore: fix timestamp type

* chore: complete more fields

* chore: change to microseconds

* chore: add empty check & span status code

* chore: minor update

* chore: update test
2025-03-07 04:31:42 +00:00
Yingwen
f4cbf1d776 docs: update cluster dashboard to make opendal panel works (#5661) 2025-03-07 02:49:15 +00:00
discord9
b35eefcf45 perf: rm coalesce batch when target_batch_size > fetch limit (#5658)
* fix: rm coalesce > limit

* fix: only rm one&test: sqlness
2025-03-07 02:45:07 +00:00
yihong
408dd55a2f fix: flaky test in sqlness by fix random port (#5657)
* fix: flaky test in sqlness by fix random port

Signed-off-by: yihong0618 <zouzou0208@gmail.com>

* fix: typo

Signed-off-by: yihong0618 <zouzou0208@gmail.com>

* fix: panic insead of forever loop

Signed-off-by: yihong0618 <zouzou0208@gmail.com>

---------

Signed-off-by: yihong0618 <zouzou0208@gmail.com>
2025-03-07 00:41:22 +00:00
Ruihang Xia
e463942a5b fix: recover plan schema after dist analyzer (#5665)
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
2025-03-07 00:29:55 +00:00
discord9
0124a0d156 fix: window sort not apply when other column alias to time index name (#5634)
* fix: other col alias to time index column handle

* test: update sqlness

* chore: per review

* test: more sqlness

* test: mv some to optimizer folder

* fix: resolve alias properly

* fix: also retain old name

* chore: remove wrong comment

* chore: fix sqlness

* test: standalone/dist more projection diff
2025-03-06 08:05:57 +00:00
liyang
e23628a4e0 ci: bump dev-builder image version to 2024-12-25-a71b93dd-20250305072908 (#5651) 2025-03-06 03:33:17 +00:00
Weny Xu
1d637cad51 fix(metric-engine): group DDL requests (#5628)
* fix(metric-engine): group DDL requests

* test: add sqlness tests

* chore: apply suggestions from CR

* chore: apply suggestions from CR
2025-03-05 09:17:47 +00:00
Lei, HUANG
a56030e6a5 refactor: remove cluster id field (#5610)
* chore: resolve conflicts

* chore: merge main

* test: add compatibility test for DatanodeLeaseKey with missing cluster_id

* test: add compatibility test for DatanodeLeaseKey without cluster_id

* refactor/remove-cluster-id:
 - **Update `greptime-proto` Dependency**: Updated the `greptime-proto` dependency in `Cargo.lock` and `Cargo.toml` to a new revision.
 - **Remove `cluster_id` Usage**: Removed the `cluster_id` field and its related logic from various files, including `cluster.rs`, `datanode.rs`, `rpc.rs`,
 `adapter.rs`, `client.rs`, `ask_leader.rs`, `heartbeat.rs`, `procedure.rs`, `store.rs`, `handler.rs`, `response_header_handler.rs`, `key.rs`, `datanode.rs`,
 `lease.rs`, `metrics.rs`, `cluster.rs`, `heartbeat.rs`, `procedure.rs`, and `store.rs`.
 - **Refactor Tests**: Updated tests in `client.rs`, `response_header_handler.rs`, `store.rs`, and `service` modules to reflect the removal of `cluster_id`.

* fix: clippy

* refactor/remove-cluster-id:
 **Refactor and Cleanup in Meta Server**

 - **`response_header_handler.rs`**: Removed unused import of `HeartbeatResponse` and cleaned up the test function by eliminating the creation of an unused `HeartbeatResponse` object.
 - **`node_lease.rs`**: Simplified parameter handling in `HttpHandler` implementation by using an underscore for unused parameters.

* refactor/remove-cluster-id:
 ### Remove `TableMetadataAllocatorContext` and Refactor Code

 - **Removed `TableMetadataAllocatorContext`**: Eliminated the `TableMetadataAllocatorContext` struct and its usage across multiple files, including `ddl.rs`, `create_table.rs`, `create_view.rs`, `table_meta.rs`, `test_util.rs`, `create_logical_tables.rs`,
 `drop_table.rs`, and `table_meta_alloc.rs`.
 - **Refactored Function Signatures**: Updated function signatures to remove the `TableMetadataAllocatorContext` parameter in methods like `create`, `create_view`, and `alloc` in `table_meta.rs` and `table_meta_alloc.rs`.
 - **Updated Imports**: Adjusted import statements to reflect the removal of `TableMetadataAllocatorContext` in affected files.

 These changes simplify the codebase by removing an unnecessary context struct and updating related function calls.

* refactor/remove-cluster-id:
 ### Update `datanode.rs` to Modify Key Prefix

 - **File Modified**: `src/common/meta/src/datanode.rs`
 - **Key Changes**:
   - Updated `DatanodeStatKey::prefix_key` and `From<DatanodeStatKey>` to remove the cluster ID from the key prefix.
   - Adjusted comments to reflect the changes in key prefix handling.

* reformat code

* refactor/remove-cluster-id:
 ### Commit Summary

 - **Refactor `Pusher` Initialization**: Removed the `RequestHeader` parameter from the `Pusher::new` method across multiple files, including `handler.rs`, `test_util.rs`, and `heartbeat.rs`. This change simplifies the `Pusher` initialization process by eliminating th
 unnecessary parameter.
 - **Update Imports**: Adjusted import statements in `handler.rs` and `test_util.rs` to remove unused `RequestHeader` references, ensuring cleaner and more efficient code.

* chore: update proto
2025-03-05 08:22:18 +00:00
liyang
a71b93dd84 fix: unable to install software-properties-common in dev builder (#5643)
* fix: unable to install software-properties-common in dev builder

* test dev builder

* improve dev-build image

* setup qemu action
2025-03-05 07:07:06 +00:00
Ning Sun
37f8341963 feat: opentelemetry trace new data modeling (#5622)
* feat: include trace v1 encoding

* feat: add trace ingestion in inserter

* feat: add partition rules and index for trace_id

* chore: format

* chore: fmt

* fix: issue introduced with merge

* feat: adjust index and add integration test for v1

* refactor: remove comment key

* fix: update default value of skip index granularity

* fix: update default value of skip index granularity

* refactor: rename some functions

* feat: remove skipping index from span_id

* refactor: made span_id part of primary key for potential dedup purpose

* feat: move the special attribute resource_attribute.service.name to top level

---------

Co-authored-by: shuiyisong <113876041+shuiyisong@users.noreply.github.com>
2025-03-05 04:08:52 +00:00
Ruihang Xia
b90ef10523 refactor: remove or deprecated existing UDAF implementation (#5637)
* expand macro

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* remove argmin/argmax (wrong impl)

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* remove mean (unnecessary)

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* documentations

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* clean up

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* clean up

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* remove scipy_*, diff and polyval

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* remove unused errors

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* fix clippy again

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

---------

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
2025-03-05 01:40:05 +00:00
jeremyhi
c8ffa70ab8 feat: get tables by ids in catalog manager (#5645)
feat: get tabels by ids in catalog manager

Co-authored-by: jeremy <jeremy@greptime.local>
2025-03-05 00:48:03 +00:00
Ning Sun
e0065a5159 ci: remove ubuntu 20.04 runners (#5545)
* ci: remove ubuntu 20.04 runners

* chore: update ec2-github-runner action as author suggests

* fix: use latest ubuntu image for fuzz test

* Update action.yml

* Update action.yml

---------

Co-authored-by: shuiyisong <113876041+shuiyisong@users.noreply.github.com>
Co-authored-by: liyang <daviderli614@gmail.com>
2025-03-05 00:40:29 +00:00
Lei, HUANG
abf1680d14 fix: interval rewrite rule that messes up show create flow function (#5642)
* fix/interval-cast-rewrite:
 ### Enhance Interval Parsing and Casting

 - **`create_parser.rs`**: Added a test case `test_parse_interval_cast` to verify the parsing of interval casts.
 - **`expand_interval.rs`**: Refactored interval casting logic to handle `CastKind` and `format` attributes. Removed the `create_interval` function and integrated its logic directly into the casting process.
 - **`interval.result`**: Updated test results to reflect changes in interval representation, switching from `IntervalMonthDayNano` to `Utf8` format for interval operations.

* reformat code
2025-03-04 11:55:25 +00:00
Ruihang Xia
0e2fd8e2bd feat: rewrite json_encode_path to geo_path using compound type (#5640)
* function impl

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* tune type

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* fix clippy and suggestions

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

---------

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
2025-03-04 05:10:12 +00:00
Ruihang Xia
0e097732ca feat: support some IP related functions (#5614)
* feat: support some IP related functions

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* sort sqlness result

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* safer shift left

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* sort result again

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* sort result again

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* update against main

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

---------

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
2025-03-04 05:06:25 +00:00
liyang
bb62dc2491 build: use ubuntu-22.04 base image release dev-build image (#5554)
* build: use ubuntu-22.04 release dev-build image

* ci: use ubuntu-22.04 replace ubuntu-22.04-16-cores
2025-03-04 04:45:55 +00:00
Ruihang Xia
40cf63d3c4 refactor: rename table function to admin function (#5636)
* refactor: rename table function to admin function

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* format

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

---------

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
2025-03-04 03:54:07 +00:00
dennis zhuang
6187fd975f feat: alias for boolean (#5639) 2025-03-04 03:12:10 +00:00
Ruihang Xia
6c90f25299 feat(log-query): implement compound filter and alias expr (#5596)
* refine alias behavior

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* implement compound

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* support gt, lt, and in

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

---------

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
2025-03-03 18:52:13 +00:00
Weny Xu
dc24c462dc fix: prevent failover of regions to the same peer (#5632) 2025-03-03 18:41:27 +00:00
shuiyisong
31f29d8a77 chore: support specifying skipping index in pipeline (#5635)
* chore: support setting skipping index in pipeline

* chore: fix typo key

* chore: add test

* chore: fix typo
2025-03-03 18:37:13 +00:00
Lei, HUANG
4a277c21ef fix: properly display CJK characters in table/column comments (#5633)
fix/comment-in-cjk:
 ### Update `OptionMap` Formatting and Add Tests

 - **Enhancements in `OptionMap`**:
   - Changed formatting from `escape_default` to `escape_debug` for better handling of special characters in `src/sql/src/statements/option_map.rs`.
   - Added unit tests to verify the new formatting behavior.

 - **Test Cases for CJK Comments**:
   - Added test cases for tables with comments in CJK (Chinese, Japanese, Korean) characters in `tests/cases/standalone/common/show/show_create.sql` and `show_create.result`.
2025-03-03 12:32:19 +00:00
Weny Xu
ca81fc6a70 fix: refactor region leader state validation (#5626)
* enhance: refactor region leader state validation

* chore: apply suggestions from CR

* chore: add logs
2025-03-03 10:07:25 +00:00
Zhenchi
e714f7df6c fix: out of bound during bloom search (#5625)
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>
2025-03-03 09:53:14 +00:00
Ruihang Xia
1c04ace4b0 feat: skip printing full config content in sqlness (#5618)
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
2025-03-03 09:43:55 +00:00
Weny Xu
95d7ca5382 fix: increase timeout for opening candidate region and log elapsed time (#5627) 2025-03-03 09:16:45 +00:00
yihong
a693583a97 fix: speed up cargo build using sallow clone (#5620)
Signed-off-by: yihong0618 <zouzou0208@gmail.com>
2025-03-03 08:02:12 +00:00
dennis zhuang
87b1408d76 feat: impl topk and bottomk (#5602)
* feat: impl topk and bottomk

* chore: test and project fields

* refactor: prom_topk_bottomk_to_plan

* fix: order

* chore: adds topk plan test

* chore: comment

Co-authored-by: Yingwen <realevenyag@gmail.com>

---------

Co-authored-by: Yingwen <realevenyag@gmail.com>
2025-03-03 07:32:24 +00:00
LFC
dee76f0a73 refactor: simplify udf (#5617)
* refactor: simplify udf

* fix tests
2025-03-03 05:52:44 +00:00
yihong
11a4f54c49 fix: update typos rules to fix ci (#5621)
Signed-off-by: yihong0618 <zouzou0208@gmail.com>
2025-03-01 09:21:36 +00:00
Ruihang Xia
d363c8ee3c fix: check physical region before use (#5612)
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
2025-02-28 06:46:48 +00:00
xiaoniaoyouhuajiang
50b521c526 feat: add vec_dim function (#5587)
* feat:add `vec_dim` function

* delete unused imports

* Modified to be implemented correctly

* fix comment

* add order for sqlness test
2025-02-27 15:54:48 +00:00
Ning Sun
c9d70e0e28 refactor: add pipeline concept to OTLP traces and remove OTLP over gRPC (#5605) 2025-02-27 14:01:45 +00:00
Weny Xu
c0c87652c3 chore: bump version to 0.13.0 (#5611)
chore: bump main branch version to 0.13.0
2025-02-27 13:19:59 +00:00
discord9
faaa0affd0 docs: tsbs update (#5608)
chore: tsbs update
2025-02-27 08:14:48 +00:00
Weny Xu
904d560175 feat(promql-planner): introduce vector matching binary operation (#5578)
* feat(promql-planner): support vector matching for binary operation

* test: add sqlness tests
2025-02-27 07:39:19 +00:00
Lei, HUANG
765d1277ee fix(metasrv): clean expired nodes in memory (#5592)
* fix/frontend-node-state: Refactor NodeInfoKey and Context Handling in Meta Server

 • Removed unused cluster_id from NodeInfoKey struct.
 • Updated HeartbeatHandlerGroup to return Context alongside HeartbeatResponse.
 • Added current_node_info to Context for tracking node information.
 • Implemented on_node_disconnect in Context to handle node disconnection events, specifically for Frontend roles.
 • Adjusted register_pusher function to return PusherId directly.
 • Updated tests to accommodate changes in Context structure.

* fix/frontend-node-state: Refactor Heartbeat Handler Context Management

Refactored the HeartbeatHandlerGroup::handle method to use a mutable reference for Context instead of passing it by value. This change simplifies the
context management by eliminating the need to return the context with the response. Updated the Metasrv implementation to align with this new context
handling approach, improving code clarity and reducing unnecessary context cloning.

* revert: clean cluster info on disconnect

* fix/frontend-node-state: Add Frontend Expiry Listener and Update NodeInfoKey Conversion

 • Introduced FrontendExpiryListener to manage the expiration of frontend nodes, including its integration with leadership change notifications.
 • Modified NodeInfoKey conversion to use references, enhancing efficiency and consistency across the codebase.
 • Updated collect_cluster_info_handler and metasrv to incorporate the new listener and conversion changes.
 • Added frontend_expiry module to the project structure for better organization and maintainability.

* chore: add config for node expiry

* add some doc

* fix: clippy

* fix/frontend-node-state:
 ### Refactor Node Expiry Handling
 - **Configuration Update**: Removed `node_expiry_tick` from `metasrv.example.toml` and `MetasrvOptions` in `metasrv.rs`.
 - **Module Renaming**: Renamed `frontend_expiry.rs` to `node_expiry_listener.rs` and updated references in `lib.rs`.
 - **Code Refactoring**: Replaced `FrontendExpiryListener` with `NodeExpiryListener` in `node_expiry_listener.rs` and `metasrv.rs`, removing the tick     interval and adjusting logic to use a fixed 60-second interval for node expiry checks.

* fix/frontend-node-state:
 Improve logging in `node_expiry_listener.rs`

 - Enhanced warning message to include peer information when an unrecognized node info key is encountered in `node_expiry_listener.rs`.

* docs: update config docs

* fix/frontend-node-state:
 **Refactor Context Handling in Heartbeat Services**

 - Updated `HeartbeatHandlerGroup` in `handler.rs` to pass `Context` by value instead of by mutable reference, allowing for more flexible context
 management.
 - Modified `Metasrv` implementation in `heartbeat.rs` to clone `Context` when passing to `handle` method, ensuring thread safety and consistency in
 asynchronous operations.
2025-02-27 06:16:36 +00:00
discord9
ccf42a9d97 fix: flow heartbeat retry (#5600)
* fix: flow heartbeat retry

* fix?: not sure if fixed

* chore: per review
2025-02-27 03:58:21 +00:00
Weny Xu
71e2fb895f feat: introduce prom_round fn (#5604)
* feat: introduce `prom_round` fn

* test: add sqlness tests
2025-02-27 03:30:15 +00:00
Ruihang Xia
c9671fd669 feat(promql): implement subquery (#5606)
* feat: initial implement for promql subquery

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* impl and test

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* refactor

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* fix clippy

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

---------

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
2025-02-27 03:28:04 +00:00
Ruihang Xia
b5efc75aab feat(promql): ignore invalid input in histogram plan (#5607)
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
2025-02-27 03:18:20 +00:00
Weny Xu
c1d18d9980 fix(prom): preserve the order of series in PromQueryResult (#5601)
fix(prom): keep the order of tags
2025-02-26 13:40:09 +00:00
Lei, HUANG
5d9faaaf39 fix(metasrv): reject ddl when metasrv is follower (#5599)
* fix/reject-ddl-in-follower-metasrv:
 Add leader check and logging for gRPC requests in `procedure.rs`

 - Implemented leader verification for `query_procedure_state`, `ddl`, and `procedure_details` gRPC requests in `procedure.rs`.
 - Added logging with `warn` for requests reaching a non-leader node.
 - Introduced `ResponseHeader` and `Error::is_not_leader()` to handle non-leader responses.

* fix/reject-ddl-in-follower-metasrv:
 Improve leader address handling in `heartbeat.rs`

 - Refactor leader address retrieval by renaming `leader` to `leader_addr` for clarity.
 - Update `make_client` function to use a reference to `leader_addr`.
 - Enhance logging to include the leader address in the success message for creating a heartbeat stream.

* fmt

* fix/reject-ddl-in-follower-metasrv:
 **Enhance Leader Check in `procedure.rs`**

 - Updated the leader verification logic in `procedure.rs` to return a failed `MigrateRegionResponse` when the server is not the leader.
 - Added logging to warn when a migrate request is received by a non-leader server.
2025-02-26 08:10:40 +00:00
ZonaHe
538875abee feat: update dashboard to v0.7.11 (#5597)
Co-authored-by: sunchanglong <sunchanglong@users.noreply.github.com>
2025-02-26 07:57:59 +00:00
jeremyhi
5ed09c4584 fix: all heartbeat channel need to check leader (#5593) 2025-02-25 10:45:30 +00:00
Yingwen
3f6a41eac5 fix: update show create table output for fulltext index (#5591)
* fix: update full index syntax in show create table

* test: update fulltext sqlness result
2025-02-25 09:36:27 +00:00
yihong
ff0dcf12c5 perf: close issue 4974 by do not delete columns when drop logical region about 100 times faster (#5561)
* perf: do not delete columns when drop logical region in drop database

Signed-off-by: yihong0618 <zouzou0208@gmail.com>

* fix: make ci happy

Signed-off-by: yihong0618 <zouzou0208@gmail.com>

* fix: address review comments

Signed-off-by: yihong0618 <zouzou0208@gmail.com>

* fix: address some comments

Signed-off-by: yihong0618 <zouzou0208@gmail.com>

* fix: drop stupid comments by copilot

Signed-off-by: yihong0618 <zouzou0208@gmail.com>

* chore: minor refactor

* chore: minor refactor

* chore: update grpetime-proto

---------

Signed-off-by: yihong0618 <zouzou0208@gmail.com>
Co-authored-by: WenyXu <wenymedia@gmail.com>
2025-02-25 09:00:49 +00:00
Yingwen
5b1fca825a fix: remove cached and uploaded files on failure (#5590) 2025-02-25 08:51:37 +00:00
Ruihang Xia
7bd108e2be feat: impl hll_state, hll_merge and hll_calc for incremental distinct counting (#5579)
* basic impl

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* more tests

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* sqlness test

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* fix clippy

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* update with more test and logs

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* impl

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* impl merge fn

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* rename function names

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

---------

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
2025-02-24 19:07:37 +00:00
Weny Xu
286f225e50 fix: correct inverted_indexed_column_ids behavior (#5586)
* fix: correct `inverted_indexed_column_ids`

* fix: fix unit tests
2025-02-23 07:17:38 +00:00
Ruihang Xia
4f988b5ba9 feat: remove default inverted index for physical table (#5583)
* feat: remove default inverted index for physical table

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* update sqlness result

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

---------

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
2025-02-22 06:48:05 +00:00
Ruihang Xia
500d0852eb fix: avoid run labeler job concurrently (#5584)
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
2025-02-22 05:18:26 +00:00
Zhenchi
8d05fb3503 feat: unify puffin name passed to stager (#5564)
* feat: purge a given puffin file in staging area

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* polish log

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* ttl set to 2d

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* feat: expose staging_ttl to index config

* feat: unify puffin name passed to stager

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* fix test

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* address comments

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* fallback to remote index

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* fix

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* refactor

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

---------

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>
Co-authored-by: evenyag <realevenyag@gmail.com>
2025-02-21 09:27:03 +00:00
Ruihang Xia
d7b6718be0 feat: run sqlness in parallel (#5499)
* define server mode

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* bump sqlness

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* all good

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* clean up

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* refactor: Move config generation logic from Env to ServerMode

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* finalize

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* change license header

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* rename variables

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* override parallelism

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* rename more variables

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

---------

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
2025-02-21 07:05:19 +00:00
Ruihang Xia
6f0783e17e fix: broken link in AUTHOR.md (#5581) 2025-02-21 07:01:41 +00:00
Ruihang Xia
d69e93b91a feat: support to generate json output for explain analyze in http api (#5567)
* impl

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* integration test

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* Update src/servers/src/http/hints.rs

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* refactor: with FORMAT option for explain format

* lift some well-known metrics

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

---------

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Ning Sun <sunning@greptime.com>
2025-02-21 05:13:09 +00:00
Ruihang Xia
76083892cd feat: support UNNEST (#5580)
* feat: support UNNEST

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* fix clippy and sqlness

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

---------

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
2025-02-21 04:53:56 +00:00
Ruihang Xia
7981c06989 feat: implement uddsketch function to calculate percentile (#5574)
* basic impl

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* more tests

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* sqlness test

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* fix clippy

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* update with more test and logs

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

---------

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
2025-02-20 18:59:20 +00:00
beryl678
97bb1519f8 docs: revise the author list (#5575) 2025-02-20 18:04:23 +00:00
Weny Xu
1d8c9c1843 feat: enable gzip for prometheus query handlers and ignore NaN values in prometheus response (#5576)
* feat: enable gzip for prometheus query handlers and ignore nan values in prometheus response

* Apply suggestions from code review

Co-authored-by: shuiyisong <113876041+shuiyisong@users.noreply.github.com>

---------

Co-authored-by: shuiyisong <113876041+shuiyisong@users.noreply.github.com>
2025-02-20 11:34:32 +00:00
jeremyhi
71007e200c feat: remap flow route address (#5565)
* feat: remap fow peers

* refactor: not stream

* feat: remap flownode addr on FlowRoute and TableFlow

* fix: unit test

* Update src/meta-srv/src/handler/remap_flow_peer_handler.rs

Co-authored-by: Lei, HUANG <6406592+v0y4g3r@users.noreply.github.com>

* chore: by comment

* Update src/meta-srv/src/handler/remap_flow_peer_handler.rs

* Update src/common/meta/src/key/flow/table_flow.rs

* Update src/common/meta/src/key/flow/flow_route.rs

* chore: remove duplicate field

---------

Co-authored-by: Lei, HUANG <6406592+v0y4g3r@users.noreply.github.com>
2025-02-20 08:21:32 +00:00
jeremyhi
a0ff9e751e feat: flow type on creating procedure (#5572)
feat: flow type on creating
2025-02-20 08:12:02 +00:00
LFC
f6f617d667 feat: submit node's cpu cores number to metasrv in heartbeat (#5571)
* feat: submit node's cpu cores number to metasrv in heartbeat

* update greptime-proto dep
2025-02-20 03:55:18 +00:00
Ruihang Xia
e8788088a8 feat(log-query): implement the first part of log query expr (#5548)
* feat(log-query): implement the first part of log query expr

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* fix clippy

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

---------

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
2025-02-19 18:25:41 +00:00
shuiyisong
53b25c04a2 chore: support Loki's structured metadata for ingestion (#5541)
* chore: support loki's structured metadata

* test: update test

* chore: revert some code change

* chore: address CR comment
2025-02-19 16:44:26 +00:00
dennis zhuang
62a8b8b9dc feat(promql): supports sort, sort_desc etc. functions (#5542)
* feat(promql): supports sort, sort_desc etc. functions

* chore: fix toml format and tests

* chore: update deps

Co-authored-by: Weny Xu <wenymedia@gmail.com>

* chore: remove fixme

* fix: cargo lock

* chore: style

---------

Co-authored-by: Weny Xu <wenymedia@gmail.com>
2025-02-19 13:13:49 +00:00
Weny Xu
c8bdeaaa6a fix(promql-planner): update ctx field columns of OR operator (#5556)
* fix(promql-planner): update ctx field columns of OR operator

* test: add sqlness test
2025-02-19 11:18:58 +00:00
Ning Sun
81da18e5df refactor: use global type alias for pipeline input (#5568)
* refactor: use global type alias for pipeline input

* fmt: reformat
2025-02-19 10:41:33 +00:00
Weny Xu
7c65fddb30 fix(promql-planner): correct AND/UNLESS operator behavior (#5557)
* fix(promql-planner): keep field column in left input for AND operator

* test: add sqlness test

* fix: fix unless operator
2025-02-19 09:07:39 +00:00
Zhenchi
421e38c481 feat: allow purging a given puffin file in staging area (#5558)
* feat: purge a given puffin file in staging area

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* polish log

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* ttl set to 2d

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* feat: expose staging_ttl to index config

* fix test

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* use `invalidate_entries_if` instead of maintaining map

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* run_pending_tasks after purging

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

---------

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>
Co-authored-by: evenyag <realevenyag@gmail.com>
2025-02-19 08:58:30 +00:00
Weny Xu
aada5c1706 fix(promql-planner): remove le tag in ctx (#5560)
* fix(promql-planner): remove le tag in ctx

* test: add sqlness test

* chore: apply suggestions from CR
2025-02-19 03:51:27 +00:00
yihong
aa8f119bbb chore: format all toml files (#5529)
fix: format some cargo files

Signed-off-by: yihong0618 <zouzou0208@gmail.com>
2025-02-18 12:09:01 +00:00
ZonaHe
19a6d15849 feat: update dashboard to v0.7.10 (#5562)
Co-authored-by: ZonaHex <ZonaHex@users.noreply.github.com>
2025-02-18 12:06:22 +00:00
liyang
073aaefe65 chore: improve grafana dashboard (#5559) 2025-02-18 11:36:27 +00:00
Yingwen
77223a0f3e fix: window sort support alias time index (#5543)
* fix: use alias expr to check commutativity

* chore: debug sort

* feat: consider alias in window sort optimizer

* test: sqlness test

* test: update sqlness result
2025-02-18 10:35:43 +00:00
Ruihang Xia
4ef038d098 fix: correct promql behavior on nonexistent columns (#5547)
* Revert "fix(promql): ignore filters for non-existent labels (#5519)"

This reverts commit 33a2485f54.

* reimplement

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* state safety

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

---------

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
2025-02-17 18:43:50 +00:00
jeremyhi
deb9520970 fix: information_schema.cluster_info be covered by the same id (#5555)
* fix: information_schema.cluster_info be coverd by the same id

* chore: by comment
2025-02-17 11:51:02 +00:00
Yingwen
6bba5e0afa feat: collect stager metrics (#5553)
* feat: collect stager metrics

* Apply suggestions from code review

Co-authored-by: Zhenchi <zhongzc_arch@outlook.com>

* Update src/mito2/src/metrics.rs

---------

Co-authored-by: Weny Xu <wenymedia@gmail.com>
Co-authored-by: Zhenchi <zhongzc_arch@outlook.com>
2025-02-17 07:09:15 +00:00
Ruihang Xia
f359eeb667 feat(log-query): support specifying exclusive/inclusive for between filter (#5546)
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
2025-02-17 04:40:47 +00:00
liyang
009dbad581 ci: don't push nightly latest image (#5551)
* ci: don't push nightly latest image

* add push release latest image
2025-02-17 04:34:49 +00:00
liyang
a2047b096c ci: use s5cmd upload artifacts (#5550) 2025-02-17 02:57:13 +00:00
Ruihang Xia
6e8b1ba004 feat: drop noneffective regex filter (#5544)
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
2025-02-15 04:20:26 +00:00
Ruihang Xia
7fc935c61c feat!: support alter skipping index (#5538)
* feat: support alter skipping index

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* update test results

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* cargo fmt

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* update sqlness result

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* finalize

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

---------

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
2025-02-14 18:43:21 +00:00
discord9
1e6d2fb1fa feat: add snapshot seqs field to query context (#5477)
* TODO: snapshot read

* feat: RegionEngine get last seq

* feat: query context snapshot

* chore: use new proto

* feat: get_region_seqs in region engine

* chore: typo

* chore: toml

* feat: make snapshots modifiable

* feat: add hint for snapshot read

* chore: some typo

* refactor: remove hint as not used

* fix: use commited seqs

* refactor: remove sequences variant on RegionRequest

* refactor: per review

* chore: rebase solve conflict

* refactor: rm unused key

* chore: per review

* chore: per review
2025-02-14 09:07:48 +00:00
Ruihang Xia
0d19e8f089 fix: promql join operation won't consider time index (#5535)
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
Co-authored-by: Weny Xu <wenymedia@gmail.com>
2025-02-14 08:21:05 +00:00
Weny Xu
c56106b883 perf: optimize table alteration speed in metric engine (#5526)
* feat(metric-engine): introduce batch alter request handling

* refactor: minor refactor

* refactor: push down filter to mito

* chore: apply suggestions from CR
2025-02-14 08:11:48 +00:00
Yohan Wal
edb040dea3 refactor: refactor pg kvbackend impl in preparation for other rds kvbackend (#5494)
* refactor: unify rds kvbackend impl

* fix: licence header

* refactor: use unique sql template set

* fix: fix deps

* chore: apply optimization patch

* chore: apply optimization patch(2)

* chore: follow review comments
2025-02-14 08:10:09 +00:00
Ruihang Xia
7bbc87b3c0 feat(promql): add series count metrics (#5534)
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
Co-authored-by: Weny Xu <wenymedia@gmail.com>
2025-02-14 07:49:28 +00:00
Zhenchi
858dae7b23 feat: add stager nofitier to collect metrics (#5530)
* feat: add stager nofitier to collect metrics

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* apply prev commit

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* remove dup size

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* add load cost

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

---------

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>
2025-02-14 07:49:26 +00:00
Weny Xu
33a2485f54 fix(promql): ignore filters for non-existent labels (#5519)
* fix(promql): ignore filters for non-existent labels

* chore: add comments

* test: add sqlness test
2025-02-14 06:40:15 +00:00
zyy17
8ebf454bc1 fix(jaeger): return error when no tracing table (#5539)
fix: return error when no tracing table
2025-02-14 06:20:56 +00:00
Ning Sun
f5b9ade6df chore: add section marker for extenal dependencies (#5536)
* chore: add section marker for extenal dependencies

* chore: update cargo.lock

* Update Cargo.toml

Co-authored-by: shuiyisong <113876041+shuiyisong@users.noreply.github.com>

* chore: update meter-core

---------

Co-authored-by: shuiyisong <113876041+shuiyisong@users.noreply.github.com>
2025-02-14 06:16:57 +00:00
Ruihang Xia
9c1834accd fix: old typo (#5532)
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
2025-02-14 02:18:43 +00:00
Yingwen
918517d221 feat: window sort supports where on fields and time index (#5527)
* feat: handle filter for window sort

* test: sqlness filter test for window sort

* test: add test on tag column filter

* test: test for filter on ts

* test: update sqlness test
2025-02-14 01:38:15 +00:00
liyang
92d9e81a9f ci: use the repository variable to pass to image-name (#5517)
Co-authored-by: Yingwen <realevenyag@gmail.com>
2025-02-13 18:14:49 +00:00
yihong
224b1d15cd chore: use the same version of chrono-tz (#5523)
* fix: use the same version of chrono-tz

Signed-off-by: yihong0618 <zouzou0208@gmail.com>

* fix: address comments

Signed-off-by: yihong0618 <zouzou0208@gmail.com>

---------

Signed-off-by: yihong0618 <zouzou0208@gmail.com>
2025-02-13 17:23:29 +00:00
Yingwen
b4d5393080 feat: speed up read/write cache and stager eviction (#5531)
* feat: change cache policy for file cache

* feat: file cache run pending task after put

* feat: run pending task in put_dir

* feat: run pending task after stager recovered

* feat: purge recycle bin periodically

* feat: use lru policy for read cache
2025-02-13 17:13:24 +00:00
Weny Xu
73c29bb482 fix(promql): unescape matcher values (#5521)
* fix(promql): unescape matcher values

* test: add sqlness tests

* chore: apply suggestions from CR

* feat: use unescaper
2025-02-13 09:42:25 +00:00
Ning Sun
198ee87675 feat: alias database matcher for promql (#5522)
* feat: provide an alias db matcher for promql

* refactor: rename __db__ to __database__

* chore: fix sqlness test
2025-02-13 08:37:37 +00:00
jeremyhi
02af9dd21a refactor!: remove datetime type (#5506)
* feat remove datetime type

* chore: fix unit test

* chore: add column test

* refactor: move create and alter validation to one place

* chore: minor refactor ut

* refactor: rename expr_factory to expr_helper

* chore: remove unnecessary args
2025-02-13 08:01:16 +00:00
Weny Xu
bb97f1bf16 perf: optimize table creation speed in metric engine (#5503)
* feat(metric-engine): introduce batch create request handling

* chore: remove unused code

* test: add more tests

* chore: remove unused error

* chore: apply suggestions from CR
2025-02-13 07:39:04 +00:00
yihong
fbd5316fdb perf: better performance for LastNonNullIter close #5229 about 10x times faster (#5518)
* fix: better performance for LastNonNullIter close #LastNonNullIter

Signed-off-by: yihong0618 <zouzou0208@gmail.com>

* fix: address comments

Signed-off-by: yihong0618 <zouzou0208@gmail.com>

* fix: add Safety comments for the unwrap

Signed-off-by: yihong0618 <zouzou0208@gmail.com>

---------

Signed-off-by: yihong0618 <zouzou0208@gmail.com>
2025-02-13 05:14:39 +00:00
Weny Xu
63d5a69a31 fix(query_range): skip data field on errors (#5520)
* fix: skip serializing PrometheusResponse when None

* fix: fix unit test

* chore: clippy
2025-02-13 04:32:24 +00:00
zyy17
954310f917 feat: implement Jaeger query APIs (#5452)
* feat: implement jaeger query api

* test: add some unit tests

* test: add integration tests for jaeger query APIs

* refactor: parse tags from url parameters

* refactor: support to query traces by tags

* refactor: add limit parameter

* refactor: add jaeger query api metrics

* chore: add some comment docs and default limit value

* test: add more unit tests

* docs: add jaeger options in config docs

* refactor: code review

* wip

* refactor: use datafusion's dataframe APIs to query traces

* refactor: code review

* chore: format test cases

* refactor: add check_schema()

* chore: fix clippy errors and rename function name

* refactor: throw error when covert start_time and duration error

* chore: modify incorrect request type name

* chore: remove unecessary serde rename

* refactor: add some important comments

* refactor: add SPAN_KIND_PREFIX

* refactor: code review
2025-02-12 23:36:38 +00:00
zyy17
58c6274bf6 fix: use fixed tonistiigi/binfmt:qemu-v7.0.0-28 image version instead of latest version to avoid segmentation fault (#5516)
fix: use fixed tonistiigi/binfmt:qemu-v7.0.0-28 image version instead of latest version to avoid segmentation fault

Co-authored-by: Yingwen <realevenyag@gmail.com>
2025-02-12 19:29:49 +00:00
Ning Sun
46947fd1de ci: docbot requires pull_request_target (#5514) 2025-02-12 09:46:04 +00:00
Weny Xu
44fffdec8b refactor: refactor region server request handling (#5504)
* refactor: refactor region server requests handling

* chore: apply suggestions from CR
2025-02-12 08:34:42 +00:00
Ruihang Xia
8026b1d72c feat!: unify all index creation grammars (#5486)
* column options

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* handle table constrain

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* update test assertions

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* change inverted index table constrain usage

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* update sqlness result

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* don't create inverted index for pk on alter table

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* remove remaining pk-as-inverted-index

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* more inverted index magic

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* update sqlness result again

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* fix clippy

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* Update src/sql/src/statements.rs

Co-authored-by: jeremyhi <jiachun_feng@proton.me>

* drop support for index def in table constrain

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

---------

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
Co-authored-by: jeremyhi <jiachun_feng@proton.me>
2025-02-12 06:54:09 +00:00
Ruihang Xia
e22aa819be feat: support server-side keep-alive for mysql and pg protocols (#5496)
* feat: support server-side keep-alive for mysql and pg protocols

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* update config.md

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* update config to use humantime for keep-alive configuration

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* chore: Update socket2 dependency

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

---------

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
2025-02-11 19:22:10 +00:00
localhost
beb9c0a797 chore: set now as timestamp field default value (#5502)
* chore: set now as timestamp field default value

* chore: import pipeline default value
2025-02-11 17:41:44 +00:00
ZonaHe
5f6f5e980a feat: update dashboard to v0.7.10-rc (#5512)
Co-authored-by: ZonaHex <ZonaHex@users.noreply.github.com>
2025-02-11 11:00:10 +00:00
LFC
ccfa40dc41 ci: run nightly jobs only on greptimedb repo (#5505)
ci: skip nightly ci jobs (#9)

(cherry picked from commit 345b4c30474f47a0477263bfba9894d7b4acda2d)
(cherry picked from commit dcd779cd668802fb1ea12fefb4dc3f83f34e30a2)
2025-02-11 10:57:43 +00:00
Zhenchi
336b941113 feat: change puffin stager eviction policy (#5511)
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>
2025-02-11 08:16:27 +00:00
yihong
de3f817596 fix: drop useless clone and for loop second (#5507)
Signed-off-by: yihong0618 <zouzou0208@gmail.com>
2025-02-11 06:23:49 +00:00
ZonaHe
d094f48822 feat: update dashboard to v0.7.9 (#5508)
Co-authored-by: ZonaHex <ZonaHex@users.noreply.github.com>
2025-02-11 06:19:58 +00:00
yihong
342883e922 ci: safe ci using zizmor check (#5491)
* ci: safe ci using zizmor check

Signed-off-by: yihong0618 <zouzou0208@gmail.com>

* fix: lines empty

Signed-off-by: yihong0618 <zouzou0208@gmail.com>

* fix: delete useless code

Signed-off-by: yihong0618 <zouzou0208@gmail.com>

---------

Signed-off-by: yihong0618 <zouzou0208@gmail.com>
2025-02-11 02:38:14 +00:00
Zhenchi
5be81abba3 feat: add metadata method to puffin reader (#5501)
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>
2025-02-10 09:14:54 +00:00
Zhenchi
c19ecd7ea2 refactor: change traversal order during index construction (#5498)
* refactor: change traversal order during index construction

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* chain

Co-authored-by: jeremyhi <jiachun_feng@proton.me>

---------

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>
Co-authored-by: jeremyhi <jiachun_feng@proton.me>
2025-02-10 06:31:35 +00:00
Ning Sun
15f4b10065 chore: revert "docs: add TM to logos" (#5495)
* Revert "docs: add TM to logos (#4789)"

This reverts commit caf5f2c7a5.

* chore: transparent
2025-02-10 04:00:59 +00:00
yihong
c100a2d1a6 fix: refactor pgkv using prepare_cache about 10% better (#5497)
fix: refactor pgkv using prepare_cache about 15% better

Signed-off-by: yihong0618 <zouzou0208@gmail.com>
2025-02-10 03:59:18 +00:00
yihong
ccb1978c98 fix: close issue #5466 by do not shortcut the drop command (#5467)
fix: close issue #5466 by do not shortcut by back it to READY when fail

Signed-off-by: yihong0618 <zouzou0208@gmail.com>
2025-02-10 03:28:34 +00:00
Ning Sun
480b05c590 feat: pipeline dispatcher part 2: execution (#5409)
* fmt: correct format

* test: add negative tests

* feat: Add pipeline dispatching and execution output handling

* refactor: Enhance ingest function to correctly process original data values

custom table names during pipeline execution while optimizing the management of
transformed rows and multiple dispatched pipelines

* refactor: call greptime_identity with intermediate values

* fix: typo

* test: port tests to refactored apis

* refactor: adapt dryrun api call

* refactor: move pipeline execution code to a separated module

* refactor: update otlp pipeline execution path

* fmt: format imports

* fix: compilation

* fix: resolve residual issues

* refactor: address review comments

* chore: use btreemap as pipeline intermediate status trait modify

* refactor: update dispatcher to accept BTreeMap

* refactor: update identity pipeline

* refactor: use new input for pipeline

* chore: wip

* refactor: use updated prepare api

* refactor: improve error and header name

* feat: port flatten to new api

* chore: update pipeline api

* chore: fix transform and some pipeline test

* refactor: reimplement cmcd

* refactor: update csv processor

* fmt: update format

* chore: fix regex and dissect processor

* chore: fix test

* test: add integration test for http pipeline

* refactor: improve regex pipeline

* refactor: improve required field check

* refactor: rename table_part to table_suffix

* fix: resolve merge issue

---------

Co-authored-by: paomian <xpaomian@gmail.com>
2025-02-08 09:01:54 +00:00
Ruihang Xia
0de0fd80b0 feat: move pipelines to the first-class endpoint (#5480)
* feat: move pipelines to the first-class endpoint

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* change endpoints

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* prefix path with /

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* update integration result

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

---------

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
2025-02-08 03:46:31 +00:00
Yohan Wal
059cb6fdc3 feat: update topic-region map when create and drop table (#5423)
* feat: update topic-region map

* fix: parse topic correctly

* test: add unit test forraft engine wal

* Update src/common/meta/src/ddl/drop_table.rs

Co-authored-by: Weny Xu <wenymedia@gmail.com>

* test: fix unit tests

* test: fix unit tests

* chore: error handling and tests

* refactor: manage region-topic map in table_metadata_keys

* refactor: use WalOptions instead of String in deletion

* chore: revert unused change

* chore: follow review comments

* Apply suggestions from code review

Co-authored-by: jeremyhi <jiachun_feng@proton.me>

* chore: follow review comments

---------

Co-authored-by: Weny Xu <wenymedia@gmail.com>
Co-authored-by: jeremyhi <jiachun_feng@proton.me>
2025-02-07 15:09:37 +00:00
jeremyhi
29218b5fe7 refactor!: unify the option names across all components part2 (#5476)
* refactor: part2, replace old options in doc yaml

* chore: remove deprecated options

* chore: update config.md

* fix: ut
2025-02-07 13:06:50 +00:00
discord9
59e6ec0395 chore: update pprof (#5488)
dep: update pprof
2025-02-07 11:43:40 +00:00
Lei, HUANG
79ee230f2a fix: cross compiling for aarch64 targets and allow customizing page size (#5487) 2025-02-07 11:21:16 +00:00
ozewr
0e4bd59fac build: Update Loki proto (#5484)
* build: mv loki-api to loki-proto

* fmt: fmt toml

* fix: loki-proto using rev

---------

Co-authored-by: wangrui <wangrui@baihai.ai>
2025-02-07 09:09:39 +00:00
Yingwen
6eccadbf73 fix: force recycle region dir after gc duration (#5485) 2025-02-07 08:39:04 +00:00
discord9
f29a1c56e9 fix: unquote flow_name in create flow expr (#5483)
* fix: unquote flow_name in create flow expr

* chore: per review

* fix: compat with older version
2025-02-07 08:26:14 +00:00
shuiyisong
88c3d331a1 refactor: otlp logs insertion (#5479)
* chore: add test for selector overlapping

* refactor: simplify otlp logs insertion

* fix: use layered extracted value array

* fix: wrong len

* chore: minor renaming and update

* chore: rename

* fix: clippy

* fix: typos

* chore: update test

* chore: address CR comment & update meter-deps version
2025-02-07 07:21:20 +00:00
yihong
79acc9911e fix: Delete statement not supported in metric engine close #4649 (#5473)
* fix: Delete statement not supported in metric engine close #4649

Signed-off-by: yihong0618 <zouzou0208@gmail.com>

* fix: do not include Truncate address review comments

Signed-off-by: yihong0618 <zouzou0208@gmail.com>

* fix: address comments

Signed-off-by: yihong0618 <zouzou0208@gmail.com>

* fix: address comment again

Signed-off-by: yihong0618 <zouzou0208@gmail.com>

---------

Signed-off-by: yihong0618 <zouzou0208@gmail.com>
2025-02-07 06:47:53 +00:00
Yingwen
0a169980b7 fix: lose decimal precision when using decimal type as tag (#5481)
* fix: replicate() of decimal vector lose precision

* test: add sqlness test

* test: drop table
2025-02-06 13:17:05 +00:00
Weny Xu
c80d2a3222 fix: introduce gc task for metadata store (#5461)
* fix: introduce gc task for metadata kvbackend

* refactor: refine KvbackendConfig

* chore: apply suggestions from CR
2025-02-06 12:12:43 +00:00
Ruihang Xia
116bdaf690 refactor: pull column filling logic out of mito worker loop (#5455)
* avoid duplicated req catagorisation

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* pull column filling up

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* fill columns instead of fill column

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* add test with metadata

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

---------

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
2025-02-06 11:43:28 +00:00
Ruihang Xia
6341fb86c7 feat: write memtable in parallel (#5456)
* feat: write memtable in parallel

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* some comments

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* remove unwrap

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* fix clippy

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* unwrap spawn result

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* use FuturesUnordered

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

---------

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
2025-02-06 09:29:57 +00:00
Ruihang Xia
fa09e181be perf: optimize time series memtable ingestion (#5451)
* initialize with capacity

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* avoid collect

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* optimize zip

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* rename variable

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* ignore type checking in the upper level

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* change to two-step capacity

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

---------

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
2025-02-06 09:12:29 +00:00
Zhenchi
ab4663ec2b feat: add vec_add function (#5471)
* feat: add vec_add function

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* fix unexpected utf8

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

---------

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>
2025-02-06 06:48:50 +00:00
jeremyhi
fac22575aa refactor!: unify the option names across all components (#5457)
* refactor: rename grpc options

* refactor: make the arg clearly

* chore: comments on server_addr

* chore: fix test

* chore: remove the store_addr alias

* refactor: cli option rpc_server_addr

* chore: keep store-addr alias

* chore: by comment
2025-02-06 06:37:14 +00:00
Yingwen
0e249f69cd fix: don't transform Limit in TypeConversionRule, StringNormalizationRule and DistPlannerAnalyzer (#5472)
* fix: do not transform exprs in the limit plan

* chore: keep some logs for debug

* feat: workaround for limit in other rules

* test: add sqlness tests for offset 0

* chore: add fixme
2025-02-05 11:30:24 +00:00
yihong
5d1761f3e5 docs: fix memory perf command wrong (#5470)
* docs: fix memory perf command wrong

Signed-off-by: yihong0618 <zouzou0208@gmail.com>

* fix: address comments

Signed-off-by: yihong0618 <zouzou0208@gmail.com>

* fix: better format

Signed-off-by: yihong0618 <zouzou0208@gmail.com>

* fix: make macos right

Signed-off-by: yihong0618 <zouzou0208@gmail.com>

* docs: add jeprof install info

Signed-off-by: yihong0618 <zouzou0208@gmail.com>
Co-authored-by: evenyag <realevenyag@gmail.com>

---------

Signed-off-by: yihong0618 <zouzou0208@gmail.com>
Co-authored-by: evenyag <realevenyag@gmail.com>
2025-02-05 10:45:51 +00:00
959 changed files with 56542 additions and 24887 deletions

View File

@@ -3,3 +3,12 @@ linker = "aarch64-linux-gnu-gcc"
[alias]
sqlness = "run --bin sqlness-runner --"
[unstable.git]
shallow_index = true
shallow_deps = true
[unstable.gitoxide]
fetch = true
checkout = true
list_files = true
internal_use_git2 = false

View File

@@ -41,7 +41,14 @@ runs:
username: ${{ inputs.dockerhub-image-registry-username }}
password: ${{ inputs.dockerhub-image-registry-token }}
- name: Build and push dev-builder-ubuntu image
- name: Set up qemu for multi-platform builds
uses: docker/setup-qemu-action@v3
with:
platforms: linux/amd64,linux/arm64
# The latest version will lead to segmentation fault.
image: tonistiigi/binfmt:qemu-v7.0.0-28
- name: Build and push dev-builder-ubuntu image # Build image for amd64 and arm64 platform.
shell: bash
if: ${{ inputs.build-dev-builder-ubuntu == 'true' }}
run: |
@@ -52,7 +59,7 @@ runs:
IMAGE_NAMESPACE=${{ inputs.dockerhub-image-namespace }} \
DEV_BUILDER_IMAGE_TAG=${{ inputs.version }}
- name: Build and push dev-builder-centos image
- name: Build and push dev-builder-centos image # Only build image for amd64 platform.
shell: bash
if: ${{ inputs.build-dev-builder-centos == 'true' }}
run: |
@@ -69,8 +76,7 @@ runs:
run: |
make dev-builder \
BASE_IMAGE=android \
BUILDX_MULTI_PLATFORM_BUILD=amd64 \
IMAGE_REGISTRY=${{ inputs.dockerhub-image-registry }} \
IMAGE_NAMESPACE=${{ inputs.dockerhub-image-namespace }} \
DEV_BUILDER_IMAGE_TAG=${{ inputs.version }} && \
docker push ${{ inputs.dockerhub-image-registry }}/${{ inputs.dockerhub-image-namespace }}/dev-builder-android:${{ inputs.version }}
DEV_BUILDER_IMAGE_TAG=${{ inputs.version }}

View File

@@ -34,8 +34,8 @@ inputs:
required: true
push-latest-tag:
description: Whether to push the latest tag
required: false
default: 'true'
required: true
default: 'false'
runs:
using: composite
steps:
@@ -47,7 +47,11 @@ runs:
password: ${{ inputs.image-registry-password }}
- name: Set up qemu for multi-platform builds
uses: docker/setup-qemu-action@v2
uses: docker/setup-qemu-action@v3
with:
platforms: linux/amd64,linux/arm64
# The latest version will lead to segmentation fault.
image: tonistiigi/binfmt:qemu-v7.0.0-28
- name: Set up buildx
uses: docker/setup-buildx-action@v2

View File

@@ -22,8 +22,8 @@ inputs:
required: true
push-latest-tag:
description: Whether to push the latest tag
required: false
default: 'true'
required: true
default: 'false'
dev-mode:
description: Enable dev mode, only build standard greptime
required: false

View File

@@ -52,7 +52,7 @@ runs:
uses: ./.github/actions/build-greptime-binary
with:
base-image: ubuntu
features: servers/dashboard,pg_kvbackend
features: servers/dashboard,pg_kvbackend,mysql_kvbackend
cargo-profile: ${{ inputs.cargo-profile }}
artifacts-dir: greptime-linux-${{ inputs.arch }}-${{ inputs.version }}
version: ${{ inputs.version }}
@@ -70,7 +70,7 @@ runs:
if: ${{ inputs.arch == 'amd64' && inputs.dev-mode == 'false' }} # Builds greptime for centos if the host machine is amd64.
with:
base-image: centos
features: servers/dashboard,pg_kvbackend
features: servers/dashboard,pg_kvbackend,mysql_kvbackend
cargo-profile: ${{ inputs.cargo-profile }}
artifacts-dir: greptime-linux-${{ inputs.arch }}-centos-${{ inputs.version }}
version: ${{ inputs.version }}

View File

@@ -47,7 +47,6 @@ runs:
shell: pwsh
run: make test sqlness-test
env:
RUSTUP_WINDOWS_PATH_ADD_BIN: 1 # Workaround for https://github.com/nextest-rs/nextest/issues/1493
RUST_BACKTRACE: 1
SQLNESS_OPTS: "--preserve-state"

View File

@@ -51,8 +51,8 @@ inputs:
required: true
upload-to-s3:
description: Upload to S3
required: false
default: 'true'
required: true
default: 'false'
artifacts-dir:
description: Directory to store artifacts
required: false
@@ -77,13 +77,21 @@ runs:
with:
path: ${{ inputs.artifacts-dir }}
- name: Install s5cmd
shell: bash
run: |
wget https://github.com/peak/s5cmd/releases/download/v2.3.0/s5cmd_2.3.0_Linux-64bit.tar.gz
tar -xzf s5cmd_2.3.0_Linux-64bit.tar.gz
sudo mv s5cmd /usr/local/bin/
sudo chmod +x /usr/local/bin/s5cmd
- name: Release artifacts to cn region
uses: nick-invision/retry@v2
if: ${{ inputs.upload-to-s3 == 'true' }}
env:
AWS_ACCESS_KEY_ID: ${{ inputs.aws-cn-access-key-id }}
AWS_SECRET_ACCESS_KEY: ${{ inputs.aws-cn-secret-access-key }}
AWS_DEFAULT_REGION: ${{ inputs.aws-cn-region }}
AWS_REGION: ${{ inputs.aws-cn-region }}
UPDATE_VERSION_INFO: ${{ inputs.update-version-info }}
with:
max_attempts: ${{ inputs.upload-max-retry-times }}

View File

@@ -8,7 +8,7 @@ inputs:
default: 2
description: "Number of Datanode replicas"
meta-replicas:
default: 1
default: 2
description: "Number of Metasrv replicas"
image-registry:
default: "docker.io"

View File

@@ -56,7 +56,7 @@ runs:
- name: Start EC2 runner
if: startsWith(inputs.runner, 'ec2')
uses: machulav/ec2-github-runner@v2
uses: machulav/ec2-github-runner@v2.3.8
id: start-linux-arm64-ec2-runner
with:
mode: start

View File

@@ -33,7 +33,7 @@ runs:
- name: Stop EC2 runner
if: ${{ inputs.label && inputs.ec2-instance-id }}
uses: machulav/ec2-github-runner@v2
uses: machulav/ec2-github-runner@v2.3.8
with:
mode: stop
label: ${{ inputs.label }}

View File

@@ -33,7 +33,7 @@ function upload_artifacts() {
# ├── greptime-darwin-amd64-v0.2.0.sha256sum
# └── greptime-darwin-amd64-v0.2.0.tar.gz
find "$ARTIFACTS_DIR" -type f \( -name "*.tar.gz" -o -name "*.sha256sum" \) | while IFS= read -r file; do
aws s3 cp \
s5cmd cp \
"$file" "s3://$AWS_S3_BUCKET/$RELEASE_DIRS/$VERSION/$(basename "$file")"
done
}
@@ -45,7 +45,7 @@ function update_version_info() {
if [[ "$VERSION" =~ ^v[0-9]+\.[0-9]+\.[0-9]+$ ]]; then
echo "Updating latest-version.txt"
echo "$VERSION" > latest-version.txt
aws s3 cp \
s5cmd cp \
latest-version.txt "s3://$AWS_S3_BUCKET/$RELEASE_DIRS/latest-version.txt"
fi
@@ -53,7 +53,7 @@ function update_version_info() {
if [[ "$VERSION" == *"nightly"* ]]; then
echo "Updating latest-nightly-version.txt"
echo "$VERSION" > latest-nightly-version.txt
aws s3 cp \
s5cmd cp \
latest-nightly-version.txt "s3://$AWS_S3_BUCKET/$RELEASE_DIRS/latest-nightly-version.txt"
fi
fi

View File

@@ -14,9 +14,11 @@ name: Build API docs
jobs:
apidoc:
runs-on: ubuntu-20.04
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
with:
persist-credentials: false
- uses: arduino/setup-protoc@v3
with:
repo-token: ${{ secrets.GITHUB_TOKEN }}

View File

@@ -12,6 +12,8 @@ jobs:
steps:
- name: Checkout code
uses: actions/checkout@v4
with:
persist-credentials: false
- name: Set up Rust
uses: actions-rust-lang/setup-rust-toolchain@v1

View File

@@ -16,11 +16,11 @@ on:
description: The runner uses to build linux-amd64 artifacts
default: ec2-c6i.4xlarge-amd64
options:
- ubuntu-20.04
- ubuntu-20.04-8-cores
- ubuntu-20.04-16-cores
- ubuntu-20.04-32-cores
- ubuntu-20.04-64-cores
- ubuntu-22.04
- ubuntu-22.04-8-cores
- ubuntu-22.04-16-cores
- ubuntu-22.04-32-cores
- ubuntu-22.04-64-cores
- ec2-c6i.xlarge-amd64 # 4C8G
- ec2-c6i.2xlarge-amd64 # 8C16G
- ec2-c6i.4xlarge-amd64 # 16C32G
@@ -76,20 +76,14 @@ env:
NIGHTLY_RELEASE_PREFIX: nightly
# Use the different image name to avoid conflict with the release images.
IMAGE_NAME: greptimedb-dev
# The source code will check out in the following path: '${WORKING_DIR}/dev/greptime'.
CHECKOUT_GREPTIMEDB_PATH: dev/greptimedb
permissions:
issues: write
jobs:
allocate-runners:
name: Allocate runners
if: ${{ github.repository == 'GreptimeTeam/greptimedb' }}
runs-on: ubuntu-20.04
runs-on: ubuntu-latest
outputs:
linux-amd64-runner: ${{ steps.start-linux-amd64-runner.outputs.label }}
linux-arm64-runner: ${{ steps.start-linux-arm64-runner.outputs.label }}
@@ -107,6 +101,7 @@ jobs:
uses: actions/checkout@v4
with:
fetch-depth: 0
persist-credentials: false
- name: Create version
id: create-version
@@ -161,6 +156,7 @@ jobs:
uses: actions/checkout@v4
with:
fetch-depth: 0
persist-credentials: false
- name: Checkout greptimedb
uses: actions/checkout@v4
@@ -168,6 +164,7 @@ jobs:
repository: ${{ inputs.repository }}
ref: ${{ inputs.commit }}
path: ${{ env.CHECKOUT_GREPTIMEDB_PATH }}
persist-credentials: true
- uses: ./.github/actions/build-linux-artifacts
with:
@@ -192,6 +189,7 @@ jobs:
uses: actions/checkout@v4
with:
fetch-depth: 0
persist-credentials: false
- name: Checkout greptimedb
uses: actions/checkout@v4
@@ -199,6 +197,7 @@ jobs:
repository: ${{ inputs.repository }}
ref: ${{ inputs.commit }}
path: ${{ env.CHECKOUT_GREPTIMEDB_PATH }}
persist-credentials: true
- uses: ./.github/actions/build-linux-artifacts
with:
@@ -219,25 +218,33 @@ jobs:
build-linux-amd64-artifacts,
build-linux-arm64-artifacts,
]
runs-on: ubuntu-20.04
runs-on: ubuntu-latest
outputs:
build-result: ${{ steps.set-build-result.outputs.build-result }}
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0
persist-credentials: false
- name: Build and push images to dockerhub
uses: ./.github/actions/build-images
with:
image-registry: docker.io
image-namespace: ${{ vars.IMAGE_NAMESPACE }}
image-name: ${{ env.IMAGE_NAME }}
image-name: ${{ vars.DEV_BUILD_IMAGE_NAME }}
image-registry-username: ${{ secrets.DOCKERHUB_USERNAME }}
image-registry-password: ${{ secrets.DOCKERHUB_TOKEN }}
version: ${{ needs.allocate-runners.outputs.version }}
push-latest-tag: false # Don't push the latest tag to registry.
dev-mode: true # Only build the standard images.
- name: Echo Docker image tag to step summary
run: |
echo "## Docker Image Tag" >> $GITHUB_STEP_SUMMARY
echo "Image Tag: \`${{ needs.allocate-runners.outputs.version }}\`" >> $GITHUB_STEP_SUMMARY
echo "Full Image Name: \`docker.io/${{ vars.IMAGE_NAMESPACE }}/${{ vars.DEV_BUILD_IMAGE_NAME }}:${{ needs.allocate-runners.outputs.version }}\`" >> $GITHUB_STEP_SUMMARY
echo "Pull Command: \`docker pull docker.io/${{ vars.IMAGE_NAMESPACE }}/${{ vars.DEV_BUILD_IMAGE_NAME }}:${{ needs.allocate-runners.outputs.version }}\`" >> $GITHUB_STEP_SUMMARY
- name: Set build result
id: set-build-result
@@ -251,19 +258,20 @@ jobs:
allocate-runners,
release-images-to-dockerhub,
]
runs-on: ubuntu-20.04
runs-on: ubuntu-latest
continue-on-error: true
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0
persist-credentials: false
- name: Release artifacts to CN region
uses: ./.github/actions/release-cn-artifacts
with:
src-image-registry: docker.io
src-image-namespace: ${{ vars.IMAGE_NAMESPACE }}
src-image-name: ${{ env.IMAGE_NAME }}
src-image-name: ${{ vars.DEV_BUILD_IMAGE_NAME }}
dst-image-registry-username: ${{ secrets.ALICLOUD_USERNAME }}
dst-image-registry-password: ${{ secrets.ALICLOUD_PASSWORD }}
dst-image-registry: ${{ vars.ACR_IMAGE_REGISTRY }}
@@ -273,6 +281,7 @@ jobs:
aws-cn-access-key-id: ${{ secrets.AWS_CN_ACCESS_KEY_ID }}
aws-cn-secret-access-key: ${{ secrets.AWS_CN_SECRET_ACCESS_KEY }}
aws-cn-region: ${{ vars.AWS_RELEASE_BUCKET_REGION }}
upload-to-s3: false
dev-mode: true # Only build the standard images(exclude centos images).
push-latest-tag: false # Don't push the latest tag to registry.
update-version-info: false # Don't update the version info in S3.
@@ -281,7 +290,7 @@ jobs:
name: Stop linux-amd64 runner
# Only run this job when the runner is allocated.
if: ${{ always() }}
runs-on: ubuntu-20.04
runs-on: ubuntu-latest
needs: [
allocate-runners,
build-linux-amd64-artifacts,
@@ -291,6 +300,7 @@ jobs:
uses: actions/checkout@v4
with:
fetch-depth: 0
persist-credentials: false
- name: Stop EC2 runner
uses: ./.github/actions/stop-runner
@@ -306,7 +316,7 @@ jobs:
name: Stop linux-arm64 runner
# Only run this job when the runner is allocated.
if: ${{ always() }}
runs-on: ubuntu-20.04
runs-on: ubuntu-latest
needs: [
allocate-runners,
build-linux-arm64-artifacts,
@@ -316,6 +326,7 @@ jobs:
uses: actions/checkout@v4
with:
fetch-depth: 0
persist-credentials: false
- name: Stop EC2 runner
uses: ./.github/actions/stop-runner
@@ -333,11 +344,17 @@ jobs:
needs: [
release-images-to-dockerhub
]
runs-on: ubuntu-20.04
runs-on: ubuntu-latest
permissions:
issues: write
env:
SLACK_WEBHOOK_URL: ${{ secrets.SLACK_WEBHOOK_URL_DEVELOP_CHANNEL }}
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0
persist-credentials: false
- uses: ./.github/actions/setup-cyborg
- name: Report CI status
id: report-ci-status

View File

@@ -23,9 +23,11 @@ concurrency:
jobs:
check-typos-and-docs:
name: Check typos and docs
runs-on: ubuntu-20.04
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
with:
persist-credentials: false
- uses: crate-ci/typos@master
- name: Check the config docs
run: |
@@ -34,10 +36,12 @@ jobs:
|| (echo "'config/config.md' is not up-to-date, please run 'make config-docs'." && exit 1)
license-header-check:
runs-on: ubuntu-20.04
runs-on: ubuntu-latest
name: Check License Header
steps:
- uses: actions/checkout@v4
with:
persist-credentials: false
- uses: korandoru/hawkeye@v5
check:
@@ -45,10 +49,12 @@ jobs:
runs-on: ${{ matrix.os }}
strategy:
matrix:
os: [ ubuntu-20.04 ]
os: [ ubuntu-latest ]
timeout-minutes: 60
steps:
- uses: actions/checkout@v4
with:
persist-credentials: false
- uses: arduino/setup-protoc@v3
with:
repo-token: ${{ secrets.GITHUB_TOKEN }}
@@ -66,10 +72,12 @@ jobs:
toml:
name: Toml Check
runs-on: ubuntu-20.04
runs-on: ubuntu-latest
timeout-minutes: 60
steps:
- uses: actions/checkout@v4
with:
persist-credentials: false
- uses: actions-rust-lang/setup-rust-toolchain@v1
- name: Install taplo
run: cargo +stable install taplo-cli --version ^0.9 --locked --force
@@ -81,10 +89,12 @@ jobs:
runs-on: ${{ matrix.os }}
strategy:
matrix:
os: [ ubuntu-20.04 ]
os: [ ubuntu-latest ]
timeout-minutes: 60
steps:
- uses: actions/checkout@v4
with:
persist-credentials: false
- uses: arduino/setup-protoc@v3
with:
repo-token: ${{ secrets.GITHUB_TOKEN }}
@@ -101,7 +111,7 @@ jobs:
- name: Build greptime binaries
shell: bash
# `cargo gc` will invoke `cargo build` with specified args
run: cargo gc -- --bin greptime --bin sqlness-runner --features pg_kvbackend
run: cargo gc -- --bin greptime --bin sqlness-runner --features "pg_kvbackend,mysql_kvbackend"
- name: Pack greptime binaries
shell: bash
run: |
@@ -139,6 +149,8 @@ jobs:
echo "Disk space after:"
df -h
- uses: actions/checkout@v4
with:
persist-credentials: false
- uses: arduino/setup-protoc@v3
with:
repo-token: ${{ secrets.GITHUB_TOKEN }}
@@ -192,6 +204,8 @@ jobs:
echo "Disk space after:"
df -h
- uses: actions/checkout@v4
with:
persist-credentials: false
- uses: arduino/setup-protoc@v3
with:
repo-token: ${{ secrets.GITHUB_TOKEN }}
@@ -234,10 +248,12 @@ jobs:
runs-on: ${{ matrix.os }}
strategy:
matrix:
os: [ ubuntu-20.04 ]
os: [ ubuntu-latest ]
timeout-minutes: 60
steps:
- uses: actions/checkout@v4
with:
persist-credentials: false
- uses: arduino/setup-protoc@v3
with:
repo-token: ${{ secrets.GITHUB_TOKEN }}
@@ -254,7 +270,7 @@ jobs:
- name: Build greptime bianry
shell: bash
# `cargo gc` will invoke `cargo build` with specified args
run: cargo gc --profile ci -- --bin greptime --features pg_kvbackend
run: cargo gc --profile ci -- --bin greptime --features "pg_kvbackend,mysql_kvbackend"
- name: Pack greptime binary
shell: bash
run: |
@@ -295,6 +311,8 @@ jobs:
echo "Disk space after:"
df -h
- uses: actions/checkout@v4
with:
persist-credentials: false
- name: Setup Kind
uses: ./.github/actions/setup-kind
- if: matrix.mode.minio
@@ -437,6 +455,8 @@ jobs:
echo "Disk space after:"
df -h
- uses: actions/checkout@v4
with:
persist-credentials: false
- name: Setup Kind
uses: ./.github/actions/setup-kind
- name: Setup Chaos Mesh
@@ -548,7 +568,7 @@ jobs:
runs-on: ${{ matrix.os }}
strategy:
matrix:
os: [ ubuntu-20.04 ]
os: [ ubuntu-latest ]
mode:
- name: "Basic"
opts: ""
@@ -556,12 +576,17 @@ jobs:
- name: "Remote WAL"
opts: "-w kafka -k 127.0.0.1:9092"
kafka: true
- name: "Pg Kvbackend"
- name: "PostgreSQL KvBackend"
opts: "--setup-pg"
kafka: false
- name: "MySQL Kvbackend"
opts: "--setup-mysql"
kafka: false
timeout-minutes: 60
steps:
- uses: actions/checkout@v4
with:
persist-credentials: false
- if: matrix.mode.kafka
name: Setup kafka server
working-directory: tests-integration/fixtures
@@ -585,10 +610,12 @@ jobs:
fmt:
name: Rustfmt
runs-on: ubuntu-20.04
runs-on: ubuntu-latest
timeout-minutes: 60
steps:
- uses: actions/checkout@v4
with:
persist-credentials: false
- uses: arduino/setup-protoc@v3
with:
repo-token: ${{ secrets.GITHUB_TOKEN }}
@@ -600,10 +627,12 @@ jobs:
clippy:
name: Clippy
runs-on: ubuntu-20.04
runs-on: ubuntu-latest
timeout-minutes: 60
steps:
- uses: actions/checkout@v4
with:
persist-credentials: false
- uses: arduino/setup-protoc@v3
with:
repo-token: ${{ secrets.GITHUB_TOKEN }}
@@ -626,6 +655,8 @@ jobs:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
with:
persist-credentials: false
- name: Merge Conflict Finder
uses: olivernybroe/action-conflict-finder@v4.0
@@ -636,6 +667,8 @@ jobs:
needs: [conflict-check, clippy, fmt]
steps:
- uses: actions/checkout@v4
with:
persist-credentials: false
- uses: arduino/setup-protoc@v3
with:
repo-token: ${{ secrets.GITHUB_TOKEN }}
@@ -657,7 +690,7 @@ jobs:
working-directory: tests-integration/fixtures
run: docker compose up -d --wait
- name: Run nextest cases
run: cargo nextest run --workspace -F dashboard -F pg_kvbackend
run: cargo nextest run --workspace -F dashboard -F pg_kvbackend -F mysql_kvbackend
env:
CARGO_BUILD_RUSTFLAGS: "-C link-arg=-fuse-ld=mold"
RUST_BACKTRACE: 1
@@ -674,16 +707,19 @@ jobs:
GT_MINIO_ENDPOINT_URL: http://127.0.0.1:9000
GT_ETCD_ENDPOINTS: http://127.0.0.1:2379
GT_POSTGRES_ENDPOINTS: postgres://greptimedb:admin@127.0.0.1:5432/postgres
GT_MYSQL_ENDPOINTS: mysql://greptimedb:admin@127.0.0.1:3306/mysql
GT_KAFKA_ENDPOINTS: 127.0.0.1:9092
GT_KAFKA_SASL_ENDPOINTS: 127.0.0.1:9093
UNITTEST_LOG_DIR: "__unittest_logs"
coverage:
if: github.event_name == 'merge_group'
runs-on: ubuntu-20.04-8-cores
runs-on: ubuntu-22.04-8-cores
timeout-minutes: 60
steps:
- uses: actions/checkout@v4
with:
persist-credentials: false
- uses: arduino/setup-protoc@v3
with:
repo-token: ${{ secrets.GITHUB_TOKEN }}
@@ -707,7 +743,7 @@ jobs:
working-directory: tests-integration/fixtures
run: docker compose up -d --wait
- name: Run nextest cases
run: cargo llvm-cov nextest --workspace --lcov --output-path lcov.info -F dashboard -F pg_kvbackend
run: cargo llvm-cov nextest --workspace --lcov --output-path lcov.info -F dashboard -F pg_kvbackend -F mysql_kvbackend
env:
CARGO_BUILD_RUSTFLAGS: "-C link-arg=-fuse-ld=mold"
RUST_BACKTRACE: 1
@@ -723,6 +759,7 @@ jobs:
GT_MINIO_ENDPOINT_URL: http://127.0.0.1:9000
GT_ETCD_ENDPOINTS: http://127.0.0.1:2379
GT_POSTGRES_ENDPOINTS: postgres://greptimedb:admin@127.0.0.1:5432/postgres
GT_MYSQL_ENDPOINTS: mysql://greptimedb:admin@127.0.0.1:3306/mysql
GT_KAFKA_ENDPOINTS: 127.0.0.1:9092
GT_KAFKA_SASL_ENDPOINTS: 127.0.0.1:9093
UNITTEST_LOG_DIR: "__unittest_logs"
@@ -738,7 +775,7 @@ jobs:
# compat:
# name: Compatibility Test
# needs: build
# runs-on: ubuntu-20.04
# runs-on: ubuntu-22.04
# timeout-minutes: 60
# steps:
# - uses: actions/checkout@v4

View File

@@ -3,16 +3,21 @@ on:
pull_request_target:
types: [opened, edited]
permissions:
pull-requests: write
contents: read
concurrency:
group: ${{ github.workflow }}-${{ github.head_ref || github.run_id }}
cancel-in-progress: true
jobs:
docbot:
runs-on: ubuntu-20.04
runs-on: ubuntu-latest
permissions:
pull-requests: write
contents: read
timeout-minutes: 10
steps:
- uses: actions/checkout@v4
with:
persist-credentials: false
- uses: ./.github/actions/setup-cyborg
- name: Maybe Follow Up Docs Issue
working-directory: cyborg

View File

@@ -31,43 +31,47 @@ name: CI
jobs:
typos:
name: Spell Check with Typos
runs-on: ubuntu-20.04
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
with:
persist-credentials: false
- uses: crate-ci/typos@master
license-header-check:
runs-on: ubuntu-20.04
runs-on: ubuntu-latest
name: Check License Header
steps:
- uses: actions/checkout@v4
with:
persist-credentials: false
- uses: korandoru/hawkeye@v5
check:
name: Check
runs-on: ubuntu-20.04
runs-on: ubuntu-latest
steps:
- run: 'echo "No action required"'
fmt:
name: Rustfmt
runs-on: ubuntu-20.04
runs-on: ubuntu-latest
steps:
- run: 'echo "No action required"'
clippy:
name: Clippy
runs-on: ubuntu-20.04
runs-on: ubuntu-latest
steps:
- run: 'echo "No action required"'
coverage:
runs-on: ubuntu-20.04
runs-on: ubuntu-latest
steps:
- run: 'echo "No action required"'
test:
runs-on: ubuntu-20.04
runs-on: ubuntu-latest
steps:
- run: 'echo "No action required"'
@@ -76,7 +80,7 @@ jobs:
runs-on: ${{ matrix.os }}
strategy:
matrix:
os: [ ubuntu-20.04 ]
os: [ ubuntu-latest ]
mode:
- name: "Basic"
- name: "Remote WAL"

52
.github/workflows/grafana.yml vendored Normal file
View File

@@ -0,0 +1,52 @@
name: Check Grafana Panels
on:
pull_request:
branches:
- main
paths:
- 'grafana/**' # Trigger only when files under the grafana/ directory change
jobs:
check-panels:
runs-on: ubuntu-latest
steps:
# Check out the repository
- name: Checkout repository
uses: actions/checkout@v4
# Install jq (required for the script)
- name: Install jq
run: sudo apt-get install -y jq
# Make the check.sh script executable
- name: Make check.sh executable
run: chmod +x grafana/check.sh
# Run the check.sh script
- name: Run check.sh
run: ./grafana/check.sh
# Only run summary.sh for pull_request events (not for merge queues or final pushes)
- name: Check if this is a pull request
id: check-pr
run: |
if [[ "${{ github.event_name }}" == "pull_request" ]]; then
echo "is_pull_request=true" >> $GITHUB_OUTPUT
else
echo "is_pull_request=false" >> $GITHUB_OUTPUT
fi
# Make the summary.sh script executable
- name: Make summary.sh executable
if: steps.check-pr.outputs.is_pull_request == 'true'
run: chmod +x grafana/summary.sh
# Run the summary.sh script and add its output to the GitHub Job Summary
- name: Run summary.sh and add to Job Summary
if: steps.check-pr.outputs.is_pull_request == 'true'
run: |
SUMMARY=$(./grafana/summary.sh)
echo "### Summary of Grafana Panels" >> $GITHUB_STEP_SUMMARY
echo "$SUMMARY" >> $GITHUB_STEP_SUMMARY

View File

@@ -14,11 +14,11 @@ on:
description: The runner uses to build linux-amd64 artifacts
default: ec2-c6i.4xlarge-amd64
options:
- ubuntu-20.04
- ubuntu-20.04-8-cores
- ubuntu-20.04-16-cores
- ubuntu-20.04-32-cores
- ubuntu-20.04-64-cores
- ubuntu-22.04
- ubuntu-22.04-8-cores
- ubuntu-22.04-16-cores
- ubuntu-22.04-32-cores
- ubuntu-22.04-64-cores
- ec2-c6i.xlarge-amd64 # 4C8G
- ec2-c6i.2xlarge-amd64 # 8C16G
- ec2-c6i.4xlarge-amd64 # 16C32G
@@ -66,18 +66,11 @@ env:
NIGHTLY_RELEASE_PREFIX: nightly
# Use the different image name to avoid conflict with the release images.
# The DockerHub image will be greptime/greptimedb-nightly.
IMAGE_NAME: greptimedb-nightly
permissions:
issues: write
jobs:
allocate-runners:
name: Allocate runners
if: ${{ github.repository == 'GreptimeTeam/greptimedb' }}
runs-on: ubuntu-20.04
runs-on: ubuntu-latest
outputs:
linux-amd64-runner: ${{ steps.start-linux-amd64-runner.outputs.label }}
linux-arm64-runner: ${{ steps.start-linux-arm64-runner.outputs.label }}
@@ -95,6 +88,7 @@ jobs:
uses: actions/checkout@v4
with:
fetch-depth: 0
persist-credentials: false
- name: Create version
id: create-version
@@ -147,6 +141,7 @@ jobs:
- uses: actions/checkout@v4
with:
fetch-depth: 0
persist-credentials: false
- uses: ./.github/actions/build-linux-artifacts
with:
@@ -168,6 +163,7 @@ jobs:
- uses: actions/checkout@v4
with:
fetch-depth: 0
persist-credentials: false
- uses: ./.github/actions/build-linux-artifacts
with:
@@ -186,24 +182,25 @@ jobs:
build-linux-amd64-artifacts,
build-linux-arm64-artifacts,
]
runs-on: ubuntu-20.04
runs-on: ubuntu-latest
outputs:
nightly-build-result: ${{ steps.set-nightly-build-result.outputs.nightly-build-result }}
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0
persist-credentials: false
- name: Build and push images to dockerhub
uses: ./.github/actions/build-images
with:
image-registry: docker.io
image-namespace: ${{ vars.IMAGE_NAMESPACE }}
image-name: ${{ env.IMAGE_NAME }}
image-name: ${{ vars.NIGHTLY_BUILD_IMAGE_NAME }}
image-registry-username: ${{ secrets.DOCKERHUB_USERNAME }}
image-registry-password: ${{ secrets.DOCKERHUB_TOKEN }}
version: ${{ needs.allocate-runners.outputs.version }}
push-latest-tag: true
push-latest-tag: false
- name: Set nightly build result
id: set-nightly-build-result
@@ -217,7 +214,7 @@ jobs:
allocate-runners,
release-images-to-dockerhub,
]
runs-on: ubuntu-20.04
runs-on: ubuntu-latest
# When we push to ACR, it's easy to fail due to some unknown network issues.
# However, we don't want to fail the whole workflow because of this.
# The ACR have daily sync with DockerHub, so don't worry about the image not being updated.
@@ -226,13 +223,14 @@ jobs:
- uses: actions/checkout@v4
with:
fetch-depth: 0
persist-credentials: false
- name: Release artifacts to CN region
uses: ./.github/actions/release-cn-artifacts
with:
src-image-registry: docker.io
src-image-namespace: ${{ vars.IMAGE_NAMESPACE }}
src-image-name: ${{ env.IMAGE_NAME }}
src-image-name: ${{ vars.NIGHTLY_BUILD_IMAGE_NAME }}
dst-image-registry-username: ${{ secrets.ALICLOUD_USERNAME }}
dst-image-registry-password: ${{ secrets.ALICLOUD_PASSWORD }}
dst-image-registry: ${{ vars.ACR_IMAGE_REGISTRY }}
@@ -242,15 +240,16 @@ jobs:
aws-cn-access-key-id: ${{ secrets.AWS_CN_ACCESS_KEY_ID }}
aws-cn-secret-access-key: ${{ secrets.AWS_CN_SECRET_ACCESS_KEY }}
aws-cn-region: ${{ vars.AWS_RELEASE_BUCKET_REGION }}
upload-to-s3: false
dev-mode: false
update-version-info: false # Don't update version info in S3.
push-latest-tag: true
push-latest-tag: false
stop-linux-amd64-runner: # It's always run as the last job in the workflow to make sure that the runner is released.
name: Stop linux-amd64 runner
# Only run this job when the runner is allocated.
if: ${{ always() }}
runs-on: ubuntu-20.04
runs-on: ubuntu-latest
needs: [
allocate-runners,
build-linux-amd64-artifacts,
@@ -260,6 +259,7 @@ jobs:
uses: actions/checkout@v4
with:
fetch-depth: 0
persist-credentials: false
- name: Stop EC2 runner
uses: ./.github/actions/stop-runner
@@ -275,7 +275,7 @@ jobs:
name: Stop linux-arm64 runner
# Only run this job when the runner is allocated.
if: ${{ always() }}
runs-on: ubuntu-20.04
runs-on: ubuntu-latest
needs: [
allocate-runners,
build-linux-arm64-artifacts,
@@ -285,6 +285,7 @@ jobs:
uses: actions/checkout@v4
with:
fetch-depth: 0
persist-credentials: false
- name: Stop EC2 runner
uses: ./.github/actions/stop-runner
@@ -302,11 +303,15 @@ jobs:
needs: [
release-images-to-dockerhub
]
runs-on: ubuntu-20.04
runs-on: ubuntu-latest
permissions:
issues: write
env:
SLACK_WEBHOOK_URL: ${{ secrets.SLACK_WEBHOOK_URL_DEVELOP_CHANNEL }}
steps:
- uses: actions/checkout@v4
with:
persist-credentials: false
- uses: ./.github/actions/setup-cyborg
- name: Report CI status
id: report-ci-status

View File

@@ -9,19 +9,17 @@ concurrency:
group: ${{ github.workflow }}-${{ github.head_ref || github.run_id }}
cancel-in-progress: true
permissions:
issues: write
jobs:
sqlness-test:
name: Run sqlness test
if: ${{ github.repository == 'GreptimeTeam/greptimedb' }}
runs-on: ubuntu-22.04
runs-on: ubuntu-latest
steps:
- name: Checkout
uses: actions/checkout@v4
with:
fetch-depth: 0
persist-credentials: false
- name: Check install.sh
run: ./.github/scripts/check-install-script.sh
@@ -46,9 +44,14 @@ jobs:
name: Sqlness tests on Windows
if: ${{ github.repository == 'GreptimeTeam/greptimedb' }}
runs-on: windows-2022-8-cores
permissions:
issues: write
timeout-minutes: 60
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0
persist-credentials: false
- uses: ./.github/actions/setup-cyborg
- uses: arduino/setup-protoc@v3
with:
@@ -76,6 +79,9 @@ jobs:
steps:
- run: git config --global core.autocrlf false
- uses: actions/checkout@v4
with:
fetch-depth: 0
persist-credentials: false
- uses: ./.github/actions/setup-cyborg
- uses: arduino/setup-protoc@v3
with:
@@ -101,7 +107,6 @@ jobs:
CARGO_BUILD_RUSTFLAGS: "-C linker=lld-link"
RUST_BACKTRACE: 1
CARGO_INCREMENTAL: 0
RUSTUP_WINDOWS_PATH_ADD_BIN: 1 # Workaround for https://github.com/nextest-rs/nextest/issues/1493
GT_S3_BUCKET: ${{ vars.AWS_CI_TEST_BUCKET }}
GT_S3_ACCESS_KEY_ID: ${{ secrets.AWS_CI_TEST_ACCESS_KEY_ID }}
GT_S3_ACCESS_KEY: ${{ secrets.AWS_CI_TEST_SECRET_ACCESS_KEY }}
@@ -111,9 +116,13 @@ jobs:
cleanbuild-linux-nix:
name: Run clean build on Linux
runs-on: ubuntu-latest
if: ${{ github.repository == 'GreptimeTeam/greptimedb' }}
timeout-minutes: 60
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0
persist-credentials: false
- uses: cachix/install-nix-action@v27
with:
nix_path: nixpkgs=channel:nixos-24.11
@@ -123,7 +132,7 @@ jobs:
name: Check status
needs: [sqlness-test, sqlness-windows, test-on-windows]
if: ${{ github.repository == 'GreptimeTeam/greptimedb' }}
runs-on: ubuntu-20.04
runs-on: ubuntu-latest
outputs:
check-result: ${{ steps.set-check-result.outputs.check-result }}
steps:
@@ -136,11 +145,14 @@ jobs:
if: ${{ github.repository == 'GreptimeTeam/greptimedb' && always() }} # Not requiring successful dependent jobs, always run.
name: Send notification to Greptime team
needs: [check-status]
runs-on: ubuntu-20.04
runs-on: ubuntu-latest
env:
SLACK_WEBHOOK_URL: ${{ secrets.SLACK_WEBHOOK_URL_DEVELOP_CHANNEL }}
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0
persist-credentials: false
- uses: ./.github/actions/setup-cyborg
- name: Report CI status
id: report-ci-status

View File

@@ -29,7 +29,7 @@ jobs:
release-dev-builder-images:
name: Release dev builder images
if: ${{ inputs.release_dev_builder_ubuntu_image || inputs.release_dev_builder_centos_image || inputs.release_dev_builder_android_image }} # Only manually trigger this job.
runs-on: ubuntu-20.04-16-cores
runs-on: ubuntu-latest
outputs:
version: ${{ steps.set-version.outputs.version }}
steps:
@@ -37,6 +37,7 @@ jobs:
uses: actions/checkout@v4
with:
fetch-depth: 0
persist-credentials: false
- name: Configure build image version
id: set-version
@@ -62,7 +63,7 @@ jobs:
release-dev-builder-images-ecr:
name: Release dev builder images to AWS ECR
runs-on: ubuntu-20.04
runs-on: ubuntu-latest
needs: [
release-dev-builder-images
]
@@ -85,51 +86,69 @@ jobs:
- name: Push dev-builder-ubuntu image
shell: bash
if: ${{ inputs.release_dev_builder_ubuntu_image }}
env:
IMAGE_VERSION: ${{ needs.release-dev-builder-images.outputs.version }}
IMAGE_NAMESPACE: ${{ vars.IMAGE_NAMESPACE }}
ECR_IMAGE_REGISTRY: ${{ vars.ECR_IMAGE_REGISTRY }}
ECR_IMAGE_NAMESPACE: ${{ vars.ECR_IMAGE_NAMESPACE }}
run: |
docker run -v "${DOCKER_CONFIG:-$HOME/.docker}:/root/.docker:ro" \
-e "REGISTRY_AUTH_FILE=/root/.docker/config.json" \
quay.io/skopeo/stable:latest \
copy -a docker://docker.io/${{ vars.IMAGE_NAMESPACE }}/dev-builder-ubuntu:${{ needs.release-dev-builder-images.outputs.version }} \
docker://${{ vars.ECR_IMAGE_REGISTRY }}/${{ vars.ECR_IMAGE_NAMESPACE }}/dev-builder-ubuntu:${{ needs.release-dev-builder-images.outputs.version }}
copy -a docker://docker.io/$IMAGE_NAMESPACE/dev-builder-ubuntu:$IMAGE_VERSION \
docker://$ECR_IMAGE_REGISTRY/$ECR_IMAGE_NAMESPACE/dev-builder-ubuntu:$IMAGE_VERSION
docker run -v "${DOCKER_CONFIG:-$HOME/.docker}:/root/.docker:ro" \
-e "REGISTRY_AUTH_FILE=/root/.docker/config.json" \
quay.io/skopeo/stable:latest \
copy -a docker://docker.io/${{ vars.IMAGE_NAMESPACE }}/dev-builder-ubuntu:latest \
docker://${{ vars.ECR_IMAGE_REGISTRY }}/${{ vars.ECR_IMAGE_NAMESPACE }}/dev-builder-ubuntu:latest
copy -a docker://docker.io/$IMAGE_NAMESPACE/dev-builder-ubuntu:latest \
docker://$ECR_IMAGE_REGISTRY/$ECR_IMAGE_NAMESPACE/dev-builder-ubuntu:latest
- name: Push dev-builder-centos image
shell: bash
if: ${{ inputs.release_dev_builder_centos_image }}
env:
IMAGE_VERSION: ${{ needs.release-dev-builder-images.outputs.version }}
IMAGE_NAMESPACE: ${{ vars.IMAGE_NAMESPACE }}
ECR_IMAGE_REGISTRY: ${{ vars.ECR_IMAGE_REGISTRY }}
ECR_IMAGE_NAMESPACE: ${{ vars.ECR_IMAGE_NAMESPACE }}
run: |
docker run -v "${DOCKER_CONFIG:-$HOME/.docker}:/root/.docker:ro" \
-e "REGISTRY_AUTH_FILE=/root/.docker/config.json" \
quay.io/skopeo/stable:latest \
copy -a docker://docker.io/${{ vars.IMAGE_NAMESPACE }}/dev-builder-centos:${{ needs.release-dev-builder-images.outputs.version }} \
docker://${{ vars.ECR_IMAGE_REGISTRY }}/${{ vars.ECR_IMAGE_NAMESPACE }}/dev-builder-centos:${{ needs.release-dev-builder-images.outputs.version }}
copy -a docker://docker.io/$IMAGE_NAMESPACE/dev-builder-centos:$IMAGE_VERSION \
docker://$ECR_IMAGE_REGISTRY/$ECR_IMAGE_NAMESPACE/dev-builder-centos:$IMAGE_VERSION
docker run -v "${DOCKER_CONFIG:-$HOME/.docker}:/root/.docker:ro" \
-e "REGISTRY_AUTH_FILE=/root/.docker/config.json" \
quay.io/skopeo/stable:latest \
copy -a docker://docker.io/${{ vars.IMAGE_NAMESPACE }}/dev-builder-centos:latest \
docker://${{ vars.ECR_IMAGE_REGISTRY }}/${{ vars.ECR_IMAGE_NAMESPACE }}/dev-builder-centos:latest
copy -a docker://docker.io/$IMAGE_NAMESPACE/dev-builder-centos:latest \
docker://$ECR_IMAGE_REGISTRY/$ECR_IMAGE_NAMESPACE/dev-builder-centos:latest
- name: Push dev-builder-android image
shell: bash
if: ${{ inputs.release_dev_builder_android_image }}
env:
IMAGE_VERSION: ${{ needs.release-dev-builder-images.outputs.version }}
IMAGE_NAMESPACE: ${{ vars.IMAGE_NAMESPACE }}
ECR_IMAGE_REGISTRY: ${{ vars.ECR_IMAGE_REGISTRY }}
ECR_IMAGE_NAMESPACE: ${{ vars.ECR_IMAGE_NAMESPACE }}
run: |
docker run -v "${DOCKER_CONFIG:-$HOME/.docker}:/root/.docker:ro" \
-e "REGISTRY_AUTH_FILE=/root/.docker/config.json" \
quay.io/skopeo/stable:latest \
copy -a docker://docker.io/${{ vars.IMAGE_NAMESPACE }}/dev-builder-android:${{ needs.release-dev-builder-images.outputs.version }} \
docker://${{ vars.ECR_IMAGE_REGISTRY }}/${{ vars.ECR_IMAGE_NAMESPACE }}/dev-builder-android:${{ needs.release-dev-builder-images.outputs.version }}
copy -a docker://docker.io/$IMAGE_NAMESPACE/dev-builder-android:$IMAGE_VERSION \
docker://$ECR_IMAGE_REGISTRY/$ECR_IMAGE_NAMESPACE/dev-builder-android:$IMAGE_VERSION
docker run -v "${DOCKER_CONFIG:-$HOME/.docker}:/root/.docker:ro" \
-e "REGISTRY_AUTH_FILE=/root/.docker/config.json" \
quay.io/skopeo/stable:latest \
copy -a docker://docker.io/${{ vars.IMAGE_NAMESPACE }}/dev-builder-android:latest \
docker://${{ vars.ECR_IMAGE_REGISTRY }}/${{ vars.ECR_IMAGE_NAMESPACE }}/dev-builder-android:latest
copy -a docker://docker.io/$IMAGE_NAMESPACE/dev-builder-android:latest \
docker://$ECR_IMAGE_REGISTRY/$ECR_IMAGE_NAMESPACE/dev-builder-android:latest
release-dev-builder-images-cn: # Note: Be careful issue: https://github.com/containers/skopeo/issues/1874 and we decide to use the latest stable skopeo container.
name: Release dev builder images to CN region
runs-on: ubuntu-20.04
runs-on: ubuntu-latest
needs: [
release-dev-builder-images
]
@@ -144,29 +163,41 @@ jobs:
- name: Push dev-builder-ubuntu image
shell: bash
if: ${{ inputs.release_dev_builder_ubuntu_image }}
env:
IMAGE_VERSION: ${{ needs.release-dev-builder-images.outputs.version }}
IMAGE_NAMESPACE: ${{ vars.IMAGE_NAMESPACE }}
ACR_IMAGE_REGISTRY: ${{ vars.ACR_IMAGE_REGISTRY }}
run: |
docker run -v "${DOCKER_CONFIG:-$HOME/.docker}:/root/.docker:ro" \
-e "REGISTRY_AUTH_FILE=/root/.docker/config.json" \
quay.io/skopeo/stable:latest \
copy -a docker://docker.io/${{ vars.IMAGE_NAMESPACE }}/dev-builder-ubuntu:${{ needs.release-dev-builder-images.outputs.version }} \
docker://${{ vars.ACR_IMAGE_REGISTRY }}/${{ vars.IMAGE_NAMESPACE }}/dev-builder-ubuntu:${{ needs.release-dev-builder-images.outputs.version }}
copy -a docker://docker.io/$IMAGE_NAMESPACE/dev-builder-ubuntu:$IMAGE_VERSION \
docker://$ACR_IMAGE_REGISTRY/$IMAGE_NAMESPACE/dev-builder-ubuntu:$IMAGE_VERSION
- name: Push dev-builder-centos image
shell: bash
if: ${{ inputs.release_dev_builder_centos_image }}
env:
IMAGE_VERSION: ${{ needs.release-dev-builder-images.outputs.version }}
IMAGE_NAMESPACE: ${{ vars.IMAGE_NAMESPACE }}
ACR_IMAGE_REGISTRY: ${{ vars.ACR_IMAGE_REGISTRY }}
run: |
docker run -v "${DOCKER_CONFIG:-$HOME/.docker}:/root/.docker:ro" \
-e "REGISTRY_AUTH_FILE=/root/.docker/config.json" \
quay.io/skopeo/stable:latest \
copy -a docker://docker.io/${{ vars.IMAGE_NAMESPACE }}/dev-builder-centos:${{ needs.release-dev-builder-images.outputs.version }} \
docker://${{ vars.ACR_IMAGE_REGISTRY }}/${{ vars.IMAGE_NAMESPACE }}/dev-builder-centos:${{ needs.release-dev-builder-images.outputs.version }}
copy -a docker://docker.io/$IMAGE_NAMESPACE/dev-builder-centos:$IMAGE_VERSION \
docker://$ACR_IMAGE_REGISTRY/$IMAGE_NAMESPACE/dev-builder-centos:$IMAGE_VERSION
- name: Push dev-builder-android image
shell: bash
if: ${{ inputs.release_dev_builder_android_image }}
env:
IMAGE_VERSION: ${{ needs.release-dev-builder-images.outputs.version }}
IMAGE_NAMESPACE: ${{ vars.IMAGE_NAMESPACE }}
ACR_IMAGE_REGISTRY: ${{ vars.ACR_IMAGE_REGISTRY }}
run: |
docker run -v "${DOCKER_CONFIG:-$HOME/.docker}:/root/.docker:ro" \
-e "REGISTRY_AUTH_FILE=/root/.docker/config.json" \
quay.io/skopeo/stable:latest \
copy -a docker://docker.io/${{ vars.IMAGE_NAMESPACE }}/dev-builder-android:${{ needs.release-dev-builder-images.outputs.version }} \
docker://${{ vars.ACR_IMAGE_REGISTRY }}/${{ vars.IMAGE_NAMESPACE }}/dev-builder-android:${{ needs.release-dev-builder-images.outputs.version }}
copy -a docker://docker.io/$IMAGE_NAMESPACE/dev-builder-android:$IMAGE_VERSION \
docker://$ACR_IMAGE_REGISTRY/$IMAGE_NAMESPACE/dev-builder-android:$IMAGE_VERSION

View File

@@ -18,11 +18,11 @@ on:
description: The runner uses to build linux-amd64 artifacts
default: ec2-c6i.4xlarge-amd64
options:
- ubuntu-20.04
- ubuntu-20.04-8-cores
- ubuntu-20.04-16-cores
- ubuntu-20.04-32-cores
- ubuntu-20.04-64-cores
- ubuntu-22.04
- ubuntu-22.04-8-cores
- ubuntu-22.04-16-cores
- ubuntu-22.04-32-cores
- ubuntu-22.04-64-cores
- ec2-c6i.xlarge-amd64 # 4C8G
- ec2-c6i.2xlarge-amd64 # 8C16G
- ec2-c6i.4xlarge-amd64 # 16C32G
@@ -91,18 +91,13 @@ env:
# The scheduled version is '${{ env.NEXT_RELEASE_VERSION }}-nightly-YYYYMMDD', like v0.2.0-nigthly-20230313;
NIGHTLY_RELEASE_PREFIX: nightly
# Note: The NEXT_RELEASE_VERSION should be modified manually by every formal release.
NEXT_RELEASE_VERSION: v0.12.0
# Permission reference: https://docs.github.com/en/actions/using-jobs/assigning-permissions-to-jobs
permissions:
issues: write # Allows the action to create issues for cyborg.
contents: write # Allows the action to create a release.
NEXT_RELEASE_VERSION: v0.14.0
jobs:
allocate-runners:
name: Allocate runners
if: ${{ github.repository == 'GreptimeTeam/greptimedb' }}
runs-on: ubuntu-20.04
runs-on: ubuntu-latest
outputs:
linux-amd64-runner: ${{ steps.start-linux-amd64-runner.outputs.label }}
linux-arm64-runner: ${{ steps.start-linux-arm64-runner.outputs.label }}
@@ -122,6 +117,7 @@ jobs:
uses: actions/checkout@v4
with:
fetch-depth: 0
persist-credentials: false
- name: Check Rust toolchain version
shell: bash
@@ -181,6 +177,7 @@ jobs:
- uses: actions/checkout@v4
with:
fetch-depth: 0
persist-credentials: false
- uses: ./.github/actions/build-linux-artifacts
with:
@@ -202,6 +199,7 @@ jobs:
- uses: actions/checkout@v4
with:
fetch-depth: 0
persist-credentials: false
- uses: ./.github/actions/build-linux-artifacts
with:
@@ -237,6 +235,7 @@ jobs:
- uses: actions/checkout@v4
with:
fetch-depth: 0
persist-credentials: false
- uses: ./.github/actions/build-macos-artifacts
with:
@@ -276,6 +275,7 @@ jobs:
- uses: actions/checkout@v4
with:
fetch-depth: 0
persist-credentials: false
- uses: ./.github/actions/build-windows-artifacts
with:
@@ -299,22 +299,25 @@ jobs:
build-linux-amd64-artifacts,
build-linux-arm64-artifacts,
]
runs-on: ubuntu-2004-16-cores
runs-on: ubuntu-latest
outputs:
build-image-result: ${{ steps.set-build-image-result.outputs.build-image-result }}
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0
persist-credentials: false
- name: Build and push images to dockerhub
uses: ./.github/actions/build-images
with:
image-registry: docker.io
image-namespace: ${{ vars.IMAGE_NAMESPACE }}
image-name: ${{ vars.GREPTIMEDB_IMAGE_NAME }}
image-registry-username: ${{ secrets.DOCKERHUB_USERNAME }}
image-registry-password: ${{ secrets.DOCKERHUB_TOKEN }}
version: ${{ needs.allocate-runners.outputs.version }}
push-latest-tag: true
- name: Set build image result
id: set-build-image-result
@@ -332,7 +335,7 @@ jobs:
build-windows-artifacts,
release-images-to-dockerhub,
]
runs-on: ubuntu-20.04
runs-on: ubuntu-latest
# When we push to ACR, it's easy to fail due to some unknown network issues.
# However, we don't want to fail the whole workflow because of this.
# The ACR have daily sync with DockerHub, so don't worry about the image not being updated.
@@ -341,13 +344,14 @@ jobs:
- uses: actions/checkout@v4
with:
fetch-depth: 0
persist-credentials: false
- name: Release artifacts to CN region
uses: ./.github/actions/release-cn-artifacts
with:
src-image-registry: docker.io
src-image-namespace: ${{ vars.IMAGE_NAMESPACE }}
src-image-name: greptimedb
src-image-name: ${{ vars.GREPTIMEDB_IMAGE_NAME }}
dst-image-registry-username: ${{ secrets.ALICLOUD_USERNAME }}
dst-image-registry-password: ${{ secrets.ALICLOUD_PASSWORD }}
dst-image-registry: ${{ vars.ACR_IMAGE_REGISTRY }}
@@ -358,6 +362,7 @@ jobs:
aws-cn-secret-access-key: ${{ secrets.AWS_CN_SECRET_ACCESS_KEY }}
aws-cn-region: ${{ vars.AWS_RELEASE_BUCKET_REGION }}
dev-mode: false
upload-to-s3: true
update-version-info: true
push-latest-tag: true
@@ -372,11 +377,12 @@ jobs:
build-windows-artifacts,
release-images-to-dockerhub,
]
runs-on: ubuntu-20.04
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0
persist-credentials: false
- name: Publish GitHub release
uses: ./.github/actions/publish-github-release
@@ -390,7 +396,7 @@ jobs:
name: Stop linux-amd64 runner
# Only run this job when the runner is allocated.
if: ${{ always() }}
runs-on: ubuntu-20.04
runs-on: ubuntu-latest
needs: [
allocate-runners,
build-linux-amd64-artifacts,
@@ -400,6 +406,7 @@ jobs:
uses: actions/checkout@v4
with:
fetch-depth: 0
persist-credentials: false
- name: Stop EC2 runner
uses: ./.github/actions/stop-runner
@@ -415,7 +422,7 @@ jobs:
name: Stop linux-arm64 runner
# Only run this job when the runner is allocated.
if: ${{ always() }}
runs-on: ubuntu-20.04
runs-on: ubuntu-latest
needs: [
allocate-runners,
build-linux-arm64-artifacts,
@@ -425,6 +432,7 @@ jobs:
uses: actions/checkout@v4
with:
fetch-depth: 0
persist-credentials: false
- name: Stop EC2 runner
uses: ./.github/actions/stop-runner
@@ -440,9 +448,16 @@ jobs:
name: Bump doc version
if: ${{ github.event_name == 'push' || github.event_name == 'schedule' }}
needs: [allocate-runners]
runs-on: ubuntu-20.04
runs-on: ubuntu-latest
# Permission reference: https://docs.github.com/en/actions/using-jobs/assigning-permissions-to-jobs
permissions:
issues: write # Allows the action to create issues for cyborg.
contents: write # Allows the action to create a release.
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0
persist-credentials: false
- uses: ./.github/actions/setup-cyborg
- name: Bump doc version
working-directory: cyborg
@@ -460,11 +475,18 @@ jobs:
build-macos-artifacts,
build-windows-artifacts,
]
runs-on: ubuntu-20.04
runs-on: ubuntu-latest
# Permission reference: https://docs.github.com/en/actions/using-jobs/assigning-permissions-to-jobs
permissions:
issues: write # Allows the action to create issues for cyborg.
contents: write # Allows the action to create a release.
env:
SLACK_WEBHOOK_URL: ${{ secrets.SLACK_WEBHOOK_URL_DEVELOP_CHANNEL }}
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0
persist-credentials: false
- uses: ./.github/actions/setup-cyborg
- name: Report CI status
id: report-ci-status

View File

@@ -4,18 +4,20 @@ on:
- cron: '4 2 * * *'
workflow_dispatch:
permissions:
contents: read
issues: write
pull-requests: write
jobs:
maintenance:
name: Periodic Maintenance
runs-on: ubuntu-latest
permissions:
contents: read
issues: write
pull-requests: write
if: ${{ github.repository == 'GreptimeTeam/greptimedb' }}
steps:
- uses: actions/checkout@v4
with:
persist-credentials: false
- uses: ./.github/actions/setup-cyborg
- name: Do Maintenance
working-directory: cyborg

View File

@@ -1,18 +1,24 @@
name: "Semantic Pull Request"
on:
pull_request_target:
pull_request:
types:
- opened
- reopened
- edited
concurrency:
group: ${{ github.workflow }}-${{ github.head_ref || github.run_id }}
cancel-in-progress: true
jobs:
check:
runs-on: ubuntu-20.04
runs-on: ubuntu-latest
timeout-minutes: 10
steps:
- uses: actions/checkout@v4
with:
persist-credentials: false
- uses: ./.github/actions/setup-cyborg
- name: Check Pull Request
working-directory: cyborg

3
.gitignore vendored
View File

@@ -54,3 +54,6 @@ tests-fuzz/corpus/
# Nix
.direnv
.envrc
## default data home
greptimedb_data

View File

@@ -3,30 +3,28 @@
## Individual Committers (in alphabetical order)
* [CookiePieWw](https://github.com/CookiePieWw)
* [KKould](https://github.com/KKould)
* [NiwakaDev](https://github.com/NiwakaDev)
* [etolbakov](https://github.com/etolbakov)
* [irenjj](https://github.com/irenjj)
* [tisonkun](https://github.com/tisonkun)
* [KKould](https://github.com/KKould)
* [Lanqing Yang](https://github.com/lyang24)
* [NiwakaDev](https://github.com/NiwakaDev)
* [tisonkun](https://github.com/tisonkun)
## Team Members (in alphabetical order)
* [Breeze-P](https://github.com/Breeze-P)
* [GrepTime](https://github.com/GrepTime)
* [MichaelScofield](https://github.com/MichaelScofield)
* [Wenjie0329](https://github.com/Wenjie0329)
* [WenyXu](https://github.com/WenyXu)
* [ZonaHex](https://github.com/ZonaHex)
* [apdong2022](https://github.com/apdong2022)
* [beryl678](https://github.com/beryl678)
* [Breeze-P](https://github.com/Breeze-P)
* [daviderli614](https://github.com/daviderli614)
* [discord9](https://github.com/discord9)
* [evenyag](https://github.com/evenyag)
* [fengjiachun](https://github.com/fengjiachun)
* [fengys1996](https://github.com/fengys1996)
* [GrepTime](https://github.com/GrepTime)
* [holalengyu](https://github.com/holalengyu)
* [killme2008](https://github.com/killme2008)
* [MichaelScofield](https://github.com/MichaelScofield)
* [nicecui](https://github.com/nicecui)
* [paomian](https://github.com/paomian)
* [shuiyisong](https://github.com/shuiyisong)
@@ -34,11 +32,14 @@
* [sunng87](https://github.com/sunng87)
* [v0y4g3r](https://github.com/v0y4g3r)
* [waynexia](https://github.com/waynexia)
* [Wenjie0329](https://github.com/Wenjie0329)
* [WenyXu](https://github.com/WenyXu)
* [xtang](https://github.com/xtang)
* [zhaoyingnan01](https://github.com/zhaoyingnan01)
* [zhongzc](https://github.com/zhongzc)
* [ZonaHex](https://github.com/ZonaHex)
* [zyy17](https://github.com/zyy17)
## All Contributors
[![All Contributors](https://contrib.rocks/image?repo=GreptimeTeam/greptimedb)](https://github.com/GreptimeTeam/greptimedb/graphs/contributors)
To see the full list of contributors, please visit our [Contributors page](https://github.com/GreptimeTeam/greptimedb/graphs/contributors)

2093
Cargo.lock generated

File diff suppressed because it is too large Load Diff

View File

@@ -29,6 +29,7 @@ members = [
"src/common/query",
"src/common/recordbatch",
"src/common/runtime",
"src/common/session",
"src/common/substrait",
"src/common/telemetry",
"src/common/test-util",
@@ -67,7 +68,7 @@ members = [
resolver = "2"
[workspace.package]
version = "0.12.0"
version = "0.14.0"
edition = "2021"
license = "Apache-2.0"
@@ -81,13 +82,14 @@ rust.unknown_lints = "deny"
rust.unexpected_cfgs = { level = "warn", check-cfg = ['cfg(tokio_unstable)'] }
[workspace.dependencies]
# DO_NOT_REMOVE_THIS: BEGIN_OF_EXTERNAL_DEPENDENCIES
# We turn off default-features for some dependencies here so the workspaces which inherit them can
# selectively turn them on if needed, since we can override default-features = true (from false)
# for the inherited dependency but cannot do the reverse (override from true to false).
#
# See for more detaiils: https://github.com/rust-lang/cargo/issues/11329
ahash = { version = "0.8", features = ["compile-time-rng"] }
aquamarine = "0.3"
aquamarine = "0.6"
arrow = { version = "53.0.0", features = ["prettyprint"] }
arrow-array = { version = "53.0.0", default-features = false, features = ["chrono-tz"] }
arrow-flight = "53.0"
@@ -98,18 +100,19 @@ async-trait = "0.1"
# Remember to update axum-extra, axum-macros when updating axum
axum = "0.8"
axum-extra = "0.10"
axum-macros = "0.4"
axum-macros = "0.5"
backon = "1"
base64 = "0.21"
base64 = "0.22"
bigdecimal = "0.4.2"
bitflags = "2.4.1"
bytemuck = "1.12"
bytes = { version = "1.7", features = ["serde"] }
chrono = { version = "0.4", features = ["serde"] }
chrono-tz = "0.10.1"
clap = { version = "4.4", features = ["derive"] }
config = "0.13.0"
crossbeam-utils = "0.8"
dashmap = "5.4"
dashmap = "6.1"
datafusion = { git = "https://github.com/apache/datafusion.git", rev = "2464703c84c400a09cc59277018813f0e797bb4e" }
datafusion-common = { git = "https://github.com/apache/datafusion.git", rev = "2464703c84c400a09cc59277018813f0e797bb4e" }
datafusion-expr = { git = "https://github.com/apache/datafusion.git", rev = "2464703c84c400a09cc59277018813f0e797bb4e" }
@@ -119,31 +122,31 @@ datafusion-physical-expr = { git = "https://github.com/apache/datafusion.git", r
datafusion-physical-plan = { git = "https://github.com/apache/datafusion.git", rev = "2464703c84c400a09cc59277018813f0e797bb4e" }
datafusion-sql = { git = "https://github.com/apache/datafusion.git", rev = "2464703c84c400a09cc59277018813f0e797bb4e" }
datafusion-substrait = { git = "https://github.com/apache/datafusion.git", rev = "2464703c84c400a09cc59277018813f0e797bb4e" }
deadpool = "0.10"
deadpool-postgres = "0.12"
derive_builder = "0.12"
deadpool = "0.12"
deadpool-postgres = "0.14"
derive_builder = "0.20"
dotenv = "0.15"
etcd-client = "0.14"
fst = "0.4.7"
futures = "0.3"
futures-util = "0.3"
greptime-proto = { git = "https://github.com/GreptimeTeam/greptime-proto.git", rev = "683e9d10ae7f3dfb8aaabd89082fc600c17e3795" }
greptime-proto = { git = "https://github.com/GreptimeTeam/greptime-proto.git", rev = "dd4a1996982534636734674db66e44464b0c0d83" }
hex = "0.4"
http = "1"
humantime = "2.1"
humantime-serde = "1.1"
hyper = "1.1"
hyper-util = "0.1"
itertools = "0.10"
itertools = "0.14"
jsonb = { git = "https://github.com/databendlabs/jsonb.git", rev = "8c8d2fc294a39f3ff08909d60f718639cfba3875", default-features = false }
lazy_static = "1.4"
local-ip-address = "0.6"
loki-api = { git = "https://github.com/shuiyisong/tracing-loki", branch = "chore/prost_version" }
meter-core = { git = "https://github.com/GreptimeTeam/greptime-meter.git", rev = "a10facb353b41460eeb98578868ebf19c2084fac" }
mockall = "0.11.4"
loki-proto = { git = "https://github.com/GreptimeTeam/loki-proto.git", rev = "1434ecf23a2654025d86188fb5205e7a74b225d3" }
meter-core = { git = "https://github.com/GreptimeTeam/greptime-meter.git", rev = "5618e779cf2bb4755b499c630fba4c35e91898cb" }
mockall = "0.13"
moka = "0.12"
nalgebra = "0.33"
notify = "6.1"
notify = "8.0"
num_cpus = "1.16"
once_cell = "1.18"
opentelemetry-proto = { version = "0.27", features = [
@@ -158,11 +161,11 @@ parquet = { version = "53.0.0", default-features = false, features = ["arrow", "
paste = "1.0"
pin-project = "1.0"
prometheus = { version = "0.13.3", features = ["process"] }
promql-parser = { version = "0.4.3", features = ["ser"] }
promql-parser = { version = "0.5", features = ["ser"] }
prost = "0.13"
raft-engine = { version = "0.4.1", default-features = false }
rand = "0.8"
ratelimit = "0.9"
rand = "0.9"
ratelimit = "0.10"
regex = "1.8"
regex-automata = "0.4"
reqwest = { version = "0.12", default-features = false, features = [
@@ -174,29 +177,37 @@ reqwest = { version = "0.12", default-features = false, features = [
rskafka = { git = "https://github.com/influxdata/rskafka.git", rev = "75535b5ad9bae4a5dbb582c82e44dfd81ec10105", features = [
"transport-tls",
] }
rstest = "0.21"
rstest = "0.25"
rstest_reuse = "0.7"
rust_decimal = "1.33"
rustc-hash = "2.0"
rustls = { version = "0.23.20", default-features = false } # override by patch, see [patch.crates-io]
# It is worth noting that we should try to avoid using aws-lc-rs until it can be compiled on various platforms.
rustls = { version = "0.23.25", default-features = false }
serde = { version = "1.0", features = ["derive"] }
serde_json = { version = "1.0", features = ["float_roundtrip"] }
serde_with = "3"
shadow-rs = "0.38"
shadow-rs = "1.1"
simd-json = "0.15"
similar-asserts = "1.6.0"
smallvec = { version = "1", features = ["serde"] }
snafu = "0.8"
sysinfo = "0.30"
sqlx = { version = "0.8", features = [
"runtime-tokio-rustls",
"mysql",
"postgres",
"chrono",
] }
sysinfo = "0.33"
# on branch v0.52.x
sqlparser = { git = "https://github.com/GreptimeTeam/sqlparser-rs.git", rev = "71dd86058d2af97b9925093d40c4e03360403170", features = [
"visitor",
"serde",
] } # on branch v0.44.x
strum = { version = "0.25", features = ["derive"] }
strum = { version = "0.27", features = ["derive"] }
tempfile = "3"
tokio = { version = "1.40", features = ["full"] }
tokio-postgres = "0.7"
tokio-rustls = { version = "0.26.0", default-features = false } # override by patch, see [patch.crates-io]
tokio-rustls = { version = "0.26.2", default-features = false }
tokio-stream = "0.1"
tokio-util = { version = "0.7", features = ["io-util", "compat"] }
toml = "0.8.8"
@@ -207,6 +218,7 @@ tracing-subscriber = { version = "0.3", features = ["env-filter", "json", "fmt"]
typetag = "0.2"
uuid = { version = "1.7", features = ["serde", "v4", "fast-rng"] }
zstd = "0.13"
# DO_NOT_REMOVE_THIS: END_OF_EXTERNAL_DEPENDENCIES
## workspaces members
api = { path = "src/api" }
@@ -238,6 +250,7 @@ common-procedure-test = { path = "src/common/procedure-test" }
common-query = { path = "src/common/query" }
common-recordbatch = { path = "src/common/recordbatch" }
common-runtime = { path = "src/common/runtime" }
common-session = { path = "src/common/session" }
common-telemetry = { path = "src/common/telemetry" }
common-test-util = { path = "src/common/test-util" }
common-time = { path = "src/common/time" }
@@ -270,20 +283,9 @@ store-api = { path = "src/store-api" }
substrait = { path = "src/common/substrait" }
table = { path = "src/table" }
[patch.crates-io]
# change all rustls dependencies to use our fork to default to `ring` to make it "just work"
hyper-rustls = { git = "https://github.com/GreptimeTeam/hyper-rustls", rev = "a951e03" } # version = "0.27.5" with ring patch
rustls = { git = "https://github.com/GreptimeTeam/rustls", rev = "34fd0c6" } # version = "0.23.20" with ring patch
tokio-rustls = { git = "https://github.com/GreptimeTeam/tokio-rustls", rev = "4604ca6" } # version = "0.26.0" with ring patch
# This is commented, since we are not using aws-lc-sys, if we need to use it, we need to uncomment this line or use a release after this commit, or it wouldn't compile with gcc < 8.1
# see https://github.com/aws/aws-lc-rs/pull/526
# aws-lc-sys = { git ="https://github.com/aws/aws-lc-rs", rev = "556558441e3494af4b156ae95ebc07ebc2fd38aa" }
# Apply a fix for pprof for unaligned pointer access
pprof = { git = "https://github.com/GreptimeTeam/pprof-rs", rev = "1bd1e21" }
[workspace.dependencies.meter-macros]
git = "https://github.com/GreptimeTeam/greptime-meter.git"
rev = "a10facb353b41460eeb98578868ebf19c2084fac"
rev = "5618e779cf2bb4755b499c630fba4c35e91898cb"
[profile.release]
debug = 1

View File

@@ -1,3 +1,6 @@
[target.aarch64-unknown-linux-gnu]
image = "ghcr.io/cross-rs/aarch64-unknown-linux-gnu:0.2.5"
[build]
pre-build = [
"dpkg --add-architecture $CROSS_DEB_ARCH",
@@ -5,3 +8,8 @@ pre-build = [
"curl -LO https://github.com/protocolbuffers/protobuf/releases/download/v3.15.8/protoc-3.15.8-linux-x86_64.zip && unzip protoc-3.15.8-linux-x86_64.zip -d /usr/",
"chmod a+x /usr/bin/protoc && chmod -R a+rx /usr/include/google",
]
[build.env]
passthrough = [
"JEMALLOC_SYS_WITH_LG_PAGE",
]

View File

@@ -8,7 +8,7 @@ CARGO_BUILD_OPTS := --locked
IMAGE_REGISTRY ?= docker.io
IMAGE_NAMESPACE ?= greptime
IMAGE_TAG ?= latest
DEV_BUILDER_IMAGE_TAG ?= 2024-12-25-9d0fa5d5-20250124085746
DEV_BUILDER_IMAGE_TAG ?= 2024-12-25-a71b93dd-20250305072908
BUILDX_MULTI_PLATFORM_BUILD ?= false
BUILDX_BUILDER_NAME ?= gtbuilder
BASE_IMAGE ?= ubuntu
@@ -60,6 +60,8 @@ ifeq ($(BUILDX_MULTI_PLATFORM_BUILD), all)
BUILDX_MULTI_PLATFORM_BUILD_OPTS := --platform linux/amd64,linux/arm64 --push
else ifeq ($(BUILDX_MULTI_PLATFORM_BUILD), amd64)
BUILDX_MULTI_PLATFORM_BUILD_OPTS := --platform linux/amd64 --push
else ifeq ($(BUILDX_MULTI_PLATFORM_BUILD), arm64)
BUILDX_MULTI_PLATFORM_BUILD_OPTS := --platform linux/arm64 --push
else
BUILDX_MULTI_PLATFORM_BUILD_OPTS := -o type=docker
endif

View File

@@ -6,7 +6,7 @@
</picture>
</p>
<h2 align="center">Unified & Cost-Effective Time Series Database for Metrics, Logs, and Events</h2>
<h2 align="center">Unified & Cost-Effective Observerability Database for Metrics, Logs, and Events</h2>
<div align="center">
<h3 align="center">
@@ -62,15 +62,19 @@
## Introduction
**GreptimeDB** is an open-source unified & cost-effective time-series database for **Metrics**, **Logs**, and **Events** (also **Traces** in plan). You can gain real-time insights from Edge to Cloud at Any Scale.
**GreptimeDB** is an open-source unified & cost-effective observerability database for **Metrics**, **Logs**, and **Events** (also **Traces** in plan). You can gain real-time insights from Edge to Cloud at Any Scale.
## News
**[GreptimeDB tops JSONBench's billion-record cold run test!](https://greptime.com/blogs/2025-03-18-jsonbench-greptimedb-performance)**
## Why GreptimeDB
Our core developers have been building time-series data platforms for years. Based on our best practices, GreptimeDB was born to give you:
Our core developers have been building observerability data platforms for years. Based on our best practices, GreptimeDB was born to give you:
* **Unified Processing of Metrics, Logs, and Events**
GreptimeDB unifies time series data processing by treating all data - whether metrics, logs, or events - as timestamped events with context. Users can analyze this data using either [SQL](https://docs.greptime.com/user-guide/query-data/sql) or [PromQL](https://docs.greptime.com/user-guide/query-data/promql) and leverage stream processing ([Flow](https://docs.greptime.com/user-guide/flow-computation/overview)) to enable continuous aggregation. [Read more](https://docs.greptime.com/user-guide/concepts/data-model).
GreptimeDB unifies observerability data processing by treating all data - whether metrics, logs, or events - as timestamped events with context. Users can analyze this data using either [SQL](https://docs.greptime.com/user-guide/query-data/sql) or [PromQL](https://docs.greptime.com/user-guide/query-data/promql) and leverage stream processing ([Flow](https://docs.greptime.com/user-guide/flow-computation/overview)) to enable continuous aggregation. [Read more](https://docs.greptime.com/user-guide/concepts/data-model).
* **Cloud-native Distributed Database**
@@ -112,11 +116,11 @@ Start a GreptimeDB container with:
```shell
docker run -p 127.0.0.1:4000-4003:4000-4003 \
-v "$(pwd)/greptimedb:/tmp/greptimedb" \
-v "$(pwd)/greptimedb:./greptimedb_data" \
--name greptime --rm \
greptime/greptimedb:latest standalone start \
--http-addr 0.0.0.0:4000 \
--rpc-addr 0.0.0.0:4001 \
--rpc-bind-addr 0.0.0.0:4001 \
--mysql-addr 0.0.0.0:4002 \
--postgres-addr 0.0.0.0:4003
```

View File

@@ -12,7 +12,6 @@
| Key | Type | Default | Descriptions |
| --- | -----| ------- | ----------- |
| `mode` | String | `standalone` | The running mode of the datanode. It can be `standalone` or `distributed`. |
| `default_timezone` | String | Unset | The default timezone of the server. |
| `init_regions_in_background` | Bool | `false` | Initialize all regions in the background during the startup.<br/>By default, it provides services after all regions have been initialized. |
| `init_regions_parallelism` | Integer | `16` | Parallelism of initializing regions. |
@@ -24,12 +23,12 @@
| `runtime.compact_rt_size` | Integer | `4` | The number of threads to execute the runtime for global write operations. |
| `http` | -- | -- | The HTTP server options. |
| `http.addr` | String | `127.0.0.1:4000` | The address to bind the HTTP server. |
| `http.timeout` | String | `30s` | HTTP request timeout. Set to 0 to disable timeout. |
| `http.timeout` | String | `0s` | HTTP request timeout. Set to 0 to disable timeout. |
| `http.body_limit` | String | `64MB` | HTTP request body limit.<br/>The following units are supported: `B`, `KB`, `KiB`, `MB`, `MiB`, `GB`, `GiB`, `TB`, `TiB`, `PB`, `PiB`.<br/>Set to 0 to disable limit. |
| `http.enable_cors` | Bool | `true` | HTTP CORS support, it's turned on by default<br/>This allows browser to access http APIs without CORS restrictions |
| `http.cors_allowed_origins` | Array | Unset | Customize allowed origins for HTTP CORS. |
| `grpc` | -- | -- | The gRPC server options. |
| `grpc.addr` | String | `127.0.0.1:4001` | The address to bind the gRPC server. |
| `grpc.bind_addr` | String | `127.0.0.1:4001` | The address to bind the gRPC server. |
| `grpc.runtime_size` | Integer | `8` | The number of server worker threads. |
| `grpc.tls` | -- | -- | gRPC server TLS options, see `mysql.tls` section. |
| `grpc.tls.mode` | String | `disable` | TLS mode. |
@@ -40,6 +39,7 @@
| `mysql.enable` | Bool | `true` | Whether to enable. |
| `mysql.addr` | String | `127.0.0.1:4002` | The addr to bind the MySQL server. |
| `mysql.runtime_size` | Integer | `2` | The number of server worker threads. |
| `mysql.keep_alive` | String | `0s` | Server-side keep-alive time.<br/>Set to 0 (default) to disable. |
| `mysql.tls` | -- | -- | -- |
| `mysql.tls.mode` | String | `disable` | TLS mode, refer to https://www.postgresql.org/docs/current/libpq-ssl.html<br/>- `disable` (default value)<br/>- `prefer`<br/>- `require`<br/>- `verify-ca`<br/>- `verify-full` |
| `mysql.tls.cert_path` | String | Unset | Certificate file path. |
@@ -49,6 +49,7 @@
| `postgres.enable` | Bool | `true` | Whether to enable |
| `postgres.addr` | String | `127.0.0.1:4003` | The addr to bind the PostgresSQL server. |
| `postgres.runtime_size` | Integer | `2` | The number of server worker threads. |
| `postgres.keep_alive` | String | `0s` | Server-side keep-alive time.<br/>Set to 0 (default) to disable. |
| `postgres.tls` | -- | -- | PostgresSQL server TLS options, see `mysql.tls` section. |
| `postgres.tls.mode` | String | `disable` | TLS mode. |
| `postgres.tls.cert_path` | String | Unset | Certificate file path. |
@@ -58,6 +59,8 @@
| `opentsdb.enable` | Bool | `true` | Whether to enable OpenTSDB put in HTTP API. |
| `influxdb` | -- | -- | InfluxDB protocol options. |
| `influxdb.enable` | Bool | `true` | Whether to enable InfluxDB protocol in HTTP API. |
| `jaeger` | -- | -- | Jaeger protocol options. |
| `jaeger.enable` | Bool | `true` | Whether to enable Jaeger protocol in HTTP API. |
| `prom_store` | -- | -- | Prometheus remote storage options |
| `prom_store.enable` | Bool | `true` | Whether to enable Prometheus remote write and read in HTTP API. |
| `prom_store.with_metric_engine` | Bool | `true` | Whether to store the data from Prometheus remote write in metric engine. |
@@ -65,8 +68,8 @@
| `wal.provider` | String | `raft_engine` | The provider of the WAL.<br/>- `raft_engine`: the wal is stored in the local file system by raft-engine.<br/>- `kafka`: it's remote wal that data is stored in Kafka. |
| `wal.dir` | String | Unset | The directory to store the WAL files.<br/>**It's only used when the provider is `raft_engine`**. |
| `wal.file_size` | String | `128MB` | The size of the WAL segment file.<br/>**It's only used when the provider is `raft_engine`**. |
| `wal.purge_threshold` | String | `1GB` | The threshold of the WAL size to trigger a flush.<br/>**It's only used when the provider is `raft_engine`**. |
| `wal.purge_interval` | String | `1m` | The interval to trigger a flush.<br/>**It's only used when the provider is `raft_engine`**. |
| `wal.purge_threshold` | String | `1GB` | The threshold of the WAL size to trigger a purge.<br/>**It's only used when the provider is `raft_engine`**. |
| `wal.purge_interval` | String | `1m` | The interval to trigger a purge.<br/>**It's only used when the provider is `raft_engine`**. |
| `wal.read_batch_size` | Integer | `128` | The read batch size.<br/>**It's only used when the provider is `raft_engine`**. |
| `wal.sync_write` | Bool | `false` | Whether to use sync write.<br/>**It's only used when the provider is `raft_engine`**. |
| `wal.enable_log_recycle` | Bool | `true` | Whether to reuse logically truncated log files.<br/>**It's only used when the provider is `raft_engine`**. |
@@ -82,21 +85,19 @@
| `wal.create_topic_timeout` | String | `30s` | Above which a topic creation operation will be cancelled.<br/>**It's only used when the provider is `kafka`**. |
| `wal.max_batch_bytes` | String | `1MB` | The max size of a single producer batch.<br/>Warning: Kafka has a default limit of 1MB per message in a topic.<br/>**It's only used when the provider is `kafka`**. |
| `wal.consumer_wait_timeout` | String | `100ms` | The consumer wait timeout.<br/>**It's only used when the provider is `kafka`**. |
| `wal.backoff_init` | String | `500ms` | The initial backoff delay.<br/>**It's only used when the provider is `kafka`**. |
| `wal.backoff_max` | String | `10s` | The maximum backoff delay.<br/>**It's only used when the provider is `kafka`**. |
| `wal.backoff_base` | Integer | `2` | The exponential backoff rate, i.e. next backoff = base * current backoff.<br/>**It's only used when the provider is `kafka`**. |
| `wal.backoff_deadline` | String | `5mins` | The deadline of retries.<br/>**It's only used when the provider is `kafka`**. |
| `wal.overwrite_entry_start_id` | Bool | `false` | Ignore missing entries during read WAL.<br/>**It's only used when the provider is `kafka`**.<br/><br/>This option ensures that when Kafka messages are deleted, the system<br/>can still successfully replay memtable data without throwing an<br/>out-of-range error.<br/>However, enabling this option might lead to unexpected data loss,<br/>as the system will skip over missing entries instead of treating<br/>them as critical errors. |
| `metadata_store` | -- | -- | Metadata storage options. |
| `metadata_store.file_size` | String | `256MB` | Kv file size in bytes. |
| `metadata_store.purge_threshold` | String | `4GB` | Kv purge threshold. |
| `metadata_store.file_size` | String | `64MB` | The size of the metadata store log file. |
| `metadata_store.purge_threshold` | String | `256MB` | The threshold of the metadata store size to trigger a purge. |
| `metadata_store.purge_interval` | String | `1m` | The interval of the metadata store to trigger a purge. |
| `procedure` | -- | -- | Procedure storage options. |
| `procedure.max_retry_times` | Integer | `3` | Procedure max retry time. |
| `procedure.retry_delay` | String | `500ms` | Initial retry delay of procedures, increases exponentially |
| `procedure.max_running_procedures` | Integer | `128` | Max running procedures.<br/>The maximum number of procedures that can be running at the same time.<br/>If the number of running procedures exceeds this limit, the procedure will be rejected. |
| `flow` | -- | -- | flow engine options. |
| `flow.num_workers` | Integer | `0` | The number of flow worker in flownode.<br/>Not setting(or set to 0) this value will use the number of CPU cores divided by 2. |
| `storage` | -- | -- | The data storage options. |
| `storage.data_home` | String | `/tmp/greptimedb/` | The working home directory. |
| `storage.data_home` | String | `./greptimedb_data/` | The working home directory. |
| `storage.type` | String | `File` | The storage type used to store the data.<br/>- `File`: the data is stored in the local file system.<br/>- `S3`: the data is stored in the S3 object storage.<br/>- `Gcs`: the data is stored in the Google Cloud Storage.<br/>- `Azblob`: the data is stored in the Azure Blob Storage.<br/>- `Oss`: the data is stored in the Aliyun OSS. |
| `storage.cache_path` | String | Unset | Read cache configuration for object storage such as 'S3' etc, it's configured by default when using object storage. It is recommended to configure it when using object storage for better performance.<br/>A local file directory, defaults to `{data_home}`. An empty string means disabling. |
| `storage.cache_capacity` | String | Unset | The local file cache capacity in bytes. If your disk space is sufficient, it is recommended to set it larger. |
@@ -147,6 +148,7 @@
| `region_engine.mito.index` | -- | -- | The options for index in Mito engine. |
| `region_engine.mito.index.aux_path` | String | `""` | Auxiliary directory path for the index in filesystem, used to store intermediate files for<br/>creating the index and staging files for searching the index, defaults to `{data_home}/index_intermediate`.<br/>The default name for this directory is `index_intermediate` for backward compatibility.<br/><br/>This path contains two subdirectories:<br/>- `__intm`: for storing intermediate files used during creating index.<br/>- `staging`: for storing staging files used during searching index. |
| `region_engine.mito.index.staging_size` | String | `2GB` | The max capacity of the staging directory. |
| `region_engine.mito.index.staging_ttl` | String | `7d` | The TTL of the staging directory.<br/>Defaults to 7 days.<br/>Setting it to "0s" to disable TTL. |
| `region_engine.mito.index.metadata_cache_size` | String | `64MiB` | Cache size for inverted index metadata. |
| `region_engine.mito.index.content_cache_size` | String | `128MiB` | Cache size for inverted index content. |
| `region_engine.mito.index.content_cache_page_size` | String | `64KiB` | Page size for inverted index content cache. |
@@ -175,7 +177,7 @@
| `region_engine.metric` | -- | -- | Metric engine options. |
| `region_engine.metric.experimental_sparse_primary_key_encoding` | Bool | `false` | Whether to enable the experimental sparse primary key encoding. |
| `logging` | -- | -- | The logging options. |
| `logging.dir` | String | `/tmp/greptimedb/logs` | The directory to store the log files. If set to empty, logs will not be written to files. |
| `logging.dir` | String | `./greptimedb_data/logs` | The directory to store the log files. If set to empty, logs will not be written to files. |
| `logging.level` | String | Unset | The log level. Can be `info`/`debug`/`warn`/`error`. |
| `logging.enable_otlp_tracing` | Bool | `false` | Enable OTLP tracing. |
| `logging.otlp_endpoint` | String | `http://localhost:4317` | The OTLP tracing endpoint. |
@@ -216,13 +218,13 @@
| `heartbeat.retry_interval` | String | `3s` | Interval for retrying to send heartbeat messages to the metasrv. |
| `http` | -- | -- | The HTTP server options. |
| `http.addr` | String | `127.0.0.1:4000` | The address to bind the HTTP server. |
| `http.timeout` | String | `30s` | HTTP request timeout. Set to 0 to disable timeout. |
| `http.timeout` | String | `0s` | HTTP request timeout. Set to 0 to disable timeout. |
| `http.body_limit` | String | `64MB` | HTTP request body limit.<br/>The following units are supported: `B`, `KB`, `KiB`, `MB`, `MiB`, `GB`, `GiB`, `TB`, `TiB`, `PB`, `PiB`.<br/>Set to 0 to disable limit. |
| `http.enable_cors` | Bool | `true` | HTTP CORS support, it's turned on by default<br/>This allows browser to access http APIs without CORS restrictions |
| `http.cors_allowed_origins` | Array | Unset | Customize allowed origins for HTTP CORS. |
| `grpc` | -- | -- | The gRPC server options. |
| `grpc.addr` | String | `127.0.0.1:4001` | The address to bind the gRPC server. |
| `grpc.hostname` | String | `127.0.0.1:4001` | The hostname advertised to the metasrv,<br/>and used for connections from outside the host |
| `grpc.bind_addr` | String | `127.0.0.1:4001` | The address to bind the gRPC server. |
| `grpc.server_addr` | String | `127.0.0.1:4001` | The address advertised to the metasrv, and used for connections from outside the host.<br/>If left empty or unset, the server will automatically use the IP address of the first network interface<br/>on the host, with the same port number as the one specified in `grpc.bind_addr`. |
| `grpc.runtime_size` | Integer | `8` | The number of server worker threads. |
| `grpc.tls` | -- | -- | gRPC server TLS options, see `mysql.tls` section. |
| `grpc.tls.mode` | String | `disable` | TLS mode. |
@@ -233,6 +235,7 @@
| `mysql.enable` | Bool | `true` | Whether to enable. |
| `mysql.addr` | String | `127.0.0.1:4002` | The addr to bind the MySQL server. |
| `mysql.runtime_size` | Integer | `2` | The number of server worker threads. |
| `mysql.keep_alive` | String | `0s` | Server-side keep-alive time.<br/>Set to 0 (default) to disable. |
| `mysql.tls` | -- | -- | -- |
| `mysql.tls.mode` | String | `disable` | TLS mode, refer to https://www.postgresql.org/docs/current/libpq-ssl.html<br/>- `disable` (default value)<br/>- `prefer`<br/>- `require`<br/>- `verify-ca`<br/>- `verify-full` |
| `mysql.tls.cert_path` | String | Unset | Certificate file path. |
@@ -242,6 +245,7 @@
| `postgres.enable` | Bool | `true` | Whether to enable |
| `postgres.addr` | String | `127.0.0.1:4003` | The addr to bind the PostgresSQL server. |
| `postgres.runtime_size` | Integer | `2` | The number of server worker threads. |
| `postgres.keep_alive` | String | `0s` | Server-side keep-alive time.<br/>Set to 0 (default) to disable. |
| `postgres.tls` | -- | -- | PostgresSQL server TLS options, see `mysql.tls` section. |
| `postgres.tls.mode` | String | `disable` | TLS mode. |
| `postgres.tls.cert_path` | String | Unset | Certificate file path. |
@@ -251,6 +255,8 @@
| `opentsdb.enable` | Bool | `true` | Whether to enable OpenTSDB put in HTTP API. |
| `influxdb` | -- | -- | InfluxDB protocol options. |
| `influxdb.enable` | Bool | `true` | Whether to enable InfluxDB protocol in HTTP API. |
| `jaeger` | -- | -- | Jaeger protocol options. |
| `jaeger.enable` | Bool | `true` | Whether to enable Jaeger protocol in HTTP API. |
| `prom_store` | -- | -- | Prometheus remote storage options |
| `prom_store.enable` | Bool | `true` | Whether to enable Prometheus remote write and read in HTTP API. |
| `prom_store.with_metric_engine` | Bool | `true` | Whether to store the data from Prometheus remote write in metric engine. |
@@ -269,7 +275,7 @@
| `datanode.client.connect_timeout` | String | `10s` | -- |
| `datanode.client.tcp_nodelay` | Bool | `true` | -- |
| `logging` | -- | -- | The logging options. |
| `logging.dir` | String | `/tmp/greptimedb/logs` | The directory to store the log files. If set to empty, logs will not be written to files. |
| `logging.dir` | String | `./greptimedb_data/logs` | The directory to store the log files. If set to empty, logs will not be written to files. |
| `logging.level` | String | Unset | The log level. Can be `info`/`debug`/`warn`/`error`. |
| `logging.enable_otlp_tracing` | Bool | `false` | Enable OTLP tracing. |
| `logging.otlp_endpoint` | String | `http://localhost:4317` | The OTLP tracing endpoint. |
@@ -298,9 +304,9 @@
| Key | Type | Default | Descriptions |
| --- | -----| ------- | ----------- |
| `data_home` | String | `/tmp/metasrv/` | The working home directory. |
| `data_home` | String | `./greptimedb_data/metasrv/` | The working home directory. |
| `bind_addr` | String | `127.0.0.1:3002` | The bind address of metasrv. |
| `server_addr` | String | `127.0.0.1:3002` | The communication server address for frontend and datanode to connect to metasrv, "127.0.0.1:3002" by default for localhost. |
| `server_addr` | String | `127.0.0.1:3002` | The communication server address for the frontend and datanode to connect to metasrv.<br/>If left empty or unset, the server will automatically use the IP address of the first network interface<br/>on the host, with the same port number as the one specified in `bind_addr`. |
| `store_addrs` | Array | -- | Store server address default to etcd store.<br/>For postgres store, the format is:<br/>"password=password dbname=postgres user=postgres host=localhost port=5432"<br/>For etcd store, the format is:<br/>"127.0.0.1:2379" |
| `store_key_prefix` | String | `""` | If it's not empty, the metasrv will store all data with this key prefix. |
| `backend` | String | `etcd_store` | The datastore for meta server.<br/>Available values:<br/>- `etcd_store` (default value)<br/>- `memory_store`<br/>- `postgres_store` |
@@ -309,6 +315,7 @@
| `selector` | String | `round_robin` | Datanode selector type.<br/>- `round_robin` (default value)<br/>- `lease_based`<br/>- `load_based`<br/>For details, please see "https://docs.greptime.com/developer-guide/metasrv/selector". |
| `use_memory_store` | Bool | `false` | Store data in memory. |
| `enable_region_failover` | Bool | `false` | Whether to enable region failover.<br/>This feature is only available on GreptimeDB running on cluster mode and<br/>- Using Remote WAL<br/>- Using shared storage (e.g., s3). |
| `node_max_idle_time` | String | `24hours` | Max allowed idle time before removing node info from metasrv memory. |
| `enable_telemetry` | Bool | `true` | Whether to enable greptimedb telemetry. Enabled by default. |
| `runtime` | -- | -- | The runtime options. |
| `runtime.global_rt_size` | Integer | `8` | The number of threads to execute the runtime for global read operations. |
@@ -317,6 +324,7 @@
| `procedure.max_retry_times` | Integer | `12` | Procedure max retry time. |
| `procedure.retry_delay` | String | `500ms` | Initial retry delay of procedures, increases exponentially |
| `procedure.max_metadata_value_size` | String | `1500KiB` | Auto split large value<br/>GreptimeDB procedure uses etcd as the default metadata storage backend.<br/>The etcd the maximum size of any request is 1.5 MiB<br/>1500KiB = 1536KiB (1.5MiB) - 36KiB (reserved size of key)<br/>Comments out the `max_metadata_value_size`, for don't split large value (no limit). |
| `procedure.max_running_procedures` | Integer | `128` | Max running procedures.<br/>The maximum number of procedures that can be running at the same time.<br/>If the number of running procedures exceeds this limit, the procedure will be rejected. |
| `failure_detector` | -- | -- | -- |
| `failure_detector.threshold` | Float | `8.0` | The threshold value used by the failure detector to determine failure conditions. |
| `failure_detector.min_std_deviation` | String | `100ms` | The minimum standard deviation of the heartbeat intervals, used to calculate acceptable variations. |
@@ -336,12 +344,8 @@
| `wal.topic_name_prefix` | String | `greptimedb_wal_topic` | A Kafka topic is constructed by concatenating `topic_name_prefix` and `topic_id`.<br/>Only accepts strings that match the following regular expression pattern:<br/>[a-zA-Z_:-][a-zA-Z0-9_:\-\.@#]*<br/>i.g., greptimedb_wal_topic_0, greptimedb_wal_topic_1. |
| `wal.replication_factor` | Integer | `1` | Expected number of replicas of each partition. |
| `wal.create_topic_timeout` | String | `30s` | Above which a topic creation operation will be cancelled. |
| `wal.backoff_init` | String | `500ms` | The initial backoff for kafka clients. |
| `wal.backoff_max` | String | `10s` | The maximum backoff for kafka clients. |
| `wal.backoff_base` | Integer | `2` | Exponential backoff rate, i.e. next backoff = base * current backoff. |
| `wal.backoff_deadline` | String | `5mins` | Stop reconnecting if the total wait time reaches the deadline. If this config is missing, the reconnecting won't terminate. |
| `logging` | -- | -- | The logging options. |
| `logging.dir` | String | `/tmp/greptimedb/logs` | The directory to store the log files. If set to empty, logs will not be written to files. |
| `logging.dir` | String | `./greptimedb_data/logs` | The directory to store the log files. If set to empty, logs will not be written to files. |
| `logging.level` | String | Unset | The log level. Can be `info`/`debug`/`warn`/`error`. |
| `logging.enable_otlp_tracing` | Bool | `false` | Enable OTLP tracing. |
| `logging.otlp_endpoint` | String | `http://localhost:4317` | The OTLP tracing endpoint. |
@@ -370,25 +374,19 @@
| Key | Type | Default | Descriptions |
| --- | -----| ------- | ----------- |
| `mode` | String | `standalone` | The running mode of the datanode. It can be `standalone` or `distributed`. |
| `node_id` | Integer | Unset | The datanode identifier and should be unique in the cluster. |
| `require_lease_before_startup` | Bool | `false` | Start services after regions have obtained leases.<br/>It will block the datanode start if it can't receive leases in the heartbeat from metasrv. |
| `init_regions_in_background` | Bool | `false` | Initialize all regions in the background during the startup.<br/>By default, it provides services after all regions have been initialized. |
| `init_regions_parallelism` | Integer | `16` | Parallelism of initializing regions. |
| `max_concurrent_queries` | Integer | `0` | The maximum current queries allowed to be executed. Zero means unlimited. |
| `rpc_addr` | String | Unset | Deprecated, use `grpc.addr` instead. |
| `rpc_hostname` | String | Unset | Deprecated, use `grpc.hostname` instead. |
| `rpc_runtime_size` | Integer | Unset | Deprecated, use `grpc.runtime_size` instead. |
| `rpc_max_recv_message_size` | String | Unset | Deprecated, use `grpc.rpc_max_recv_message_size` instead. |
| `rpc_max_send_message_size` | String | Unset | Deprecated, use `grpc.rpc_max_send_message_size` instead. |
| `enable_telemetry` | Bool | `true` | Enable telemetry to collect anonymous usage data. Enabled by default. |
| `http` | -- | -- | The HTTP server options. |
| `http.addr` | String | `127.0.0.1:4000` | The address to bind the HTTP server. |
| `http.timeout` | String | `30s` | HTTP request timeout. Set to 0 to disable timeout. |
| `http.timeout` | String | `0s` | HTTP request timeout. Set to 0 to disable timeout. |
| `http.body_limit` | String | `64MB` | HTTP request body limit.<br/>The following units are supported: `B`, `KB`, `KiB`, `MB`, `MiB`, `GB`, `GiB`, `TB`, `TiB`, `PB`, `PiB`.<br/>Set to 0 to disable limit. |
| `grpc` | -- | -- | The gRPC server options. |
| `grpc.addr` | String | `127.0.0.1:3001` | The address to bind the gRPC server. |
| `grpc.hostname` | String | `127.0.0.1:3001` | The hostname advertised to the metasrv,<br/>and used for connections from outside the host |
| `grpc.bind_addr` | String | `127.0.0.1:3001` | The address to bind the gRPC server. |
| `grpc.server_addr` | String | `127.0.0.1:3001` | The address advertised to the metasrv, and used for connections from outside the host.<br/>If left empty or unset, the server will automatically use the IP address of the first network interface<br/>on the host, with the same port number as the one specified in `grpc.bind_addr`. |
| `grpc.runtime_size` | Integer | `8` | The number of server worker threads. |
| `grpc.max_recv_message_size` | String | `512MB` | The maximum receive message size for gRPC server. |
| `grpc.max_send_message_size` | String | `512MB` | The maximum send message size for gRPC server. |
@@ -428,15 +426,11 @@
| `wal.broker_endpoints` | Array | -- | The Kafka broker endpoints.<br/>**It's only used when the provider is `kafka`**. |
| `wal.max_batch_bytes` | String | `1MB` | The max size of a single producer batch.<br/>Warning: Kafka has a default limit of 1MB per message in a topic.<br/>**It's only used when the provider is `kafka`**. |
| `wal.consumer_wait_timeout` | String | `100ms` | The consumer wait timeout.<br/>**It's only used when the provider is `kafka`**. |
| `wal.backoff_init` | String | `500ms` | The initial backoff delay.<br/>**It's only used when the provider is `kafka`**. |
| `wal.backoff_max` | String | `10s` | The maximum backoff delay.<br/>**It's only used when the provider is `kafka`**. |
| `wal.backoff_base` | Integer | `2` | The exponential backoff rate, i.e. next backoff = base * current backoff.<br/>**It's only used when the provider is `kafka`**. |
| `wal.backoff_deadline` | String | `5mins` | The deadline of retries.<br/>**It's only used when the provider is `kafka`**. |
| `wal.create_index` | Bool | `true` | Whether to enable WAL index creation.<br/>**It's only used when the provider is `kafka`**. |
| `wal.dump_index_interval` | String | `60s` | The interval for dumping WAL indexes.<br/>**It's only used when the provider is `kafka`**. |
| `wal.overwrite_entry_start_id` | Bool | `false` | Ignore missing entries during read WAL.<br/>**It's only used when the provider is `kafka`**.<br/><br/>This option ensures that when Kafka messages are deleted, the system<br/>can still successfully replay memtable data without throwing an<br/>out-of-range error.<br/>However, enabling this option might lead to unexpected data loss,<br/>as the system will skip over missing entries instead of treating<br/>them as critical errors. |
| `storage` | -- | -- | The data storage options. |
| `storage.data_home` | String | `/tmp/greptimedb/` | The working home directory. |
| `storage.data_home` | String | `./greptimedb_data/` | The working home directory. |
| `storage.type` | String | `File` | The storage type used to store the data.<br/>- `File`: the data is stored in the local file system.<br/>- `S3`: the data is stored in the S3 object storage.<br/>- `Gcs`: the data is stored in the Google Cloud Storage.<br/>- `Azblob`: the data is stored in the Azure Blob Storage.<br/>- `Oss`: the data is stored in the Aliyun OSS. |
| `storage.cache_path` | String | Unset | Read cache configuration for object storage such as 'S3' etc, it's configured by default when using object storage. It is recommended to configure it when using object storage for better performance.<br/>A local file directory, defaults to `{data_home}`. An empty string means disabling. |
| `storage.cache_capacity` | String | Unset | The local file cache capacity in bytes. If your disk space is sufficient, it is recommended to set it larger. |
@@ -487,6 +481,7 @@
| `region_engine.mito.index` | -- | -- | The options for index in Mito engine. |
| `region_engine.mito.index.aux_path` | String | `""` | Auxiliary directory path for the index in filesystem, used to store intermediate files for<br/>creating the index and staging files for searching the index, defaults to `{data_home}/index_intermediate`.<br/>The default name for this directory is `index_intermediate` for backward compatibility.<br/><br/>This path contains two subdirectories:<br/>- `__intm`: for storing intermediate files used during creating index.<br/>- `staging`: for storing staging files used during searching index. |
| `region_engine.mito.index.staging_size` | String | `2GB` | The max capacity of the staging directory. |
| `region_engine.mito.index.staging_ttl` | String | `7d` | The TTL of the staging directory.<br/>Defaults to 7 days.<br/>Setting it to "0s" to disable TTL. |
| `region_engine.mito.index.metadata_cache_size` | String | `64MiB` | Cache size for inverted index metadata. |
| `region_engine.mito.index.content_cache_size` | String | `128MiB` | Cache size for inverted index content. |
| `region_engine.mito.index.content_cache_page_size` | String | `64KiB` | Page size for inverted index content cache. |
@@ -515,7 +510,7 @@
| `region_engine.metric` | -- | -- | Metric engine options. |
| `region_engine.metric.experimental_sparse_primary_key_encoding` | Bool | `false` | Whether to enable the experimental sparse primary key encoding. |
| `logging` | -- | -- | The logging options. |
| `logging.dir` | String | `/tmp/greptimedb/logs` | The directory to store the log files. If set to empty, logs will not be written to files. |
| `logging.dir` | String | `./greptimedb_data/logs` | The directory to store the log files. If set to empty, logs will not be written to files. |
| `logging.level` | String | Unset | The log level. Can be `info`/`debug`/`warn`/`error`. |
| `logging.enable_otlp_tracing` | Bool | `false` | Enable OTLP tracing. |
| `logging.otlp_endpoint` | String | `http://localhost:4317` | The OTLP tracing endpoint. |
@@ -544,19 +539,18 @@
| Key | Type | Default | Descriptions |
| --- | -----| ------- | ----------- |
| `mode` | String | `distributed` | The running mode of the flownode. It can be `standalone` or `distributed`. |
| `node_id` | Integer | Unset | The flownode identifier and should be unique in the cluster. |
| `flow` | -- | -- | flow engine options. |
| `flow.num_workers` | Integer | `0` | The number of flow worker in flownode.<br/>Not setting(or set to 0) this value will use the number of CPU cores divided by 2. |
| `grpc` | -- | -- | The gRPC server options. |
| `grpc.addr` | String | `127.0.0.1:6800` | The address to bind the gRPC server. |
| `grpc.hostname` | String | `127.0.0.1` | The hostname advertised to the metasrv,<br/>and used for connections from outside the host |
| `grpc.bind_addr` | String | `127.0.0.1:6800` | The address to bind the gRPC server. |
| `grpc.server_addr` | String | `127.0.0.1:6800` | The address advertised to the metasrv,<br/>and used for connections from outside the host |
| `grpc.runtime_size` | Integer | `2` | The number of server worker threads. |
| `grpc.max_recv_message_size` | String | `512MB` | The maximum receive message size for gRPC server. |
| `grpc.max_send_message_size` | String | `512MB` | The maximum send message size for gRPC server. |
| `http` | -- | -- | The HTTP server options. |
| `http.addr` | String | `127.0.0.1:4000` | The address to bind the HTTP server. |
| `http.timeout` | String | `30s` | HTTP request timeout. Set to 0 to disable timeout. |
| `http.timeout` | String | `0s` | HTTP request timeout. Set to 0 to disable timeout. |
| `http.body_limit` | String | `64MB` | HTTP request body limit.<br/>The following units are supported: `B`, `KB`, `KiB`, `MB`, `MiB`, `GB`, `GiB`, `TB`, `TiB`, `PB`, `PiB`.<br/>Set to 0 to disable limit. |
| `meta_client` | -- | -- | The metasrv client options. |
| `meta_client.metasrv_addrs` | Array | -- | The addresses of the metasrv. |
@@ -572,7 +566,7 @@
| `heartbeat.interval` | String | `3s` | Interval for sending heartbeat messages to the metasrv. |
| `heartbeat.retry_interval` | String | `3s` | Interval for retrying to send heartbeat messages to the metasrv. |
| `logging` | -- | -- | The logging options. |
| `logging.dir` | String | `/tmp/greptimedb/logs` | The directory to store the log files. If set to empty, logs will not be written to files. |
| `logging.dir` | String | `./greptimedb_data/logs` | The directory to store the log files. If set to empty, logs will not be written to files. |
| `logging.level` | String | Unset | The log level. Can be `info`/`debug`/`warn`/`error`. |
| `logging.enable_otlp_tracing` | Bool | `false` | Enable OTLP tracing. |
| `logging.otlp_endpoint` | String | `http://localhost:4317` | The OTLP tracing endpoint. |

View File

@@ -1,6 +1,3 @@
## The running mode of the datanode. It can be `standalone` or `distributed`.
mode = "standalone"
## The datanode identifier and should be unique in the cluster.
## @toml2docs:none-default
node_id = 42
@@ -19,26 +16,6 @@ init_regions_parallelism = 16
## The maximum current queries allowed to be executed. Zero means unlimited.
max_concurrent_queries = 0
## Deprecated, use `grpc.addr` instead.
## @toml2docs:none-default
rpc_addr = "127.0.0.1:3001"
## Deprecated, use `grpc.hostname` instead.
## @toml2docs:none-default
rpc_hostname = "127.0.0.1"
## Deprecated, use `grpc.runtime_size` instead.
## @toml2docs:none-default
rpc_runtime_size = 8
## Deprecated, use `grpc.rpc_max_recv_message_size` instead.
## @toml2docs:none-default
rpc_max_recv_message_size = "512MB"
## Deprecated, use `grpc.rpc_max_send_message_size` instead.
## @toml2docs:none-default
rpc_max_send_message_size = "512MB"
## Enable telemetry to collect anonymous usage data. Enabled by default.
#+ enable_telemetry = true
@@ -47,7 +24,7 @@ rpc_max_send_message_size = "512MB"
## The address to bind the HTTP server.
addr = "127.0.0.1:4000"
## HTTP request timeout. Set to 0 to disable timeout.
timeout = "30s"
timeout = "0s"
## HTTP request body limit.
## The following units are supported: `B`, `KB`, `KiB`, `MB`, `MiB`, `GB`, `GiB`, `TB`, `TiB`, `PB`, `PiB`.
## Set to 0 to disable limit.
@@ -56,10 +33,11 @@ body_limit = "64MB"
## The gRPC server options.
[grpc]
## The address to bind the gRPC server.
addr = "127.0.0.1:3001"
## The hostname advertised to the metasrv,
## and used for connections from outside the host
hostname = "127.0.0.1:3001"
bind_addr = "127.0.0.1:3001"
## The address advertised to the metasrv, and used for connections from outside the host.
## If left empty or unset, the server will automatically use the IP address of the first network interface
## on the host, with the same port number as the one specified in `grpc.bind_addr`.
server_addr = "127.0.0.1:3001"
## The number of server worker threads.
runtime_size = 8
## The maximum receive message size for gRPC server.
@@ -138,7 +116,7 @@ provider = "raft_engine"
## The directory to store the WAL files.
## **It's only used when the provider is `raft_engine`**.
## @toml2docs:none-default
dir = "/tmp/greptimedb/wal"
dir = "./greptimedb_data/wal"
## The size of the WAL segment file.
## **It's only used when the provider is `raft_engine`**.
@@ -188,22 +166,6 @@ max_batch_bytes = "1MB"
## **It's only used when the provider is `kafka`**.
consumer_wait_timeout = "100ms"
## The initial backoff delay.
## **It's only used when the provider is `kafka`**.
backoff_init = "500ms"
## The maximum backoff delay.
## **It's only used when the provider is `kafka`**.
backoff_max = "10s"
## The exponential backoff rate, i.e. next backoff = base * current backoff.
## **It's only used when the provider is `kafka`**.
backoff_base = 2
## The deadline of retries.
## **It's only used when the provider is `kafka`**.
backoff_deadline = "5mins"
## Whether to enable WAL index creation.
## **It's only used when the provider is `kafka`**.
create_index = true
@@ -250,6 +212,7 @@ overwrite_entry_start_id = false
# secret_access_key = "123456"
# endpoint = "https://s3.amazonaws.com"
# region = "us-west-2"
# enable_virtual_host_style = false
# Example of using Oss as the storage.
# [storage]
@@ -283,7 +246,7 @@ overwrite_entry_start_id = false
## The data storage options.
[storage]
## The working home directory.
data_home = "/tmp/greptimedb/"
data_home = "./greptimedb_data/"
## The storage type used to store the data.
## - `File`: the data is stored in the local file system.
@@ -516,6 +479,11 @@ aux_path = ""
## The max capacity of the staging directory.
staging_size = "2GB"
## The TTL of the staging directory.
## Defaults to 7 days.
## Setting it to "0s" to disable TTL.
staging_ttl = "7d"
## Cache size for inverted index metadata.
metadata_cache_size = "64MiB"
@@ -631,7 +599,7 @@ experimental_sparse_primary_key_encoding = false
## The logging options.
[logging]
## The directory to store the log files. If set to empty, logs will not be written to files.
dir = "/tmp/greptimedb/logs"
dir = "./greptimedb_data/logs"
## The log level. Can be `info`/`debug`/`warn`/`error`.
## @toml2docs:none-default

View File

@@ -1,6 +1,3 @@
## The running mode of the flownode. It can be `standalone` or `distributed`.
mode = "distributed"
## The flownode identifier and should be unique in the cluster.
## @toml2docs:none-default
node_id = 14
@@ -14,10 +11,10 @@ node_id = 14
## The gRPC server options.
[grpc]
## The address to bind the gRPC server.
addr = "127.0.0.1:6800"
## The hostname advertised to the metasrv,
bind_addr = "127.0.0.1:6800"
## The address advertised to the metasrv,
## and used for connections from outside the host
hostname = "127.0.0.1"
server_addr = "127.0.0.1:6800"
## The number of server worker threads.
runtime_size = 2
## The maximum receive message size for gRPC server.
@@ -30,7 +27,7 @@ max_send_message_size = "512MB"
## The address to bind the HTTP server.
addr = "127.0.0.1:4000"
## HTTP request timeout. Set to 0 to disable timeout.
timeout = "30s"
timeout = "0s"
## HTTP request body limit.
## The following units are supported: `B`, `KB`, `KiB`, `MB`, `MiB`, `GB`, `GiB`, `TB`, `TiB`, `PB`, `PiB`.
## Set to 0 to disable limit.
@@ -76,7 +73,7 @@ retry_interval = "3s"
## The logging options.
[logging]
## The directory to store the log files. If set to empty, logs will not be written to files.
dir = "/tmp/greptimedb/logs"
dir = "./greptimedb_data/logs"
## The log level. Can be `info`/`debug`/`warn`/`error`.
## @toml2docs:none-default
@@ -121,4 +118,3 @@ sample_ratio = 1.0
## The tokio console address.
## @toml2docs:none-default
#+ tokio_console_addr = "127.0.0.1"

View File

@@ -26,7 +26,7 @@ retry_interval = "3s"
## The address to bind the HTTP server.
addr = "127.0.0.1:4000"
## HTTP request timeout. Set to 0 to disable timeout.
timeout = "30s"
timeout = "0s"
## HTTP request body limit.
## The following units are supported: `B`, `KB`, `KiB`, `MB`, `MiB`, `GB`, `GiB`, `TB`, `TiB`, `PB`, `PiB`.
## Set to 0 to disable limit.
@@ -41,10 +41,11 @@ cors_allowed_origins = ["https://example.com"]
## The gRPC server options.
[grpc]
## The address to bind the gRPC server.
addr = "127.0.0.1:4001"
## The hostname advertised to the metasrv,
## and used for connections from outside the host
hostname = "127.0.0.1:4001"
bind_addr = "127.0.0.1:4001"
## The address advertised to the metasrv, and used for connections from outside the host.
## If left empty or unset, the server will automatically use the IP address of the first network interface
## on the host, with the same port number as the one specified in `grpc.bind_addr`.
server_addr = "127.0.0.1:4001"
## The number of server worker threads.
runtime_size = 8
@@ -73,6 +74,9 @@ enable = true
addr = "127.0.0.1:4002"
## The number of server worker threads.
runtime_size = 2
## Server-side keep-alive time.
## Set to 0 (default) to disable.
keep_alive = "0s"
# MySQL server TLS options.
[mysql.tls]
@@ -104,6 +108,9 @@ enable = true
addr = "127.0.0.1:4003"
## The number of server worker threads.
runtime_size = 2
## Server-side keep-alive time.
## Set to 0 (default) to disable.
keep_alive = "0s"
## PostgresSQL server TLS options, see `mysql.tls` section.
[postgres.tls]
@@ -131,6 +138,11 @@ enable = true
## Whether to enable InfluxDB protocol in HTTP API.
enable = true
## Jaeger protocol options.
[jaeger]
## Whether to enable Jaeger protocol in HTTP API.
enable = true
## Prometheus remote storage options
[prom_store]
## Whether to enable Prometheus remote write and read in HTTP API.
@@ -177,7 +189,7 @@ tcp_nodelay = true
## The logging options.
[logging]
## The directory to store the log files. If set to empty, logs will not be written to files.
dir = "/tmp/greptimedb/logs"
dir = "./greptimedb_data/logs"
## The log level. Can be `info`/`debug`/`warn`/`error`.
## @toml2docs:none-default

View File

@@ -1,10 +1,12 @@
## The working home directory.
data_home = "/tmp/metasrv/"
data_home = "./greptimedb_data/metasrv/"
## The bind address of metasrv.
bind_addr = "127.0.0.1:3002"
## The communication server address for frontend and datanode to connect to metasrv, "127.0.0.1:3002" by default for localhost.
## The communication server address for the frontend and datanode to connect to metasrv.
## If left empty or unset, the server will automatically use the IP address of the first network interface
## on the host, with the same port number as the one specified in `bind_addr`.
server_addr = "127.0.0.1:3002"
## Store server address default to etcd store.
@@ -48,6 +50,9 @@ use_memory_store = false
## - Using shared storage (e.g., s3).
enable_region_failover = false
## Max allowed idle time before removing node info from metasrv memory.
node_max_idle_time = "24hours"
## Whether to enable greptimedb telemetry. Enabled by default.
#+ enable_telemetry = true
@@ -74,6 +79,11 @@ retry_delay = "500ms"
## Comments out the `max_metadata_value_size`, for don't split large value (no limit).
max_metadata_value_size = "1500KiB"
## Max running procedures.
## The maximum number of procedures that can be running at the same time.
## If the number of running procedures exceeds this limit, the procedure will be rejected.
max_running_procedures = 128
# Failure detectors options.
[failure_detector]
@@ -139,17 +149,6 @@ replication_factor = 1
## Above which a topic creation operation will be cancelled.
create_topic_timeout = "30s"
## The initial backoff for kafka clients.
backoff_init = "500ms"
## The maximum backoff for kafka clients.
backoff_max = "10s"
## Exponential backoff rate, i.e. next backoff = base * current backoff.
backoff_base = 2
## Stop reconnecting if the total wait time reaches the deadline. If this config is missing, the reconnecting won't terminate.
backoff_deadline = "5mins"
# The Kafka SASL configuration.
# **It's only used when the provider is `kafka`**.
@@ -172,7 +171,7 @@ backoff_deadline = "5mins"
## The logging options.
[logging]
## The directory to store the log files. If set to empty, logs will not be written to files.
dir = "/tmp/greptimedb/logs"
dir = "./greptimedb_data/logs"
## The log level. Can be `info`/`debug`/`warn`/`error`.
## @toml2docs:none-default

View File

@@ -1,6 +1,3 @@
## The running mode of the datanode. It can be `standalone` or `distributed`.
mode = "standalone"
## The default timezone of the server.
## @toml2docs:none-default
default_timezone = "UTC"
@@ -34,7 +31,7 @@ max_concurrent_queries = 0
## The address to bind the HTTP server.
addr = "127.0.0.1:4000"
## HTTP request timeout. Set to 0 to disable timeout.
timeout = "30s"
timeout = "0s"
## HTTP request body limit.
## The following units are supported: `B`, `KB`, `KiB`, `MB`, `MiB`, `GB`, `GiB`, `TB`, `TiB`, `PB`, `PiB`.
## Set to 0 to disable limit.
@@ -49,7 +46,7 @@ cors_allowed_origins = ["https://example.com"]
## The gRPC server options.
[grpc]
## The address to bind the gRPC server.
addr = "127.0.0.1:4001"
bind_addr = "127.0.0.1:4001"
## The number of server worker threads.
runtime_size = 8
@@ -78,6 +75,9 @@ enable = true
addr = "127.0.0.1:4002"
## The number of server worker threads.
runtime_size = 2
## Server-side keep-alive time.
## Set to 0 (default) to disable.
keep_alive = "0s"
# MySQL server TLS options.
[mysql.tls]
@@ -109,6 +109,9 @@ enable = true
addr = "127.0.0.1:4003"
## The number of server worker threads.
runtime_size = 2
## Server-side keep-alive time.
## Set to 0 (default) to disable.
keep_alive = "0s"
## PostgresSQL server TLS options, see `mysql.tls` section.
[postgres.tls]
@@ -136,6 +139,11 @@ enable = true
## Whether to enable InfluxDB protocol in HTTP API.
enable = true
## Jaeger protocol options.
[jaeger]
## Whether to enable Jaeger protocol in HTTP API.
enable = true
## Prometheus remote storage options
[prom_store]
## Whether to enable Prometheus remote write and read in HTTP API.
@@ -153,17 +161,17 @@ provider = "raft_engine"
## The directory to store the WAL files.
## **It's only used when the provider is `raft_engine`**.
## @toml2docs:none-default
dir = "/tmp/greptimedb/wal"
dir = "./greptimedb_data/wal"
## The size of the WAL segment file.
## **It's only used when the provider is `raft_engine`**.
file_size = "128MB"
## The threshold of the WAL size to trigger a flush.
## The threshold of the WAL size to trigger a purge.
## **It's only used when the provider is `raft_engine`**.
purge_threshold = "1GB"
## The interval to trigger a flush.
## The interval to trigger a purge.
## **It's only used when the provider is `raft_engine`**.
purge_interval = "1m"
@@ -231,22 +239,6 @@ max_batch_bytes = "1MB"
## **It's only used when the provider is `kafka`**.
consumer_wait_timeout = "100ms"
## The initial backoff delay.
## **It's only used when the provider is `kafka`**.
backoff_init = "500ms"
## The maximum backoff delay.
## **It's only used when the provider is `kafka`**.
backoff_max = "10s"
## The exponential backoff rate, i.e. next backoff = base * current backoff.
## **It's only used when the provider is `kafka`**.
backoff_base = 2
## The deadline of retries.
## **It's only used when the provider is `kafka`**.
backoff_deadline = "5mins"
## Ignore missing entries during read WAL.
## **It's only used when the provider is `kafka`**.
##
@@ -278,10 +270,12 @@ overwrite_entry_start_id = false
## Metadata storage options.
[metadata_store]
## Kv file size in bytes.
file_size = "256MB"
## Kv purge threshold.
purge_threshold = "4GB"
## The size of the metadata store log file.
file_size = "64MB"
## The threshold of the metadata store size to trigger a purge.
purge_threshold = "256MB"
## The interval of the metadata store to trigger a purge.
purge_interval = "1m"
## Procedure storage options.
[procedure]
@@ -289,6 +283,10 @@ purge_threshold = "4GB"
max_retry_times = 3
## Initial retry delay of procedures, increases exponentially
retry_delay = "500ms"
## Max running procedures.
## The maximum number of procedures that can be running at the same time.
## If the number of running procedures exceeds this limit, the procedure will be rejected.
max_running_procedures = 128
## flow engine options.
[flow]
@@ -305,6 +303,7 @@ retry_delay = "500ms"
# secret_access_key = "123456"
# endpoint = "https://s3.amazonaws.com"
# region = "us-west-2"
# enable_virtual_host_style = false
# Example of using Oss as the storage.
# [storage]
@@ -338,7 +337,7 @@ retry_delay = "500ms"
## The data storage options.
[storage]
## The working home directory.
data_home = "/tmp/greptimedb/"
data_home = "./greptimedb_data/"
## The storage type used to store the data.
## - `File`: the data is stored in the local file system.
@@ -571,6 +570,11 @@ aux_path = ""
## The max capacity of the staging directory.
staging_size = "2GB"
## The TTL of the staging directory.
## Defaults to 7 days.
## Setting it to "0s" to disable TTL.
staging_ttl = "7d"
## Cache size for inverted index metadata.
metadata_cache_size = "64MiB"
@@ -686,7 +690,7 @@ experimental_sparse_primary_key_encoding = false
## The logging options.
[logging]
## The directory to store the log files. If set to empty, logs will not be written to files.
dir = "/tmp/greptimedb/logs"
dir = "./greptimedb_data/logs"
## The log level. Can be `info`/`debug`/`warn`/`error`.
## @toml2docs:none-default

View File

@@ -1,4 +1,4 @@
FROM ubuntu:20.04 as builder
FROM ubuntu:22.04 as builder
ARG CARGO_PROFILE
ARG FEATURES

View File

@@ -1,4 +1,4 @@
FROM ubuntu:22.04
FROM ubuntu:latest
# The binary name of GreptimeDB executable.
# Defaults to "greptime", but sometimes in other projects it might be different.

View File

@@ -1,4 +1,4 @@
FROM ubuntu:20.04
FROM ubuntu:22.04
# The root path under which contains all the dependencies to build this Dockerfile.
ARG DOCKER_BUILD_ROOT=.
@@ -41,7 +41,7 @@ RUN mv protoc3/include/* /usr/local/include/
# and the repositories are pulled from trusted sources (still us, of course). Doing so does not violate the intention
# of the Git's addition to the "safe.directory" at the first place (see the commit message here:
# https://github.com/git/git/commit/8959555cee7ec045958f9b6dd62e541affb7e7d9).
# There's also another solution to this, that we add the desired submodules to the safe directory, instead of using
# There's also another solution to this, that we add the desired submodules to the safe directory, instead of using
# wildcard here. However, that requires the git's config files and the submodules all owned by the very same user.
# It's troublesome to do this since the dev build runs in Docker, which is under user "root"; while outside the Docker,
# it can be a different user that have prepared the submodules.

View File

@@ -1,51 +0,0 @@
# Use the legacy glibc 2.28.
FROM ubuntu:18.10
ENV LANG en_US.utf8
WORKDIR /greptimedb
# Use old-releases.ubuntu.com to avoid 404s: https://help.ubuntu.com/community/EOLUpgrades.
RUN echo "deb http://old-releases.ubuntu.com/ubuntu/ cosmic main restricted universe multiverse\n\
deb http://old-releases.ubuntu.com/ubuntu/ cosmic-updates main restricted universe multiverse\n\
deb http://old-releases.ubuntu.com/ubuntu/ cosmic-security main restricted universe multiverse" > /etc/apt/sources.list
# Install dependencies.
RUN apt-get update && DEBIAN_FRONTEND=noninteractive apt-get install -y \
libssl-dev \
tzdata \
curl \
ca-certificates \
git \
build-essential \
unzip \
pkg-config
# Install protoc.
ENV PROTOC_VERSION=29.3
RUN if [ "$(uname -m)" = "x86_64" ]; then \
PROTOC_ZIP=protoc-${PROTOC_VERSION}-linux-x86_64.zip; \
elif [ "$(uname -m)" = "aarch64" ]; then \
PROTOC_ZIP=protoc-${PROTOC_VERSION}-linux-aarch_64.zip; \
else \
echo "Unsupported architecture"; exit 1; \
fi && \
curl -OL https://github.com/protocolbuffers/protobuf/releases/download/v${PROTOC_VERSION}/${PROTOC_ZIP} && \
unzip -o ${PROTOC_ZIP} -d /usr/local bin/protoc && \
unzip -o ${PROTOC_ZIP} -d /usr/local 'include/*' && \
rm -f ${PROTOC_ZIP}
# Install Rust.
SHELL ["/bin/bash", "-c"]
RUN curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- --no-modify-path --default-toolchain none -y
ENV PATH /root/.cargo/bin/:$PATH
# Install Rust toolchains.
ARG RUST_TOOLCHAIN
RUN rustup toolchain install ${RUST_TOOLCHAIN}
# Install cargo-binstall with a specific version to adapt the current rust toolchain.
# Note: if we use the latest version, we may encounter the following `use of unstable library feature 'io_error_downcast'` error.
RUN cargo install cargo-binstall --version 1.6.6 --locked
# Install nextest.
RUN cargo binstall cargo-nextest --no-confirm

View File

@@ -0,0 +1,66 @@
FROM ubuntu:20.04
# The root path under which contains all the dependencies to build this Dockerfile.
ARG DOCKER_BUILD_ROOT=.
ENV LANG en_US.utf8
WORKDIR /greptimedb
RUN apt-get update && \
DEBIAN_FRONTEND=noninteractive apt-get install -y software-properties-common
# Install dependencies.
RUN apt-get update && DEBIAN_FRONTEND=noninteractive apt-get install -y \
libssl-dev \
tzdata \
curl \
unzip \
ca-certificates \
git \
build-essential \
pkg-config
ARG TARGETPLATFORM
RUN echo "target platform: $TARGETPLATFORM"
ARG PROTOBUF_VERSION=29.3
# Install protobuf, because the one in the apt is too old (v3.12).
RUN if [ "$TARGETPLATFORM" = "linux/arm64" ]; then \
curl -OL https://github.com/protocolbuffers/protobuf/releases/download/v${PROTOBUF_VERSION}/protoc-${PROTOBUF_VERSION}-linux-aarch_64.zip && \
unzip protoc-${PROTOBUF_VERSION}-linux-aarch_64.zip -d protoc3; \
elif [ "$TARGETPLATFORM" = "linux/amd64" ]; then \
curl -OL https://github.com/protocolbuffers/protobuf/releases/download/v${PROTOBUF_VERSION}/protoc-${PROTOBUF_VERSION}-linux-x86_64.zip && \
unzip protoc-${PROTOBUF_VERSION}-linux-x86_64.zip -d protoc3; \
fi
RUN mv protoc3/bin/* /usr/local/bin/
RUN mv protoc3/include/* /usr/local/include/
# Silence all `safe.directory` warnings, to avoid the "detect dubious repository" error when building with submodules.
# Disabling the safe directory check here won't pose extra security issues, because in our usage for this dev build
# image, we use it solely on our own environment (that github action's VM, or ECS created dynamically by ourselves),
# and the repositories are pulled from trusted sources (still us, of course). Doing so does not violate the intention
# of the Git's addition to the "safe.directory" at the first place (see the commit message here:
# https://github.com/git/git/commit/8959555cee7ec045958f9b6dd62e541affb7e7d9).
# There's also another solution to this, that we add the desired submodules to the safe directory, instead of using
# wildcard here. However, that requires the git's config files and the submodules all owned by the very same user.
# It's troublesome to do this since the dev build runs in Docker, which is under user "root"; while outside the Docker,
# it can be a different user that have prepared the submodules.
RUN git config --global --add safe.directory '*'
# Install Rust.
SHELL ["/bin/bash", "-c"]
RUN curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- --no-modify-path --default-toolchain none -y
ENV PATH /root/.cargo/bin/:$PATH
# Install Rust toolchains.
ARG RUST_TOOLCHAIN
RUN rustup toolchain install ${RUST_TOOLCHAIN}
# Install cargo-binstall with a specific version to adapt the current rust toolchain.
# Note: if we use the latest version, we may encounter the following `use of unstable library feature 'io_error_downcast'` error.
# compile from source take too long, so we use the precompiled binary instead
COPY $DOCKER_BUILD_ROOT/docker/dev-builder/binstall/pull_binstall.sh /usr/local/bin/pull_binstall.sh
RUN chmod +x /usr/local/bin/pull_binstall.sh && /usr/local/bin/pull_binstall.sh
# Install nextest.
RUN cargo binstall cargo-nextest --no-confirm

View File

@@ -25,7 +25,7 @@ services:
- --initial-cluster-state=new
- *etcd_initial_cluster_token
volumes:
- /tmp/greptimedb-cluster-docker-compose/etcd0:/var/lib/etcd
- ./greptimedb-cluster-docker-compose/etcd0:/var/lib/etcd
healthcheck:
test: [ "CMD", "etcdctl", "--endpoints=http://etcd0:2379", "endpoint", "health" ]
interval: 5s
@@ -43,8 +43,8 @@ services:
command:
- metasrv
- start
- --bind-addr=0.0.0.0:3002
- --server-addr=metasrv:3002
- --rpc-bind-addr=0.0.0.0:3002
- --rpc-server-addr=metasrv:3002
- --store-addrs=etcd0:2379
- --http-addr=0.0.0.0:3000
healthcheck:
@@ -68,12 +68,13 @@ services:
- datanode
- start
- --node-id=0
- --rpc-addr=0.0.0.0:3001
- --rpc-hostname=datanode0:3001
- --data-home=/greptimedb_data
- --rpc-bind-addr=0.0.0.0:3001
- --rpc-server-addr=datanode0:3001
- --metasrv-addrs=metasrv:3002
- --http-addr=0.0.0.0:5000
volumes:
- /tmp/greptimedb-cluster-docker-compose/datanode0:/tmp/greptimedb
- ./greptimedb-cluster-docker-compose/datanode0:/greptimedb_data
healthcheck:
test: [ "CMD", "curl", "-fv", "http://datanode0:5000/health" ]
interval: 5s
@@ -98,7 +99,7 @@ services:
- start
- --metasrv-addrs=metasrv:3002
- --http-addr=0.0.0.0:4000
- --rpc-addr=0.0.0.0:4001
- --rpc-bind-addr=0.0.0.0:4001
- --mysql-addr=0.0.0.0:4002
- --postgres-addr=0.0.0.0:4003
healthcheck:
@@ -123,8 +124,8 @@ services:
- start
- --node-id=0
- --metasrv-addrs=metasrv:3002
- --rpc-addr=0.0.0.0:4004
- --rpc-hostname=flownode0:4004
- --rpc-bind-addr=0.0.0.0:4004
- --rpc-server-addr=flownode0:4004
- --http-addr=0.0.0.0:4005
depends_on:
frontend0:

View File

@@ -0,0 +1,40 @@
# TSBS benchmark - v0.12.0
## Environment
### Amazon EC2
| | |
|---------|-------------------------|
| Machine | c5d.2xlarge |
| CPU | 8 core |
| Memory | 16GB |
| Disk | 100GB (GP3) |
| OS | Ubuntu Server 24.04 LTS |
## Write performance
| Environment | Ingest rate (rows/s) |
|-----------------|----------------------|
| EC2 c5d.2xlarge | 326839.28 |
## Query performance
| Query type | EC2 c5d.2xlarge (ms) |
|-----------------------|----------------------|
| cpu-max-all-1 | 12.46 |
| cpu-max-all-8 | 24.20 |
| double-groupby-1 | 673.08 |
| double-groupby-5 | 963.99 |
| double-groupby-all | 1330.05 |
| groupby-orderby-limit | 952.46 |
| high-cpu-1 | 5.08 |
| high-cpu-all | 4638.57 |
| lastpoint | 591.02 |
| single-groupby-1-1-1 | 4.06 |
| single-groupby-1-1-12 | 4.73 |
| single-groupby-1-8-1 | 8.23 |
| single-groupby-5-1-1 | 4.61 |
| single-groupby-5-1-12 | 5.61 |
| single-groupby-5-8-1 | 9.74 |

View File

@@ -4,6 +4,16 @@ This crate provides an easy approach to dump memory profiling info.
## Prerequisites
### jemalloc
jeprof is already compiled in the target directory of GreptimeDB. You can find the binary and use it.
```
# find jeprof binary
find . -name 'jeprof'
# add executable permission
chmod +x <path_to_jeprof>
```
The path is usually under `./target/${PROFILE}/build/tikv-jemalloc-sys-${HASH}/out/build/bin/jeprof`.
The default version of jemalloc installed from the package manager may not have the `--collapsed` option.
You may need to check the whether the `jeprof` version is >= `5.3.0` if you want to install it from the package manager.
```bash
# for macOS
brew install jemalloc
@@ -23,7 +33,11 @@ curl https://raw.githubusercontent.com/brendangregg/FlameGraph/master/flamegraph
Start GreptimeDB instance with environment variables:
```bash
# for Linux
MALLOC_CONF=prof:true ./target/debug/greptime standalone start
# for macOS
_RJEM_MALLOC_CONF=prof:true ./target/debug/greptime standalone start
```
Dump memory profiling data through HTTP API:

View File

@@ -3,7 +3,7 @@
This document introduces how to write fuzz tests in GreptimeDB.
## What is a fuzz test
Fuzz test is tool that leverage deterministic random generation to assist in finding bugs. The goal of fuzz tests is to identify inputs generated by the fuzzer that cause system panics, crashes, or unexpected behaviors to occur. And we are using the [cargo-fuzz](https://github.com/rust-fuzz/cargo-fuzz) to run our fuzz test targets.
Fuzz test is tool that leverage deterministic random generation to assist in finding bugs. The goal of fuzz tests is to identify inputs generated by the fuzzer that cause system panics, crashes, or unexpected behaviors to occur. And we are using the [cargo-fuzz](https://github.com/rust-fuzz/cargo-fuzz) to run our fuzz test targets.
## Why we need them
- Find bugs by leveraging random generation
@@ -13,7 +13,7 @@ Fuzz test is tool that leverage deterministic random generation to assist in fin
All fuzz test-related resources are located in the `/tests-fuzz` directory.
There are two types of resources: (1) fundamental components and (2) test targets.
### Fundamental components
### Fundamental components
They are located in the `/tests-fuzz/src` directory. The fundamental components define how to generate SQLs (including dialects for different protocols) and validate execution results (e.g., column attribute validation), etc.
### Test targets
@@ -21,25 +21,25 @@ They are located in the `/tests-fuzz/targets` directory, with each file represen
Figure 1 illustrates the fundamental components of the fuzz test provide the ability to generate random SQLs. It utilizes a Random Number Generator (Rng) to generate the Intermediate Representation (IR), then employs a DialectTranslator to produce specified dialects for different protocols. Finally, the fuzz tests send the generated SQL via the specified protocol and verify that the execution results meet expectations.
```
Rng
|
|
v
ExprGenerator
|
|
v
Intermediate representation (IR)
|
|
+----------------------+----------------------+
| | |
v v v
Rng
|
|
v
ExprGenerator
|
|
v
Intermediate representation (IR)
|
|
+----------------------+----------------------+
| | |
v v v
MySQLTranslator PostgreSQLTranslator OtherDialectTranslator
| | |
| | |
v v v
SQL(MySQL Dialect) ..... .....
| | |
| | |
v v v
SQL(MySQL Dialect) ..... .....
|
|
v
@@ -133,4 +133,4 @@ fuzz_target!(|input: FuzzInput| {
cargo fuzz run <fuzz-target> --fuzz-dir tests-fuzz
```
For more details, please refer to this [document](/tests-fuzz/README.md).
For more details, please refer to this [document](/tests-fuzz/README.md).

Binary file not shown.

Before

Width:  |  Height:  |  Size: 36 KiB

After

Width:  |  Height:  |  Size: 25 KiB

BIN
docs/logo-text-padding.png Executable file → Normal file

Binary file not shown.

Before

Width:  |  Height:  |  Size: 25 KiB

After

Width:  |  Height:  |  Size: 21 KiB

View File

@@ -0,0 +1,77 @@
---
Feature Name: Remote WAL Purge
Tracking Issue: https://github.com/GreptimeTeam/greptimedb/issues/5474
Date: 2025-02-06
Author: "Yuhan Wang <profsyb@gmail.com>"
---
# Summary
This RFC proposes a method for purging remote WAL in the database.
# Motivation
Currently only local wal entries are purged when flushing, while remote wal does nothing.
# Details
```mermaid
sequenceDiagram
Region0->>Kafka: Last entry id of the topic in use
Region0->>WALPruner: Heartbeat with last entry id
WALPruner->>+WALPruner: Time Loop
WALPruner->>+ProcedureManager: Submit purge procedure
ProcedureManager->>Region0: Flush request
ProcedureManager->>Kafka: Prune WAL entries
Region0->>Region0: Flush
```
## Steps
### Before purge
Before purging remote WAL, metasrv needs to know:
1. `last_entry_id` of each region.
2. `kafka_topic_last_entry_id` which is the last entry id of the topic in use. Can be lazily updated and needed when region has empty memtable.
3. Kafka topics that each region uses.
The states are maintained through:
1. Heartbeat: Datanode sends `last_entry_id` to metasrv in heartbeat. As for regions with empty memtable, `last_entry_id` should equals to `kafka_topic_last_entry_id`.
2. Metasrv maintains a topic-region map to know which region uses which topic.
`kafka_topic_last_entry_id` will be maintained by the region itself. Region will update the value after `k` heartbeats if the memtable is empty.
### Purge procedure
We can better handle locks utilizing current procedure. It's quite similar to the region migration procedure.
After a period of time, metasrv will submit a purge procedure to ProcedureManager. The purge will apply to all topics.
The procedure is divided into following stages:
1. Preparation:
- Retrieve `last_entry_id` of each region kvbackend.
- Choose regions that have a relatively small `last_entry_id` as candidate regions, which means we need to send a flush request to these regions.
2. Communication:
- Send flush requests to candidate regions.
3. Purge:
- Choose proper entry id to delete for each topic. The entry should be the smallest `last_entry_id - 1` among all regions.
- Delete legacy entries in Kafka.
- Store the `last_purged_entry_id` in kvbackend. It should be locked to prevent other regions from replaying the purged entries.
### After purge
After purge, there may be some regions that have `last_entry_id` smaller than the entry we just deleted. It's legal since we only delete the entries that are not needed anymore.
When restarting a region, it should query the `last_purged_entry_id` from metasrv and replay from `min(last_entry_id, last_purged_entry_id)`.
### Error handling
No persisted states are needed since all states are maintained in kvbackend.
Retry when failed to retrieving metadata from kvbackend.
# Alternatives
Purge time can depend on the size of the WAL entries instead of a fixed period of time, which may be more efficient.

19
grafana/check.sh Executable file
View File

@@ -0,0 +1,19 @@
#!/usr/bin/env bash
BASEDIR=$(dirname "$0")
# Use jq to check for panels with empty or missing descriptions
invalid_panels=$(cat $BASEDIR/greptimedb-cluster.json | jq -r '
.panels[]
| select((.type == "stats" or .type == "timeseries") and (.description == "" or .description == null))
')
# Check if any invalid panels were found
if [[ -n "$invalid_panels" ]]; then
echo "Error: The following panels have empty or missing descriptions:"
echo "$invalid_panels"
exit 1
else
echo "All panels with type 'stats' or 'timeseries' have valid descriptions."
exit 0
fi

File diff suppressed because it is too large Load Diff

View File

@@ -384,8 +384,8 @@
"rowHeight": 0.9,
"showValue": "auto",
"tooltip": {
"mode": "none",
"sort": "none"
"mode": "multi",
"sort": "desc"
}
},
"targets": [
@@ -483,8 +483,8 @@
"showLegend": true
},
"tooltip": {
"mode": "single",
"sort": "none"
"mode": "multi",
"sort": "desc"
}
},
"pluginVersion": "10.2.3",
@@ -578,8 +578,8 @@
"showLegend": true
},
"tooltip": {
"mode": "single",
"sort": "none"
"mode": "multi",
"sort": "desc"
}
},
"pluginVersion": "10.2.3",
@@ -601,7 +601,7 @@
"type": "timeseries"
},
{
"collapsed": true,
"collapsed": false,
"gridPos": {
"h": 1,
"w": 24,
@@ -684,8 +684,8 @@
"showLegend": true
},
"tooltip": {
"mode": "single",
"sort": "none"
"mode": "multi",
"sort": "desc"
}
},
"targets": [
@@ -878,8 +878,8 @@
"showLegend": true
},
"tooltip": {
"mode": "single",
"sort": "none"
"mode": "multi",
"sort": "desc"
}
},
"targets": [
@@ -1124,8 +1124,8 @@
"showLegend": true
},
"tooltip": {
"mode": "single",
"sort": "none"
"mode": "multi",
"sort": "desc"
}
},
"targets": [
@@ -1223,8 +1223,8 @@
"showLegend": true
},
"tooltip": {
"mode": "single",
"sort": "none"
"mode": "multi",
"sort": "desc"
}
},
"targets": [
@@ -1322,8 +1322,8 @@
"showLegend": true
},
"tooltip": {
"mode": "single",
"sort": "none"
"mode": "multi",
"sort": "desc"
}
},
"targets": [
@@ -1456,8 +1456,8 @@
"showLegend": true
},
"tooltip": {
"mode": "single",
"sort": "none"
"mode": "multi",
"sort": "desc"
}
},
"targets": [
@@ -1573,8 +1573,8 @@
"showLegend": true
},
"tooltip": {
"mode": "single",
"sort": "none"
"mode": "multi",
"sort": "desc"
}
},
"targets": [
@@ -1673,8 +1673,8 @@
"showLegend": true
},
"tooltip": {
"mode": "single",
"sort": "none"
"mode": "multi",
"sort": "desc"
}
},
"targets": [
@@ -1773,8 +1773,8 @@
"showLegend": true
},
"tooltip": {
"mode": "single",
"sort": "none"
"mode": "multi",
"sort": "desc"
}
},
"targets": [
@@ -1890,8 +1890,8 @@
"showLegend": true
},
"tooltip": {
"mode": "single",
"sort": "none"
"mode": "multi",
"sort": "desc"
}
},
"targets": [
@@ -2002,8 +2002,8 @@
"showLegend": true
},
"tooltip": {
"mode": "single",
"sort": "none"
"mode": "multi",
"sort": "desc"
}
},
"targets": [
@@ -2120,8 +2120,8 @@
"showLegend": true
},
"tooltip": {
"mode": "single",
"sort": "none"
"mode": "multi",
"sort": "desc"
}
},
"targets": [
@@ -2233,8 +2233,8 @@
"showLegend": true
},
"tooltip": {
"mode": "single",
"sort": "none"
"mode": "multi",
"sort": "desc"
}
},
"targets": [
@@ -2334,8 +2334,8 @@
"showLegend": true
},
"tooltip": {
"mode": "single",
"sort": "none"
"mode": "multi",
"sort": "desc"
}
},
"targets": [
@@ -2435,8 +2435,8 @@
"showLegend": true
},
"tooltip": {
"mode": "single",
"sort": "none"
"mode": "multi",
"sort": "desc"
}
},
"targets": [
@@ -2548,8 +2548,8 @@
"showLegend": true
},
"tooltip": {
"mode": "single",
"sort": "none"
"mode": "multi",
"sort": "desc"
}
},
"targets": [
@@ -2661,8 +2661,8 @@
"showLegend": true
},
"tooltip": {
"mode": "single",
"sort": "none"
"mode": "multi",
"sort": "desc"
}
},
"targets": [
@@ -2788,8 +2788,8 @@
"showLegend": true
},
"tooltip": {
"mode": "single",
"sort": "none"
"mode": "multi",
"sort": "desc"
}
},
"targets": [
@@ -2889,8 +2889,8 @@
"showLegend": true
},
"tooltip": {
"mode": "single",
"sort": "none"
"mode": "multi",
"sort": "desc"
}
},
"targets": [
@@ -2990,8 +2990,8 @@
"showLegend": true
},
"tooltip": {
"mode": "single",
"sort": "none"
"mode": "multi",
"sort": "desc"
}
},
"targets": [
@@ -3091,8 +3091,8 @@
"showLegend": true
},
"tooltip": {
"mode": "single",
"sort": "none"
"mode": "multi",
"sort": "desc"
}
},
"targets": [
@@ -3191,8 +3191,8 @@
"showLegend": true
},
"tooltip": {
"mode": "single",
"sort": "none"
"mode": "multi",
"sort": "desc"
}
},
"targets": [
@@ -3302,8 +3302,8 @@
"showLegend": true
},
"tooltip": {
"mode": "single",
"sort": "none"
"mode": "multi",
"sort": "desc"
}
},
"targets": [
@@ -3432,8 +3432,8 @@
"showLegend": true
},
"tooltip": {
"mode": "single",
"sort": "none"
"mode": "multi",
"sort": "desc"
}
},
"targets": [
@@ -3543,8 +3543,8 @@
"showLegend": true
},
"tooltip": {
"mode": "single",
"sort": "none"
"mode": "multi",
"sort": "desc"
}
},
"targets": [
@@ -3657,8 +3657,8 @@
"showLegend": true
},
"tooltip": {
"mode": "single",
"sort": "none"
"mode": "multi",
"sort": "desc"
}
},
"targets": [
@@ -3808,8 +3808,8 @@
"showLegend": true
},
"tooltip": {
"mode": "single",
"sort": "none"
"mode": "multi",
"sort": "desc"
}
},
"targets": [
@@ -3909,8 +3909,8 @@
"showLegend": true
},
"tooltip": {
"mode": "single",
"sort": "none"
"mode": "multi",
"sort": "desc"
}
},
"targets": [
@@ -4011,8 +4011,8 @@
"showLegend": true
},
"tooltip": {
"mode": "single",
"sort": "none"
"mode": "multi",
"sort": "desc"
}
},
"targets": [
@@ -4113,8 +4113,8 @@
"showLegend": true
},
"tooltip": {
"mode": "single",
"sort": "none"
"mode": "multi",
"sort": "desc"
}
},
"targets": [

11
grafana/summary.sh Executable file
View File

@@ -0,0 +1,11 @@
#!/usr/bin/env bash
BASEDIR=$(dirname "$0")
echo '| Title | Description | Expressions |
|---|---|---|'
cat $BASEDIR/greptimedb-cluster.json | jq -r '
.panels |
map(select(.type == "stat" or .type == "timeseries")) |
.[] | "| \(.title) | \(.description | gsub("\n"; "<br>")) | \(.targets | map(.expr // .rawSql | "`\(.|gsub("\n"; "<br>"))`") | join("<br>")) |"
'

View File

@@ -15,13 +15,10 @@ common-macro.workspace = true
common-time.workspace = true
datatypes.workspace = true
greptime-proto.workspace = true
paste = "1.0"
paste.workspace = true
prost.workspace = true
serde_json.workspace = true
snafu.workspace = true
[build-dependencies]
tonic-build = "0.11"
[dev-dependencies]
paste = "1.0"

View File

@@ -19,9 +19,7 @@ use common_decimal::decimal128::{DECIMAL128_DEFAULT_SCALE, DECIMAL128_MAX_PRECIS
use common_decimal::Decimal128;
use common_time::time::Time;
use common_time::timestamp::TimeUnit;
use common_time::{
Date, DateTime, IntervalDayTime, IntervalMonthDayNano, IntervalYearMonth, Timestamp,
};
use common_time::{Date, IntervalDayTime, IntervalMonthDayNano, IntervalYearMonth, Timestamp};
use datatypes::prelude::{ConcreteDataType, ValueRef};
use datatypes::scalars::ScalarVector;
use datatypes::types::{
@@ -29,8 +27,8 @@ use datatypes::types::{
};
use datatypes::value::{OrderedF32, OrderedF64, Value};
use datatypes::vectors::{
BinaryVector, BooleanVector, DateTimeVector, DateVector, Decimal128Vector, Float32Vector,
Float64Vector, Int32Vector, Int64Vector, IntervalDayTimeVector, IntervalMonthDayNanoVector,
BinaryVector, BooleanVector, DateVector, Decimal128Vector, Float32Vector, Float64Vector,
Int32Vector, Int64Vector, IntervalDayTimeVector, IntervalMonthDayNanoVector,
IntervalYearMonthVector, PrimitiveVector, StringVector, TimeMicrosecondVector,
TimeMillisecondVector, TimeNanosecondVector, TimeSecondVector, TimestampMicrosecondVector,
TimestampMillisecondVector, TimestampNanosecondVector, TimestampSecondVector, UInt32Vector,
@@ -118,7 +116,7 @@ impl From<ColumnDataTypeWrapper> for ConcreteDataType {
ColumnDataType::Json => ConcreteDataType::json_datatype(),
ColumnDataType::String => ConcreteDataType::string_datatype(),
ColumnDataType::Date => ConcreteDataType::date_datatype(),
ColumnDataType::Datetime => ConcreteDataType::datetime_datatype(),
ColumnDataType::Datetime => ConcreteDataType::timestamp_microsecond_datatype(),
ColumnDataType::TimestampSecond => ConcreteDataType::timestamp_second_datatype(),
ColumnDataType::TimestampMillisecond => {
ConcreteDataType::timestamp_millisecond_datatype()
@@ -271,7 +269,6 @@ impl TryFrom<ConcreteDataType> for ColumnDataTypeWrapper {
ConcreteDataType::Binary(_) => ColumnDataType::Binary,
ConcreteDataType::String(_) => ColumnDataType::String,
ConcreteDataType::Date(_) => ColumnDataType::Date,
ConcreteDataType::DateTime(_) => ColumnDataType::Datetime,
ConcreteDataType::Timestamp(t) => match t {
TimestampType::Second(_) => ColumnDataType::TimestampSecond,
TimestampType::Millisecond(_) => ColumnDataType::TimestampMillisecond,
@@ -476,7 +473,6 @@ pub fn push_vals(column: &mut Column, origin_count: usize, vector: VectorRef) {
Value::String(val) => values.string_values.push(val.as_utf8().to_string()),
Value::Binary(val) => values.binary_values.push(val.to_vec()),
Value::Date(val) => values.date_values.push(val.val()),
Value::DateTime(val) => values.datetime_values.push(val.val()),
Value::Timestamp(val) => match val.unit() {
TimeUnit::Second => values.timestamp_second_values.push(val.value()),
TimeUnit::Millisecond => values.timestamp_millisecond_values.push(val.value()),
@@ -577,12 +573,11 @@ pub fn pb_value_to_value_ref<'a>(
ValueData::BinaryValue(bytes) => ValueRef::Binary(bytes.as_slice()),
ValueData::StringValue(string) => ValueRef::String(string.as_str()),
ValueData::DateValue(d) => ValueRef::Date(Date::from(*d)),
ValueData::DatetimeValue(d) => ValueRef::DateTime(DateTime::new(*d)),
ValueData::TimestampSecondValue(t) => ValueRef::Timestamp(Timestamp::new_second(*t)),
ValueData::TimestampMillisecondValue(t) => {
ValueRef::Timestamp(Timestamp::new_millisecond(*t))
}
ValueData::TimestampMicrosecondValue(t) => {
ValueData::DatetimeValue(t) | ValueData::TimestampMicrosecondValue(t) => {
ValueRef::Timestamp(Timestamp::new_microsecond(*t))
}
ValueData::TimestampNanosecondValue(t) => {
@@ -651,7 +646,6 @@ pub fn pb_values_to_vector_ref(data_type: &ConcreteDataType, values: Values) ->
ConcreteDataType::Binary(_) => Arc::new(BinaryVector::from(values.binary_values)),
ConcreteDataType::String(_) => Arc::new(StringVector::from_vec(values.string_values)),
ConcreteDataType::Date(_) => Arc::new(DateVector::from_vec(values.date_values)),
ConcreteDataType::DateTime(_) => Arc::new(DateTimeVector::from_vec(values.datetime_values)),
ConcreteDataType::Timestamp(unit) => match unit {
TimestampType::Second(_) => Arc::new(TimestampSecondVector::from_vec(
values.timestamp_second_values,
@@ -787,11 +781,6 @@ pub fn pb_values_to_values(data_type: &ConcreteDataType, values: Values) -> Vec<
.into_iter()
.map(|val| val.into())
.collect(),
ConcreteDataType::DateTime(_) => values
.datetime_values
.into_iter()
.map(|v| Value::DateTime(v.into()))
.collect(),
ConcreteDataType::Date(_) => values
.date_values
.into_iter()
@@ -947,9 +936,6 @@ pub fn to_proto_value(value: Value) -> Option<v1::Value> {
Value::Date(v) => v1::Value {
value_data: Some(ValueData::DateValue(v.val())),
},
Value::DateTime(v) => v1::Value {
value_data: Some(ValueData::DatetimeValue(v.val())),
},
Value::Timestamp(v) => match v.unit() {
TimeUnit::Second => v1::Value {
value_data: Some(ValueData::TimestampSecondValue(v.value())),
@@ -1066,7 +1052,6 @@ pub fn value_to_grpc_value(value: Value) -> GrpcValue {
Value::String(v) => Some(ValueData::StringValue(v.as_utf8().to_string())),
Value::Binary(v) => Some(ValueData::BinaryValue(v.to_vec())),
Value::Date(v) => Some(ValueData::DateValue(v.val())),
Value::DateTime(v) => Some(ValueData::DatetimeValue(v.val())),
Value::Timestamp(v) => Some(match v.unit() {
TimeUnit::Second => ValueData::TimestampSecondValue(v.value()),
TimeUnit::Millisecond => ValueData::TimestampMillisecondValue(v.value()),
@@ -1248,7 +1233,7 @@ mod tests {
ColumnDataTypeWrapper::date_datatype().into()
);
assert_eq!(
ConcreteDataType::datetime_datatype(),
ConcreteDataType::timestamp_microsecond_datatype(),
ColumnDataTypeWrapper::datetime_datatype().into()
);
assert_eq!(
@@ -1339,10 +1324,6 @@ mod tests {
ColumnDataTypeWrapper::date_datatype(),
ConcreteDataType::date_datatype().try_into().unwrap()
);
assert_eq!(
ColumnDataTypeWrapper::datetime_datatype(),
ConcreteDataType::datetime_datatype().try_into().unwrap()
);
assert_eq!(
ColumnDataTypeWrapper::timestamp_millisecond_datatype(),
ConcreteDataType::timestamp_millisecond_datatype()
@@ -1830,17 +1811,6 @@ mod tests {
]
);
test_convert_values!(
datetime,
vec![1.into(), 2.into(), 3.into()],
datetime,
vec![
Value::DateTime(1.into()),
Value::DateTime(2.into()),
Value::DateTime(3.into())
]
);
#[test]
fn test_vectors_to_rows_for_different_types() {
let boolean_vec = BooleanVector::from_vec(vec![true, false, true]);

View File

@@ -15,10 +15,13 @@
use std::collections::HashMap;
use datatypes::schema::{
ColumnDefaultConstraint, ColumnSchema, FulltextAnalyzer, FulltextOptions, COMMENT_KEY,
FULLTEXT_KEY, INVERTED_INDEX_KEY, SKIPPING_INDEX_KEY,
ColumnDefaultConstraint, ColumnSchema, FulltextAnalyzer, FulltextBackend, FulltextOptions,
SkippingIndexOptions, SkippingIndexType, COMMENT_KEY, FULLTEXT_KEY, INVERTED_INDEX_KEY,
SKIPPING_INDEX_KEY,
};
use greptime_proto::v1::{
Analyzer, FulltextBackend as PbFulltextBackend, SkippingIndexType as PbSkippingIndexType,
};
use greptime_proto::v1::Analyzer;
use snafu::ResultExt;
use crate::error::{self, Result};
@@ -103,6 +106,13 @@ pub fn contains_fulltext(options: &Option<ColumnOptions>) -> bool {
.is_some_and(|o| o.options.contains_key(FULLTEXT_GRPC_KEY))
}
/// Checks if the `ColumnOptions` contains skipping index options.
pub fn contains_skipping(options: &Option<ColumnOptions>) -> bool {
options
.as_ref()
.is_some_and(|o| o.options.contains_key(SKIPPING_INDEX_GRPC_KEY))
}
/// Tries to construct a `ColumnOptions` from the given `FulltextOptions`.
pub fn options_from_fulltext(fulltext: &FulltextOptions) -> Result<Option<ColumnOptions>> {
let mut options = ColumnOptions::default();
@@ -113,19 +123,55 @@ pub fn options_from_fulltext(fulltext: &FulltextOptions) -> Result<Option<Column
Ok((!options.options.is_empty()).then_some(options))
}
/// Tries to construct a `ColumnOptions` from the given `SkippingIndexOptions`.
pub fn options_from_skipping(skipping: &SkippingIndexOptions) -> Result<Option<ColumnOptions>> {
let mut options = ColumnOptions::default();
let v = serde_json::to_string(skipping).context(error::SerializeJsonSnafu)?;
options
.options
.insert(SKIPPING_INDEX_GRPC_KEY.to_string(), v);
Ok((!options.options.is_empty()).then_some(options))
}
/// Tries to construct a `ColumnOptions` for inverted index.
pub fn options_from_inverted() -> ColumnOptions {
let mut options = ColumnOptions::default();
options
.options
.insert(INVERTED_INDEX_GRPC_KEY.to_string(), "true".to_string());
options
}
/// Tries to construct a `FulltextAnalyzer` from the given analyzer.
pub fn as_fulltext_option(analyzer: Analyzer) -> FulltextAnalyzer {
pub fn as_fulltext_option_analyzer(analyzer: Analyzer) -> FulltextAnalyzer {
match analyzer {
Analyzer::English => FulltextAnalyzer::English,
Analyzer::Chinese => FulltextAnalyzer::Chinese,
}
}
/// Tries to construct a `FulltextBackend` from the given backend.
pub fn as_fulltext_option_backend(backend: PbFulltextBackend) -> FulltextBackend {
match backend {
PbFulltextBackend::Bloom => FulltextBackend::Bloom,
PbFulltextBackend::Tantivy => FulltextBackend::Tantivy,
}
}
/// Tries to construct a `SkippingIndexType` from the given skipping index type.
pub fn as_skipping_index_type(skipping_index_type: PbSkippingIndexType) -> SkippingIndexType {
match skipping_index_type {
PbSkippingIndexType::BloomFilter => SkippingIndexType::BloomFilter,
}
}
#[cfg(test)]
mod tests {
use datatypes::data_type::ConcreteDataType;
use datatypes::schema::FulltextAnalyzer;
use datatypes::schema::{FulltextAnalyzer, FulltextBackend};
use super::*;
use crate::v1::ColumnDataType;
@@ -184,13 +230,14 @@ mod tests {
enable: true,
analyzer: FulltextAnalyzer::English,
case_sensitive: false,
backend: FulltextBackend::Bloom,
})
.unwrap();
schema.set_inverted_index(true);
let options = options_from_column_schema(&schema).unwrap();
assert_eq!(
options.options.get(FULLTEXT_GRPC_KEY).unwrap(),
"{\"enable\":true,\"analyzer\":\"English\",\"case-sensitive\":false}"
"{\"enable\":true,\"analyzer\":\"English\",\"case-sensitive\":false,\"backend\":\"bloom\"}"
);
assert_eq!(
options.options.get(INVERTED_INDEX_GRPC_KEY).unwrap(),
@@ -204,11 +251,12 @@ mod tests {
enable: true,
analyzer: FulltextAnalyzer::English,
case_sensitive: false,
backend: FulltextBackend::Bloom,
};
let options = options_from_fulltext(&fulltext).unwrap().unwrap();
assert_eq!(
options.options.get(FULLTEXT_GRPC_KEY).unwrap(),
"{\"enable\":true,\"analyzer\":\"English\",\"case-sensitive\":false}"
"{\"enable\":true,\"analyzer\":\"English\",\"case-sensitive\":false,\"backend\":\"bloom\"}"
);
}

View File

@@ -15,7 +15,7 @@ api.workspace = true
arrow.workspace = true
arrow-schema.workspace = true
async-stream.workspace = true
async-trait = "0.1"
async-trait.workspace = true
bytes.workspace = true
common-catalog.workspace = true
common-error.workspace = true
@@ -31,7 +31,7 @@ common-version.workspace = true
dashmap.workspace = true
datafusion.workspace = true
datatypes.workspace = true
futures = "0.3"
futures.workspace = true
futures-util.workspace = true
humantime.workspace = true
itertools.workspace = true
@@ -39,7 +39,7 @@ lazy_static.workspace = true
meta-client.workspace = true
moka = { workspace = true, features = ["future", "sync"] }
partition.workspace = true
paste = "1.0"
paste.workspace = true
prometheus.workspace = true
rustc-hash.workspace = true
serde_json.workspace = true
@@ -49,7 +49,7 @@ sql.workspace = true
store-api.workspace = true
table.workspace = true
tokio.workspace = true
tokio-stream = "0.1"
tokio-stream.workspace = true
[dev-dependencies]
cache.workspace = true

View File

@@ -38,6 +38,7 @@ use partition::manager::{PartitionRuleManager, PartitionRuleManagerRef};
use session::context::{Channel, QueryContext};
use snafu::prelude::*;
use table::dist_table::DistTable;
use table::metadata::TableId;
use table::table::numbers::{NumbersTable, NUMBERS_TABLE_NAME};
use table::table_name::TableName;
use table::TableRef;
@@ -286,6 +287,28 @@ impl CatalogManager for KvBackendCatalogManager {
return Ok(None);
}
async fn tables_by_ids(
&self,
catalog: &str,
schema: &str,
table_ids: &[TableId],
) -> Result<Vec<TableRef>> {
let table_info_values = self
.table_metadata_manager
.table_info_manager()
.batch_get(table_ids)
.await
.context(TableMetadataManagerSnafu)?;
let tables = table_info_values
.into_values()
.filter(|t| t.table_info.catalog_name == catalog && t.table_info.schema_name == schema)
.map(build_table)
.collect::<Result<Vec<_>>>()?;
Ok(tables)
}
fn tables<'a>(
&'a self,
catalog: &'a str,

View File

@@ -87,6 +87,14 @@ pub trait CatalogManager: Send + Sync {
query_ctx: Option<&QueryContext>,
) -> Result<Option<TableRef>>;
/// Returns the tables by table ids.
async fn tables_by_ids(
&self,
catalog: &str,
schema: &str,
table_ids: &[TableId],
) -> Result<Vec<TableRef>>;
/// Returns all tables with a stream by catalog and schema.
fn tables<'a>(
&'a self,

View File

@@ -14,7 +14,7 @@
use std::any::Any;
use std::collections::hash_map::Entry;
use std::collections::HashMap;
use std::collections::{HashMap, HashSet};
use std::sync::{Arc, RwLock, Weak};
use async_stream::{stream, try_stream};
@@ -28,6 +28,7 @@ use common_meta::kv_backend::memory::MemoryKvBackend;
use futures_util::stream::BoxStream;
use session::context::QueryContext;
use snafu::OptionExt;
use table::metadata::TableId;
use table::TableRef;
use crate::error::{CatalogNotFoundSnafu, Result, SchemaNotFoundSnafu, TableExistsSnafu};
@@ -143,6 +144,33 @@ impl CatalogManager for MemoryCatalogManager {
Ok(result)
}
async fn tables_by_ids(
&self,
catalog: &str,
schema: &str,
table_ids: &[TableId],
) -> Result<Vec<TableRef>> {
let catalogs = self.catalogs.read().unwrap();
let schemas = catalogs.get(catalog).context(CatalogNotFoundSnafu {
catalog_name: catalog,
})?;
let tables = schemas
.get(schema)
.context(SchemaNotFoundSnafu { catalog, schema })?;
let filter_ids: HashSet<_> = table_ids.iter().collect();
// It is very inefficient, but we do not need to optimize it since it will not be called in `MemoryCatalogManager`.
let tables = tables
.values()
.filter(|t| filter_ids.contains(&t.table_info().table_id()))
.cloned()
.collect::<Vec<_>>();
Ok(tables)
}
fn tables<'a>(
&'a self,
catalog: &'a str,

View File

@@ -77,7 +77,7 @@ trait SystemSchemaProviderInner {
fn system_table(&self, name: &str) -> Option<SystemTableRef>;
fn table_info(catalog_name: String, table: &SystemTableRef) -> TableInfoRef {
let table_meta = TableMetaBuilder::default()
let table_meta = TableMetaBuilder::empty()
.schema(table.schema())
.primary_key_indices(vec![])
.next_column_id(0)

View File

@@ -19,7 +19,7 @@ mod information_memory_table;
pub mod key_column_usage;
mod partitions;
mod procedure_info;
mod region_peers;
pub mod region_peers;
mod region_statistics;
mod runtime_metrics;
pub mod schemata;

View File

@@ -56,6 +56,8 @@ pub const TABLE_CATALOG: &str = "table_catalog";
pub const TABLE_SCHEMA: &str = "table_schema";
pub const TABLE_NAME: &str = "table_name";
pub const COLUMN_NAME: &str = "column_name";
pub const REGION_ID: &str = "region_id";
pub const PEER_ID: &str = "peer_id";
const ORDINAL_POSITION: &str = "ordinal_position";
const CHARACTER_MAXIMUM_LENGTH: &str = "character_maximum_length";
const CHARACTER_OCTET_LENGTH: &str = "character_octet_length";
@@ -365,10 +367,6 @@ impl InformationSchemaColumnsBuilder {
self.numeric_scales.push(None);
match &column_schema.data_type {
ConcreteDataType::DateTime(datetime_type) => {
self.datetime_precisions
.push(Some(datetime_type.precision() as i64));
}
ConcreteDataType::Timestamp(ts_type) => {
self.datetime_precisions
.push(Some(ts_type.precision() as i64));

View File

@@ -28,16 +28,19 @@ use datafusion::physical_plan::streaming::PartitionStream as DfPartitionStream;
use datatypes::prelude::ConcreteDataType as CDT;
use datatypes::scalars::ScalarVectorBuilder;
use datatypes::schema::{ColumnSchema, Schema, SchemaRef};
use datatypes::timestamp::TimestampMillisecond;
use datatypes::value::Value;
use datatypes::vectors::{
Int64VectorBuilder, StringVectorBuilder, UInt32VectorBuilder, UInt64VectorBuilder, VectorRef,
Int64VectorBuilder, StringVectorBuilder, TimestampMillisecondVectorBuilder,
UInt32VectorBuilder, UInt64VectorBuilder, VectorRef,
};
use futures::TryStreamExt;
use snafu::{OptionExt, ResultExt};
use store_api::storage::{ScanRequest, TableId};
use crate::error::{
CreateRecordBatchSnafu, FlowInfoNotFoundSnafu, InternalSnafu, JsonSnafu, ListFlowsSnafu, Result,
CreateRecordBatchSnafu, FlowInfoNotFoundSnafu, InternalSnafu, JsonSnafu, ListFlowsSnafu,
Result, UpgradeWeakCatalogManagerRefSnafu,
};
use crate::information_schema::{Predicates, FLOWS};
use crate::system_schema::information_schema::InformationTable;
@@ -59,6 +62,10 @@ pub const SOURCE_TABLE_IDS: &str = "source_table_ids";
pub const SINK_TABLE_NAME: &str = "sink_table_name";
pub const FLOWNODE_IDS: &str = "flownode_ids";
pub const OPTIONS: &str = "options";
pub const CREATED_TIME: &str = "created_time";
pub const UPDATED_TIME: &str = "updated_time";
pub const LAST_EXECUTION_TIME: &str = "last_execution_time";
pub const SOURCE_TABLE_NAMES: &str = "source_table_names";
/// The `information_schema.flows` to provides information about flows in databases.
#[derive(Debug)]
@@ -99,6 +106,14 @@ impl InformationSchemaFlows {
(SINK_TABLE_NAME, CDT::string_datatype(), false),
(FLOWNODE_IDS, CDT::string_datatype(), true),
(OPTIONS, CDT::string_datatype(), true),
(CREATED_TIME, CDT::timestamp_millisecond_datatype(), false),
(UPDATED_TIME, CDT::timestamp_millisecond_datatype(), false),
(
LAST_EXECUTION_TIME,
CDT::timestamp_millisecond_datatype(),
true,
),
(SOURCE_TABLE_NAMES, CDT::string_datatype(), true),
]
.into_iter()
.map(|(name, ty, nullable)| ColumnSchema::new(name, ty, nullable))
@@ -170,6 +185,10 @@ struct InformationSchemaFlowsBuilder {
sink_table_names: StringVectorBuilder,
flownode_id_groups: StringVectorBuilder,
option_groups: StringVectorBuilder,
created_time: TimestampMillisecondVectorBuilder,
updated_time: TimestampMillisecondVectorBuilder,
last_execution_time: TimestampMillisecondVectorBuilder,
source_table_names: StringVectorBuilder,
}
impl InformationSchemaFlowsBuilder {
@@ -196,6 +215,10 @@ impl InformationSchemaFlowsBuilder {
sink_table_names: StringVectorBuilder::with_capacity(INIT_CAPACITY),
flownode_id_groups: StringVectorBuilder::with_capacity(INIT_CAPACITY),
option_groups: StringVectorBuilder::with_capacity(INIT_CAPACITY),
created_time: TimestampMillisecondVectorBuilder::with_capacity(INIT_CAPACITY),
updated_time: TimestampMillisecondVectorBuilder::with_capacity(INIT_CAPACITY),
last_execution_time: TimestampMillisecondVectorBuilder::with_capacity(INIT_CAPACITY),
source_table_names: StringVectorBuilder::with_capacity(INIT_CAPACITY),
}
}
@@ -235,13 +258,14 @@ impl InformationSchemaFlowsBuilder {
catalog_name: catalog_name.to_string(),
flow_name: flow_name.to_string(),
})?;
self.add_flow(&predicates, flow_id.flow_id(), flow_info, &flow_stat)?;
self.add_flow(&predicates, flow_id.flow_id(), flow_info, &flow_stat)
.await?;
}
self.finish()
}
fn add_flow(
async fn add_flow(
&mut self,
predicates: &Predicates,
flow_id: FlowId,
@@ -290,6 +314,36 @@ impl InformationSchemaFlowsBuilder {
input: format!("{:?}", flow_info.options()),
},
)?));
self.created_time
.push(Some(flow_info.created_time().timestamp_millis().into()));
self.updated_time
.push(Some(flow_info.updated_time().timestamp_millis().into()));
self.last_execution_time
.push(flow_stat.as_ref().and_then(|state| {
state
.last_exec_time_map
.get(&flow_id)
.map(|v| TimestampMillisecond::new(*v))
}));
let mut source_table_names = vec![];
let catalog_name = self.catalog_name.clone();
let catalog_manager = self
.catalog_manager
.upgrade()
.context(UpgradeWeakCatalogManagerRefSnafu)?;
for schema_name in catalog_manager.schema_names(&catalog_name, None).await? {
source_table_names.extend(
catalog_manager
.tables_by_ids(&catalog_name, &schema_name, flow_info.source_table_ids())
.await?
.into_iter()
.map(|table| table.table_info().full_table_name()),
);
}
let source_table_names = source_table_names.join(",");
self.source_table_names.push(Some(&source_table_names));
Ok(())
}
@@ -307,6 +361,10 @@ impl InformationSchemaFlowsBuilder {
Arc::new(self.sink_table_names.finish()),
Arc::new(self.flownode_id_groups.finish()),
Arc::new(self.option_groups.finish()),
Arc::new(self.created_time.finish()),
Arc::new(self.updated_time.finish()),
Arc::new(self.last_execution_time.finish()),
Arc::new(self.source_table_names.finish()),
];
RecordBatch::new(self.schema.clone(), columns).context(CreateRecordBatchSnafu)
}

View File

@@ -20,7 +20,7 @@ use datatypes::vectors::{Int64Vector, StringVector, VectorRef};
use super::table_names::*;
use crate::system_schema::utils::tables::{
bigint_column, datetime_column, string_column, string_columns,
bigint_column, string_column, string_columns, timestamp_micro_column,
};
const NO_VALUE: &str = "NO";
@@ -163,17 +163,17 @@ pub(super) fn get_schema_columns(table_name: &str) -> (SchemaRef, Vec<VectorRef>
string_column("EVENT_BODY"),
string_column("EVENT_DEFINITION"),
string_column("EVENT_TYPE"),
datetime_column("EXECUTE_AT"),
timestamp_micro_column("EXECUTE_AT"),
bigint_column("INTERVAL_VALUE"),
string_column("INTERVAL_FIELD"),
string_column("SQL_MODE"),
datetime_column("STARTS"),
datetime_column("ENDS"),
timestamp_micro_column("STARTS"),
timestamp_micro_column("ENDS"),
string_column("STATUS"),
string_column("ON_COMPLETION"),
datetime_column("CREATED"),
datetime_column("LAST_ALTERED"),
datetime_column("LAST_EXECUTED"),
timestamp_micro_column("CREATED"),
timestamp_micro_column("LAST_ALTERED"),
timestamp_micro_column("LAST_EXECUTED"),
string_column("EVENT_COMMENT"),
bigint_column("ORIGINATOR"),
string_column("CHARACTER_SET_CLIENT"),
@@ -204,10 +204,10 @@ pub(super) fn get_schema_columns(table_name: &str) -> (SchemaRef, Vec<VectorRef>
bigint_column("INITIAL_SIZE"),
bigint_column("MAXIMUM_SIZE"),
bigint_column("AUTOEXTEND_SIZE"),
datetime_column("CREATION_TIME"),
datetime_column("LAST_UPDATE_TIME"),
datetime_column("LAST_ACCESS_TIME"),
datetime_column("RECOVER_TIME"),
timestamp_micro_column("CREATION_TIME"),
timestamp_micro_column("LAST_UPDATE_TIME"),
timestamp_micro_column("LAST_ACCESS_TIME"),
timestamp_micro_column("RECOVER_TIME"),
bigint_column("TRANSACTION_COUNTER"),
string_column("VERSION"),
string_column("ROW_FORMAT"),
@@ -217,9 +217,9 @@ pub(super) fn get_schema_columns(table_name: &str) -> (SchemaRef, Vec<VectorRef>
bigint_column("MAX_DATA_LENGTH"),
bigint_column("INDEX_LENGTH"),
bigint_column("DATA_FREE"),
datetime_column("CREATE_TIME"),
datetime_column("UPDATE_TIME"),
datetime_column("CHECK_TIME"),
timestamp_micro_column("CREATE_TIME"),
timestamp_micro_column("UPDATE_TIME"),
timestamp_micro_column("CHECK_TIME"),
string_column("CHECKSUM"),
string_column("STATUS"),
string_column("EXTRA"),
@@ -330,8 +330,8 @@ pub(super) fn get_schema_columns(table_name: &str) -> (SchemaRef, Vec<VectorRef>
string_column("SQL_DATA_ACCESS"),
string_column("SQL_PATH"),
string_column("SECURITY_TYPE"),
datetime_column("CREATED"),
datetime_column("LAST_ALTERED"),
timestamp_micro_column("CREATED"),
timestamp_micro_column("LAST_ALTERED"),
string_column("SQL_MODE"),
string_column("ROUTINE_COMMENT"),
string_column("DEFINER"),
@@ -383,7 +383,7 @@ pub(super) fn get_schema_columns(table_name: &str) -> (SchemaRef, Vec<VectorRef>
string_column("ACTION_REFERENCE_NEW_TABLE"),
string_column("ACTION_REFERENCE_OLD_ROW"),
string_column("ACTION_REFERENCE_NEW_ROW"),
datetime_column("CREATED"),
timestamp_micro_column("CREATED"),
string_column("SQL_MODE"),
string_column("DEFINER"),
string_column("CHARACTER_SET_CLIENT"),

View File

@@ -228,12 +228,6 @@ impl InformationSchemaKeyColumnUsageBuilder {
let keys = &table_info.meta.primary_key_indices;
let schema = table.schema();
// For compatibility, use primary key columns as inverted index columns.
let pk_as_inverted_index = !schema
.column_schemas()
.iter()
.any(|c| c.has_inverted_index_key());
for (idx, column) in schema.column_schemas().iter().enumerate() {
let mut constraints = vec![];
if column.is_time_index() {
@@ -251,10 +245,6 @@ impl InformationSchemaKeyColumnUsageBuilder {
// TODO(dimbtp): foreign key constraint not supported yet
if keys.contains(&idx) {
constraints.push(PRI_CONSTRAINT_NAME);
if pk_as_inverted_index {
constraints.push(INVERTED_INDEX_CONSTRAINT_NAME);
}
}
if column.is_inverted_indexed() {
constraints.push(INVERTED_INDEX_CONSTRAINT_NAME);

View File

@@ -20,17 +20,18 @@ use common_catalog::consts::INFORMATION_SCHEMA_PARTITIONS_TABLE_ID;
use common_error::ext::BoxedError;
use common_recordbatch::adapter::RecordBatchStreamAdapter;
use common_recordbatch::{RecordBatch, SendableRecordBatchStream};
use common_time::datetime::DateTime;
use datafusion::execution::TaskContext;
use datafusion::physical_plan::stream::RecordBatchStreamAdapter as DfRecordBatchStreamAdapter;
use datafusion::physical_plan::streaming::PartitionStream as DfPartitionStream;
use datafusion::physical_plan::SendableRecordBatchStream as DfSendableRecordBatchStream;
use datatypes::prelude::{ConcreteDataType, ScalarVectorBuilder, VectorRef};
use datatypes::schema::{ColumnSchema, Schema, SchemaRef};
use datatypes::timestamp::TimestampMicrosecond;
use datatypes::value::Value;
use datatypes::vectors::{
ConstantVector, DateTimeVector, DateTimeVectorBuilder, Int64Vector, Int64VectorBuilder,
MutableVector, StringVector, StringVectorBuilder, UInt64VectorBuilder,
ConstantVector, Int64Vector, Int64VectorBuilder, MutableVector, StringVector,
StringVectorBuilder, TimestampMicrosecondVector, TimestampMicrosecondVectorBuilder,
UInt64VectorBuilder,
};
use futures::{StreamExt, TryStreamExt};
use partition::manager::PartitionInfo;
@@ -127,9 +128,21 @@ impl InformationSchemaPartitions {
ColumnSchema::new("max_data_length", ConcreteDataType::int64_datatype(), true),
ColumnSchema::new("index_length", ConcreteDataType::int64_datatype(), true),
ColumnSchema::new("data_free", ConcreteDataType::int64_datatype(), true),
ColumnSchema::new("create_time", ConcreteDataType::datetime_datatype(), true),
ColumnSchema::new("update_time", ConcreteDataType::datetime_datatype(), true),
ColumnSchema::new("check_time", ConcreteDataType::datetime_datatype(), true),
ColumnSchema::new(
"create_time",
ConcreteDataType::timestamp_microsecond_datatype(),
true,
),
ColumnSchema::new(
"update_time",
ConcreteDataType::timestamp_microsecond_datatype(),
true,
),
ColumnSchema::new(
"check_time",
ConcreteDataType::timestamp_microsecond_datatype(),
true,
),
ColumnSchema::new("checksum", ConcreteDataType::int64_datatype(), true),
ColumnSchema::new(
"partition_comment",
@@ -200,7 +213,7 @@ struct InformationSchemaPartitionsBuilder {
partition_names: StringVectorBuilder,
partition_ordinal_positions: Int64VectorBuilder,
partition_expressions: StringVectorBuilder,
create_times: DateTimeVectorBuilder,
create_times: TimestampMicrosecondVectorBuilder,
partition_ids: UInt64VectorBuilder,
}
@@ -220,7 +233,7 @@ impl InformationSchemaPartitionsBuilder {
partition_names: StringVectorBuilder::with_capacity(INIT_CAPACITY),
partition_ordinal_positions: Int64VectorBuilder::with_capacity(INIT_CAPACITY),
partition_expressions: StringVectorBuilder::with_capacity(INIT_CAPACITY),
create_times: DateTimeVectorBuilder::with_capacity(INIT_CAPACITY),
create_times: TimestampMicrosecondVectorBuilder::with_capacity(INIT_CAPACITY),
partition_ids: UInt64VectorBuilder::with_capacity(INIT_CAPACITY),
}
}
@@ -324,7 +337,7 @@ impl InformationSchemaPartitionsBuilder {
};
self.partition_expressions.push(expressions.as_deref());
self.create_times.push(Some(DateTime::from(
self.create_times.push(Some(TimestampMicrosecond::from(
table_info.meta.created_on.timestamp_millis(),
)));
self.partition_ids.push(Some(partition.id.as_u64()));
@@ -342,8 +355,8 @@ impl InformationSchemaPartitionsBuilder {
Arc::new(Int64Vector::from(vec![None])),
rows_num,
));
let null_datetime_vector = Arc::new(ConstantVector::new(
Arc::new(DateTimeVector::from(vec![None])),
let null_timestampmicrosecond_vector = Arc::new(ConstantVector::new(
Arc::new(TimestampMicrosecondVector::from(vec![None])),
rows_num,
));
let partition_methods = Arc::new(ConstantVector::new(
@@ -373,8 +386,8 @@ impl InformationSchemaPartitionsBuilder {
null_i64_vector.clone(),
Arc::new(self.create_times.finish()),
// TODO(dennis): supports update_time
null_datetime_vector.clone(),
null_datetime_vector,
null_timestampmicrosecond_vector.clone(),
null_timestampmicrosecond_vector,
null_i64_vector,
null_string_vector.clone(),
null_string_vector.clone(),

View File

@@ -21,6 +21,7 @@ use common_error::ext::BoxedError;
use common_meta::rpc::router::RegionRoute;
use common_recordbatch::adapter::RecordBatchStreamAdapter;
use common_recordbatch::{RecordBatch, SendableRecordBatchStream};
use datafusion::common::HashMap;
use datafusion::execution::TaskContext;
use datafusion::physical_plan::stream::RecordBatchStreamAdapter as DfRecordBatchStreamAdapter;
use datafusion::physical_plan::streaming::PartitionStream as DfPartitionStream;
@@ -43,16 +44,22 @@ use crate::kvbackend::KvBackendCatalogManager;
use crate::system_schema::information_schema::{InformationTable, Predicates};
use crate::CatalogManager;
const REGION_ID: &str = "region_id";
const PEER_ID: &str = "peer_id";
pub const TABLE_CATALOG: &str = "table_catalog";
pub const TABLE_SCHEMA: &str = "table_schema";
pub const TABLE_NAME: &str = "table_name";
pub const REGION_ID: &str = "region_id";
pub const PEER_ID: &str = "peer_id";
const PEER_ADDR: &str = "peer_addr";
const IS_LEADER: &str = "is_leader";
pub const IS_LEADER: &str = "is_leader";
const STATUS: &str = "status";
const DOWN_SECONDS: &str = "down_seconds";
const INIT_CAPACITY: usize = 42;
/// The `REGION_PEERS` table provides information about the region distribution and routes. Including fields:
///
/// - `table_catalog`: the table catalog name
/// - `table_schema`: the table schema name
/// - `table_name`: the table name
/// - `region_id`: the region id
/// - `peer_id`: the region storage datanode peer id
/// - `peer_addr`: the region storage datanode gRPC peer address
@@ -77,6 +84,9 @@ impl InformationSchemaRegionPeers {
pub(crate) fn schema() -> SchemaRef {
Arc::new(Schema::new(vec![
ColumnSchema::new(TABLE_CATALOG, ConcreteDataType::string_datatype(), false),
ColumnSchema::new(TABLE_SCHEMA, ConcreteDataType::string_datatype(), false),
ColumnSchema::new(TABLE_NAME, ConcreteDataType::string_datatype(), false),
ColumnSchema::new(REGION_ID, ConcreteDataType::uint64_datatype(), false),
ColumnSchema::new(PEER_ID, ConcreteDataType::uint64_datatype(), true),
ColumnSchema::new(PEER_ADDR, ConcreteDataType::string_datatype(), true),
@@ -134,6 +144,9 @@ struct InformationSchemaRegionPeersBuilder {
catalog_name: String,
catalog_manager: Weak<dyn CatalogManager>,
table_catalogs: StringVectorBuilder,
table_schemas: StringVectorBuilder,
table_names: StringVectorBuilder,
region_ids: UInt64VectorBuilder,
peer_ids: UInt64VectorBuilder,
peer_addrs: StringVectorBuilder,
@@ -152,6 +165,9 @@ impl InformationSchemaRegionPeersBuilder {
schema,
catalog_name,
catalog_manager,
table_catalogs: StringVectorBuilder::with_capacity(INIT_CAPACITY),
table_schemas: StringVectorBuilder::with_capacity(INIT_CAPACITY),
table_names: StringVectorBuilder::with_capacity(INIT_CAPACITY),
region_ids: UInt64VectorBuilder::with_capacity(INIT_CAPACITY),
peer_ids: UInt64VectorBuilder::with_capacity(INIT_CAPACITY),
peer_addrs: StringVectorBuilder::with_capacity(INIT_CAPACITY),
@@ -177,24 +193,28 @@ impl InformationSchemaRegionPeersBuilder {
let predicates = Predicates::from_scan_request(&request);
for schema_name in catalog_manager.schema_names(&catalog_name, None).await? {
let table_id_stream = catalog_manager
let table_stream = catalog_manager
.tables(&catalog_name, &schema_name, None)
.try_filter_map(|t| async move {
let table_info = t.table_info();
if table_info.table_type == TableType::Temporary {
Ok(None)
} else {
Ok(Some(table_info.ident.table_id))
Ok(Some((
table_info.ident.table_id,
table_info.name.to_string(),
)))
}
});
const BATCH_SIZE: usize = 128;
// Split table ids into chunks
let mut table_id_chunks = pin!(table_id_stream.ready_chunks(BATCH_SIZE));
// Split tables into chunks
let mut table_chunks = pin!(table_stream.ready_chunks(BATCH_SIZE));
while let Some(table_ids) = table_id_chunks.next().await {
let table_ids = table_ids.into_iter().collect::<Result<Vec<_>>>()?;
while let Some(tables) = table_chunks.next().await {
let tables = tables.into_iter().collect::<Result<HashMap<_, _>>>()?;
let table_ids = tables.keys().cloned().collect::<Vec<_>>();
let table_routes = if let Some(partition_manager) = &partition_manager {
partition_manager
@@ -206,7 +226,16 @@ impl InformationSchemaRegionPeersBuilder {
};
for (table_id, routes) in table_routes {
self.add_region_peers(&predicates, table_id, &routes);
// Safety: table_id is guaranteed to be in the map
let table_name = tables.get(&table_id).unwrap();
self.add_region_peers(
&catalog_name,
&schema_name,
table_name,
&predicates,
table_id,
&routes,
);
}
}
}
@@ -216,6 +245,9 @@ impl InformationSchemaRegionPeersBuilder {
fn add_region_peers(
&mut self,
table_catalog: &str,
table_schema: &str,
table_name: &str,
predicates: &Predicates,
table_id: TableId,
routes: &[RegionRoute],
@@ -231,13 +263,20 @@ impl InformationSchemaRegionPeersBuilder {
Some("ALIVE".to_string())
};
let row = [(REGION_ID, &Value::from(region_id))];
let row = [
(TABLE_CATALOG, &Value::from(table_catalog)),
(TABLE_SCHEMA, &Value::from(table_schema)),
(TABLE_NAME, &Value::from(table_name)),
(REGION_ID, &Value::from(region_id)),
];
if !predicates.eval(&row) {
return;
}
// TODO(dennis): adds followers.
self.table_catalogs.push(Some(table_catalog));
self.table_schemas.push(Some(table_schema));
self.table_names.push(Some(table_name));
self.region_ids.push(Some(region_id));
self.peer_ids.push(peer_id);
self.peer_addrs.push(peer_addr.as_deref());
@@ -245,11 +284,26 @@ impl InformationSchemaRegionPeersBuilder {
self.statuses.push(state.as_deref());
self.down_seconds
.push(route.leader_down_millis().map(|m| m / 1000));
for follower in &route.follower_peers {
self.table_catalogs.push(Some(table_catalog));
self.table_schemas.push(Some(table_schema));
self.table_names.push(Some(table_name));
self.region_ids.push(Some(region_id));
self.peer_ids.push(Some(follower.id));
self.peer_addrs.push(Some(follower.addr.as_str()));
self.is_leaders.push(Some("No"));
self.statuses.push(None);
self.down_seconds.push(None);
}
}
}
fn finish(&mut self) -> Result<RecordBatch> {
let columns: Vec<VectorRef> = vec![
Arc::new(self.table_catalogs.finish()),
Arc::new(self.table_schemas.finish()),
Arc::new(self.table_names.finish()),
Arc::new(self.region_ids.finish()),
Arc::new(self.peer_ids.finish()),
Arc::new(self.peer_addrs.finish()),

View File

@@ -30,7 +30,8 @@ use datatypes::prelude::{ConcreteDataType, ScalarVectorBuilder, VectorRef};
use datatypes::schema::{ColumnSchema, Schema, SchemaRef};
use datatypes::value::Value;
use datatypes::vectors::{
DateTimeVectorBuilder, StringVectorBuilder, UInt32VectorBuilder, UInt64VectorBuilder,
StringVectorBuilder, TimestampMicrosecondVectorBuilder, UInt32VectorBuilder,
UInt64VectorBuilder,
};
use futures::TryStreamExt;
use snafu::{OptionExt, ResultExt};
@@ -105,9 +106,21 @@ impl InformationSchemaTables {
ColumnSchema::new(TABLE_ROWS, ConcreteDataType::uint64_datatype(), true),
ColumnSchema::new(DATA_FREE, ConcreteDataType::uint64_datatype(), true),
ColumnSchema::new(AUTO_INCREMENT, ConcreteDataType::uint64_datatype(), true),
ColumnSchema::new(CREATE_TIME, ConcreteDataType::datetime_datatype(), true),
ColumnSchema::new(UPDATE_TIME, ConcreteDataType::datetime_datatype(), true),
ColumnSchema::new(CHECK_TIME, ConcreteDataType::datetime_datatype(), true),
ColumnSchema::new(
CREATE_TIME,
ConcreteDataType::timestamp_microsecond_datatype(),
true,
),
ColumnSchema::new(
UPDATE_TIME,
ConcreteDataType::timestamp_microsecond_datatype(),
true,
),
ColumnSchema::new(
CHECK_TIME,
ConcreteDataType::timestamp_microsecond_datatype(),
true,
),
ColumnSchema::new(TABLE_COLLATION, ConcreteDataType::string_datatype(), true),
ColumnSchema::new(CHECKSUM, ConcreteDataType::uint64_datatype(), true),
ColumnSchema::new(CREATE_OPTIONS, ConcreteDataType::string_datatype(), true),
@@ -182,9 +195,9 @@ struct InformationSchemaTablesBuilder {
max_index_length: UInt64VectorBuilder,
data_free: UInt64VectorBuilder,
auto_increment: UInt64VectorBuilder,
create_time: DateTimeVectorBuilder,
update_time: DateTimeVectorBuilder,
check_time: DateTimeVectorBuilder,
create_time: TimestampMicrosecondVectorBuilder,
update_time: TimestampMicrosecondVectorBuilder,
check_time: TimestampMicrosecondVectorBuilder,
table_collation: StringVectorBuilder,
checksum: UInt64VectorBuilder,
create_options: StringVectorBuilder,
@@ -219,9 +232,9 @@ impl InformationSchemaTablesBuilder {
max_index_length: UInt64VectorBuilder::with_capacity(INIT_CAPACITY),
data_free: UInt64VectorBuilder::with_capacity(INIT_CAPACITY),
auto_increment: UInt64VectorBuilder::with_capacity(INIT_CAPACITY),
create_time: DateTimeVectorBuilder::with_capacity(INIT_CAPACITY),
update_time: DateTimeVectorBuilder::with_capacity(INIT_CAPACITY),
check_time: DateTimeVectorBuilder::with_capacity(INIT_CAPACITY),
create_time: TimestampMicrosecondVectorBuilder::with_capacity(INIT_CAPACITY),
update_time: TimestampMicrosecondVectorBuilder::with_capacity(INIT_CAPACITY),
check_time: TimestampMicrosecondVectorBuilder::with_capacity(INIT_CAPACITY),
table_collation: StringVectorBuilder::with_capacity(INIT_CAPACITY),
checksum: UInt64VectorBuilder::with_capacity(INIT_CAPACITY),
create_options: StringVectorBuilder::with_capacity(INIT_CAPACITY),

View File

@@ -51,10 +51,10 @@ pub fn bigint_column(name: &str) -> ColumnSchema {
)
}
pub fn datetime_column(name: &str) -> ColumnSchema {
pub fn timestamp_micro_column(name: &str) -> ColumnSchema {
ColumnSchema::new(
str::to_lowercase(name),
ConcreteDataType::datetime_datatype(),
ConcreteDataType::timestamp_microsecond_datatype(),
false,
)
}

View File

@@ -6,6 +6,7 @@ license.workspace = true
[features]
pg_kvbackend = ["common-meta/pg_kvbackend"]
mysql_kvbackend = ["common-meta/mysql_kvbackend"]
[lints]
workspace = true
@@ -43,6 +44,10 @@ futures.workspace = true
humantime.workspace = true
meta-client.workspace = true
nu-ansi-term = "0.46"
opendal = { version = "0.51.1", features = [
"services-fs",
"services-s3",
] }
query.workspace = true
rand.workspace = true
reqwest.workspace = true

View File

@@ -23,11 +23,14 @@ use common_error::ext::BoxedError;
use common_meta::key::{TableMetadataManager, TableMetadataManagerRef};
use common_meta::kv_backend::etcd::EtcdStore;
use common_meta::kv_backend::memory::MemoryKvBackend;
#[cfg(feature = "mysql_kvbackend")]
use common_meta::kv_backend::rds::MySqlStore;
#[cfg(feature = "pg_kvbackend")]
use common_meta::kv_backend::postgres::PgStore;
use common_meta::kv_backend::rds::PgStore;
use common_meta::peer::Peer;
use common_meta::rpc::router::{Region, RegionRoute};
use common_telemetry::info;
use common_wal::options::WalOptions;
use datatypes::data_type::ConcreteDataType;
use datatypes::schema::{ColumnSchema, RawSchema};
use rand::Rng;
@@ -62,6 +65,9 @@ pub struct BenchTableMetadataCommand {
#[cfg(feature = "pg_kvbackend")]
#[clap(long)]
postgres_addr: Option<String>,
#[cfg(feature = "mysql_kvbackend")]
#[clap(long)]
mysql_addr: Option<String>,
#[clap(long)]
count: u32,
}
@@ -85,6 +91,16 @@ impl BenchTableMetadataCommand {
kv_backend
};
#[cfg(feature = "mysql_kvbackend")]
let kv_backend = if let Some(mysql_addr) = &self.mysql_addr {
info!("Using mysql as kv backend");
MySqlStore::with_url(mysql_addr, "greptime_metakv", 128)
.await
.unwrap()
} else {
kv_backend
};
let table_metadata_manager = Arc::new(TableMetadataManager::new(kv_backend));
let tool = BenchTableMetadata {
@@ -161,7 +177,7 @@ fn create_table_info(table_id: TableId, table_name: TableName) -> RawTableInfo {
fn create_region_routes(regions: Vec<RegionNumber>) -> Vec<RegionRoute> {
let mut region_routes = Vec::with_capacity(100);
let mut rng = rand::thread_rng();
let mut rng = rand::rng();
for region_id in regions.into_iter().map(u64::from) {
region_routes.push(RegionRoute {
@@ -172,7 +188,7 @@ fn create_region_routes(regions: Vec<RegionNumber>) -> Vec<RegionRoute> {
attrs: BTreeMap::new(),
},
leader_peer: Some(Peer {
id: rng.gen_range(0..10),
id: rng.random_range(0..10),
addr: String::new(),
}),
follower_peers: vec![],
@@ -184,7 +200,7 @@ fn create_region_routes(regions: Vec<RegionNumber>) -> Vec<RegionRoute> {
region_routes
}
fn create_region_wal_options(regions: Vec<RegionNumber>) -> HashMap<RegionNumber, String> {
fn create_region_wal_options(regions: Vec<RegionNumber>) -> HashMap<RegionNumber, WalOptions> {
// TODO(niebayes): construct region wal options for benchmark.
let _ = regions;
HashMap::default()

View File

@@ -49,7 +49,12 @@ impl TableMetadataBencher {
let regions: Vec<_> = (0..64).collect();
let region_routes = create_region_routes(regions.clone());
let region_wal_options = create_region_wal_options(regions);
let region_wal_options = create_region_wal_options(regions)
.into_iter()
.map(|(region_id, wal_options)| {
(region_id, serde_json::to_string(&wal_options).unwrap())
})
.collect();
let start = Instant::now();
@@ -109,9 +114,17 @@ impl TableMetadataBencher {
let table_info = table_info.unwrap();
let table_route = table_route.unwrap();
let table_id = table_info.table_info.ident.table_id;
let regions: Vec<_> = (0..64).collect();
let region_wal_options = create_region_wal_options(regions);
let _ = self
.table_metadata_manager
.delete_table_metadata(table_id, &table_info.table_name(), &table_route)
.delete_table_metadata(
table_id,
&table_info.table_name(),
&table_route,
&region_wal_options,
)
.await;
start.elapsed()
},

View File

@@ -276,6 +276,24 @@ pub enum Error {
#[snafu(implicit)]
location: Location,
},
#[snafu(display("OpenDAL operator failed"))]
OpenDal {
#[snafu(implicit)]
location: Location,
#[snafu(source)]
error: opendal::Error,
},
#[snafu(display("S3 config need be set"))]
S3ConfigNotSet {
#[snafu(implicit)]
location: Location,
},
#[snafu(display("Output directory not set"))]
OutputDirNotSet {
#[snafu(implicit)]
location: Location,
},
}
pub type Result<T> = std::result::Result<T, Error>;
@@ -319,6 +337,9 @@ impl ErrorExt for Error {
| Error::BuildClient { .. } => StatusCode::Unexpected,
Error::Other { source, .. } => source.status_code(),
Error::OpenDal { .. } => StatusCode::Internal,
Error::S3ConfigNotSet { .. } => StatusCode::InvalidArguments,
Error::OutputDirNotSet { .. } => StatusCode::InvalidArguments,
Error::BuildRuntime { source, .. } => source.status_code(),

View File

@@ -21,15 +21,18 @@ use async_trait::async_trait;
use clap::{Parser, ValueEnum};
use common_error::ext::BoxedError;
use common_telemetry::{debug, error, info};
use opendal::layers::LoggingLayer;
use opendal::{services, Operator};
use serde_json::Value;
use snafu::{OptionExt, ResultExt};
use tokio::fs::File;
use tokio::io::{AsyncWriteExt, BufWriter};
use tokio::sync::Semaphore;
use tokio::time::Instant;
use crate::database::{parse_proxy_opts, DatabaseClient};
use crate::error::{EmptyResultSnafu, Error, FileIoSnafu, Result, SchemaNotFoundSnafu};
use crate::error::{
EmptyResultSnafu, Error, OpenDalSnafu, OutputDirNotSetSnafu, Result, S3ConfigNotSetSnafu,
SchemaNotFoundSnafu,
};
use crate::{database, Tool};
type TableReference = (String, String, String);
@@ -52,8 +55,9 @@ pub struct ExportCommand {
addr: String,
/// Directory to put the exported data. E.g.: /tmp/greptimedb-export
/// for local export.
#[clap(long)]
output_dir: String,
output_dir: Option<String>,
/// The name of the catalog to export.
#[clap(long, default_value = "greptime-*")]
@@ -101,10 +105,51 @@ pub struct ExportCommand {
/// Disable proxy server, if set, will not use any proxy.
#[clap(long)]
no_proxy: bool,
/// if export data to s3
#[clap(long)]
s3: bool,
/// The s3 bucket name
/// if s3 is set, this is required
#[clap(long)]
s3_bucket: Option<String>,
/// The s3 endpoint
/// if s3 is set, this is required
#[clap(long)]
s3_endpoint: Option<String>,
/// The s3 access key
/// if s3 is set, this is required
#[clap(long)]
s3_access_key: Option<String>,
/// The s3 secret key
/// if s3 is set, this is required
#[clap(long)]
s3_secret_key: Option<String>,
/// The s3 region
/// if s3 is set, this is required
#[clap(long)]
s3_region: Option<String>,
}
impl ExportCommand {
pub async fn build(&self) -> std::result::Result<Box<dyn Tool>, BoxedError> {
if self.s3
&& (self.s3_bucket.is_none()
|| self.s3_endpoint.is_none()
|| self.s3_access_key.is_none()
|| self.s3_secret_key.is_none()
|| self.s3_region.is_none())
{
return Err(BoxedError::new(S3ConfigNotSetSnafu {}.build()));
}
if !self.s3 && self.output_dir.is_none() {
return Err(BoxedError::new(OutputDirNotSetSnafu {}.build()));
}
let (catalog, schema) =
database::split_database(&self.database).map_err(BoxedError::new)?;
let proxy = parse_proxy_opts(self.proxy.clone(), self.no_proxy)?;
@@ -126,24 +171,43 @@ impl ExportCommand {
target: self.target.clone(),
start_time: self.start_time.clone(),
end_time: self.end_time.clone(),
s3: self.s3,
s3_bucket: self.s3_bucket.clone(),
s3_endpoint: self.s3_endpoint.clone(),
s3_access_key: self.s3_access_key.clone(),
s3_secret_key: self.s3_secret_key.clone(),
s3_region: self.s3_region.clone(),
}))
}
}
#[derive(Clone)]
pub struct Export {
catalog: String,
schema: Option<String>,
database_client: DatabaseClient,
output_dir: String,
output_dir: Option<String>,
parallelism: usize,
target: ExportTarget,
start_time: Option<String>,
end_time: Option<String>,
s3: bool,
s3_bucket: Option<String>,
s3_endpoint: Option<String>,
s3_access_key: Option<String>,
s3_secret_key: Option<String>,
s3_region: Option<String>,
}
impl Export {
fn catalog_path(&self) -> PathBuf {
PathBuf::from(&self.output_dir).join(&self.catalog)
if self.s3 {
PathBuf::from(&self.catalog)
} else if let Some(dir) = &self.output_dir {
PathBuf::from(dir).join(&self.catalog)
} else {
unreachable!("catalog_path: output_dir must be set when not using s3")
}
}
async fn get_db_names(&self) -> Result<Vec<String>> {
@@ -300,19 +364,23 @@ impl Export {
let timer = Instant::now();
let db_names = self.get_db_names().await?;
let db_count = db_names.len();
let operator = self.build_operator().await?;
for schema in db_names {
let db_dir = self.catalog_path().join(format!("{schema}/"));
tokio::fs::create_dir_all(&db_dir)
.await
.context(FileIoSnafu)?;
let file = db_dir.join("create_database.sql");
let mut file = File::create(file).await.context(FileIoSnafu)?;
let create_database = self
.show_create("DATABASE", &self.catalog, &schema, None)
.await?;
file.write_all(create_database.as_bytes())
.await
.context(FileIoSnafu)?;
let file_path = self.get_file_path(&schema, "create_database.sql");
self.write_to_storage(&operator, &file_path, create_database.into_bytes())
.await?;
info!(
"Exported {}.{} database creation SQL to {}",
self.catalog,
schema,
self.format_output_path(&file_path)
);
}
let elapsed = timer.elapsed();
@@ -326,149 +394,267 @@ impl Export {
let semaphore = Arc::new(Semaphore::new(self.parallelism));
let db_names = self.get_db_names().await?;
let db_count = db_names.len();
let operator = Arc::new(self.build_operator().await?);
let mut tasks = Vec::with_capacity(db_names.len());
for schema in db_names {
let semaphore_moved = semaphore.clone();
let export_self = self.clone();
let operator = operator.clone();
tasks.push(async move {
let _permit = semaphore_moved.acquire().await.unwrap();
let (metric_physical_tables, remaining_tables, views) =
self.get_table_list(&self.catalog, &schema).await?;
let table_count =
metric_physical_tables.len() + remaining_tables.len() + views.len();
let db_dir = self.catalog_path().join(format!("{schema}/"));
tokio::fs::create_dir_all(&db_dir)
.await
.context(FileIoSnafu)?;
let file = db_dir.join("create_tables.sql");
let mut file = File::create(file).await.context(FileIoSnafu)?;
for (c, s, t) in metric_physical_tables.into_iter().chain(remaining_tables) {
let create_table = self.show_create("TABLE", &c, &s, Some(&t)).await?;
file.write_all(create_table.as_bytes())
.await
.context(FileIoSnafu)?;
}
for (c, s, v) in views {
let create_view = self.show_create("VIEW", &c, &s, Some(&v)).await?;
file.write_all(create_view.as_bytes())
.await
.context(FileIoSnafu)?;
let (metric_physical_tables, remaining_tables, views) = export_self
.get_table_list(&export_self.catalog, &schema)
.await?;
// Create directory if needed for file system storage
if !export_self.s3 {
let db_dir = format!("{}/{}/", export_self.catalog, schema);
operator.create_dir(&db_dir).await.context(OpenDalSnafu)?;
}
let file_path = export_self.get_file_path(&schema, "create_tables.sql");
let mut content = Vec::new();
// Add table creation SQL
for (c, s, t) in metric_physical_tables.iter().chain(&remaining_tables) {
let create_table = export_self.show_create("TABLE", c, s, Some(t)).await?;
content.extend_from_slice(create_table.as_bytes());
}
// Add view creation SQL
for (c, s, v) in &views {
let create_view = export_self.show_create("VIEW", c, s, Some(v)).await?;
content.extend_from_slice(create_view.as_bytes());
}
// Write to storage
export_self
.write_to_storage(&operator, &file_path, content)
.await?;
info!(
"Finished exporting {}.{schema} with {table_count} table schemas to path: {}",
self.catalog,
db_dir.to_string_lossy()
"Finished exporting {}.{schema} with {} table schemas to path: {}",
export_self.catalog,
metric_physical_tables.len() + remaining_tables.len() + views.len(),
export_self.format_output_path(&file_path)
);
Ok::<(), Error>(())
});
}
let success = futures::future::join_all(tasks)
.await
.into_iter()
.filter(|r| match r {
Ok(_) => true,
Err(e) => {
error!(e; "export schema job failed");
false
}
})
.count();
let success = self.execute_tasks(tasks).await;
let elapsed = timer.elapsed();
info!("Success {success}/{db_count} jobs, cost: {elapsed:?}");
Ok(())
}
async fn build_operator(&self) -> Result<Operator> {
if self.s3 {
self.build_s3_operator().await
} else {
self.build_fs_operator().await
}
}
async fn build_s3_operator(&self) -> Result<Operator> {
let mut builder = services::S3::default().root("").bucket(
self.s3_bucket
.as_ref()
.expect("s3_bucket must be provided when s3 is enabled"),
);
if let Some(endpoint) = self.s3_endpoint.as_ref() {
builder = builder.endpoint(endpoint);
}
if let Some(region) = self.s3_region.as_ref() {
builder = builder.region(region);
}
if let Some(key_id) = self.s3_access_key.as_ref() {
builder = builder.access_key_id(key_id);
}
if let Some(secret_key) = self.s3_secret_key.as_ref() {
builder = builder.secret_access_key(secret_key);
}
let op = Operator::new(builder)
.context(OpenDalSnafu)?
.layer(LoggingLayer::default())
.finish();
Ok(op)
}
async fn build_fs_operator(&self) -> Result<Operator> {
let root = self
.output_dir
.as_ref()
.context(OutputDirNotSetSnafu)?
.clone();
let op = Operator::new(services::Fs::default().root(&root))
.context(OpenDalSnafu)?
.layer(LoggingLayer::default())
.finish();
Ok(op)
}
async fn export_database_data(&self) -> Result<()> {
let timer = Instant::now();
let semaphore = Arc::new(Semaphore::new(self.parallelism));
let db_names = self.get_db_names().await?;
let db_count = db_names.len();
let mut tasks = Vec::with_capacity(db_count);
let operator = Arc::new(self.build_operator().await?);
let with_options = build_with_options(&self.start_time, &self.end_time);
for schema in db_names {
let semaphore_moved = semaphore.clone();
let export_self = self.clone();
let with_options_clone = with_options.clone();
let operator = operator.clone();
tasks.push(async move {
let _permit = semaphore_moved.acquire().await.unwrap();
let db_dir = self.catalog_path().join(format!("{schema}/"));
tokio::fs::create_dir_all(&db_dir)
.await
.context(FileIoSnafu)?;
let with_options = match (&self.start_time, &self.end_time) {
(Some(start_time), Some(end_time)) => {
format!(
"WITH (FORMAT='parquet', start_time='{}', end_time='{}')",
start_time, end_time
)
}
(Some(start_time), None) => {
format!("WITH (FORMAT='parquet', start_time='{}')", start_time)
}
(None, Some(end_time)) => {
format!("WITH (FORMAT='parquet', end_time='{}')", end_time)
}
(None, None) => "WITH (FORMAT='parquet')".to_string(),
};
// Create directory if not using S3
if !export_self.s3 {
let db_dir = format!("{}/{}/", export_self.catalog, schema);
operator.create_dir(&db_dir).await.context(OpenDalSnafu)?;
}
let (path, connection_part) = export_self.get_storage_params(&schema);
// Execute COPY DATABASE TO command
let sql = format!(
r#"COPY DATABASE "{}"."{}" TO '{}' {};"#,
self.catalog,
schema,
db_dir.to_str().unwrap(),
with_options
r#"COPY DATABASE "{}"."{}" TO '{}' WITH ({}){};"#,
export_self.catalog, schema, path, with_options_clone, connection_part
);
info!("Executing sql: {sql}");
export_self.database_client.sql_in_public(&sql).await?;
info!(
"Finished exporting {}.{} data to {}",
export_self.catalog, schema, path
);
info!("Executing sql: {sql}");
// Create copy_from.sql file
let copy_database_from_sql = format!(
r#"COPY DATABASE "{}"."{}" FROM '{}' WITH ({}){};"#,
export_self.catalog, schema, path, with_options_clone, connection_part
);
self.database_client.sql_in_public(&sql).await?;
let copy_from_path = export_self.get_file_path(&schema, "copy_from.sql");
export_self
.write_to_storage(
&operator,
&copy_from_path,
copy_database_from_sql.into_bytes(),
)
.await?;
info!(
"Finished exporting {}.{schema} data into path: {}",
self.catalog,
db_dir.to_string_lossy()
);
// The export copy from sql
let copy_from_file = db_dir.join("copy_from.sql");
let mut writer =
BufWriter::new(File::create(copy_from_file).await.context(FileIoSnafu)?);
let copy_database_from_sql = format!(
r#"COPY DATABASE "{}"."{}" FROM '{}' WITH (FORMAT='parquet');"#,
self.catalog,
"Finished exporting {}.{} copy_from.sql to {}",
export_self.catalog,
schema,
db_dir.to_str().unwrap()
export_self.format_output_path(&copy_from_path)
);
writer
.write(copy_database_from_sql.as_bytes())
.await
.context(FileIoSnafu)?;
writer.flush().await.context(FileIoSnafu)?;
info!("Finished exporting {}.{schema} copy_from.sql", self.catalog);
Ok::<(), Error>(())
})
});
}
let success = futures::future::join_all(tasks)
let success = self.execute_tasks(tasks).await;
let elapsed = timer.elapsed();
info!("Success {success}/{db_count} jobs, costs: {elapsed:?}");
Ok(())
}
fn get_file_path(&self, schema: &str, file_name: &str) -> String {
format!("{}/{}/{}", self.catalog, schema, file_name)
}
fn format_output_path(&self, file_path: &str) -> String {
if self.s3 {
format!(
"s3://{}/{}",
self.s3_bucket.as_ref().unwrap_or(&String::new()),
file_path
)
} else {
format!(
"{}/{}",
self.output_dir.as_ref().unwrap_or(&String::new()),
file_path
)
}
}
async fn write_to_storage(
&self,
op: &Operator,
file_path: &str,
content: Vec<u8>,
) -> Result<()> {
op.write(file_path, content).await.context(OpenDalSnafu)
}
fn get_storage_params(&self, schema: &str) -> (String, String) {
if self.s3 {
let s3_path = format!(
"s3://{}/{}/{}/",
// Safety: s3_bucket is required when s3 is enabled
self.s3_bucket.as_ref().unwrap(),
self.catalog,
schema
);
// endpoint is optional
let endpoint_option = if let Some(endpoint) = self.s3_endpoint.as_ref() {
format!(", ENDPOINT='{}'", endpoint)
} else {
String::new()
};
// Safety: All s3 options are required
let connection_options = format!(
"ACCESS_KEY_ID='{}', SECRET_ACCESS_KEY='{}', REGION='{}'{}",
self.s3_access_key.as_ref().unwrap(),
self.s3_secret_key.as_ref().unwrap(),
self.s3_region.as_ref().unwrap(),
endpoint_option
);
(s3_path, format!(" CONNECTION ({})", connection_options))
} else {
(
self.catalog_path()
.join(format!("{schema}/"))
.to_string_lossy()
.to_string(),
String::new(),
)
}
}
async fn execute_tasks(
&self,
tasks: Vec<impl std::future::Future<Output = Result<()>>>,
) -> usize {
futures::future::join_all(tasks)
.await
.into_iter()
.filter(|r| match r {
Ok(_) => true,
Err(e) => {
error!(e; "export database job failed");
error!(e; "export job failed");
false
}
})
.count();
let elapsed = timer.elapsed();
info!("Success {success}/{db_count} jobs, costs: {elapsed:?}");
Ok(())
.count()
}
}
@@ -493,3 +679,15 @@ impl Tool for Export {
}
}
}
/// Builds the WITH options string for SQL commands, assuming consistent syntax across S3 and local exports.
fn build_with_options(start_time: &Option<String>, end_time: &Option<String>) -> String {
let mut options = vec!["format = 'parquet'".to_string()];
if let Some(start) = start_time {
options.push(format!("start_time = '{}'", start));
}
if let Some(end) = end_time {
options.push(format!("end_time = '{}'", end));
}
options.join(", ")
}

View File

@@ -16,7 +16,6 @@
mod client;
pub mod client_manager;
#[cfg(feature = "testing")]
mod database;
pub mod error;
pub mod flow;
@@ -34,7 +33,6 @@ pub use common_recordbatch::{RecordBatches, SendableRecordBatchStream};
use snafu::OptionExt;
pub use self::client::Client;
#[cfg(feature = "testing")]
pub use self::database::Database;
pub use self::error::{Error, Result};
use crate::error::{IllegalDatabaseResponseSnafu, ServerSnafu};

View File

@@ -13,7 +13,7 @@
// limitations under the License.
use enum_dispatch::enum_dispatch;
use rand::seq::SliceRandom;
use rand::seq::IndexedRandom;
#[enum_dispatch]
pub trait LoadBalance {
@@ -37,7 +37,7 @@ pub struct Random;
impl LoadBalance for Random {
fn get_peer<'a>(&self, peers: &'a [String]) -> Option<&'a String> {
peers.choose(&mut rand::thread_rng())
peers.choose(&mut rand::rng())
}
}

View File

@@ -30,7 +30,7 @@ use datanode::datanode::{Datanode, DatanodeBuilder};
use datanode::service::DatanodeServiceBuilder;
use meta_client::{MetaClientOptions, MetaClientType};
use servers::Mode;
use snafu::{OptionExt, ResultExt};
use snafu::{ensure, OptionExt, ResultExt};
use tracing_appender::non_blocking::WorkerGuard;
use crate::error::{
@@ -126,10 +126,14 @@ impl SubCommand {
struct StartCommand {
#[clap(long)]
node_id: Option<u64>,
#[clap(long)]
rpc_addr: Option<String>,
#[clap(long)]
rpc_hostname: Option<String>,
/// The address to bind the gRPC server.
#[clap(long, alias = "rpc-addr")]
rpc_bind_addr: Option<String>,
/// The address advertised to the metasrv, and used for connections from outside the host.
/// If left empty or unset, the server will automatically use the IP address of the first network interface
/// on the host, with the same port number as the one specified in `rpc_bind_addr`.
#[clap(long, alias = "rpc-hostname")]
rpc_server_addr: Option<String>,
#[clap(long, value_delimiter = ',', num_args = 1..)]
metasrv_addrs: Option<Vec<String>>,
#[clap(short, long)]
@@ -181,18 +185,18 @@ impl StartCommand {
tokio_console_addr: global_options.tokio_console_addr.clone(),
};
if let Some(addr) = &self.rpc_addr {
opts.grpc.addr.clone_from(addr);
if let Some(addr) = &self.rpc_bind_addr {
opts.grpc.bind_addr.clone_from(addr);
} else if let Some(addr) = &opts.rpc_addr {
warn!("Use the deprecated attribute `DatanodeOptions.rpc_addr`, please use `grpc.addr` instead.");
opts.grpc.addr.clone_from(addr);
opts.grpc.bind_addr.clone_from(addr);
}
if let Some(hostname) = &self.rpc_hostname {
opts.grpc.hostname.clone_from(hostname);
} else if let Some(hostname) = &opts.rpc_hostname {
if let Some(server_addr) = &self.rpc_server_addr {
opts.grpc.server_addr.clone_from(server_addr);
} else if let Some(server_addr) = &opts.rpc_hostname {
warn!("Use the deprecated attribute `DatanodeOptions.rpc_hostname`, please use `grpc.hostname` instead.");
opts.grpc.hostname.clone_from(hostname);
opts.grpc.server_addr.clone_from(server_addr);
}
if let Some(runtime_size) = opts.rpc_runtime_size {
@@ -219,15 +223,14 @@ impl StartCommand {
.get_or_insert_with(MetaClientOptions::default)
.metasrv_addrs
.clone_from(metasrv_addrs);
opts.mode = Mode::Distributed;
}
if let (Mode::Distributed, None) = (&opts.mode, &opts.node_id) {
return MissingConfigSnafu {
msg: "Missing node id option",
ensure!(
opts.node_id.is_some(),
MissingConfigSnafu {
msg: "Missing node id option"
}
.fail();
}
);
if let Some(data_home) = &self.data_home {
opts.storage.data_home.clone_from(data_home);
@@ -277,13 +280,12 @@ impl StartCommand {
let plugin_opts = opts.plugins;
let mut opts = opts.component;
opts.grpc.detect_hostname();
opts.grpc.detect_server_addr();
let mut plugins = Plugins::new();
plugins::setup_datanode_plugins(&mut plugins, &plugin_opts, &opts)
.await
.context(StartDatanodeSnafu)?;
let cluster_id = 0; // TODO(hl): read from config
let member_id = opts
.node_id
.context(MissingConfigSnafu { msg: "'node_id'" })?;
@@ -293,9 +295,9 @@ impl StartCommand {
})?;
let meta_client = meta_client::create_meta_client(
cluster_id,
MetaClientType::Datanode { member_id },
meta_config,
None,
)
.await
.context(MetaClientInitSnafu)?;
@@ -311,7 +313,7 @@ impl StartCommand {
.build(),
);
let mut datanode = DatanodeBuilder::new(opts.clone(), plugins)
let mut datanode = DatanodeBuilder::new(opts.clone(), plugins, Mode::Distributed)
.with_meta_client(meta_client)
.with_kv_backend(meta_backend)
.with_cache_registry(layered_cache_registry)
@@ -333,6 +335,7 @@ impl StartCommand {
#[cfg(test)]
mod tests {
use std::assert_matches::assert_matches;
use std::io::Write;
use std::time::Duration;
@@ -340,7 +343,6 @@ mod tests {
use common_test_util::temp_dir::create_named_temp_file;
use datanode::config::{FileConfig, GcsConfig, ObjectStoreConfig, S3Config};
use servers::heartbeat_options::HeartbeatOptions;
use servers::Mode;
use super::*;
use crate::options::GlobalOptions;
@@ -357,8 +359,8 @@ mod tests {
rpc_addr = "127.0.0.1:4001"
rpc_hostname = "192.168.0.1"
[grpc]
addr = "127.0.0.1:3001"
hostname = "127.0.0.1"
bind_addr = "127.0.0.1:3001"
server_addr = "127.0.0.1"
runtime_size = 8
"#;
write!(file, "{}", toml_str).unwrap();
@@ -369,8 +371,8 @@ mod tests {
};
let options = cmd.load_options(&Default::default()).unwrap().component;
assert_eq!("127.0.0.1:4001".to_string(), options.grpc.addr);
assert_eq!("192.168.0.1".to_string(), options.grpc.hostname);
assert_eq!("127.0.0.1:4001".to_string(), options.grpc.bind_addr);
assert_eq!("192.168.0.1".to_string(), options.grpc.server_addr);
}
#[test]
@@ -406,7 +408,7 @@ mod tests {
sync_write = false
[storage]
data_home = "/tmp/greptimedb/"
data_home = "./greptimedb_data/"
type = "File"
[[storage.providers]]
@@ -420,7 +422,7 @@ mod tests {
[logging]
level = "debug"
dir = "/tmp/greptimedb/test/logs"
dir = "./greptimedb_data/test/logs"
"#;
write!(file, "{}", toml_str).unwrap();
@@ -431,7 +433,7 @@ mod tests {
let options = cmd.load_options(&Default::default()).unwrap().component;
assert_eq!("127.0.0.1:3001".to_string(), options.grpc.addr);
assert_eq!("127.0.0.1:3001".to_string(), options.grpc.bind_addr);
assert_eq!(Some(42), options.node_id);
let DatanodeWalConfig::RaftEngine(raft_engine_config) = options.wal else {
@@ -467,7 +469,7 @@ mod tests {
assert_eq!(10000, ddl_timeout.as_millis());
assert_eq!(3000, timeout.as_millis());
assert!(tcp_nodelay);
assert_eq!("/tmp/greptimedb/", options.storage.data_home);
assert_eq!("./greptimedb_data/", options.storage.data_home);
assert!(matches!(
&options.storage.store,
ObjectStoreConfig::File(FileConfig { .. })
@@ -483,27 +485,14 @@ mod tests {
));
assert_eq!("debug", options.logging.level.unwrap());
assert_eq!("/tmp/greptimedb/test/logs".to_string(), options.logging.dir);
assert_eq!(
"./greptimedb_data/test/logs".to_string(),
options.logging.dir
);
}
#[test]
fn test_try_from_cmd() {
let opt = StartCommand::default()
.load_options(&GlobalOptions::default())
.unwrap()
.component;
assert_eq!(Mode::Standalone, opt.mode);
let opt = (StartCommand {
node_id: Some(42),
metasrv_addrs: Some(vec!["127.0.0.1:3002".to_string()]),
..Default::default()
})
.load_options(&GlobalOptions::default())
.unwrap()
.component;
assert_eq!(Mode::Distributed, opt.mode);
assert!((StartCommand {
metasrv_addrs: Some(vec!["127.0.0.1:3002".to_string()]),
..Default::default()
@@ -522,11 +511,23 @@ mod tests {
#[test]
fn test_load_log_options_from_cli() {
let cmd = StartCommand::default();
let mut cmd = StartCommand::default();
let result = cmd.load_options(&GlobalOptions {
log_dir: Some("./greptimedb_data/test/logs".to_string()),
log_level: Some("debug".to_string()),
#[cfg(feature = "tokio-console")]
tokio_console_addr: None,
});
// Missing node_id.
assert_matches!(result, Err(crate::error::Error::MissingConfig { .. }));
cmd.node_id = Some(42);
let options = cmd
.load_options(&GlobalOptions {
log_dir: Some("/tmp/greptimedb/test/logs".to_string()),
log_dir: Some("./greptimedb_data/test/logs".to_string()),
log_level: Some("debug".to_string()),
#[cfg(feature = "tokio-console")]
@@ -536,7 +537,7 @@ mod tests {
.component;
let logging_opt = options.logging;
assert_eq!("/tmp/greptimedb/test/logs", logging_opt.dir);
assert_eq!("./greptimedb_data/test/logs", logging_opt.dir);
assert_eq!("debug", logging_opt.level.as_ref().unwrap());
}
@@ -565,11 +566,11 @@ mod tests {
[storage]
type = "File"
data_home = "/tmp/greptimedb/"
data_home = "./greptimedb_data/"
[logging]
level = "debug"
dir = "/tmp/greptimedb/test/logs"
dir = "./greptimedb_data/test/logs"
"#;
write!(file, "{}", toml_str).unwrap();
@@ -645,7 +646,7 @@ mod tests {
opts.http.addr,
DatanodeOptions::default().component.http.addr
);
assert_eq!(opts.grpc.hostname, "10.103.174.219");
assert_eq!(opts.grpc.server_addr, "10.103.174.219");
},
);
}

View File

@@ -100,6 +100,13 @@ pub enum Error {
source: flow::Error,
},
#[snafu(display("Servers error"))]
Servers {
#[snafu(implicit)]
location: Location,
source: servers::error::Error,
},
#[snafu(display("Failed to start frontend"))]
StartFrontend {
#[snafu(implicit)]
@@ -365,6 +372,7 @@ impl ErrorExt for Error {
Error::ShutdownFrontend { source, .. } => source.status_code(),
Error::StartMetaServer { source, .. } => source.status_code(),
Error::ShutdownMetaServer { source, .. } => source.status_code(),
Error::Servers { source, .. } => source.status_code(),
Error::BuildMetaServer { source, .. } => source.status_code(),
Error::UnsupportedSelectorType { source, .. } => source.status_code(),
Error::BuildCli { source, .. } => source.status_code(),

View File

@@ -34,8 +34,7 @@ use common_telemetry::logging::TracingOptions;
use common_version::{short_version, version};
use flow::{FlownodeBuilder, FlownodeInstance, FrontendInvoker};
use meta_client::{MetaClientOptions, MetaClientType};
use servers::Mode;
use snafu::{OptionExt, ResultExt};
use snafu::{ensure, OptionExt, ResultExt};
use tracing_appender::non_blocking::WorkerGuard;
use crate::error::{
@@ -129,11 +128,13 @@ struct StartCommand {
#[clap(long)]
node_id: Option<u64>,
/// Bind address for the gRPC server.
#[clap(long)]
rpc_addr: Option<String>,
/// Hostname for the gRPC server.
#[clap(long)]
rpc_hostname: Option<String>,
#[clap(long, alias = "rpc-addr")]
rpc_bind_addr: Option<String>,
/// The address advertised to the metasrv, and used for connections from outside the host.
/// If left empty or unset, the server will automatically use the IP address of the first network interface
/// on the host, with the same port number as the one specified in `rpc_bind_addr`.
#[clap(long, alias = "rpc-hostname")]
rpc_server_addr: Option<String>,
/// Metasrv address list;
#[clap(long, value_delimiter = ',', num_args = 1..)]
metasrv_addrs: Option<Vec<String>>,
@@ -184,12 +185,12 @@ impl StartCommand {
tokio_console_addr: global_options.tokio_console_addr.clone(),
};
if let Some(addr) = &self.rpc_addr {
opts.grpc.addr.clone_from(addr);
if let Some(addr) = &self.rpc_bind_addr {
opts.grpc.bind_addr.clone_from(addr);
}
if let Some(hostname) = &self.rpc_hostname {
opts.grpc.hostname.clone_from(hostname);
if let Some(server_addr) = &self.rpc_server_addr {
opts.grpc.server_addr.clone_from(server_addr);
}
if let Some(node_id) = self.node_id {
@@ -201,7 +202,6 @@ impl StartCommand {
.get_or_insert_with(MetaClientOptions::default)
.metasrv_addrs
.clone_from(metasrv_addrs);
opts.mode = Mode::Distributed;
}
if let Some(http_addr) = &self.http_addr {
@@ -212,12 +212,12 @@ impl StartCommand {
opts.http.timeout = Duration::from_secs(http_timeout);
}
if let (Mode::Distributed, None) = (&opts.mode, &opts.node_id) {
return MissingConfigSnafu {
msg: "Missing node id option",
ensure!(
opts.node_id.is_some(),
MissingConfigSnafu {
msg: "Missing node id option"
}
.fail();
}
);
Ok(())
}
@@ -237,10 +237,7 @@ impl StartCommand {
info!("Flownode options: {:#?}", opts);
let mut opts = opts.component;
opts.grpc.detect_hostname();
// TODO(discord9): make it not optionale after cluster id is required
let cluster_id = opts.cluster_id.unwrap_or(0);
opts.grpc.detect_server_addr();
let member_id = opts
.node_id
@@ -251,9 +248,9 @@ impl StartCommand {
})?;
let meta_client = meta_client::create_meta_client(
cluster_id,
MetaClientType::Flownode { member_id },
meta_config,
None,
)
.await
.context(MetaClientInitSnafu)?;

View File

@@ -32,28 +32,25 @@ use common_telemetry::info;
use common_telemetry::logging::TracingOptions;
use common_time::timezone::set_default_timezone;
use common_version::{short_version, version};
use frontend::frontend::Frontend;
use frontend::heartbeat::HeartbeatTask;
use frontend::instance::builder::FrontendBuilder;
use frontend::instance::{FrontendInstance, Instance as FeInstance};
use frontend::server::Services;
use meta_client::{MetaClientOptions, MetaClientType};
use query::stats::StatementStatistics;
use servers::export_metrics::ExportMetricsTask;
use servers::tls::{TlsMode, TlsOption};
use snafu::{OptionExt, ResultExt};
use tracing_appender::non_blocking::WorkerGuard;
use crate::error::{
self, InitTimezoneSnafu, LoadLayeredConfigSnafu, MetaClientInitSnafu, MissingConfigSnafu,
Result, StartFrontendSnafu,
};
use crate::error::{self, Result};
use crate::options::{GlobalOptions, GreptimeOptions};
use crate::{log_versions, App};
type FrontendOptions = GreptimeOptions<frontend::frontend::FrontendOptions>;
pub struct Instance {
frontend: FeInstance,
frontend: Frontend,
// Keep the logging guard to prevent the worker from being dropped.
_guard: Vec<WorkerGuard>,
}
@@ -61,20 +58,17 @@ pub struct Instance {
pub const APP_NAME: &str = "greptime-frontend";
impl Instance {
pub fn new(frontend: FeInstance, guard: Vec<WorkerGuard>) -> Self {
Self {
frontend,
_guard: guard,
}
pub fn new(frontend: Frontend, _guard: Vec<WorkerGuard>) -> Self {
Self { frontend, _guard }
}
pub fn mut_inner(&mut self) -> &mut FeInstance {
&mut self.frontend
}
pub fn inner(&self) -> &FeInstance {
pub fn inner(&self) -> &Frontend {
&self.frontend
}
pub fn mut_inner(&mut self) -> &mut Frontend {
&mut self.frontend
}
}
#[async_trait]
@@ -84,11 +78,15 @@ impl App for Instance {
}
async fn start(&mut self) -> Result<()> {
plugins::start_frontend_plugins(self.frontend.plugins().clone())
let plugins = self.frontend.instance.plugins().clone();
plugins::start_frontend_plugins(plugins)
.await
.context(StartFrontendSnafu)?;
.context(error::StartFrontendSnafu)?;
self.frontend.start().await.context(StartFrontendSnafu)
self.frontend
.start()
.await
.context(error::StartFrontendSnafu)
}
async fn stop(&self) -> Result<()> {
@@ -136,13 +134,19 @@ impl SubCommand {
#[derive(Debug, Default, Parser)]
pub struct StartCommand {
/// The address to bind the gRPC server.
#[clap(long, alias = "rpc-addr")]
rpc_bind_addr: Option<String>,
/// The address advertised to the metasrv, and used for connections from outside the host.
/// If left empty or unset, the server will automatically use the IP address of the first network interface
/// on the host, with the same port number as the one specified in `rpc_bind_addr`.
#[clap(long, alias = "rpc-hostname")]
rpc_server_addr: Option<String>,
#[clap(long)]
http_addr: Option<String>,
#[clap(long)]
http_timeout: Option<u64>,
#[clap(long)]
rpc_addr: Option<String>,
#[clap(long)]
mysql_addr: Option<String>,
#[clap(long)]
postgres_addr: Option<String>,
@@ -172,7 +176,7 @@ impl StartCommand {
self.config_file.as_deref(),
self.env_prefix.as_ref(),
)
.context(LoadLayeredConfigSnafu)?;
.context(error::LoadLayeredConfigSnafu)?;
self.merge_with_cli_options(global_options, &mut opts)?;
@@ -218,11 +222,15 @@ impl StartCommand {
opts.http.disable_dashboard = disable_dashboard;
}
if let Some(addr) = &self.rpc_addr {
opts.grpc.addr.clone_from(addr);
if let Some(addr) = &self.rpc_bind_addr {
opts.grpc.bind_addr.clone_from(addr);
opts.grpc.tls = tls_opts.clone();
}
if let Some(addr) = &self.rpc_server_addr {
opts.grpc.server_addr.clone_from(addr);
}
if let Some(addr) = &self.mysql_addr {
opts.mysql.enable = true;
opts.mysql.addr.clone_from(addr);
@@ -269,30 +277,32 @@ impl StartCommand {
let plugin_opts = opts.plugins;
let mut opts = opts.component;
opts.grpc.detect_hostname();
opts.grpc.detect_server_addr();
let mut plugins = Plugins::new();
plugins::setup_frontend_plugins(&mut plugins, &plugin_opts, &opts)
.await
.context(StartFrontendSnafu)?;
.context(error::StartFrontendSnafu)?;
set_default_timezone(opts.default_timezone.as_deref()).context(InitTimezoneSnafu)?;
set_default_timezone(opts.default_timezone.as_deref()).context(error::InitTimezoneSnafu)?;
let meta_client_options = opts.meta_client.as_ref().context(MissingConfigSnafu {
msg: "'meta_client'",
})?;
let meta_client_options = opts
.meta_client
.as_ref()
.context(error::MissingConfigSnafu {
msg: "'meta_client'",
})?;
let cache_max_capacity = meta_client_options.metadata_cache_max_capacity;
let cache_ttl = meta_client_options.metadata_cache_ttl;
let cache_tti = meta_client_options.metadata_cache_tti;
let cluster_id = 0; // (TODO: jeremy): It is currently a reserved field and has not been enabled.
let meta_client = meta_client::create_meta_client(
cluster_id,
MetaClientType::Frontend,
meta_client_options,
Some(&plugins),
)
.await
.context(MetaClientInitSnafu)?;
.context(error::MetaClientInitSnafu)?;
// TODO(discord9): add helper function to ease the creation of cache registry&such
let cached_meta_backend =
@@ -339,6 +349,7 @@ impl StartCommand {
opts.heartbeat.clone(),
Arc::new(executor),
);
let heartbeat_task = Some(heartbeat_task);
// frontend to datanode need not timeout.
// Some queries are expected to take long time.
@@ -350,7 +361,7 @@ impl StartCommand {
};
let client = NodeClients::new(channel_config);
let mut instance = FrontendBuilder::new(
let instance = FrontendBuilder::new(
opts.clone(),
cached_meta_backend.clone(),
layered_cache_registry.clone(),
@@ -361,20 +372,27 @@ impl StartCommand {
)
.with_plugin(plugins.clone())
.with_local_cache_invalidator(layered_cache_registry)
.with_heartbeat_task(heartbeat_task)
.try_build()
.await
.context(StartFrontendSnafu)?;
.context(error::StartFrontendSnafu)?;
let instance = Arc::new(instance);
let servers = Services::new(opts, Arc::new(instance.clone()), plugins)
let export_metrics_task = ExportMetricsTask::try_new(&opts.export_metrics, Some(&plugins))
.context(error::ServersSnafu)?;
let servers = Services::new(opts, instance.clone(), plugins)
.build()
.await
.context(StartFrontendSnafu)?;
instance
.build_servers(servers)
.context(StartFrontendSnafu)?;
.context(error::StartFrontendSnafu)?;
Ok(Instance::new(instance, guard))
let frontend = Frontend {
instance,
servers,
heartbeat_task,
export_metrics_task,
};
Ok(Instance::new(frontend, guard))
}
}
@@ -413,7 +431,7 @@ mod tests {
let default_opts = FrontendOptions::default().component;
assert_eq!(opts.grpc.addr, default_opts.grpc.addr);
assert_eq!(opts.grpc.bind_addr, default_opts.grpc.bind_addr);
assert!(opts.mysql.enable);
assert_eq!(opts.mysql.runtime_size, default_opts.mysql.runtime_size);
assert!(opts.postgres.enable);
@@ -434,7 +452,7 @@ mod tests {
[http]
addr = "127.0.0.1:4000"
timeout = "30s"
timeout = "0s"
body_limit = "2GB"
[opentsdb]
@@ -442,7 +460,7 @@ mod tests {
[logging]
level = "debug"
dir = "/tmp/greptimedb/test/logs"
dir = "./greptimedb_data/test/logs"
"#;
write!(file, "{}", toml_str).unwrap();
@@ -455,12 +473,15 @@ mod tests {
let fe_opts = command.load_options(&Default::default()).unwrap().component;
assert_eq!("127.0.0.1:4000".to_string(), fe_opts.http.addr);
assert_eq!(Duration::from_secs(30), fe_opts.http.timeout);
assert_eq!(Duration::from_secs(0), fe_opts.http.timeout);
assert_eq!(ReadableSize::gb(2), fe_opts.http.body_limit);
assert_eq!("debug", fe_opts.logging.level.as_ref().unwrap());
assert_eq!("/tmp/greptimedb/test/logs".to_string(), fe_opts.logging.dir);
assert_eq!(
"./greptimedb_data/test/logs".to_string(),
fe_opts.logging.dir
);
assert!(!fe_opts.opentsdb.enable);
}
@@ -499,7 +520,7 @@ mod tests {
let options = cmd
.load_options(&GlobalOptions {
log_dir: Some("/tmp/greptimedb/test/logs".to_string()),
log_dir: Some("./greptimedb_data/test/logs".to_string()),
log_level: Some("debug".to_string()),
#[cfg(feature = "tokio-console")]
@@ -509,7 +530,7 @@ mod tests {
.component;
let logging_opt = options.logging;
assert_eq!("/tmp/greptimedb/test/logs", logging_opt.dir);
assert_eq!("./greptimedb_data/test/logs", logging_opt.dir);
assert_eq!("debug", logging_opt.level.as_ref().unwrap());
}
@@ -604,7 +625,7 @@ mod tests {
assert_eq!(fe_opts.http.addr, "127.0.0.1:14000");
// Should be default value.
assert_eq!(fe_opts.grpc.addr, GrpcOptions::default().addr);
assert_eq!(fe_opts.grpc.bind_addr, GrpcOptions::default().bind_addr);
},
);
}

View File

@@ -42,7 +42,7 @@ pub struct Instance {
}
impl Instance {
fn new(instance: MetasrvInstance, guard: Vec<WorkerGuard>) -> Self {
pub fn new(instance: MetasrvInstance, guard: Vec<WorkerGuard>) -> Self {
Self {
instance,
_guard: guard,
@@ -133,11 +133,15 @@ impl SubCommand {
#[derive(Debug, Default, Parser)]
struct StartCommand {
#[clap(long)]
bind_addr: Option<String>,
#[clap(long)]
server_addr: Option<String>,
#[clap(long, aliases = ["store-addr"], value_delimiter = ',', num_args = 1..)]
/// The address to bind the gRPC server.
#[clap(long, alias = "bind-addr")]
rpc_bind_addr: Option<String>,
/// The communication server address for the frontend and datanode to connect to metasrv.
/// If left empty or unset, the server will automatically use the IP address of the first network interface
/// on the host, with the same port number as the one specified in `rpc_bind_addr`.
#[clap(long, alias = "server-addr")]
rpc_server_addr: Option<String>,
#[clap(long, alias = "store-addr", value_delimiter = ',', num_args = 1..)]
store_addrs: Option<Vec<String>>,
#[clap(short, long)]
config_file: Option<String>,
@@ -201,11 +205,11 @@ impl StartCommand {
tokio_console_addr: global_options.tokio_console_addr.clone(),
};
if let Some(addr) = &self.bind_addr {
if let Some(addr) = &self.rpc_bind_addr {
opts.bind_addr.clone_from(addr);
}
if let Some(addr) = &self.server_addr {
if let Some(addr) = &self.rpc_server_addr {
opts.server_addr.clone_from(addr);
}
@@ -269,11 +273,13 @@ impl StartCommand {
log_versions(version(), short_version(), APP_NAME);
info!("Metasrv start command: {:#?}", self);
info!("Metasrv options: {:#?}", opts);
let plugin_opts = opts.plugins;
let mut opts = opts.component;
opts.detect_server_addr();
info!("Metasrv options: {:#?}", opts);
let mut plugins = Plugins::new();
plugins::setup_metasrv_plugins(&mut plugins, &plugin_opts, &opts)
.await
@@ -306,8 +312,8 @@ mod tests {
#[test]
fn test_read_from_cmd() {
let cmd = StartCommand {
bind_addr: Some("127.0.0.1:3002".to_string()),
server_addr: Some("127.0.0.1:3002".to_string()),
rpc_bind_addr: Some("127.0.0.1:3002".to_string()),
rpc_server_addr: Some("127.0.0.1:3002".to_string()),
store_addrs: Some(vec!["127.0.0.1:2380".to_string()]),
selector: Some("LoadBased".to_string()),
..Default::default()
@@ -331,7 +337,7 @@ mod tests {
[logging]
level = "debug"
dir = "/tmp/greptimedb/test/logs"
dir = "./greptimedb_data/test/logs"
[failure_detector]
threshold = 8.0
@@ -352,7 +358,10 @@ mod tests {
assert_eq!(vec!["127.0.0.1:2379".to_string()], options.store_addrs);
assert_eq!(SelectorType::LeaseBased, options.selector);
assert_eq!("debug", options.logging.level.as_ref().unwrap());
assert_eq!("/tmp/greptimedb/test/logs".to_string(), options.logging.dir);
assert_eq!(
"./greptimedb_data/test/logs".to_string(),
options.logging.dir
);
assert_eq!(8.0, options.failure_detector.threshold);
assert_eq!(
100.0,
@@ -381,8 +390,8 @@ mod tests {
#[test]
fn test_load_log_options_from_cli() {
let cmd = StartCommand {
bind_addr: Some("127.0.0.1:3002".to_string()),
server_addr: Some("127.0.0.1:3002".to_string()),
rpc_bind_addr: Some("127.0.0.1:3002".to_string()),
rpc_server_addr: Some("127.0.0.1:3002".to_string()),
store_addrs: Some(vec!["127.0.0.1:2380".to_string()]),
selector: Some("LoadBased".to_string()),
..Default::default()
@@ -390,7 +399,7 @@ mod tests {
let options = cmd
.load_options(&GlobalOptions {
log_dir: Some("/tmp/greptimedb/test/logs".to_string()),
log_dir: Some("./greptimedb_data/test/logs".to_string()),
log_level: Some("debug".to_string()),
#[cfg(feature = "tokio-console")]
@@ -400,7 +409,7 @@ mod tests {
.component;
let logging_opt = options.logging;
assert_eq!("/tmp/greptimedb/test/logs", logging_opt.dir);
assert_eq!("./greptimedb_data/test/logs", logging_opt.dir);
assert_eq!("debug", logging_opt.level.as_ref().unwrap());
}
@@ -418,7 +427,7 @@ mod tests {
[logging]
level = "debug"
dir = "/tmp/greptimedb/test/logs"
dir = "./greptimedb_data/test/logs"
"#;
write!(file, "{}", toml_str).unwrap();

View File

@@ -42,6 +42,7 @@ use common_meta::kv_backend::KvBackendRef;
use common_meta::node_manager::NodeManagerRef;
use common_meta::peer::Peer;
use common_meta::region_keeper::MemoryRegionKeeper;
use common_meta::region_registry::LeaderRegionRegistry;
use common_meta::sequence::SequenceBuilder;
use common_meta::wal_options_allocator::{build_wal_options_allocator, WalOptionsAllocatorRef};
use common_procedure::{ProcedureInfo, ProcedureManagerRef};
@@ -55,18 +56,19 @@ use datanode::datanode::{Datanode, DatanodeBuilder};
use datanode::region_server::RegionServer;
use file_engine::config::EngineConfig as FileEngineConfig;
use flow::{FlowConfig, FlowWorkerManager, FlownodeBuilder, FlownodeOptions, FrontendInvoker};
use frontend::frontend::FrontendOptions;
use frontend::frontend::{Frontend, FrontendOptions};
use frontend::instance::builder::FrontendBuilder;
use frontend::instance::{FrontendInstance, Instance as FeInstance, StandaloneDatanodeManager};
use frontend::instance::{Instance as FeInstance, StandaloneDatanodeManager};
use frontend::server::Services;
use frontend::service_config::{
InfluxdbOptions, MysqlOptions, OpentsdbOptions, PostgresOptions, PromStoreOptions,
InfluxdbOptions, JaegerOptions, MysqlOptions, OpentsdbOptions, PostgresOptions,
PromStoreOptions,
};
use meta_srv::metasrv::{FLOW_ID_SEQ, TABLE_ID_SEQ};
use mito2::config::MitoConfig;
use query::stats::StatementStatistics;
use serde::{Deserialize, Serialize};
use servers::export_metrics::ExportMetricsOption;
use servers::export_metrics::{ExportMetricsOption, ExportMetricsTask};
use servers::grpc::GrpcOptions;
use servers::http::HttpOptions;
use servers::tls::{TlsMode, TlsOption};
@@ -75,15 +77,9 @@ use snafu::ResultExt;
use tokio::sync::{broadcast, RwLock};
use tracing_appender::non_blocking::WorkerGuard;
use crate::error::{
BuildCacheRegistrySnafu, BuildWalOptionsAllocatorSnafu, CreateDirSnafu, IllegalConfigSnafu,
InitDdlManagerSnafu, InitMetadataSnafu, InitTimezoneSnafu, LoadLayeredConfigSnafu, OtherSnafu,
Result, ShutdownDatanodeSnafu, ShutdownFlownodeSnafu, ShutdownFrontendSnafu,
StartDatanodeSnafu, StartFlownodeSnafu, StartFrontendSnafu, StartProcedureManagerSnafu,
StartWalOptionsAllocatorSnafu, StopProcedureManagerSnafu,
};
use crate::error::Result;
use crate::options::{GlobalOptions, GreptimeOptions};
use crate::{log_versions, App};
use crate::{error, log_versions, App};
pub const APP_NAME: &str = "greptime-standalone";
@@ -131,7 +127,6 @@ impl SubCommand {
#[derive(Clone, Debug, Serialize, Deserialize, PartialEq)]
#[serde(default)]
pub struct StandaloneOptions {
pub mode: Mode,
pub enable_telemetry: bool,
pub default_timezone: Option<String>,
pub http: HttpOptions,
@@ -140,6 +135,7 @@ pub struct StandaloneOptions {
pub postgres: PostgresOptions,
pub opentsdb: OpentsdbOptions,
pub influxdb: InfluxdbOptions,
pub jaeger: JaegerOptions,
pub prom_store: PromStoreOptions,
pub wal: DatanodeWalConfig,
pub storage: StorageConfig,
@@ -160,7 +156,6 @@ pub struct StandaloneOptions {
impl Default for StandaloneOptions {
fn default() -> Self {
Self {
mode: Mode::Standalone,
enable_telemetry: true,
default_timezone: None,
http: HttpOptions::default(),
@@ -169,6 +164,7 @@ impl Default for StandaloneOptions {
postgres: PostgresOptions::default(),
opentsdb: OpentsdbOptions::default(),
influxdb: InfluxdbOptions::default(),
jaeger: JaegerOptions::default(),
prom_store: PromStoreOptions::default(),
wal: DatanodeWalConfig::default(),
storage: StorageConfig::default(),
@@ -217,6 +213,7 @@ impl StandaloneOptions {
postgres: cloned_opts.postgres,
opentsdb: cloned_opts.opentsdb,
influxdb: cloned_opts.influxdb,
jaeger: cloned_opts.jaeger,
prom_store: cloned_opts.prom_store,
meta_client: None,
logging: cloned_opts.logging,
@@ -239,7 +236,6 @@ impl StandaloneOptions {
grpc: cloned_opts.grpc,
init_regions_in_background: cloned_opts.init_regions_in_background,
init_regions_parallelism: cloned_opts.init_regions_parallelism,
mode: Mode::Standalone,
..Default::default()
}
}
@@ -247,13 +243,12 @@ impl StandaloneOptions {
pub struct Instance {
datanode: Datanode,
frontend: FeInstance,
frontend: Frontend,
// TODO(discord9): wrapped it in flownode instance instead
flow_worker_manager: Arc<FlowWorkerManager>,
flow_shutdown: broadcast::Sender<()>,
procedure_manager: ProcedureManagerRef,
wal_options_allocator: WalOptionsAllocatorRef,
// Keep the logging guard to prevent the worker from being dropped.
_guard: Vec<WorkerGuard>,
}
@@ -277,21 +272,26 @@ impl App for Instance {
self.procedure_manager
.start()
.await
.context(StartProcedureManagerSnafu)?;
.context(error::StartProcedureManagerSnafu)?;
self.wal_options_allocator
.start()
.await
.context(StartWalOptionsAllocatorSnafu)?;
.context(error::StartWalOptionsAllocatorSnafu)?;
plugins::start_frontend_plugins(self.frontend.plugins().clone())
plugins::start_frontend_plugins(self.frontend.instance.plugins().clone())
.await
.context(StartFrontendSnafu)?;
.context(error::StartFrontendSnafu)?;
self.frontend
.start()
.await
.context(error::StartFrontendSnafu)?;
self.frontend.start().await.context(StartFrontendSnafu)?;
self.flow_worker_manager
.clone()
.run_background(Some(self.flow_shutdown.subscribe()));
Ok(())
}
@@ -299,17 +299,18 @@ impl App for Instance {
self.frontend
.shutdown()
.await
.context(ShutdownFrontendSnafu)?;
.context(error::ShutdownFrontendSnafu)?;
self.procedure_manager
.stop()
.await
.context(StopProcedureManagerSnafu)?;
.context(error::StopProcedureManagerSnafu)?;
self.datanode
.shutdown()
.await
.context(ShutdownDatanodeSnafu)?;
.context(error::ShutdownDatanodeSnafu)?;
self.flow_shutdown
.send(())
.map_err(|_e| {
@@ -318,7 +319,8 @@ impl App for Instance {
}
.build()
})
.context(ShutdownFlownodeSnafu)?;
.context(error::ShutdownFlownodeSnafu)?;
info!("Datanode instance stopped.");
Ok(())
@@ -329,8 +331,8 @@ impl App for Instance {
pub struct StartCommand {
#[clap(long)]
http_addr: Option<String>,
#[clap(long)]
rpc_addr: Option<String>,
#[clap(long, alias = "rpc-addr")]
rpc_bind_addr: Option<String>,
#[clap(long)]
mysql_addr: Option<String>,
#[clap(long)]
@@ -364,7 +366,7 @@ impl StartCommand {
self.config_file.as_deref(),
self.env_prefix.as_ref(),
)
.context(LoadLayeredConfigSnafu)?;
.context(error::LoadLayeredConfigSnafu)?;
self.merge_with_cli_options(global_options, &mut opts.component)?;
@@ -377,9 +379,6 @@ impl StartCommand {
global_options: &GlobalOptions,
opts: &mut StandaloneOptions,
) -> Result<()> {
// Should always be standalone mode.
opts.mode = Mode::Standalone;
if let Some(dir) = &global_options.log_dir {
opts.logging.dir.clone_from(dir);
}
@@ -407,17 +406,17 @@ impl StartCommand {
opts.storage.data_home.clone_from(data_home);
}
if let Some(addr) = &self.rpc_addr {
if let Some(addr) = &self.rpc_bind_addr {
// frontend grpc addr conflict with datanode default grpc addr
let datanode_grpc_addr = DatanodeOptions::default().grpc.addr;
let datanode_grpc_addr = DatanodeOptions::default().grpc.bind_addr;
if addr.eq(&datanode_grpc_addr) {
return IllegalConfigSnafu {
return error::IllegalConfigSnafu {
msg: format!(
"gRPC listen address conflicts with datanode reserved gRPC addr: {datanode_grpc_addr}",
),
}.fail();
}
opts.grpc.addr.clone_from(addr)
opts.grpc.bind_addr.clone_from(addr)
}
if let Some(addr) = &self.mysql_addr {
@@ -464,33 +463,34 @@ impl StartCommand {
let mut plugins = Plugins::new();
let plugin_opts = opts.plugins;
let mut opts = opts.component;
opts.grpc.detect_hostname();
opts.grpc.detect_server_addr();
let fe_opts = opts.frontend_options();
let dn_opts = opts.datanode_options();
plugins::setup_frontend_plugins(&mut plugins, &plugin_opts, &fe_opts)
.await
.context(StartFrontendSnafu)?;
.context(error::StartFrontendSnafu)?;
plugins::setup_datanode_plugins(&mut plugins, &plugin_opts, &dn_opts)
.await
.context(StartDatanodeSnafu)?;
.context(error::StartDatanodeSnafu)?;
set_default_timezone(fe_opts.default_timezone.as_deref()).context(InitTimezoneSnafu)?;
set_default_timezone(fe_opts.default_timezone.as_deref())
.context(error::InitTimezoneSnafu)?;
let data_home = &dn_opts.storage.data_home;
// Ensure the data_home directory exists.
fs::create_dir_all(path::Path::new(data_home))
.context(CreateDirSnafu { dir: data_home })?;
.context(error::CreateDirSnafu { dir: data_home })?;
let metadata_dir = metadata_store_dir(data_home);
let (kv_backend, procedure_manager) = FeInstance::try_build_standalone_components(
metadata_dir,
opts.metadata_store.clone(),
opts.procedure.clone(),
opts.metadata_store,
opts.procedure,
)
.await
.context(StartFrontendSnafu)?;
.context(error::StartFrontendSnafu)?;
// Builds cache registry
let layered_cache_builder = LayeredCacheRegistryBuilder::default();
@@ -499,16 +499,16 @@ impl StartCommand {
with_default_composite_cache_registry(
layered_cache_builder.add_cache_registry(fundamental_cache_registry),
)
.context(BuildCacheRegistrySnafu)?
.context(error::BuildCacheRegistrySnafu)?
.build(),
);
let datanode = DatanodeBuilder::new(dn_opts, plugins.clone())
let datanode = DatanodeBuilder::new(dn_opts, plugins.clone(), Mode::Standalone)
.with_kv_backend(kv_backend.clone())
.with_cache_registry(layered_cache_registry.clone())
.build()
.await
.context(StartDatanodeSnafu)?;
.context(error::StartDatanodeSnafu)?;
let information_extension = Arc::new(StandaloneInformationExtension::new(
datanode.region_server(),
@@ -541,7 +541,7 @@ impl StartCommand {
.build()
.await
.map_err(BoxedError::new)
.context(OtherSnafu)?,
.context(error::OtherSnafu)?,
);
// set the ref to query for the local flow state
@@ -572,7 +572,7 @@ impl StartCommand {
let kafka_options = opts.wal.clone().into();
let wal_options_allocator = build_wal_options_allocator(&kafka_options, kv_backend.clone())
.await
.context(BuildWalOptionsAllocatorSnafu)?;
.context(error::BuildWalOptionsAllocatorSnafu)?;
let wal_options_allocator = Arc::new(wal_options_allocator);
let table_meta_allocator = Arc::new(TableMetadataAllocator::new(
table_id_sequence,
@@ -593,8 +593,8 @@ impl StartCommand {
)
.await?;
let mut frontend = FrontendBuilder::new(
fe_opts,
let fe_instance = FrontendBuilder::new(
fe_opts.clone(),
kv_backend.clone(),
layered_cache_registry.clone(),
catalog_manager.clone(),
@@ -605,7 +605,8 @@ impl StartCommand {
.with_plugin(plugins.clone())
.try_build()
.await
.context(StartFrontendSnafu)?;
.context(error::StartFrontendSnafu)?;
let fe_instance = Arc::new(fe_instance);
let flow_worker_manager = flownode.flow_worker_manager();
// flow server need to be able to use frontend to write insert requests back
@@ -618,18 +619,25 @@ impl StartCommand {
node_manager,
)
.await
.context(StartFlownodeSnafu)?;
.context(error::StartFlownodeSnafu)?;
flow_worker_manager.set_frontend_invoker(invoker).await;
let (tx, _rx) = broadcast::channel(1);
let servers = Services::new(opts, Arc::new(frontend.clone()), plugins)
let export_metrics_task = ExportMetricsTask::try_new(&opts.export_metrics, Some(&plugins))
.context(error::ServersSnafu)?;
let servers = Services::new(opts, fe_instance.clone(), plugins)
.build()
.await
.context(StartFrontendSnafu)?;
frontend
.build_servers(servers)
.context(StartFrontendSnafu)?;
.context(error::StartFrontendSnafu)?;
let frontend = Frontend {
instance: fe_instance,
servers,
heartbeat_task: None,
export_metrics_task,
};
Ok(Instance {
datanode,
@@ -657,6 +665,7 @@ impl StartCommand {
node_manager,
cache_invalidator,
memory_region_keeper: Arc::new(MemoryRegionKeeper::default()),
leader_region_registry: Arc::new(LeaderRegionRegistry::default()),
table_metadata_manager,
table_metadata_allocator,
flow_metadata_manager,
@@ -666,7 +675,7 @@ impl StartCommand {
procedure_manager,
true,
)
.context(InitDdlManagerSnafu)?,
.context(error::InitDdlManagerSnafu)?,
);
Ok(procedure_executor)
@@ -680,7 +689,7 @@ impl StartCommand {
table_metadata_manager
.init()
.await
.context(InitMetadataSnafu)?;
.context(error::InitMetadataSnafu)?;
Ok(table_metadata_manager)
}
@@ -774,6 +783,7 @@ impl InformationExtension for StandaloneInformationExtension {
manifest_size: region_stat.manifest_size,
sst_size: region_stat.sst_size,
index_size: region_stat.index_size,
region_manifest: region_stat.manifest.into(),
}
})
.collect::<Vec<_>>();
@@ -848,7 +858,7 @@ mod tests {
[wal]
provider = "raft_engine"
dir = "/tmp/greptimedb/test/wal"
dir = "./greptimedb_data/test/wal"
file_size = "1GB"
purge_threshold = "50GB"
purge_interval = "10m"
@@ -856,7 +866,7 @@ mod tests {
sync_write = false
[storage]
data_home = "/tmp/greptimedb/"
data_home = "./greptimedb_data/"
type = "File"
[[storage.providers]]
@@ -888,7 +898,7 @@ mod tests {
[logging]
level = "debug"
dir = "/tmp/greptimedb/test/logs"
dir = "./greptimedb_data/test/logs"
"#;
write!(file, "{}", toml_str).unwrap();
let cmd = StartCommand {
@@ -907,7 +917,7 @@ mod tests {
assert_eq!("127.0.0.1:4000".to_string(), fe_opts.http.addr);
assert_eq!(Duration::from_secs(33), fe_opts.http.timeout);
assert_eq!(ReadableSize::mb(128), fe_opts.http.body_limit);
assert_eq!("127.0.0.1:4001".to_string(), fe_opts.grpc.addr);
assert_eq!("127.0.0.1:4001".to_string(), fe_opts.grpc.bind_addr);
assert!(fe_opts.mysql.enable);
assert_eq!("127.0.0.1:4002", fe_opts.mysql.addr);
assert_eq!(2, fe_opts.mysql.runtime_size);
@@ -918,7 +928,10 @@ mod tests {
let DatanodeWalConfig::RaftEngine(raft_engine_config) = dn_opts.wal else {
unreachable!()
};
assert_eq!("/tmp/greptimedb/test/wal", raft_engine_config.dir.unwrap());
assert_eq!(
"./greptimedb_data/test/wal",
raft_engine_config.dir.unwrap()
);
assert!(matches!(
&dn_opts.storage.store,
@@ -942,7 +955,7 @@ mod tests {
}
assert_eq!("debug", logging_opts.level.as_ref().unwrap());
assert_eq!("/tmp/greptimedb/test/logs".to_string(), logging_opts.dir);
assert_eq!("./greptimedb_data/test/logs".to_string(), logging_opts.dir);
}
#[test]
@@ -954,7 +967,7 @@ mod tests {
let opts = cmd
.load_options(&GlobalOptions {
log_dir: Some("/tmp/greptimedb/test/logs".to_string()),
log_dir: Some("./greptimedb_data/test/logs".to_string()),
log_level: Some("debug".to_string()),
#[cfg(feature = "tokio-console")]
@@ -963,7 +976,7 @@ mod tests {
.unwrap()
.component;
assert_eq!("/tmp/greptimedb/test/logs", opts.logging.dir);
assert_eq!("./greptimedb_data/test/logs", opts.logging.dir);
assert_eq!("debug", opts.logging.level.unwrap());
}
@@ -1037,7 +1050,7 @@ mod tests {
assert_eq!(ReadableSize::mb(64), fe_opts.http.body_limit);
// Should be default value.
assert_eq!(fe_opts.grpc.addr, GrpcOptions::default().addr);
assert_eq!(fe_opts.grpc.bind_addr, GrpcOptions::default().bind_addr);
},
);
}
@@ -1047,7 +1060,6 @@ mod tests {
let options =
StandaloneOptions::load_layered_options(None, "GREPTIMEDB_STANDALONE").unwrap();
let default_options = StandaloneOptions::default();
assert_eq!(options.mode, default_options.mode);
assert_eq!(options.enable_telemetry, default_options.enable_telemetry);
assert_eq!(options.http, default_options.http);
assert_eq!(options.grpc, default_options.grpc);

View File

@@ -63,7 +63,7 @@ mod tests {
.args([
"datanode",
"start",
"--rpc-addr=0.0.0.0:4321",
"--rpc-bind-addr=0.0.0.0:4321",
"--node-id=1",
&format!("--data-home={}", data_home.path().display()),
&format!("--wal-dir={}", wal_dir.path().display()),
@@ -80,7 +80,7 @@ mod tests {
"--log-level=off",
"cli",
"attach",
"--grpc-addr=0.0.0.0:4321",
"--grpc-bind-addr=0.0.0.0:4321",
// history commands can sneaky into stdout and mess up our tests, so disable it
"--disable-helper",
]);

View File

@@ -17,9 +17,6 @@ use std::time::Duration;
use cmd::options::GreptimeOptions;
use cmd::standalone::StandaloneOptions;
use common_config::Configurable;
use common_grpc::channel_manager::{
DEFAULT_MAX_GRPC_RECV_MESSAGE_SIZE, DEFAULT_MAX_GRPC_SEND_MESSAGE_SIZE,
};
use common_options::datanode::{ClientOptions, DatanodeClientOptions};
use common_telemetry::logging::{LoggingOptions, SlowQueryOptions, DEFAULT_OTLP_ENDPOINT};
use common_wal::config::raft_engine::RaftEngineConfig;
@@ -59,13 +56,13 @@ fn test_load_datanode_example_config() {
metadata_cache_tti: Duration::from_secs(300),
}),
wal: DatanodeWalConfig::RaftEngine(RaftEngineConfig {
dir: Some("/tmp/greptimedb/wal".to_string()),
dir: Some("./greptimedb_data/wal".to_string()),
sync_period: Some(Duration::from_secs(10)),
recovery_parallelism: 2,
..Default::default()
}),
storage: StorageConfig {
data_home: "/tmp/greptimedb/".to_string(),
data_home: "./greptimedb_data/".to_string(),
..Default::default()
},
region_engine: vec![
@@ -91,13 +88,8 @@ fn test_load_datanode_example_config() {
..Default::default()
},
grpc: GrpcOptions::default()
.with_addr("127.0.0.1:3001")
.with_hostname("127.0.0.1:3001"),
rpc_addr: Some("127.0.0.1:3001".to_string()),
rpc_hostname: Some("127.0.0.1".to_string()),
rpc_runtime_size: Some(8),
rpc_max_recv_message_size: Some(DEFAULT_MAX_GRPC_RECV_MESSAGE_SIZE),
rpc_max_send_message_size: Some(DEFAULT_MAX_GRPC_SEND_MESSAGE_SIZE),
.with_bind_addr("127.0.0.1:3001")
.with_server_addr("127.0.0.1:3001"),
..Default::default()
},
..Default::default()
@@ -144,7 +136,9 @@ fn test_load_frontend_example_config() {
remote_write: Some(Default::default()),
..Default::default()
},
grpc: GrpcOptions::default().with_hostname("127.0.0.1:4001"),
grpc: GrpcOptions::default()
.with_bind_addr("127.0.0.1:4001")
.with_server_addr("127.0.0.1:4001"),
http: HttpOptions {
cors_allowed_origins: vec!["https://example.com".to_string()],
..Default::default()
@@ -165,17 +159,17 @@ fn test_load_metasrv_example_config() {
let expected = GreptimeOptions::<MetasrvOptions> {
component: MetasrvOptions {
selector: SelectorType::default(),
data_home: "/tmp/metasrv/".to_string(),
data_home: "./greptimedb_data/metasrv/".to_string(),
server_addr: "127.0.0.1:3002".to_string(),
logging: LoggingOptions {
dir: "/tmp/greptimedb/logs".to_string(),
dir: "./greptimedb_data/logs".to_string(),
level: Some("info".to_string()),
otlp_endpoint: Some(DEFAULT_OTLP_ENDPOINT.to_string()),
tracing_sample_ratio: Some(Default::default()),
slow_query: SlowQueryOptions {
enable: false,
threshold: Some(Duration::from_secs(10)),
sample_ratio: Some(1.0),
threshold: None,
sample_ratio: None,
},
..Default::default()
},
@@ -208,7 +202,7 @@ fn test_load_standalone_example_config() {
component: StandaloneOptions {
default_timezone: Some("UTC".to_string()),
wal: DatanodeWalConfig::RaftEngine(RaftEngineConfig {
dir: Some("/tmp/greptimedb/wal".to_string()),
dir: Some("./greptimedb_data/wal".to_string()),
sync_period: Some(Duration::from_secs(10)),
recovery_parallelism: 2,
..Default::default()
@@ -225,7 +219,7 @@ fn test_load_standalone_example_config() {
}),
],
storage: StorageConfig {
data_home: "/tmp/greptimedb/".to_string(),
data_home: "./greptimedb_data/".to_string(),
..Default::default()
},
logging: LoggingOptions {

View File

@@ -18,7 +18,7 @@ bytes.workspace = true
common-error.workspace = true
common-macro.workspace = true
futures.workspace = true
paste = "1.0"
paste.workspace = true
pin-project.workspace = true
rand.workspace = true
serde = { version = "1.0", features = ["derive"] }

View File

@@ -130,3 +130,19 @@ pub const SEMANTIC_TYPE_TIME_INDEX: &str = "TIMESTAMP";
pub fn is_readonly_schema(schema: &str) -> bool {
matches!(schema, INFORMATION_SCHEMA_NAME)
}
// ---- special table and fields ----
pub const TRACE_ID_COLUMN: &str = "trace_id";
pub const SPAN_ID_COLUMN: &str = "span_id";
pub const SPAN_NAME_COLUMN: &str = "span_name";
pub const SERVICE_NAME_COLUMN: &str = "service_name";
pub const PARENT_SPAN_ID_COLUMN: &str = "parent_span_id";
pub const TRACE_TABLE_NAME: &str = "opentelemetry_traces";
pub const TRACE_TABLE_NAME_SESSION_KEY: &str = "trace_table_name";
// ---- End of special table and fields ----
/// Generate the trace services table name from the trace table name by adding `_services` suffix.
pub fn trace_services_table_name(trace_table_name: &str) -> String {
format!("{}_services", trace_table_name)
}
// ---- End of special table and fields ----

View File

@@ -12,9 +12,11 @@ common-base.workspace = true
common-error.workspace = true
common-macro.workspace = true
config.workspace = true
humantime-serde.workspace = true
num_cpus.workspace = true
serde.workspace = true
serde_json.workspace = true
serde_with.workspace = true
snafu.workspace = true
sysinfo.workspace = true
toml.workspace = true

View File

@@ -161,7 +161,7 @@ mod tests {
[wal]
provider = "raft_engine"
dir = "/tmp/greptimedb/wal"
dir = "./greptimedb_data/wal"
file_size = "1GB"
purge_threshold = "50GB"
purge_interval = "10m"
@@ -170,7 +170,7 @@ mod tests {
[logging]
level = "debug"
dir = "/tmp/greptimedb/test/logs"
dir = "./greptimedb_data/test/logs"
"#;
write!(file, "{}", toml_str).unwrap();
@@ -246,7 +246,7 @@ mod tests {
let DatanodeWalConfig::RaftEngine(raft_engine_config) = opts.wal else {
unreachable!()
};
assert_eq!(raft_engine_config.dir.unwrap(), "/tmp/greptimedb/wal");
assert_eq!(raft_engine_config.dir.unwrap(), "./greptimedb_data/wal");
// Should be default values.
assert_eq!(opts.node_id, None);

View File

@@ -16,6 +16,8 @@ pub mod config;
pub mod error;
pub mod utils;
use std::time::Duration;
use common_base::readable_size::ReadableSize;
pub use config::*;
use serde::{Deserialize, Serialize};
@@ -34,22 +36,27 @@ pub enum Mode {
Distributed,
}
#[derive(Debug, Clone, Serialize, Deserialize, PartialEq, Eq)]
#[derive(Debug, Clone, Copy, Serialize, Deserialize, PartialEq, Eq)]
#[serde(default)]
pub struct KvBackendConfig {
// Kv file size in bytes
/// The size of the metadata store backend log file.
pub file_size: ReadableSize,
// Kv purge threshold in bytes
/// The threshold of the metadata store size to trigger a purge.
pub purge_threshold: ReadableSize,
/// The interval of the metadata store to trigger a purge.
#[serde(with = "humantime_serde")]
pub purge_interval: Duration,
}
impl Default for KvBackendConfig {
fn default() -> Self {
Self {
// log file size 256MB
file_size: ReadableSize::mb(256),
// purge threshold 4GB
purge_threshold: ReadableSize::gb(4),
// The log file size 64MB
file_size: ReadableSize::mb(64),
// The log purge threshold 256MB
purge_threshold: ReadableSize::mb(256),
// The log purge interval 1m
purge_interval: Duration::from_secs(60),
}
}
}

View File

@@ -35,7 +35,7 @@ orc-rust = { version = "0.5", default-features = false, features = [
"async",
] }
parquet.workspace = true
paste = "1.0"
paste.workspace = true
rand.workspace = true
regex = "1.7"
serde.workspace = true

View File

@@ -12,9 +12,12 @@ default = ["geo"]
geo = ["geohash", "h3o", "s2", "wkt", "geo-types", "dep:geo"]
[dependencies]
ahash = "0.8"
api.workspace = true
arc-swap = "1.0"
async-trait.workspace = true
bincode = "1.3"
chrono.workspace = true
common-base.workspace = true
common-catalog.workspace = true
common-error.workspace = true
@@ -26,18 +29,22 @@ common-telemetry.workspace = true
common-time.workspace = true
common-version.workspace = true
datafusion.workspace = true
datafusion-common.workspace = true
datafusion-expr.workspace = true
datatypes.workspace = true
derive_more = { version = "1", default-features = false, features = ["display"] }
geo = { version = "0.29", optional = true }
geo-types = { version = "0.7", optional = true }
geohash = { version = "0.13", optional = true }
h3o = { version = "0.6", optional = true }
hyperloglogplus = "0.4"
jsonb.workspace = true
memchr = "2.7"
nalgebra.workspace = true
num = "0.4"
num-traits = "0.2"
once_cell.workspace = true
paste = "1.0"
paste.workspace = true
s2 = { version = "0.0.12", optional = true }
serde.workspace = true
serde_json.workspace = true
@@ -47,6 +54,7 @@ sql.workspace = true
statrs = "0.16"
store-api.workspace = true
table.workspace = true
uddsketch = { git = "https://github.com/GreptimeTeam/timescaledb-toolkit.git", rev = "84828fe8fb494a6a61412a3da96517fc80f7bb20" }
wkt = { version = "0.11", optional = true }
[dev-dependencies]

View File

@@ -12,26 +12,32 @@
// See the License for the specific language governing permissions and
// limitations under the License.
mod add_region_follower;
mod flush_compact_region;
mod flush_compact_table;
mod migrate_region;
mod remove_region_follower;
use std::sync::Arc;
use add_region_follower::AddRegionFollowerFunction;
use flush_compact_region::{CompactRegionFunction, FlushRegionFunction};
use flush_compact_table::{CompactTableFunction, FlushTableFunction};
use migrate_region::MigrateRegionFunction;
use remove_region_follower::RemoveRegionFollowerFunction;
use crate::flush_flow::FlushFlowFunction;
use crate::function_registry::FunctionRegistry;
/// Table functions
pub(crate) struct TableFunction;
pub(crate) struct AdminFunction;
impl TableFunction {
impl AdminFunction {
/// Register all table functions to [`FunctionRegistry`].
pub fn register(registry: &FunctionRegistry) {
registry.register_async(Arc::new(MigrateRegionFunction));
registry.register_async(Arc::new(AddRegionFollowerFunction));
registry.register_async(Arc::new(RemoveRegionFollowerFunction));
registry.register_async(Arc::new(FlushRegionFunction));
registry.register_async(Arc::new(CompactRegionFunction));
registry.register_async(Arc::new(FlushTableFunction));

View File

@@ -0,0 +1,129 @@
// Copyright 2023 Greptime Team
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
use common_macro::admin_fn;
use common_meta::rpc::procedure::AddRegionFollowerRequest;
use common_query::error::{
InvalidFuncArgsSnafu, MissingProcedureServiceHandlerSnafu, Result,
UnsupportedInputDataTypeSnafu,
};
use common_query::prelude::{Signature, TypeSignature, Volatility};
use datatypes::prelude::ConcreteDataType;
use datatypes::value::{Value, ValueRef};
use session::context::QueryContextRef;
use snafu::ensure;
use crate::handlers::ProcedureServiceHandlerRef;
use crate::helper::cast_u64;
/// A function to add a follower to a region.
/// Only available in cluster mode.
///
/// - `add_region_follower(region_id, peer_id)`.
///
/// The parameters:
/// - `region_id`: the region id
/// - `peer_id`: the peer id
#[admin_fn(
name = AddRegionFollowerFunction,
display_name = add_region_follower,
sig_fn = signature,
ret = uint64
)]
pub(crate) async fn add_region_follower(
procedure_service_handler: &ProcedureServiceHandlerRef,
_ctx: &QueryContextRef,
params: &[ValueRef<'_>],
) -> Result<Value> {
ensure!(
params.len() == 2,
InvalidFuncArgsSnafu {
err_msg: format!(
"The length of the args is not correct, expect exactly 2, have: {}",
params.len()
),
}
);
let Some(region_id) = cast_u64(&params[0])? else {
return UnsupportedInputDataTypeSnafu {
function: "add_region_follower",
datatypes: params.iter().map(|v| v.data_type()).collect::<Vec<_>>(),
}
.fail();
};
let Some(peer_id) = cast_u64(&params[1])? else {
return UnsupportedInputDataTypeSnafu {
function: "add_region_follower",
datatypes: params.iter().map(|v| v.data_type()).collect::<Vec<_>>(),
}
.fail();
};
procedure_service_handler
.add_region_follower(AddRegionFollowerRequest { region_id, peer_id })
.await?;
Ok(Value::from(0u64))
}
fn signature() -> Signature {
Signature::one_of(
vec![
// add_region_follower(region_id, peer)
TypeSignature::Uniform(2, ConcreteDataType::numerics()),
],
Volatility::Immutable,
)
}
#[cfg(test)]
mod tests {
use std::sync::Arc;
use common_query::prelude::TypeSignature;
use datatypes::vectors::{UInt64Vector, VectorRef};
use super::*;
use crate::function::{AsyncFunction, FunctionContext};
#[test]
fn test_add_region_follower_misc() {
let f = AddRegionFollowerFunction;
assert_eq!("add_region_follower", f.name());
assert_eq!(
ConcreteDataType::uint64_datatype(),
f.return_type(&[]).unwrap()
);
assert!(matches!(f.signature(),
Signature {
type_signature: TypeSignature::OneOf(sigs),
volatility: Volatility::Immutable
} if sigs.len() == 1));
}
#[tokio::test]
async fn test_add_region_follower() {
let f = AddRegionFollowerFunction;
let args = vec![1, 1];
let args = args
.into_iter()
.map(|arg| Arc::new(UInt64Vector::from_slice([arg])) as _)
.collect::<Vec<_>>();
let result = f.eval(FunctionContext::mock(), &args).await.unwrap();
let expect: VectorRef = Arc::new(UInt64Vector::from_slice([0u64]));
assert_eq!(result, expect);
}
}

View File

@@ -0,0 +1,129 @@
// Copyright 2023 Greptime Team
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
use common_macro::admin_fn;
use common_meta::rpc::procedure::RemoveRegionFollowerRequest;
use common_query::error::{
InvalidFuncArgsSnafu, MissingProcedureServiceHandlerSnafu, Result,
UnsupportedInputDataTypeSnafu,
};
use common_query::prelude::{Signature, TypeSignature, Volatility};
use datatypes::prelude::ConcreteDataType;
use datatypes::value::{Value, ValueRef};
use session::context::QueryContextRef;
use snafu::ensure;
use crate::handlers::ProcedureServiceHandlerRef;
use crate::helper::cast_u64;
/// A function to remove a follower from a region.
//// Only available in cluster mode.
///
/// - `remove_region_follower(region_id, peer_id)`.
///
/// The parameters:
/// - `region_id`: the region id
/// - `peer_id`: the peer id
#[admin_fn(
name = RemoveRegionFollowerFunction,
display_name = remove_region_follower,
sig_fn = signature,
ret = uint64
)]
pub(crate) async fn remove_region_follower(
procedure_service_handler: &ProcedureServiceHandlerRef,
_ctx: &QueryContextRef,
params: &[ValueRef<'_>],
) -> Result<Value> {
ensure!(
params.len() == 2,
InvalidFuncArgsSnafu {
err_msg: format!(
"The length of the args is not correct, expect exactly 2, have: {}",
params.len()
),
}
);
let Some(region_id) = cast_u64(&params[0])? else {
return UnsupportedInputDataTypeSnafu {
function: "add_region_follower",
datatypes: params.iter().map(|v| v.data_type()).collect::<Vec<_>>(),
}
.fail();
};
let Some(peer_id) = cast_u64(&params[1])? else {
return UnsupportedInputDataTypeSnafu {
function: "add_region_follower",
datatypes: params.iter().map(|v| v.data_type()).collect::<Vec<_>>(),
}
.fail();
};
procedure_service_handler
.remove_region_follower(RemoveRegionFollowerRequest { region_id, peer_id })
.await?;
Ok(Value::from(0u64))
}
fn signature() -> Signature {
Signature::one_of(
vec![
// remove_region_follower(region_id, peer_id)
TypeSignature::Uniform(2, ConcreteDataType::numerics()),
],
Volatility::Immutable,
)
}
#[cfg(test)]
mod tests {
use std::sync::Arc;
use common_query::prelude::TypeSignature;
use datatypes::vectors::{UInt64Vector, VectorRef};
use super::*;
use crate::function::{AsyncFunction, FunctionContext};
#[test]
fn test_remove_region_follower_misc() {
let f = RemoveRegionFollowerFunction;
assert_eq!("remove_region_follower", f.name());
assert_eq!(
ConcreteDataType::uint64_datatype(),
f.return_type(&[]).unwrap()
);
assert!(matches!(f.signature(),
Signature {
type_signature: TypeSignature::OneOf(sigs),
volatility: Volatility::Immutable
} if sigs.len() == 1));
}
#[tokio::test]
async fn test_remove_region_follower() {
let f = RemoveRegionFollowerFunction;
let args = vec![1, 1];
let args = args
.into_iter()
.map(|arg| Arc::new(UInt64Vector::from_slice([arg])) as _)
.collect::<Vec<_>>();
let result = f.eval(FunctionContext::mock(), &args).await.unwrap();
let expect: VectorRef = Arc::new(UInt64Vector::from_slice([0u64]));
assert_eq!(result, expect);
}
}

View File

@@ -0,0 +1,22 @@
// Copyright 2023 Greptime Team
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
mod geo_path;
mod hll;
mod uddsketch_state;
pub use geo_path::{GeoPathAccumulator, GEO_PATH_NAME};
pub(crate) use hll::HllStateType;
pub use hll::{HllState, HLL_MERGE_NAME, HLL_NAME};
pub use uddsketch_state::{UddSketchState, UDDSKETCH_STATE_NAME};

View File

@@ -0,0 +1,433 @@
// Copyright 2023 Greptime Team
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
use std::sync::Arc;
use datafusion::arrow::array::{Array, ArrayRef};
use datafusion::common::cast::as_primitive_array;
use datafusion::error::{DataFusionError, Result as DfResult};
use datafusion::logical_expr::{Accumulator as DfAccumulator, AggregateUDF, Volatility};
use datafusion::prelude::create_udaf;
use datafusion_common::cast::{as_list_array, as_struct_array};
use datafusion_common::utils::SingleRowListArrayBuilder;
use datafusion_common::ScalarValue;
use datatypes::arrow::array::{Float64Array, Int64Array, ListArray, StructArray};
use datatypes::arrow::datatypes::{
DataType, Field, Float64Type, Int64Type, TimeUnit, TimestampNanosecondType,
};
use datatypes::compute::{self, sort_to_indices};
pub const GEO_PATH_NAME: &str = "geo_path";
const LATITUDE_FIELD: &str = "lat";
const LONGITUDE_FIELD: &str = "lng";
const TIMESTAMP_FIELD: &str = "timestamp";
const DEFAULT_LIST_FIELD_NAME: &str = "item";
#[derive(Debug, Default)]
pub struct GeoPathAccumulator {
lat: Vec<Option<f64>>,
lng: Vec<Option<f64>>,
timestamp: Vec<Option<i64>>,
}
impl GeoPathAccumulator {
pub fn new() -> Self {
Self::default()
}
pub fn udf_impl() -> AggregateUDF {
create_udaf(
GEO_PATH_NAME,
// Input types: lat, lng, timestamp
vec![
DataType::Float64,
DataType::Float64,
DataType::Timestamp(TimeUnit::Nanosecond, None),
],
// Output type: list of points {[lat], [lng]}
Arc::new(DataType::Struct(
vec![
Field::new(
LATITUDE_FIELD,
DataType::List(Arc::new(Field::new(
DEFAULT_LIST_FIELD_NAME,
DataType::Float64,
true,
))),
false,
),
Field::new(
LONGITUDE_FIELD,
DataType::List(Arc::new(Field::new(
DEFAULT_LIST_FIELD_NAME,
DataType::Float64,
true,
))),
false,
),
]
.into(),
)),
Volatility::Immutable,
// Create the accumulator
Arc::new(|_| Ok(Box::new(GeoPathAccumulator::new()))),
// Intermediate state types
Arc::new(vec![DataType::Struct(
vec![
Field::new(
LATITUDE_FIELD,
DataType::List(Arc::new(Field::new(
DEFAULT_LIST_FIELD_NAME,
DataType::Float64,
true,
))),
false,
),
Field::new(
LONGITUDE_FIELD,
DataType::List(Arc::new(Field::new(
DEFAULT_LIST_FIELD_NAME,
DataType::Float64,
true,
))),
false,
),
Field::new(
TIMESTAMP_FIELD,
DataType::List(Arc::new(Field::new(
DEFAULT_LIST_FIELD_NAME,
DataType::Int64,
true,
))),
false,
),
]
.into(),
)]),
)
}
}
impl DfAccumulator for GeoPathAccumulator {
fn update_batch(&mut self, values: &[ArrayRef]) -> datafusion::error::Result<()> {
if values.len() != 3 {
return Err(DataFusionError::Internal(format!(
"Expected 3 columns for geo_path, got {}",
values.len()
)));
}
let lat_array = as_primitive_array::<Float64Type>(&values[0])?;
let lng_array = as_primitive_array::<Float64Type>(&values[1])?;
let ts_array = as_primitive_array::<TimestampNanosecondType>(&values[2])?;
let size = lat_array.len();
self.lat.reserve(size);
self.lng.reserve(size);
for idx in 0..size {
self.lat.push(if lat_array.is_null(idx) {
None
} else {
Some(lat_array.value(idx))
});
self.lng.push(if lng_array.is_null(idx) {
None
} else {
Some(lng_array.value(idx))
});
self.timestamp.push(if ts_array.is_null(idx) {
None
} else {
Some(ts_array.value(idx))
});
}
Ok(())
}
fn evaluate(&mut self) -> DfResult<ScalarValue> {
let unordered_lng_array = Float64Array::from(self.lng.clone());
let unordered_lat_array = Float64Array::from(self.lat.clone());
let ts_array = Int64Array::from(self.timestamp.clone());
let ordered_indices = sort_to_indices(&ts_array, None, None)?;
let lat_array = compute::take(&unordered_lat_array, &ordered_indices, None)?;
let lng_array = compute::take(&unordered_lng_array, &ordered_indices, None)?;
let lat_list = Arc::new(SingleRowListArrayBuilder::new(lat_array).build_list_array());
let lng_list = Arc::new(SingleRowListArrayBuilder::new(lng_array).build_list_array());
let result = ScalarValue::Struct(Arc::new(StructArray::new(
vec![
Field::new(
LATITUDE_FIELD,
DataType::List(Arc::new(Field::new("item", DataType::Float64, true))),
false,
),
Field::new(
LONGITUDE_FIELD,
DataType::List(Arc::new(Field::new("item", DataType::Float64, true))),
false,
),
]
.into(),
vec![lat_list, lng_list],
None,
)));
Ok(result)
}
fn size(&self) -> usize {
// Base size of GeoPathAccumulator struct fields
let mut total_size = std::mem::size_of::<Self>();
// Size of vectors (approximation)
total_size += self.lat.capacity() * std::mem::size_of::<Option<f64>>();
total_size += self.lng.capacity() * std::mem::size_of::<Option<f64>>();
total_size += self.timestamp.capacity() * std::mem::size_of::<Option<i64>>();
total_size
}
fn state(&mut self) -> datafusion::error::Result<Vec<ScalarValue>> {
let lat_array = Arc::new(ListArray::from_iter_primitive::<Float64Type, _, _>(vec![
Some(self.lat.clone()),
]));
let lng_array = Arc::new(ListArray::from_iter_primitive::<Float64Type, _, _>(vec![
Some(self.lng.clone()),
]));
let ts_array = Arc::new(ListArray::from_iter_primitive::<Int64Type, _, _>(vec![
Some(self.timestamp.clone()),
]));
let state_struct = StructArray::new(
vec![
Field::new(
LATITUDE_FIELD,
DataType::List(Arc::new(Field::new("item", DataType::Float64, true))),
false,
),
Field::new(
LONGITUDE_FIELD,
DataType::List(Arc::new(Field::new("item", DataType::Float64, true))),
false,
),
Field::new(
TIMESTAMP_FIELD,
DataType::List(Arc::new(Field::new("item", DataType::Int64, true))),
false,
),
]
.into(),
vec![lat_array, lng_array, ts_array],
None,
);
Ok(vec![ScalarValue::Struct(Arc::new(state_struct))])
}
fn merge_batch(&mut self, states: &[ArrayRef]) -> datafusion::error::Result<()> {
if states.len() != 1 {
return Err(DataFusionError::Internal(format!(
"Expected 1 states for geo_path, got {}",
states.len()
)));
}
for state in states {
let state = as_struct_array(state)?;
let lat_list = as_list_array(state.column(0))?.value(0);
let lat_array = as_primitive_array::<Float64Type>(&lat_list)?;
let lng_list = as_list_array(state.column(1))?.value(0);
let lng_array = as_primitive_array::<Float64Type>(&lng_list)?;
let ts_list = as_list_array(state.column(2))?.value(0);
let ts_array = as_primitive_array::<Int64Type>(&ts_list)?;
self.lat.extend(lat_array);
self.lng.extend(lng_array);
self.timestamp.extend(ts_array);
}
Ok(())
}
}
#[cfg(test)]
mod tests {
use datafusion::arrow::array::{Float64Array, TimestampNanosecondArray};
use datafusion::scalar::ScalarValue;
use super::*;
#[test]
fn test_geo_path_basic() {
let mut accumulator = GeoPathAccumulator::new();
// Create test data
let lat_array = Arc::new(Float64Array::from(vec![1.0, 2.0, 3.0]));
let lng_array = Arc::new(Float64Array::from(vec![4.0, 5.0, 6.0]));
let ts_array = Arc::new(TimestampNanosecondArray::from(vec![100, 200, 300]));
// Update batch
accumulator
.update_batch(&[lat_array, lng_array, ts_array])
.unwrap();
// Evaluate
let result = accumulator.evaluate().unwrap();
if let ScalarValue::Struct(struct_array) = result {
// Verify structure
let fields = struct_array.fields().clone();
assert_eq!(fields.len(), 2);
assert_eq!(fields[0].name(), LATITUDE_FIELD);
assert_eq!(fields[1].name(), LONGITUDE_FIELD);
// Verify data
let columns = struct_array.columns();
assert_eq!(columns.len(), 2);
// Check latitude values
let lat_list = as_list_array(&columns[0]).unwrap().value(0);
let lat_array = as_primitive_array::<Float64Type>(&lat_list).unwrap();
assert_eq!(lat_array.len(), 3);
assert_eq!(lat_array.value(0), 1.0);
assert_eq!(lat_array.value(1), 2.0);
assert_eq!(lat_array.value(2), 3.0);
// Check longitude values
let lng_list = as_list_array(&columns[1]).unwrap().value(0);
let lng_array = as_primitive_array::<Float64Type>(&lng_list).unwrap();
assert_eq!(lng_array.len(), 3);
assert_eq!(lng_array.value(0), 4.0);
assert_eq!(lng_array.value(1), 5.0);
assert_eq!(lng_array.value(2), 6.0);
} else {
panic!("Expected Struct scalar value");
}
}
#[test]
fn test_geo_path_sort_by_timestamp() {
let mut accumulator = GeoPathAccumulator::new();
// Create test data with unordered timestamps
let lat_array = Arc::new(Float64Array::from(vec![1.0, 2.0, 3.0]));
let lng_array = Arc::new(Float64Array::from(vec![4.0, 5.0, 6.0]));
let ts_array = Arc::new(TimestampNanosecondArray::from(vec![300, 100, 200]));
// Update batch
accumulator
.update_batch(&[lat_array, lng_array, ts_array])
.unwrap();
// Evaluate
let result = accumulator.evaluate().unwrap();
if let ScalarValue::Struct(struct_array) = result {
// Extract arrays
let columns = struct_array.columns();
// Check latitude values
let lat_list = as_list_array(&columns[0]).unwrap().value(0);
let lat_array = as_primitive_array::<Float64Type>(&lat_list).unwrap();
assert_eq!(lat_array.len(), 3);
assert_eq!(lat_array.value(0), 2.0); // timestamp 100
assert_eq!(lat_array.value(1), 3.0); // timestamp 200
assert_eq!(lat_array.value(2), 1.0); // timestamp 300
// Check longitude values (should be sorted by timestamp)
let lng_list = as_list_array(&columns[1]).unwrap().value(0);
let lng_array = as_primitive_array::<Float64Type>(&lng_list).unwrap();
assert_eq!(lng_array.len(), 3);
assert_eq!(lng_array.value(0), 5.0); // timestamp 100
assert_eq!(lng_array.value(1), 6.0); // timestamp 200
assert_eq!(lng_array.value(2), 4.0); // timestamp 300
} else {
panic!("Expected Struct scalar value");
}
}
#[test]
fn test_geo_path_merge() {
let mut accumulator1 = GeoPathAccumulator::new();
let mut accumulator2 = GeoPathAccumulator::new();
// Create test data for first accumulator
let lat_array1 = Arc::new(Float64Array::from(vec![1.0]));
let lng_array1 = Arc::new(Float64Array::from(vec![4.0]));
let ts_array1 = Arc::new(TimestampNanosecondArray::from(vec![100]));
// Create test data for second accumulator
let lat_array2 = Arc::new(Float64Array::from(vec![2.0]));
let lng_array2 = Arc::new(Float64Array::from(vec![5.0]));
let ts_array2 = Arc::new(TimestampNanosecondArray::from(vec![200]));
// Update batches
accumulator1
.update_batch(&[lat_array1, lng_array1, ts_array1])
.unwrap();
accumulator2
.update_batch(&[lat_array2, lng_array2, ts_array2])
.unwrap();
// Get states
let state1 = accumulator1.state().unwrap();
let state2 = accumulator2.state().unwrap();
// Create a merged accumulator
let mut merged = GeoPathAccumulator::new();
// Extract the struct arrays from the states
let state_array1 = match &state1[0] {
ScalarValue::Struct(array) => array.clone(),
_ => panic!("Expected Struct scalar value"),
};
let state_array2 = match &state2[0] {
ScalarValue::Struct(array) => array.clone(),
_ => panic!("Expected Struct scalar value"),
};
// Merge state arrays
merged.merge_batch(&[state_array1]).unwrap();
merged.merge_batch(&[state_array2]).unwrap();
// Evaluate merged result
let result = merged.evaluate().unwrap();
if let ScalarValue::Struct(struct_array) = result {
// Extract arrays
let columns = struct_array.columns();
// Check latitude values
let lat_list = as_list_array(&columns[0]).unwrap().value(0);
let lat_array = as_primitive_array::<Float64Type>(&lat_list).unwrap();
assert_eq!(lat_array.len(), 2);
assert_eq!(lat_array.value(0), 1.0); // timestamp 100
assert_eq!(lat_array.value(1), 2.0); // timestamp 200
// Check longitude values (should be sorted by timestamp)
let lng_list = as_list_array(&columns[1]).unwrap().value(0);
let lng_array = as_primitive_array::<Float64Type>(&lng_list).unwrap();
assert_eq!(lng_array.len(), 2);
assert_eq!(lng_array.value(0), 4.0); // timestamp 100
assert_eq!(lng_array.value(1), 5.0); // timestamp 200
} else {
panic!("Expected Struct scalar value");
}
}
}

Some files were not shown because too many files have changed in this diff Show More