Compare commits

..

84 Commits

Author SHA1 Message Date
discord9
13582c9efb bytes trace
Signed-off-by: discord9 <discord9@163.com>
2025-11-04 11:19:07 +08:00
liyang
5d0ef376de fix: initializer container not work (#7152)
* fix: initializer not work

Signed-off-by: liyang <daviderli614@gmail.com>

* use a one version of operator

Signed-off-by: liyang <daviderli614@gmail.com>

---------

Signed-off-by: liyang <daviderli614@gmail.com>
2025-10-29 18:11:55 +00:00
shuiyisong
11c0381fc1 chore: set default catalog using build env (#7156)
* chore: update reference to const

Signed-off-by: shuiyisong <xixing.sys@gmail.com>

* chore: use option_env to set default catalog

Signed-off-by: shuiyisong <xixing.sys@gmail.com>

* chore: use const_format

Signed-off-by: shuiyisong <xixing.sys@gmail.com>

* chore: update reference in cli

Signed-off-by: shuiyisong <xixing.sys@gmail.com>

* chore: introduce a build.rs to set default catalog

Signed-off-by: shuiyisong <xixing.sys@gmail.com>

* chore: remove unused feature gate

Signed-off-by: shuiyisong <xixing.sys@gmail.com>

---------

Signed-off-by: shuiyisong <xixing.sys@gmail.com>
2025-10-29 18:10:58 +00:00
LFC
e8b7b0ad16 fix: memtable value push result was ignored (#7136)
* fix: memtable value push result was ignored

Signed-off-by: luofucong <luofc@foxmail.com>

* chore: apply suggestion

Co-authored-by: Yingwen <realevenyag@gmail.com>

---------

Signed-off-by: luofucong <luofc@foxmail.com>
Co-authored-by: dennis zhuang <killme2008@gmail.com>
Co-authored-by: Yingwen <realevenyag@gmail.com>
2025-10-29 13:44:36 +00:00
Weny Xu
6efffa427d fix: missing flamegraph feature in pprof dependency (#7158)
fix: fix pprof deps

Signed-off-by: WenyXu <wenymedia@gmail.com>
2025-10-29 11:41:21 +00:00
Ruihang Xia
6576e3555d fix: cache estimate methods (#7157)
* fix: cache estimate methods

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* revert page value change

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* Apply suggestion from @evenyag

Co-authored-by: Yingwen <realevenyag@gmail.com>

* update test

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

---------

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
Co-authored-by: Yingwen <realevenyag@gmail.com>
2025-10-29 09:57:28 +00:00
Lei, HUANG
f0afd675e3 feat: objbench sub command for datanode (#7114)
* feat/objbench-subcmd:
 ### Add Object Storage Benchmark Tool and Update Dependencies

 - **`Cargo.lock` & `Cargo.toml`**: Added dependencies for `colored`, `parquet`, and `pprof` to support new features.
 - **`datanode.rs`**: Introduced `ObjbenchCommand` for benchmarking object storage, including command-line options for configuration and execution. Added `StorageConfig` and `StorageConfigWrapper` for storage engine configuration.
 - **`datanode.rs`**: Implemented a stub for `build_object_store` function to initialize object storage.

 These changes introduce a new subcommand for object storage benchmarking and update dependencies to support additional functionality.

Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>

* init

Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>

* fix: code style and clippy

* feat/objbench-subcmd:
 Improve error handling in `objbench.rs`

 - Enhanced error handling in `parse_config` and `parse_file_dir_components` functions by replacing `unwrap` with `OptionExt` and `context` for better error messages.
 - Updated `build_access_layer_simple` and `build_cache_manager` functions to use `map_err` for more descriptive error handling.

Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>

* chore: rebase main

Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>

---------

Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
2025-10-29 05:26:29 +00:00
discord9
37bc2e6b07 feat: gc worker heartbeat instruction (#7118)
again



false by default



test: config api



refactor: per code review



less info!



even less info!!



docs: gc regions instr



refactor: grp by region id



per code review



per review



error handling?



test: fix



todos



aft rebase fix



after refactor

Signed-off-by: discord9 <discord9@163.com>
2025-10-29 02:59:36 +00:00
Ning Sun
a9d1d33138 feat: update datafusion-pg-catalog for better dbeaver support (#7143)
* chore: update datafusion-pg-catalog to 0.12.1

* feat: import more udfs
2025-10-28 18:42:03 +00:00
discord9
22d9eb6930 feat: part sort provide dyn filter (#7140)
* feat: part sort provide dyn filter

Signed-off-by: discord9 <discord9@163.com>

* fix: reset_state reset dynamic filter

Signed-off-by: discord9 <discord9@163.com>

---------

Signed-off-by: discord9 <discord9@163.com>
2025-10-28 02:44:29 +00:00
shuiyisong
da976e534d refactor: add test feature gate to numbers table (#7148)
* refactor: add test feature gate to numbers table

Signed-off-by: shuiyisong <xixing.sys@gmail.com>

* chore: add debug_assertions

Signed-off-by: shuiyisong <xixing.sys@gmail.com>

* refactor: extract numbers table provider

Signed-off-by: shuiyisong <xixing.sys@gmail.com>

* chore: address CR issues

Signed-off-by: shuiyisong <xixing.sys@gmail.com>

---------

Signed-off-by: shuiyisong <xixing.sys@gmail.com>
2025-10-27 10:16:00 +00:00
discord9
f2bc92b9e6 refactor: use generic for heartbeat instruction handler (#7149)
* refactor: use generic

Signed-off-by: discord9 <discord9@163.com>

* w

Signed-off-by: discord9 <discord9@163.com>

* per review

Signed-off-by: discord9 <discord9@163.com>

---------

Signed-off-by: discord9 <discord9@163.com>
2025-10-27 09:09:48 +00:00
Weny Xu
785f9d7fd7 fix: add delays in reconcile tests for async cache invalidation (#7147)
Signed-off-by: WenyXu <wenymedia@gmail.com>
2025-10-27 08:07:51 +00:00
shuiyisong
a20ac4f9e5 feat: prefix option for timestamp index and value column (#7125)
* refactor: use GREPTIME_TIMESTAMP const

Signed-off-by: shuiyisong <xixing.sys@gmail.com>

* feat: add config for default ts col name

Signed-off-by: shuiyisong <xixing.sys@gmail.com>

* refactor: replace GREPTIME_TIMESTAMP with function get

Signed-off-by: shuiyisong <xixing.sys@gmail.com>

* chore: update config doc

* fix: test

Signed-off-by: shuiyisong <xixing.sys@gmail.com>

* chore: remove opts on flownode and metasrv

Signed-off-by: shuiyisong <xixing.sys@gmail.com>

* chore: add validation for ts column name

Signed-off-by: shuiyisong <xixing.sys@gmail.com>

* chore: use get_or_init to avoid test error

Signed-off-by: shuiyisong <xixing.sys@gmail.com>

* chore: fmt

Signed-off-by: shuiyisong <xixing.sys@gmail.com>

* chore: update docs

Signed-off-by: shuiyisong <xixing.sys@gmail.com>

* chore: using empty string to disable prefix

Signed-off-by: shuiyisong <xixing.sys@gmail.com>

* chore: update comment

Signed-off-by: shuiyisong <xixing.sys@gmail.com>

* chore: address CR issues

Signed-off-by: shuiyisong <xixing.sys@gmail.com>

---------

Signed-off-by: shuiyisong <xixing.sys@gmail.com>
2025-10-27 08:00:03 +00:00
zyy17
0a3961927d refactor!: add a opentelemetry_traces_operations table to aggregate (service_name, span_name, span_kind) to improve query performance (#7144)
refactor: add a `*_operations` table to aggregate `(service_name, span_name, span_kind)` to improve query performance

Signed-off-by: zyy17 <zyylsxm@gmail.com>
2025-10-27 03:36:22 +00:00
LFC
d7ed6a69ab feat: merge json datatype (#7142)
* feat: merge json datatype

Signed-off-by: luofucong <luofc@foxmail.com>

* resolve PR comments

Signed-off-by: luofucong <luofc@foxmail.com>

---------

Signed-off-by: luofucong <luofc@foxmail.com>
2025-10-27 03:30:52 +00:00
discord9
68247fc9b1 fix: count_state use stat to eval&predicate w/out region (#7116)
* fix: count_state use stat to eval

Signed-off-by: discord9 <discord9@163.com>

* cleanup

Signed-off-by: discord9 <discord9@163.com>

* fix: use predicate without region

Signed-off-by: discord9 <discord9@163.com>

* test: diverge standalone/dist impl

Signed-off-by: discord9 <discord9@163.com>

---------

Signed-off-by: discord9 <discord9@163.com>
2025-10-27 02:14:45 +00:00
Lei, HUANG
e386a366d0 feat: add HTTP endpoint to control prof.gdump feature (#6999)
* feat/gdump:
 ### Add Support for Jemalloc Gdump Flag

 - **`jemalloc.rs`**: Introduced `PROF_GDUMP` constant and added functions `set_gdump_active` and `is_gdump_active` to manage the gdump flag.
 - **`error.rs`**: Added error handling for reading and updating the jemalloc gdump flag with `ReadGdump` and `UpdateGdump` errors.
 - **`lib.rs`**: Exposed `is_gdump_active` and `set_gdump_active` functions for non-Windows platforms.
 - **`http.rs`**: Added HTTP routes for checking and toggling the jemalloc gdump flag status.
 - **`mem_prof.rs`**: Implemented handlers `gdump_toggle_handler` and `gdump_status_handler` for managing gdump flag via HTTP requests.

Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>

* Update docs/how-to/how-to-profile-memory.md

Co-authored-by: shuiyisong <113876041+shuiyisong@users.noreply.github.com>

* fix: typo in docs

Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>

---------

Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
Co-authored-by: shuiyisong <113876041+shuiyisong@users.noreply.github.com>
2025-10-27 01:41:19 +00:00
dennis zhuang
d8563ba56d feat: adds regex_extract function and more type tests (#7107)
* feat: adds format, regex_extract function and more type tests

Signed-off-by: Dennis Zhuang <killme2008@gmail.com>

* fix: forgot functions

Signed-off-by: Dennis Zhuang <killme2008@gmail.com>

* chore: forgot null type

Signed-off-by: Dennis Zhuang <killme2008@gmail.com>

* test: forgot date type

Signed-off-by: Dennis Zhuang <killme2008@gmail.com>

* feat: remove format function

Signed-off-by: Dennis Zhuang <killme2008@gmail.com>

* test: update results after upgrading datafusion

Signed-off-by: Dennis Zhuang <killme2008@gmail.com>

---------

Signed-off-by: Dennis Zhuang <killme2008@gmail.com>
2025-10-25 08:41:49 +00:00
Weny Xu
7da2f5ed12 refactor: refactor instruction handler and adds support for batch region downgrade operations (#7130)
* refactor: refactor instruction handler

Signed-off-by: WenyXu <wenymedia@gmail.com>

* refactor: support batch downgrade region instructions

Signed-off-by: WenyXu <wenymedia@gmail.com>

* fix compat

Signed-off-by: WenyXu <wenymedia@gmail.com>

* fix clippy

Signed-off-by: WenyXu <wenymedia@gmail.com>

* add tests

Signed-off-by: WenyXu <wenymedia@gmail.com>

* chore: add comments

Signed-off-by: WenyXu <wenymedia@gmail.com>

---------

Signed-off-by: WenyXu <wenymedia@gmail.com>
2025-10-24 09:11:42 +00:00
Yingwen
4c70b4c31d feat: store estimated series num in file meta (#7126)
* feat: add num_series to FileMeta

Signed-off-by: evenyag <realevenyag@gmail.com>

* feat: add SeriesEstimator to collect num_series

Signed-off-by: evenyag <realevenyag@gmail.com>

* fix: set num_series in compactor

Signed-off-by: evenyag <realevenyag@gmail.com>

* chore: print num_series in Debug for FileMeta

Signed-off-by: evenyag <realevenyag@gmail.com>

* style: fmt code

Signed-off-by: evenyag <realevenyag@gmail.com>

* style: fix clippy

Signed-off-by: evenyag <realevenyag@gmail.com>

* fix: increase series count when next ts <= last

Signed-off-by: evenyag <realevenyag@gmail.com>

* test: add tests for SeriesEstimator

Signed-off-by: evenyag <realevenyag@gmail.com>

* feat: add num_series to ssts_manifest table

Signed-off-by: evenyag <realevenyag@gmail.com>

* test: update sqlness tests

Signed-off-by: evenyag <realevenyag@gmail.com>

* test: fix metric engine list entry test

Signed-off-by: evenyag <realevenyag@gmail.com>

---------

Signed-off-by: evenyag <realevenyag@gmail.com>
2025-10-24 05:53:48 +00:00
Ning Sun
b78ee1743c feat: add a missing pg_catalog function current_database (#7138)
feat: add a missing function current_database
2025-10-24 03:36:07 +00:00
LFC
6ad23bc9b4 refactor: convert to postgres values directly from arrow (#7131)
* refactor: convert to pg values directly from arrow

Signed-off-by: luofucong <luofc@foxmail.com>

* resolve PR comments

Signed-off-by: luofucong <luofc@foxmail.com>

---------

Signed-off-by: luofucong <luofc@foxmail.com>
2025-10-24 03:28:04 +00:00
Sicong Hu
03a29c6591 fix: correct test_index_build_type_compact (#7137)
Signed-off-by: SNC123 <sinhco@outlook.com>
2025-10-24 03:24:13 +00:00
zyy17
a0e6bcbeb3 feat: add cpu_usage_millicores and memory_usage_bytes in information_schema.cluster_info table. (#7051)
* refactor: add `hostname` in cluster_info table

Signed-off-by: zyy17 <zyylsxm@gmail.com>

* chore: update information schema result

Signed-off-by: zyy17 <zyylsxm@gmail.com>

* feat: enable zstd for bulk memtable encoded parts (#7045)

feat: enable zstd in bulk memtable

Signed-off-by: evenyag <realevenyag@gmail.com>

* refactor: add `get_total_cpu_millicores()` / `get_total_cpu_cores()` / `get_total_memory_bytes()` / `get_total_memory_readable()` in common-stat

Signed-off-by: zyy17 <zyylsxm@gmail.com>

* feat: add `cpu_usage_millicores` and `memory_usage_bytes` in `information_schema.cluster_info` table

Signed-off-by: zyy17 <zyylsxm@gmail.com>

* fix: compile warning and integration test failed

Signed-off-by: zyy17 <zyylsxm@gmail.com>

* fix: integration test failed

Signed-off-by: zyy17 <zyylsxm@gmail.com>

* refactor: add `ResourceStat`

Signed-off-by: zyy17 <zyylsxm@gmail.com>

* refactor: apply code review comments

Signed-off-by: zyy17 <zyylsxm@gmail.com>

* chore: update greptime-proto

Signed-off-by: zyy17 <zyylsxm@gmail.com>

---------

Signed-off-by: zyy17 <zyylsxm@gmail.com>
Signed-off-by: evenyag <realevenyag@gmail.com>
Co-authored-by: Yingwen <realevenyag@gmail.com>
2025-10-24 03:12:45 +00:00
LFC
b53a0b86fb feat: create table with new json datatype (#7128)
* feat: create table with new json datatype

Signed-off-by: luofucong <luofc@foxmail.com>

* resolve PR comments

Signed-off-by: luofucong <luofc@foxmail.com>

---------

Signed-off-by: luofucong <luofc@foxmail.com>
2025-10-24 02:16:49 +00:00
LFC
2f637a262e chore: update datafusion to 50 (#7076)
* chore: update datafusion to 50

Signed-off-by: luofucong <luofc@foxmail.com>

* fix ci

Signed-off-by: luofucong <luofc@foxmail.com>

* fix: update datafusion_pg_catalog import

* chore: fix toml format

* chore: fix toml format again

* fix nextest

Signed-off-by: luofucong <luofc@foxmail.com>

* fix sqlness

Signed-off-by: luofucong <luofc@foxmail.com>

* chore: switch datafusion-orc to upstream tag

* fix sqlness

Signed-off-by: luofucong <luofc@foxmail.com>

* resolve PR comments

Signed-off-by: luofucong <luofc@foxmail.com>

---------

Signed-off-by: luofucong <luofc@foxmail.com>
Co-authored-by: Ning Sun <sunning@greptime.com>
2025-10-23 07:18:36 +00:00
Yingwen
f388dbdbb8 fix: fix index and tag filtering for flat format (#7121)
* perf: only decode primary keys in the batch

Signed-off-by: evenyag <realevenyag@gmail.com>

* fix: don't push none to creator

Signed-off-by: evenyag <realevenyag@gmail.com>

* chore: implement method to filter __table_id for sparse encoding

Signed-off-by: evenyag <realevenyag@gmail.com>

* feat: filter table id for sparse encoding separately

The __table_id doesn't present in projection so we have to filter it
manually

Signed-off-by: evenyag <realevenyag@gmail.com>

* fix: decode tags for sparse encoding when building bloom filter

Signed-off-by: evenyag <realevenyag@gmail.com>

* feat: support inverted index for tags under sparse encoding

Signed-off-by: evenyag <realevenyag@gmail.com>

* feat: skip tag columns in fulltext index

Signed-off-by: evenyag <realevenyag@gmail.com>

* chore: fix warnings

Signed-off-by: evenyag <realevenyag@gmail.com>

* style: fix clippy

Signed-off-by: evenyag <realevenyag@gmail.com>

* test: fix list index metadata test

Signed-off-by: evenyag <realevenyag@gmail.com>

* fix: decode primary key columns to filter

When primary key columns are not in projection but in filters, we need
to decode them in compute_filter_mask_flat

Signed-off-by: evenyag <realevenyag@gmail.com>

* refactor: reuse filter method

Signed-off-by: evenyag <realevenyag@gmail.com>

* fix: only use dictionary for string type in compat

Signed-off-by: evenyag <realevenyag@gmail.com>

* refactor: safe to get column by creator's column id

Signed-off-by: evenyag <realevenyag@gmail.com>

---------

Signed-off-by: evenyag <realevenyag@gmail.com>
2025-10-23 06:43:46 +00:00
jeremyhi
136b9eef7a feat: pr review reminder frequency (#7129)
* feat: run at 9:00 am on monday, wednesday, friday

Signed-off-by: jeremyhi <fengjiachun@gmail.com>

* chore: remove unused method

Signed-off-by: jeremyhi <fengjiachun@gmail.com>

---------

Signed-off-by: jeremyhi <fengjiachun@gmail.com>
2025-10-23 06:22:02 +00:00
fys
e8f39cbc4f fix: unit test about trigger parser (#7132)
* fix: unit test about trigger parser

* fix: cargo clippy
2025-10-23 03:47:25 +00:00
jeremyhi
62b51c6736 feat: writer mem limiter for http and grpc service (#7092)
* feat: writer mem limiter for http and grpc service

Signed-off-by: jeremyhi <fengjiachun@gmail.com>

* fix: docs

Signed-off-by: jeremyhi <fengjiachun@gmail.com>

* feat: add metrics for limiter

Signed-off-by: jeremyhi <fengjiachun@gmail.com>

* Apply suggestion from @MichaelScofield

Co-authored-by: LFC <990479+MichaelScofield@users.noreply.github.com>

* chore: refactor try_acquire

Signed-off-by: jeremyhi <fengjiachun@gmail.com>

* chore: make size human readable

Signed-off-by: jeremyhi <fengjiachun@gmail.com>

---------

Signed-off-by: jeremyhi <fengjiachun@gmail.com>
Co-authored-by: LFC <990479+MichaelScofield@users.noreply.github.com>
2025-10-22 09:30:36 +00:00
discord9
a9a3e0b121 fix: prom ql logical plan use column index not name (#7109)
* feat: use index not col name

Signed-off-by: discord9 <discord9@163.com>

* fix: use name without qualifier&output schema fix

Signed-off-by: discord9 <discord9@163.com>

* proto

Signed-off-by: discord9 <discord9@163.com>

* refactor: resolve column name/index

Signed-off-by: discord9 <discord9@163.com>

* pcr

Signed-off-by: discord9 <discord9@163.com>

* chore: update proto

Signed-off-by: discord9 <discord9@163.com>

* chore: update proto

Signed-off-by: discord9 <discord9@163.com>

---------

Signed-off-by: discord9 <discord9@163.com>
2025-10-22 09:04:09 +00:00
Ning Sun
41ce100624 feat: update pgwire to 0.34 for a critical issue on accepting connection (#7127)
feat: update pgwire to 0.34
2025-10-22 07:25:04 +00:00
Weny Xu
328ec56b63 feat: introduce OpenRegions and CloseRegions instructions to support batch region operations (#7122)
* feat: introduce `OpenRegions` and `CloseRegions` instructions to support batch region operations

Signed-off-by: WenyXu <wenymedia@gmail.com>

* chore: apply suggestions

Signed-off-by: WenyXu <wenymedia@gmail.com>

* feat: merge instructions

Signed-off-by: WenyXu <wenymedia@gmail.com>

* chore: apply suggestions

Signed-off-by: WenyXu <wenymedia@gmail.com>

---------

Signed-off-by: WenyXu <wenymedia@gmail.com>
2025-10-22 03:43:47 +00:00
Ning Sun
bfa00df9f2 fix: list inner type for json and valueref, refactor type to ref for struct/list (#7113)
* refactor: use arc for struct type

* fix: inner type of list value and ref
2025-10-21 12:46:18 +00:00
jeremyhi
2e7b3951fb feat: 14 days PRs review reminder (#7123) 2025-10-21 08:53:38 +00:00
Yingwen
1054c63503 test: run engine unit tests for flat format (#7119)
* test: support flat in basic_test

Signed-off-by: evenyag <realevenyag@gmail.com>

* test: support flat in alter_test

Signed-off-by: evenyag <realevenyag@gmail.com>

* test: support flat for append_mode_test

Signed-off-by: evenyag <realevenyag@gmail.com>

* refactor: update bump_committed_sequence_test to test both formats

Signed-off-by: evenyag <realevenyag@gmail.com>

* refactor: update close_test to test both formats

Signed-off-by: evenyag <realevenyag@gmail.com>

* refactor: update compaction_test to test both formats

Signed-off-by: evenyag <realevenyag@gmail.com>

* refactor: update create_test to test both formats

Signed-off-by: evenyag <realevenyag@gmail.com>

* refactor: update edit_region_test to test both formats

Signed-off-by: evenyag <realevenyag@gmail.com>

* refactor: update merge_mode_test to test both formats

Signed-off-by: evenyag <realevenyag@gmail.com>

* refactor: update parallel_test to test both formats

Signed-off-by: evenyag <realevenyag@gmail.com>

* refactor: update projection_test to test both formats

Signed-off-by: evenyag <realevenyag@gmail.com>

* refactor: update prune_test to test both formats

Signed-off-by: evenyag <realevenyag@gmail.com>

* refactor: update row_selector_test to test both formats

Signed-off-by: evenyag <realevenyag@gmail.com>

* refactor: update scan_test to test both formats

Signed-off-by: evenyag <realevenyag@gmail.com>

* refactor: update drop_test to test both formats

Signed-off-by: evenyag <realevenyag@gmail.com>

* refactor: update filter_deleted_test to test both formats

Signed-off-by: evenyag <realevenyag@gmail.com>

* refactor: update sync_test to test both formats

Signed-off-by: evenyag <realevenyag@gmail.com>

* refactor: update set_role_state_test to test both formats

Signed-off-by: evenyag <realevenyag@gmail.com>

* refactor: update staging_test to test both formats

Signed-off-by: evenyag <realevenyag@gmail.com>

* refactor: update truncate_test to test both formats

Signed-off-by: evenyag <realevenyag@gmail.com>

* refactor: update catchup_test to test both formats

Signed-off-by: evenyag <realevenyag@gmail.com>

* refactor: update flush_test to test both formats

Signed-off-by: evenyag <realevenyag@gmail.com>

* refactor: update open_test to test both formats

Signed-off-by: evenyag <realevenyag@gmail.com>

* refactor: update batch_open_test to test both formats

Signed-off-by: evenyag <realevenyag@gmail.com>

* test: fix all flat format tests

Signed-off-by: evenyag <realevenyag@gmail.com>

---------

Signed-off-by: evenyag <realevenyag@gmail.com>
2025-10-21 08:24:12 +00:00
Sicong Hu
a1af4dce0c feat: implement three build types for async index build (#7029)
* feat: impl four types index build

Signed-off-by: SNC123 <sinhco@outlook.com>

* test: add tests for four types index build

Signed-off-by: SNC123 <sinhco@outlook.com>

* test: add sqlness test for manual index build

Signed-off-by: SNC123 <sinhco@outlook.com>

* fix: add region request support and correct sqlness

Signed-off-by: SNC123 <sinhco@outlook.com>

* fix: update cargo.toml for proto and resolve conflicts

Signed-off-by: SNC123 <sinhco@outlook.com>

* fix: rebase

Signed-off-by: SNC123 <sinhco@outlook.com>

* chore: clippy

Signed-off-by: SNC123 <sinhco@outlook.com>

* fix: toml fmt and correct sqlness

Signed-off-by: SNC123 <sinhco@outlook.com>

* fix: correct sqlness result

Signed-off-by: SNC123 <sinhco@outlook.com>

* refactor: extract manual build logic

Signed-off-by: SNC123 <sinhco@outlook.com>

* apply suggestions

Signed-off-by: SNC123 <sinhco@outlook.com>

* feat: abort index build process

Signed-off-by: SNC123 <sinhco@outlook.com>

* clippy

Signed-off-by: SNC123 <sinhco@outlook.com>

* chore: wrap `should_abort_index`

Signed-off-by: SNC123 <sinhco@outlook.com>

* chore: clippy

Signed-off-by: SNC123 <sinhco@outlook.com>

---------

Signed-off-by: SNC123 <sinhco@outlook.com>
2025-10-21 02:48:28 +00:00
jeremyhi
27268cf424 chore: pr review reminder (#7120)
* chore: pr review reminder

* chore: for test

* chore: vars

* fix: gracefully handle missing webhook URL

* test: allow workflow to run in fork for testing

* test: add environment variable logging

* chore: monior change

* feat: filter draft pr
2025-10-21 02:44:04 +00:00
Zhenchi
938d757523 feat: expose SST index metadata via information schema (#7044)
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>
2025-10-20 11:59:16 +00:00
LFC
855eb54ded refactor: convert to mysql values directly from arrow (#7096)
Signed-off-by: luofucong <luofc@foxmail.com>
2025-10-20 11:09:24 +00:00
Weny Xu
3119464ff9 feat: introduce the Noop WAL provider for datanode (#7105)
* feat: introduce noop log store

Signed-off-by: WenyXu <wenymedia@gmail.com>

* chore: update config example

Signed-off-by: WenyXu <wenymedia@gmail.com>

* test: add noop wal tests

Signed-off-by: WenyXu <wenymedia@gmail.com>

---------

Signed-off-by: WenyXu <wenymedia@gmail.com>
2025-10-20 06:13:27 +00:00
fys
20b5b9bee4 chore: remove unused deps (#7108) 2025-10-17 11:53:19 +00:00
Zhenchi
7b396bb290 feat(mito2): expose puffin index metadata (#7042)
* Add encode/decode helpers for IndexTarget

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* Use IndexTarget encode for puffin index blob keys

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* Normalize puffin index blobs to use IndexTarget keys

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* feat(mito2): expose puffin index metadata

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* target json polish

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* fix header

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* add index path

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* address copilot comments

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* address comments

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* reuse cached index metadata

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* parallelism for reading index meta

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

---------

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>
2025-10-17 06:22:07 +00:00
LFC
21532abf94 feat: new create table syntax for new json datatype (#7103)
* feat: new create table syntax for new json datatype

Signed-off-by: luofucong <luofc@foxmail.com>

* refactor: extract consts

* refactor: remove unused error variant

* fix tests

Signed-off-by: luofucong <luofc@foxmail.com>

* fix sqlness

Signed-off-by: luofucong <luofc@foxmail.com>

---------

Signed-off-by: luofucong <luofc@foxmail.com>
Co-authored-by: Ning Sun <sunning@greptime.com>
2025-10-17 05:22:29 +00:00
fys
331c64c6fd feat(trigger): support "for" and "keep_firing_for" (#7087)
* feat: support for and keep_firing_for optiosn in create trigger

* upgrade greptime-proto
2025-10-17 04:31:56 +00:00
Zhenchi
82e4600d1b feat: add index cache eviction support (#7064)
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>
2025-10-17 03:30:02 +00:00
dennis zhuang
8a2371a05c feat: supports large string (#7097)
* feat: supports large string

Signed-off-by: Dennis Zhuang <killme2008@gmail.com>

* chore: add doc for extract_string_vector_values

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Signed-off-by: Dennis Zhuang <killme2008@gmail.com>

* chore: refactor by cr comments

Signed-off-by: Dennis Zhuang <killme2008@gmail.com>

* chore: changes by cr comments

Signed-off-by: Dennis Zhuang <killme2008@gmail.com>

* refactor: extract_string_vector_values

Signed-off-by: Dennis Zhuang <killme2008@gmail.com>

* feat: remove large string type and refactor string vector

Signed-off-by: Dennis Zhuang <killme2008@gmail.com>

* chore: revert some changes

Signed-off-by: Dennis Zhuang <killme2008@gmail.com>

* feat: adds large string type

Signed-off-by: Dennis Zhuang <killme2008@gmail.com>

* chore: impl default for StringSizeType

Signed-off-by: Dennis Zhuang <killme2008@gmail.com>

* fix: tests and test compatibility

Signed-off-by: Dennis Zhuang <killme2008@gmail.com>

* test: update sqlness tests

Signed-off-by: Dennis Zhuang <killme2008@gmail.com>

* chore: remove panic

Signed-off-by: Dennis Zhuang <killme2008@gmail.com>

---------

Signed-off-by: Dennis Zhuang <killme2008@gmail.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-10-17 01:46:11 +00:00
zyy17
cf1b8392af refactor!: unify the API of getting total cpu and memory (#7049)
* refactor: add `get_total_cpu_millicores()` / `get_total_cpu_cores()` / `get_total_memory_bytes()` / `get_total_memory_readable()` in common-stat

Signed-off-by: zyy17 <zyylsxm@gmail.com>

* tests: update sqlness test cases

Signed-off-by: zyy17 <zyylsxm@gmail.com>

---------

Signed-off-by: zyy17 <zyylsxm@gmail.com>
2025-10-16 12:41:34 +00:00
Ning Sun
2e6ea1167f refactor: update valueref coerce function name based on its semantics (#7098) 2025-10-16 09:11:40 +00:00
Lei, HUANG
50386fda97 chore: pub route_prometheus function (#7101)
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
2025-10-16 08:11:06 +00:00
Weny Xu
873555feb2 fix: fix build warnings (#7099)
Signed-off-by: WenyXu <wenymedia@gmail.com>
2025-10-16 07:57:16 +00:00
zyy17
6ab4672866 refactor: add peer_hostname field in information_schema.cluster_info table (#7050)
* refactor: add `hostname` in cluster_info table

Signed-off-by: zyy17 <zyylsxm@gmail.com>

* chore: update information schema result

Signed-off-by: zyy17 <zyylsxm@gmail.com>

* chore: apply code review comments

Signed-off-by: zyy17 <zyylsxm@gmail.com>

* chore: update greptime-proto

Signed-off-by: zyy17 <zyylsxm@gmail.com>

* chore: add the compatibility for old proto

Signed-off-by: zyy17 <zyylsxm@gmail.com>

---------

Signed-off-by: zyy17 <zyylsxm@gmail.com>
2025-10-16 06:02:47 +00:00
discord9
ac65ede033 feat: memtable seq range read (#6950)
* feat: seq range memtable read

Signed-off-by: discord9 <discord9@163.com>

* test: from&range

Signed-off-by: discord9 <discord9@163.com>

* wt

Signed-off-by: discord9 <discord9@163.com>

* after rebase fix

Signed-off-by: discord9 <discord9@163.com>

* refactor: per review

Signed-off-by: discord9 <discord9@163.com>

* docs: better naming&emphaise

Signed-off-by: discord9 <discord9@163.com>

* refactor: use filter method

Signed-off-by: discord9 <discord9@163.com>

* tests: unwrap

Signed-off-by: discord9 <discord9@163.com>

---------

Signed-off-by: discord9 <discord9@163.com>
2025-10-16 05:30:56 +00:00
Lei, HUANG
552c502620 feat: manual compaction parallelism (#7086)
* feat/manual-compaction-parallelism:
 ### Add Parallelism Support to Compaction Requests

 - **`Cargo.lock` & `Cargo.toml`**: Updated `greptime-proto` dependency to a new revision.
 - **`flush_compact_table.rs`**: Enhanced `parse_compact_params` to support a new `parallelism` parameter, allowing users to
 specify the level of parallelism for table compaction.
 - **`handle_compaction.rs`**: Integrated `parallelism` into the compaction scheduling process, defaulting to 1 if not
 specified.
 - **`request.rs` & `region_request.rs`**: Modified `CompactRequest` to include `parallelism`, with logic to handle unspecifie
 values.
 - **`requests.rs`**: Updated `CompactTableRequest` structure to include an optional `parallelism` field.

Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>

* feat/manual-compaction-parallelism:
 ### Commit Message

 Enhance Compaction Request Handling

 - **`flush_compact_table.rs`**:
   - Renamed `parse_compact_params` to `parse_compact_request`.
   - Introduced `DEFAULT_COMPACTION_PARALLELISM` constant.
   - Updated parsing logic to handle keyword arguments for `strict_window` and `regular` compaction types, including `parallelism` and `window`.
   - Modified tests to reflect changes in parsing logic and default parallelism handling.

 - **`request.rs`**:
   - Updated `parallelism` handling in `RegionRequestBody::Compact` to use the new default value.

 - **`requests.rs`**:
   - Changed `CompactTableRequest` to use a non-optional `parallelism` field with a default value of 1.

Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>

* feat/manual-compaction-parallelism:
 ### Update `flush_compact_table.rs` Parameter Validation

 - Modified parameter validation in `flush_compact_table.rs` to restrict the maximum number of parameters from 4 to 3 in the `parse_compact_request` function.

Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>

* feat/manual-compaction-parallelism:
 Update `greptime-proto` dependency

 - Updated the `greptime-proto` dependency to a new revision in both `Cargo.lock` and `Cargo.toml`.

Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>

---------

Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
2025-10-16 03:47:01 +00:00
discord9
9aca7c97d7 fix: part cols not in projection (#7090)
* fix: part cols not in projection

Signed-off-by: discord9 <discord9@163.com>

* test: table scan with projection

Signed-off-by: discord9 <discord9@163.com>

* Update src/query/src/dist_plan/analyzer.rs

Co-authored-by: Yingwen <realevenyag@gmail.com>
Signed-off-by: discord9 <discord9@163.com>

---------

Signed-off-by: discord9 <discord9@163.com>
Co-authored-by: Yingwen <realevenyag@gmail.com>
2025-10-16 03:44:30 +00:00
Ning Sun
145c1024d1 feat: add Value::Json value type (#7083)
* feat: struct value

Signed-off-by: Ning Sun <sunning@greptime.com>

* feat: update for proto module

* feat: wip struct type

* feat: implement more vector operations

* feat: make datatype and api

* feat: reoslve some compilation issues

* feat: resolve all compilation issues

* chore: format update

* test: resolve tests

* test: test and refactor value-to-pb

* feat: add more tests and fix for value types

* chore: remove dbg

* feat: test and fix iterator

* fix: resolve struct_type issue

* feat: pgwire 0.33 update

* refactor: use vec for struct items

* feat: conversion from json to value

* feat: add decode function

* fix: lint issue

* feat: update how we encode raw data

* feat: add convertion to fully strcutured StructValue

* refactor: take owned value in all encode/decode functions

* feat: add pg serialization of structvalue

* chore: toml format

* refactor: adopt new and try_new from struct value

* chore: cleanup residual issues

* docs: docs up

* fix lint issue

* Apply suggestion from @MichaelScofield

Co-authored-by: LFC <990479+MichaelScofield@users.noreply.github.com>

* Apply suggestion from @MichaelScofield

Co-authored-by: LFC <990479+MichaelScofield@users.noreply.github.com>

* Apply suggestion from @MichaelScofield

Co-authored-by: LFC <990479+MichaelScofield@users.noreply.github.com>

* Apply suggestion from @MichaelScofield

Co-authored-by: LFC <990479+MichaelScofield@users.noreply.github.com>

* chore: address review comment especially collection capacity

* refactor: remove unneeded processed keys collection

* feat: Value::Json type

* chore: add some work in progress changes

* feat: adopt new json type

* refactor: limit scope json conversion functions

* fix: self review update

* test: provide tests for value::json

* test: add tests for api/helper

* switch proto to main branch

* fix: implement is_null for ValueRef::Json

---------

Signed-off-by: Ning Sun <sunning@greptime.com>
Co-authored-by: LFC <990479+MichaelScofield@users.noreply.github.com>
2025-10-15 20:13:12 +00:00
Alan Tang
8073e552df feat: add updated_on to tablemeta with a default of created_on (#7072)
* feat: add updated_on to tablemeta with a default of created_on

Signed-off-by: Alan Tang <jmtangcs@gmail.com>

* feat: support the update_on on alter procedure

Signed-off-by: Alan Tang <jmtangcs@gmail.com>

* feat: add updated_on into information_schema.tables

Signed-off-by: Alan Tang <jmtangcs@gmail.com>

* fix: make sqlness happy

Signed-off-by: Alan Tang <jmtangcs@gmail.com>

* test: add test case for tablemeta update

Signed-off-by: Alan Tang <jmtangcs@gmail.com>

* fix: fix failing test for ALTER TABLE

Signed-off-by: Alan Tang <jmtangcs@gmail.com>

* feat: use created_on as default for updated_on when missing

Signed-off-by: Alan Tang <jmtangcs@gmail.com>

---------

Signed-off-by: Alan Tang <jmtangcs@gmail.com>
2025-10-15 11:12:27 +00:00
Ruihang Xia
aa98033e85 feat(parser): ALTER TABLE ... REPARTITION ... (#7082)
* initial impl

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* sqlness tests

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* tidy up

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

---------

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
2025-10-15 03:54:36 +00:00
Ning Sun
9606a6fda8 feat: conversion between struct, value and json (#7052)
* feat: struct value

Signed-off-by: Ning Sun <sunning@greptime.com>

* feat: update for proto module

* feat: wip struct type

* feat: implement more vector operations

* feat: make datatype and api

* feat: reoslve some compilation issues

* feat: resolve all compilation issues

* chore: format update

* test: resolve tests

* test: test and refactor value-to-pb

* feat: add more tests and fix for value types

* chore: remove dbg

* feat: test and fix iterator

* fix: resolve struct_type issue

* feat: pgwire 0.33 update

* refactor: use vec for struct items

* feat: conversion from json to value

* feat: add decode function

* fix: lint issue

* feat: update how we encode raw data

* feat: add convertion to fully strcutured StructValue

* refactor: take owned value in all encode/decode functions

* feat: add pg serialization of structvalue

* chore: toml format

* refactor: adopt new and try_new from struct value

* chore: cleanup residual issues

* docs: docs up

* fix lint issue

* Apply suggestion from @MichaelScofield

Co-authored-by: LFC <990479+MichaelScofield@users.noreply.github.com>

* Apply suggestion from @MichaelScofield

Co-authored-by: LFC <990479+MichaelScofield@users.noreply.github.com>

* Apply suggestion from @MichaelScofield

Co-authored-by: LFC <990479+MichaelScofield@users.noreply.github.com>

* Apply suggestion from @MichaelScofield

Co-authored-by: LFC <990479+MichaelScofield@users.noreply.github.com>

* chore: address review comment especially collection capacity

* refactor: remove unneeded processed keys collection

---------

Signed-off-by: Ning Sun <sunning@greptime.com>
Co-authored-by: LFC <990479+MichaelScofield@users.noreply.github.com>
2025-10-14 07:22:37 +00:00
github-actions[bot]
9cc0bcb449 ci: update dev-builder image tag (#7073)
Signed-off-by: greptimedb-ci <greptimedb-ci@greptime.com>
Co-authored-by: greptimedb-ci <greptimedb-ci@greptime.com>
2025-10-14 06:53:02 +00:00
Ning Sun
5ad1eac924 refactor: remove unused grpc-expr module and pb conversions (#7085)
* refactor: remove unused grpc-expr module and pb conversions

* chore: remove unused snafu
2025-10-14 04:11:48 +00:00
shuiyisong
a027b824a2 chore: add information extension to the plugins in standalone (#7079)
chore: add information extension to the plugins

Signed-off-by: shuiyisong <xixing.sys@gmail.com>
2025-10-14 02:30:42 +00:00
Lei, HUANG
44d46a6702 fix: correct impl Clear for &[u8] (#7081)
* fix: correct impl Clear for &[u8]

* fix: clippy

Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>

---------

Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
2025-10-13 11:59:26 +00:00
Yingwen
a9c342b0f7 feat: support setting sst_format in table options (#7068)
* feat: add FormatType to support multi format in the future

Signed-off-by: evenyag <realevenyag@gmail.com>

* feat: add sst_format to RegionOptions

Signed-off-by: evenyag <realevenyag@gmail.com>

* feat: sets the sst_format based on RegionOptions

Signed-off-by: evenyag <realevenyag@gmail.com>

* feat: add sst_format to mito table options

Signed-off-by: evenyag <realevenyag@gmail.com>

* fix: fix RegionManifest deserialization without sst_format

Signed-off-by: evenyag <realevenyag@gmail.com>

* refactor: remove Parquet suffix from FormatType

Signed-off-by: evenyag <realevenyag@gmail.com>

* feat: prefer RegionOptions::sst_format in compactor/memtable builder

Signed-off-by: evenyag <realevenyag@gmail.com>

* refactor: rename enable_experimental_flat_format to
default_experimental_flat_format

Signed-off-by: evenyag <realevenyag@gmail.com>

* docs: update config.md

Signed-off-by: evenyag <realevenyag@gmail.com>

* style: fmt

Signed-off-by: evenyag <realevenyag@gmail.com>

* test: update manifest test

Signed-off-by: evenyag <realevenyag@gmail.com>

* chore: fix compiler errors, handle sst_format in remap_manifest

Signed-off-by: evenyag <realevenyag@gmail.com>

---------

Signed-off-by: evenyag <realevenyag@gmail.com>
2025-10-13 08:38:37 +00:00
Ruihang Xia
1a73b485fe feat: apply region partition expr to region scan (#7067)
* handle null in partition expr

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* apply region partition expr on scanning

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* fix clippy

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* fix format

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* tidy

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* fix gt/gteq

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

---------

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
2025-10-13 07:38:19 +00:00
Ruihang Xia
ab46127414 feat: remap SST files for partition change (#7071)
* initial impl

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* update expr

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* move error

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* immutable file meta

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* tidy

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* reduce state

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* inherit manifest

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* simplify test cases

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* fix rebase error

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* log new exprs

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

---------

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
2025-10-13 03:13:14 +00:00
LFC
8fe17d43d5 chore: update rust to nightly 2025-10-01 (#7069)
* chore: update rust to nightly 2025-10-01

Signed-off-by: luofucong <luofc@foxmail.com>

* chore: nix update

---------

Signed-off-by: luofucong <luofc@foxmail.com>
Co-authored-by: Ning Sun <sunning@greptime.com>
2025-10-11 07:30:52 +00:00
Weny Xu
40e9ce90a7 refactor: restructure sqlness to support multiple envs and extract common utils (#7066)
* refactor: restructure sqlness to support multiple envs and extract common utils

Signed-off-by: WenyXu <wenymedia@gmail.com>

* chore(ci): update sqlness cmd

Signed-off-by: WenyXu <wenymedia@gmail.com>

* chore: add comments

Signed-off-by: WenyXu <wenymedia@gmail.com>

* fix: error fmt

Signed-off-by: WenyXu <wenymedia@gmail.com>

* fix: only reconnect mysql and pg client

Signed-off-by: WenyXu <wenymedia@gmail.com>

* chore: apply suggestions

Signed-off-by: WenyXu <wenymedia@gmail.com>

---------

Signed-off-by: WenyXu <wenymedia@gmail.com>
2025-10-11 06:34:17 +00:00
discord9
ba034c5a9e feat: explain custom statement (#7058)
* feat: explain tql cte

Signed-off-by: discord9 <discord9@163.com>

* chore: unused

Signed-off-by: discord9 <discord9@163.com>

* fix: analyze format

Signed-off-by: discord9 <discord9@163.com>

* Update src/sql/src/statements/statement.rs

Co-authored-by: Yingwen <realevenyag@gmail.com>
Signed-off-by: discord9 <discord9@163.com>

* test: sqlness

Signed-off-by: discord9 <discord9@163.com>

* pcr

Signed-off-by: discord9 <discord9@163.com>

---------

Signed-off-by: discord9 <discord9@163.com>
Co-authored-by: Yingwen <realevenyag@gmail.com>
2025-10-11 06:27:51 +00:00
Ruihang Xia
e46ce7c6da feat: divide subtasks from old/new partition rules (#7003)
* feat: divide subtasks from old/new partition rules

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* fix format

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* change copyright year

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* simplify filter

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* naming

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* Update src/partition/src/subtask.rs

Co-authored-by: Zhenchi <zhongzc_arch@outlook.com>
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

---------

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
Co-authored-by: Zhenchi <zhongzc_arch@outlook.com>
2025-10-11 06:17:25 +00:00
dennis zhuang
57d84b9de5 feat: supports value aliasing in TQL (#7041)
* feat: supports value aliasing in TQL

Signed-off-by: Dennis Zhuang <killme2008@gmail.com>

* fix: invalid checking

Signed-off-by: Dennis Zhuang <killme2008@gmail.com>

* chore: remove invalid checking

Signed-off-by: Dennis Zhuang <killme2008@gmail.com>

* test: add explain test

Signed-off-by: Dennis Zhuang <killme2008@gmail.com>

* chore: improve parser

Signed-off-by: Dennis Zhuang <killme2008@gmail.com>

* test: add explain TQL-CTE

Signed-off-by: Dennis Zhuang <killme2008@gmail.com>

---------

Signed-off-by: Dennis Zhuang <killme2008@gmail.com>
2025-10-11 02:49:09 +00:00
Ning Sun
749a5ab165 feat: struct value and vector (#7033)
* feat: struct value

Signed-off-by: Ning Sun <sunning@greptime.com>

* feat: update for proto module

* feat: wip struct type

* feat: implement more vector operations

* feat: make datatype and api

* feat: reoslve some compilation issues

* feat: resolve all compilation issues

* chore: format update

* test: resolve tests

* test: test and refactor value-to-pb

* feat: add more tests and fix for value types

* chore: remove dbg

* feat: test and fix iterator

* fix: resolve struct_type issue

* refactor: use vec for struct items

* chore: update proto to main branch

* refactor: address some of review issues

* refactor: update for further review

* Add validation on new methods

* feat: update struct/list json serialization

* refactor: reimplement get in struct_vector

* refactor: struct vector functions

* refactor: fix lint issue

* refactor: address review comments

---------

Signed-off-by: Ning Sun <sunning@greptime.com>
2025-10-10 21:49:51 +00:00
LFC
3738440753 feat: align influxdb line timestamp with table time index (#7057)
* feat: align influxdb line timestamp with table time index

Signed-off-by: luofucong <luofc@foxmail.com>

* fix ci

Signed-off-by: luofucong <luofc@foxmail.com>

---------

Signed-off-by: luofucong <luofc@foxmail.com>
2025-10-10 07:37:52 +00:00
Ning Sun
aa84642afc refactor!: remove pb_value to json conversion, keep json output consistent (#7063)
* refactor: remove pb_value to json

* chore: remove unused module
2025-10-10 07:09:20 +00:00
Ning Sun
af213be403 refactor: remove duplicated valueref to json (#7062) 2025-10-10 07:08:26 +00:00
Sicong Hu
779865d389 feat: introduce IndexBuildTask for async index build (#6927)
* feat: add framework for asynchronous index building

Signed-off-by: SNC123 <sinhco@outlook.com>

* test: add unit tests for IndexBuildTask

Signed-off-by: SNC123 <sinhco@outlook.com>

* chore: clippy,format,fix-udeps

Signed-off-by: SNC123 <sinhco@outlook.com>

* fix: correct write cache logic in IndexBuildTask

Signed-off-by: SNC123 <sinhco@outlook.com>

* chore: clippy, resolve conflicts

Signed-off-by: SNC123 <sinhco@outlook.com>

* chore: resolve conflicts

Signed-off-by: SNC123 <sinhco@outlook.com>

* fix: apply review suggestions

Signed-off-by: SNC123 <sinhco@outlook.com>

* chore: resolve conflicts

Signed-off-by: SNC123 <sinhco@outlook.com>

* fix: clean up index files in aborted case

Signed-off-by: SNC123 <sinhco@outlook.com>

* refactor: move manifest update logic into IndexBuildTask

Signed-off-by: SNC123 <sinhco@outlook.com>

* fix: enhance check file logic and error handling

Signed-off-by: SNC123 <sinhco@outlook.com>

---------

Signed-off-by: SNC123 <sinhco@outlook.com>
2025-10-10 03:29:32 +00:00
Yingwen
47c1ef672a fix: support dictionary in regex match (#7055)
* fix: support dictionary in regex match

Signed-off-by: evenyag <realevenyag@gmail.com>

* chore: get key from keys buffer directly

Signed-off-by: evenyag <realevenyag@gmail.com>

---------

Signed-off-by: evenyag <realevenyag@gmail.com>
2025-10-10 03:03:34 +00:00
shyam
591b9f3e81 fix: show proper error msg, when executing non-admin functions as admin functions (#7061)
Signed-off-by: Shyamnatesan <shyamnatesan21@gmail.com>
2025-10-10 01:25:49 +00:00
LFC
979c8be51b feat: able to pass external service for sqlness test (#7032)
feat: able to pass external service instead of creating inside for sqlness test

Signed-off-by: luofucong <luofc@foxmail.com>
2025-10-09 07:02:19 +00:00
Yingwen
45b1458254 fix: only skips auto convert when encoding is sparse (#7056)
* fix: only skips auto convert when encoding is sparse

Signed-off-by: evenyag <realevenyag@gmail.com>

* chore: address comment and add tests

Signed-off-by: evenyag <realevenyag@gmail.com>

---------

Signed-off-by: evenyag <realevenyag@gmail.com>
2025-10-09 06:50:48 +00:00
fys
4cdcf2ef39 chore: add trigger querier factory trait (#7053)
feat: add trigger-querier-factory-ent
2025-10-09 02:16:50 +00:00
shuiyisong
b24a55cea4 chore: rename the default ts column name to greptime_timestamp for influxdb line protocol (#7046)
* chore: rename influxdb ts column name to greptime_timestamp

Signed-off-by: shuiyisong <xixing.sys@gmail.com>

* fix: tests

Signed-off-by: shuiyisong <xixing.sys@gmail.com>

---------

Signed-off-by: shuiyisong <xixing.sys@gmail.com>
2025-10-09 02:14:11 +00:00
Ning Sun
1aa4f346a0 fix: build_grpc_server visibility (#7054) 2025-10-06 03:16:48 +00:00
711 changed files with 36450 additions and 8239 deletions

View File

@@ -7,6 +7,8 @@ KUBERNETES_VERSION="${KUBERNETES_VERSION:-v1.32.0}"
ENABLE_STANDALONE_MODE="${ENABLE_STANDALONE_MODE:-true}" ENABLE_STANDALONE_MODE="${ENABLE_STANDALONE_MODE:-true}"
DEFAULT_INSTALL_NAMESPACE=${DEFAULT_INSTALL_NAMESPACE:-default} DEFAULT_INSTALL_NAMESPACE=${DEFAULT_INSTALL_NAMESPACE:-default}
GREPTIMEDB_IMAGE_TAG=${GREPTIMEDB_IMAGE_TAG:-latest} GREPTIMEDB_IMAGE_TAG=${GREPTIMEDB_IMAGE_TAG:-latest}
GREPTIMEDB_OPERATOR_IMAGE_TAG=${GREPTIMEDB_OPERATOR_IMAGE_TAG:-v0.5.1}
GREPTIMEDB_INITIALIZER_IMAGE_TAG="${GREPTIMEDB_OPERATOR_IMAGE_TAG}"
GREPTIME_CHART="https://greptimeteam.github.io/helm-charts/" GREPTIME_CHART="https://greptimeteam.github.io/helm-charts/"
ETCD_CHART="oci://registry-1.docker.io/bitnamicharts/etcd" ETCD_CHART="oci://registry-1.docker.io/bitnamicharts/etcd"
ETCD_CHART_VERSION="${ETCD_CHART_VERSION:-12.0.8}" ETCD_CHART_VERSION="${ETCD_CHART_VERSION:-12.0.8}"
@@ -58,7 +60,7 @@ function deploy_greptimedb_operator() {
# Use the latest chart and image. # Use the latest chart and image.
helm upgrade --install greptimedb-operator greptime/greptimedb-operator \ helm upgrade --install greptimedb-operator greptime/greptimedb-operator \
--create-namespace \ --create-namespace \
--set image.tag=latest \ --set image.tag="$GREPTIMEDB_OPERATOR_IMAGE_TAG" \
-n "$DEFAULT_INSTALL_NAMESPACE" -n "$DEFAULT_INSTALL_NAMESPACE"
# Wait for greptimedb-operator to be ready. # Wait for greptimedb-operator to be ready.
@@ -78,6 +80,7 @@ function deploy_greptimedb_cluster() {
helm upgrade --install "$cluster_name" greptime/greptimedb-cluster \ helm upgrade --install "$cluster_name" greptime/greptimedb-cluster \
--create-namespace \ --create-namespace \
--set image.tag="$GREPTIMEDB_IMAGE_TAG" \ --set image.tag="$GREPTIMEDB_IMAGE_TAG" \
--set initializer.tag="$GREPTIMEDB_INITIALIZER_IMAGE_TAG" \
--set meta.backendStorage.etcd.endpoints="etcd.$install_namespace:2379" \ --set meta.backendStorage.etcd.endpoints="etcd.$install_namespace:2379" \
--set meta.backendStorage.etcd.storeKeyPrefix="$cluster_name" \ --set meta.backendStorage.etcd.storeKeyPrefix="$cluster_name" \
-n "$install_namespace" -n "$install_namespace"
@@ -115,6 +118,7 @@ function deploy_greptimedb_cluster_with_s3_storage() {
helm upgrade --install "$cluster_name" greptime/greptimedb-cluster -n "$install_namespace" \ helm upgrade --install "$cluster_name" greptime/greptimedb-cluster -n "$install_namespace" \
--create-namespace \ --create-namespace \
--set image.tag="$GREPTIMEDB_IMAGE_TAG" \ --set image.tag="$GREPTIMEDB_IMAGE_TAG" \
--set initializer.tag="$GREPTIMEDB_INITIALIZER_IMAGE_TAG" \
--set meta.backendStorage.etcd.endpoints="etcd.$install_namespace:2379" \ --set meta.backendStorage.etcd.endpoints="etcd.$install_namespace:2379" \
--set meta.backendStorage.etcd.storeKeyPrefix="$cluster_name" \ --set meta.backendStorage.etcd.storeKeyPrefix="$cluster_name" \
--set objectStorage.s3.bucket="$AWS_CI_TEST_BUCKET" \ --set objectStorage.s3.bucket="$AWS_CI_TEST_BUCKET" \

507
.github/scripts/package-lock.json generated vendored Normal file
View File

@@ -0,0 +1,507 @@
{
"name": "greptimedb-github-scripts",
"version": "1.0.0",
"lockfileVersion": 3,
"requires": true,
"packages": {
"": {
"name": "greptimedb-github-scripts",
"version": "1.0.0",
"dependencies": {
"@octokit/rest": "^21.0.0",
"axios": "^1.7.0"
}
},
"node_modules/@octokit/auth-token": {
"version": "5.1.2",
"resolved": "https://registry.npmjs.org/@octokit/auth-token/-/auth-token-5.1.2.tgz",
"integrity": "sha512-JcQDsBdg49Yky2w2ld20IHAlwr8d/d8N6NiOXbtuoPCqzbsiJgF633mVUw3x4mo0H5ypataQIX7SFu3yy44Mpw==",
"license": "MIT",
"engines": {
"node": ">= 18"
}
},
"node_modules/@octokit/core": {
"version": "6.1.6",
"resolved": "https://registry.npmjs.org/@octokit/core/-/core-6.1.6.tgz",
"integrity": "sha512-kIU8SLQkYWGp3pVKiYzA5OSaNF5EE03P/R8zEmmrG6XwOg5oBjXyQVVIauQ0dgau4zYhpZEhJrvIYt6oM+zZZA==",
"license": "MIT",
"dependencies": {
"@octokit/auth-token": "^5.0.0",
"@octokit/graphql": "^8.2.2",
"@octokit/request": "^9.2.3",
"@octokit/request-error": "^6.1.8",
"@octokit/types": "^14.0.0",
"before-after-hook": "^3.0.2",
"universal-user-agent": "^7.0.0"
},
"engines": {
"node": ">= 18"
}
},
"node_modules/@octokit/endpoint": {
"version": "10.1.4",
"resolved": "https://registry.npmjs.org/@octokit/endpoint/-/endpoint-10.1.4.tgz",
"integrity": "sha512-OlYOlZIsfEVZm5HCSR8aSg02T2lbUWOsCQoPKfTXJwDzcHQBrVBGdGXb89dv2Kw2ToZaRtudp8O3ZIYoaOjKlA==",
"license": "MIT",
"dependencies": {
"@octokit/types": "^14.0.0",
"universal-user-agent": "^7.0.2"
},
"engines": {
"node": ">= 18"
}
},
"node_modules/@octokit/graphql": {
"version": "8.2.2",
"resolved": "https://registry.npmjs.org/@octokit/graphql/-/graphql-8.2.2.tgz",
"integrity": "sha512-Yi8hcoqsrXGdt0yObxbebHXFOiUA+2v3n53epuOg1QUgOB6c4XzvisBNVXJSl8RYA5KrDuSL2yq9Qmqe5N0ryA==",
"license": "MIT",
"dependencies": {
"@octokit/request": "^9.2.3",
"@octokit/types": "^14.0.0",
"universal-user-agent": "^7.0.0"
},
"engines": {
"node": ">= 18"
}
},
"node_modules/@octokit/openapi-types": {
"version": "25.1.0",
"resolved": "https://registry.npmjs.org/@octokit/openapi-types/-/openapi-types-25.1.0.tgz",
"integrity": "sha512-idsIggNXUKkk0+BExUn1dQ92sfysJrje03Q0bv0e+KPLrvyqZF8MnBpFz8UNfYDwB3Ie7Z0TByjWfzxt7vseaA==",
"license": "MIT"
},
"node_modules/@octokit/plugin-paginate-rest": {
"version": "11.6.0",
"resolved": "https://registry.npmjs.org/@octokit/plugin-paginate-rest/-/plugin-paginate-rest-11.6.0.tgz",
"integrity": "sha512-n5KPteiF7pWKgBIBJSk8qzoZWcUkza2O6A0za97pMGVrGfPdltxrfmfF5GucHYvHGZD8BdaZmmHGz5cX/3gdpw==",
"license": "MIT",
"dependencies": {
"@octokit/types": "^13.10.0"
},
"engines": {
"node": ">= 18"
},
"peerDependencies": {
"@octokit/core": ">=6"
}
},
"node_modules/@octokit/plugin-paginate-rest/node_modules/@octokit/openapi-types": {
"version": "24.2.0",
"resolved": "https://registry.npmjs.org/@octokit/openapi-types/-/openapi-types-24.2.0.tgz",
"integrity": "sha512-9sIH3nSUttelJSXUrmGzl7QUBFul0/mB8HRYl3fOlgHbIWG+WnYDXU3v/2zMtAvuzZ/ed00Ei6on975FhBfzrg==",
"license": "MIT"
},
"node_modules/@octokit/plugin-paginate-rest/node_modules/@octokit/types": {
"version": "13.10.0",
"resolved": "https://registry.npmjs.org/@octokit/types/-/types-13.10.0.tgz",
"integrity": "sha512-ifLaO34EbbPj0Xgro4G5lP5asESjwHracYJvVaPIyXMuiuXLlhic3S47cBdTb+jfODkTE5YtGCLt3Ay3+J97sA==",
"license": "MIT",
"dependencies": {
"@octokit/openapi-types": "^24.2.0"
}
},
"node_modules/@octokit/plugin-request-log": {
"version": "5.3.1",
"resolved": "https://registry.npmjs.org/@octokit/plugin-request-log/-/plugin-request-log-5.3.1.tgz",
"integrity": "sha512-n/lNeCtq+9ofhC15xzmJCNKP2BWTv8Ih2TTy+jatNCCq/gQP/V7rK3fjIfuz0pDWDALO/o/4QY4hyOF6TQQFUw==",
"license": "MIT",
"engines": {
"node": ">= 18"
},
"peerDependencies": {
"@octokit/core": ">=6"
}
},
"node_modules/@octokit/plugin-rest-endpoint-methods": {
"version": "13.5.0",
"resolved": "https://registry.npmjs.org/@octokit/plugin-rest-endpoint-methods/-/plugin-rest-endpoint-methods-13.5.0.tgz",
"integrity": "sha512-9Pas60Iv9ejO3WlAX3maE1+38c5nqbJXV5GrncEfkndIpZrJ/WPMRd2xYDcPPEt5yzpxcjw9fWNoPhsSGzqKqw==",
"license": "MIT",
"dependencies": {
"@octokit/types": "^13.10.0"
},
"engines": {
"node": ">= 18"
},
"peerDependencies": {
"@octokit/core": ">=6"
}
},
"node_modules/@octokit/plugin-rest-endpoint-methods/node_modules/@octokit/openapi-types": {
"version": "24.2.0",
"resolved": "https://registry.npmjs.org/@octokit/openapi-types/-/openapi-types-24.2.0.tgz",
"integrity": "sha512-9sIH3nSUttelJSXUrmGzl7QUBFul0/mB8HRYl3fOlgHbIWG+WnYDXU3v/2zMtAvuzZ/ed00Ei6on975FhBfzrg==",
"license": "MIT"
},
"node_modules/@octokit/plugin-rest-endpoint-methods/node_modules/@octokit/types": {
"version": "13.10.0",
"resolved": "https://registry.npmjs.org/@octokit/types/-/types-13.10.0.tgz",
"integrity": "sha512-ifLaO34EbbPj0Xgro4G5lP5asESjwHracYJvVaPIyXMuiuXLlhic3S47cBdTb+jfODkTE5YtGCLt3Ay3+J97sA==",
"license": "MIT",
"dependencies": {
"@octokit/openapi-types": "^24.2.0"
}
},
"node_modules/@octokit/request": {
"version": "9.2.4",
"resolved": "https://registry.npmjs.org/@octokit/request/-/request-9.2.4.tgz",
"integrity": "sha512-q8ybdytBmxa6KogWlNa818r0k1wlqzNC+yNkcQDECHvQo8Vmstrg18JwqJHdJdUiHD2sjlwBgSm9kHkOKe2iyA==",
"license": "MIT",
"dependencies": {
"@octokit/endpoint": "^10.1.4",
"@octokit/request-error": "^6.1.8",
"@octokit/types": "^14.0.0",
"fast-content-type-parse": "^2.0.0",
"universal-user-agent": "^7.0.2"
},
"engines": {
"node": ">= 18"
}
},
"node_modules/@octokit/request-error": {
"version": "6.1.8",
"resolved": "https://registry.npmjs.org/@octokit/request-error/-/request-error-6.1.8.tgz",
"integrity": "sha512-WEi/R0Jmq+IJKydWlKDmryPcmdYSVjL3ekaiEL1L9eo1sUnqMJ+grqmC9cjk7CA7+b2/T397tO5d8YLOH3qYpQ==",
"license": "MIT",
"dependencies": {
"@octokit/types": "^14.0.0"
},
"engines": {
"node": ">= 18"
}
},
"node_modules/@octokit/rest": {
"version": "21.1.1",
"resolved": "https://registry.npmjs.org/@octokit/rest/-/rest-21.1.1.tgz",
"integrity": "sha512-sTQV7va0IUVZcntzy1q3QqPm/r8rWtDCqpRAmb8eXXnKkjoQEtFe3Nt5GTVsHft+R6jJoHeSiVLcgcvhtue/rg==",
"license": "MIT",
"dependencies": {
"@octokit/core": "^6.1.4",
"@octokit/plugin-paginate-rest": "^11.4.2",
"@octokit/plugin-request-log": "^5.3.1",
"@octokit/plugin-rest-endpoint-methods": "^13.3.0"
},
"engines": {
"node": ">= 18"
}
},
"node_modules/@octokit/types": {
"version": "14.1.0",
"resolved": "https://registry.npmjs.org/@octokit/types/-/types-14.1.0.tgz",
"integrity": "sha512-1y6DgTy8Jomcpu33N+p5w58l6xyt55Ar2I91RPiIA0xCJBXyUAhXCcmZaDWSANiha7R9a6qJJ2CRomGPZ6f46g==",
"license": "MIT",
"dependencies": {
"@octokit/openapi-types": "^25.1.0"
}
},
"node_modules/asynckit": {
"version": "0.4.0",
"resolved": "https://registry.npmjs.org/asynckit/-/asynckit-0.4.0.tgz",
"integrity": "sha512-Oei9OH4tRh0YqU3GxhX79dM/mwVgvbZJaSNaRk+bshkj0S5cfHcgYakreBjrHwatXKbz+IoIdYLxrKim2MjW0Q==",
"license": "MIT"
},
"node_modules/axios": {
"version": "1.12.2",
"resolved": "https://registry.npmjs.org/axios/-/axios-1.12.2.tgz",
"integrity": "sha512-vMJzPewAlRyOgxV2dU0Cuz2O8zzzx9VYtbJOaBgXFeLc4IV/Eg50n4LowmehOOR61S8ZMpc2K5Sa7g6A4jfkUw==",
"license": "MIT",
"dependencies": {
"follow-redirects": "^1.15.6",
"form-data": "^4.0.4",
"proxy-from-env": "^1.1.0"
}
},
"node_modules/before-after-hook": {
"version": "3.0.2",
"resolved": "https://registry.npmjs.org/before-after-hook/-/before-after-hook-3.0.2.tgz",
"integrity": "sha512-Nik3Sc0ncrMK4UUdXQmAnRtzmNQTAAXmXIopizwZ1W1t8QmfJj+zL4OA2I7XPTPW5z5TDqv4hRo/JzouDJnX3A==",
"license": "Apache-2.0"
},
"node_modules/call-bind-apply-helpers": {
"version": "1.0.2",
"resolved": "https://registry.npmjs.org/call-bind-apply-helpers/-/call-bind-apply-helpers-1.0.2.tgz",
"integrity": "sha512-Sp1ablJ0ivDkSzjcaJdxEunN5/XvksFJ2sMBFfq6x0ryhQV/2b/KwFe21cMpmHtPOSij8K99/wSfoEuTObmuMQ==",
"license": "MIT",
"dependencies": {
"es-errors": "^1.3.0",
"function-bind": "^1.1.2"
},
"engines": {
"node": ">= 0.4"
}
},
"node_modules/combined-stream": {
"version": "1.0.8",
"resolved": "https://registry.npmjs.org/combined-stream/-/combined-stream-1.0.8.tgz",
"integrity": "sha512-FQN4MRfuJeHf7cBbBMJFXhKSDq+2kAArBlmRBvcvFE5BB1HZKXtSFASDhdlz9zOYwxh8lDdnvmMOe/+5cdoEdg==",
"license": "MIT",
"dependencies": {
"delayed-stream": "~1.0.0"
},
"engines": {
"node": ">= 0.8"
}
},
"node_modules/delayed-stream": {
"version": "1.0.0",
"resolved": "https://registry.npmjs.org/delayed-stream/-/delayed-stream-1.0.0.tgz",
"integrity": "sha512-ZySD7Nf91aLB0RxL4KGrKHBXl7Eds1DAmEdcoVawXnLD7SDhpNgtuII2aAkg7a7QS41jxPSZ17p4VdGnMHk3MQ==",
"license": "MIT",
"engines": {
"node": ">=0.4.0"
}
},
"node_modules/dunder-proto": {
"version": "1.0.1",
"resolved": "https://registry.npmjs.org/dunder-proto/-/dunder-proto-1.0.1.tgz",
"integrity": "sha512-KIN/nDJBQRcXw0MLVhZE9iQHmG68qAVIBg9CqmUYjmQIhgij9U5MFvrqkUL5FbtyyzZuOeOt0zdeRe4UY7ct+A==",
"license": "MIT",
"dependencies": {
"call-bind-apply-helpers": "^1.0.1",
"es-errors": "^1.3.0",
"gopd": "^1.2.0"
},
"engines": {
"node": ">= 0.4"
}
},
"node_modules/es-define-property": {
"version": "1.0.1",
"resolved": "https://registry.npmjs.org/es-define-property/-/es-define-property-1.0.1.tgz",
"integrity": "sha512-e3nRfgfUZ4rNGL232gUgX06QNyyez04KdjFrF+LTRoOXmrOgFKDg4BCdsjW8EnT69eqdYGmRpJwiPVYNrCaW3g==",
"license": "MIT",
"engines": {
"node": ">= 0.4"
}
},
"node_modules/es-errors": {
"version": "1.3.0",
"resolved": "https://registry.npmjs.org/es-errors/-/es-errors-1.3.0.tgz",
"integrity": "sha512-Zf5H2Kxt2xjTvbJvP2ZWLEICxA6j+hAmMzIlypy4xcBg1vKVnx89Wy0GbS+kf5cwCVFFzdCFh2XSCFNULS6csw==",
"license": "MIT",
"engines": {
"node": ">= 0.4"
}
},
"node_modules/es-object-atoms": {
"version": "1.1.1",
"resolved": "https://registry.npmjs.org/es-object-atoms/-/es-object-atoms-1.1.1.tgz",
"integrity": "sha512-FGgH2h8zKNim9ljj7dankFPcICIK9Cp5bm+c2gQSYePhpaG5+esrLODihIorn+Pe6FGJzWhXQotPv73jTaldXA==",
"license": "MIT",
"dependencies": {
"es-errors": "^1.3.0"
},
"engines": {
"node": ">= 0.4"
}
},
"node_modules/es-set-tostringtag": {
"version": "2.1.0",
"resolved": "https://registry.npmjs.org/es-set-tostringtag/-/es-set-tostringtag-2.1.0.tgz",
"integrity": "sha512-j6vWzfrGVfyXxge+O0x5sh6cvxAog0a/4Rdd2K36zCMV5eJ+/+tOAngRO8cODMNWbVRdVlmGZQL2YS3yR8bIUA==",
"license": "MIT",
"dependencies": {
"es-errors": "^1.3.0",
"get-intrinsic": "^1.2.6",
"has-tostringtag": "^1.0.2",
"hasown": "^2.0.2"
},
"engines": {
"node": ">= 0.4"
}
},
"node_modules/fast-content-type-parse": {
"version": "2.0.1",
"resolved": "https://registry.npmjs.org/fast-content-type-parse/-/fast-content-type-parse-2.0.1.tgz",
"integrity": "sha512-nGqtvLrj5w0naR6tDPfB4cUmYCqouzyQiz6C5y/LtcDllJdrcc6WaWW6iXyIIOErTa/XRybj28aasdn4LkVk6Q==",
"funding": [
{
"type": "github",
"url": "https://github.com/sponsors/fastify"
},
{
"type": "opencollective",
"url": "https://opencollective.com/fastify"
}
],
"license": "MIT"
},
"node_modules/follow-redirects": {
"version": "1.15.11",
"resolved": "https://registry.npmjs.org/follow-redirects/-/follow-redirects-1.15.11.tgz",
"integrity": "sha512-deG2P0JfjrTxl50XGCDyfI97ZGVCxIpfKYmfyrQ54n5FO/0gfIES8C/Psl6kWVDolizcaaxZJnTS0QSMxvnsBQ==",
"funding": [
{
"type": "individual",
"url": "https://github.com/sponsors/RubenVerborgh"
}
],
"license": "MIT",
"engines": {
"node": ">=4.0"
},
"peerDependenciesMeta": {
"debug": {
"optional": true
}
}
},
"node_modules/form-data": {
"version": "4.0.4",
"resolved": "https://registry.npmjs.org/form-data/-/form-data-4.0.4.tgz",
"integrity": "sha512-KrGhL9Q4zjj0kiUt5OO4Mr/A/jlI2jDYs5eHBpYHPcBEVSiipAvn2Ko2HnPe20rmcuuvMHNdZFp+4IlGTMF0Ow==",
"license": "MIT",
"dependencies": {
"asynckit": "^0.4.0",
"combined-stream": "^1.0.8",
"es-set-tostringtag": "^2.1.0",
"hasown": "^2.0.2",
"mime-types": "^2.1.12"
},
"engines": {
"node": ">= 6"
}
},
"node_modules/function-bind": {
"version": "1.1.2",
"resolved": "https://registry.npmjs.org/function-bind/-/function-bind-1.1.2.tgz",
"integrity": "sha512-7XHNxH7qX9xG5mIwxkhumTox/MIRNcOgDrxWsMt2pAr23WHp6MrRlN7FBSFpCpr+oVO0F744iUgR82nJMfG2SA==",
"license": "MIT",
"funding": {
"url": "https://github.com/sponsors/ljharb"
}
},
"node_modules/get-intrinsic": {
"version": "1.3.0",
"resolved": "https://registry.npmjs.org/get-intrinsic/-/get-intrinsic-1.3.0.tgz",
"integrity": "sha512-9fSjSaos/fRIVIp+xSJlE6lfwhES7LNtKaCBIamHsjr2na1BiABJPo0mOjjz8GJDURarmCPGqaiVg5mfjb98CQ==",
"license": "MIT",
"dependencies": {
"call-bind-apply-helpers": "^1.0.2",
"es-define-property": "^1.0.1",
"es-errors": "^1.3.0",
"es-object-atoms": "^1.1.1",
"function-bind": "^1.1.2",
"get-proto": "^1.0.1",
"gopd": "^1.2.0",
"has-symbols": "^1.1.0",
"hasown": "^2.0.2",
"math-intrinsics": "^1.1.0"
},
"engines": {
"node": ">= 0.4"
},
"funding": {
"url": "https://github.com/sponsors/ljharb"
}
},
"node_modules/get-proto": {
"version": "1.0.1",
"resolved": "https://registry.npmjs.org/get-proto/-/get-proto-1.0.1.tgz",
"integrity": "sha512-sTSfBjoXBp89JvIKIefqw7U2CCebsc74kiY6awiGogKtoSGbgjYE/G/+l9sF3MWFPNc9IcoOC4ODfKHfxFmp0g==",
"license": "MIT",
"dependencies": {
"dunder-proto": "^1.0.1",
"es-object-atoms": "^1.0.0"
},
"engines": {
"node": ">= 0.4"
}
},
"node_modules/gopd": {
"version": "1.2.0",
"resolved": "https://registry.npmjs.org/gopd/-/gopd-1.2.0.tgz",
"integrity": "sha512-ZUKRh6/kUFoAiTAtTYPZJ3hw9wNxx+BIBOijnlG9PnrJsCcSjs1wyyD6vJpaYtgnzDrKYRSqf3OO6Rfa93xsRg==",
"license": "MIT",
"engines": {
"node": ">= 0.4"
},
"funding": {
"url": "https://github.com/sponsors/ljharb"
}
},
"node_modules/has-symbols": {
"version": "1.1.0",
"resolved": "https://registry.npmjs.org/has-symbols/-/has-symbols-1.1.0.tgz",
"integrity": "sha512-1cDNdwJ2Jaohmb3sg4OmKaMBwuC48sYni5HUw2DvsC8LjGTLK9h+eb1X6RyuOHe4hT0ULCW68iomhjUoKUqlPQ==",
"license": "MIT",
"engines": {
"node": ">= 0.4"
},
"funding": {
"url": "https://github.com/sponsors/ljharb"
}
},
"node_modules/has-tostringtag": {
"version": "1.0.2",
"resolved": "https://registry.npmjs.org/has-tostringtag/-/has-tostringtag-1.0.2.tgz",
"integrity": "sha512-NqADB8VjPFLM2V0VvHUewwwsw0ZWBaIdgo+ieHtK3hasLz4qeCRjYcqfB6AQrBggRKppKF8L52/VqdVsO47Dlw==",
"license": "MIT",
"dependencies": {
"has-symbols": "^1.0.3"
},
"engines": {
"node": ">= 0.4"
},
"funding": {
"url": "https://github.com/sponsors/ljharb"
}
},
"node_modules/hasown": {
"version": "2.0.2",
"resolved": "https://registry.npmjs.org/hasown/-/hasown-2.0.2.tgz",
"integrity": "sha512-0hJU9SCPvmMzIBdZFqNPXWa6dqh7WdH0cII9y+CyS8rG3nL48Bclra9HmKhVVUHyPWNH5Y7xDwAB7bfgSjkUMQ==",
"license": "MIT",
"dependencies": {
"function-bind": "^1.1.2"
},
"engines": {
"node": ">= 0.4"
}
},
"node_modules/math-intrinsics": {
"version": "1.1.0",
"resolved": "https://registry.npmjs.org/math-intrinsics/-/math-intrinsics-1.1.0.tgz",
"integrity": "sha512-/IXtbwEk5HTPyEwyKX6hGkYXxM9nbj64B+ilVJnC/R6B0pH5G4V3b0pVbL7DBj4tkhBAppbQUlf6F6Xl9LHu1g==",
"license": "MIT",
"engines": {
"node": ">= 0.4"
}
},
"node_modules/mime-db": {
"version": "1.52.0",
"resolved": "https://registry.npmjs.org/mime-db/-/mime-db-1.52.0.tgz",
"integrity": "sha512-sPU4uV7dYlvtWJxwwxHD0PuihVNiE7TyAbQ5SWxDCB9mUYvOgroQOwYQQOKPJ8CIbE+1ETVlOoK1UC2nU3gYvg==",
"license": "MIT",
"engines": {
"node": ">= 0.6"
}
},
"node_modules/mime-types": {
"version": "2.1.35",
"resolved": "https://registry.npmjs.org/mime-types/-/mime-types-2.1.35.tgz",
"integrity": "sha512-ZDY+bPm5zTTF+YpCrAU9nK0UgICYPT0QtT1NZWFv4s++TNkcgVaT0g6+4R2uI4MjQjzysHB1zxuWL50hzaeXiw==",
"license": "MIT",
"dependencies": {
"mime-db": "1.52.0"
},
"engines": {
"node": ">= 0.6"
}
},
"node_modules/proxy-from-env": {
"version": "1.1.0",
"resolved": "https://registry.npmjs.org/proxy-from-env/-/proxy-from-env-1.1.0.tgz",
"integrity": "sha512-D+zkORCbA9f1tdWRK0RaCR3GPv50cMxcrz4X8k5LTSUD1Dkw47mKJEZQNunItRTkWwgtaUSo1RVFRIG9ZXiFYg==",
"license": "MIT"
},
"node_modules/universal-user-agent": {
"version": "7.0.3",
"resolved": "https://registry.npmjs.org/universal-user-agent/-/universal-user-agent-7.0.3.tgz",
"integrity": "sha512-TmnEAEAsBJVZM/AADELsK76llnwcf9vMKuPz8JflO1frO8Lchitr0fNaN9d+Ap0BjKtqWqd/J17qeDnXh8CL2A==",
"license": "ISC"
}
}
}

10
.github/scripts/package.json vendored Normal file
View File

@@ -0,0 +1,10 @@
{
"name": "greptimedb-github-scripts",
"version": "1.0.0",
"type": "module",
"description": "GitHub automation scripts for GreptimeDB",
"dependencies": {
"@octokit/rest": "^21.0.0",
"axios": "^1.7.0"
}
}

152
.github/scripts/pr-review-reminder.js vendored Normal file
View File

@@ -0,0 +1,152 @@
// Daily PR Review Reminder Script
// Fetches open PRs from GreptimeDB repository and sends Slack notifications
// to PR owners and assigned reviewers to keep review process moving.
(async () => {
const { Octokit } = await import("@octokit/rest");
const { default: axios } = await import('axios');
// Configuration
const GITHUB_TOKEN = process.env.GITHUB_TOKEN;
const SLACK_WEBHOOK_URL = process.env.SLACK_PR_REVIEW_WEBHOOK_URL;
const REPO_OWNER = "GreptimeTeam";
const REPO_NAME = "greptimedb";
const GITHUB_TO_SLACK = JSON.parse(process.env.GITHUBID_SLACKID_MAPPING || '{}');
// Debug: Print environment variable status
console.log("=== Environment Variables Debug ===");
console.log(`GITHUB_TOKEN: ${GITHUB_TOKEN ? 'Set ✓' : 'NOT SET ✗'}`);
console.log(`SLACK_PR_REVIEW_WEBHOOK_URL: ${SLACK_WEBHOOK_URL ? 'Set ✓' : 'NOT SET ✗'}`);
console.log(`GITHUBID_SLACKID_MAPPING: ${process.env.GITHUBID_SLACKID_MAPPING ? `Set ✓ (${Object.keys(GITHUB_TO_SLACK).length} mappings)` : 'NOT SET ✗'}`);
console.log("===================================\n");
const octokit = new Octokit({
auth: GITHUB_TOKEN
});
// Fetch all open PRs from the repository
async function fetchOpenPRs() {
try {
const prs = await octokit.pulls.list({
owner: REPO_OWNER,
repo: REPO_NAME,
state: "open",
per_page: 100,
sort: "created",
direction: "asc"
});
return prs.data.filter((pr) => !pr.draft);
} catch (error) {
console.error("Error fetching PRs:", error);
return [];
}
}
// Convert GitHub username to Slack mention or fallback to GitHub username
function toSlackMention(githubUser) {
const slackUserId = GITHUB_TO_SLACK[githubUser];
return slackUserId ? `<@${slackUserId}>` : `@${githubUser}`;
}
// Calculate days since PR was opened
function getDaysOpen(createdAt) {
const created = new Date(createdAt);
const now = new Date();
const diffMs = now - created;
const days = Math.floor(diffMs / (1000 * 60 * 60 * 24));
return days;
}
// Build Slack notification message from PR list
function buildSlackMessage(prs) {
if (prs.length === 0) {
return "*🎉 Great job! No pending PRs for review.*";
}
// Separate PRs by age threshold (14 days)
const criticalPRs = [];
const recentPRs = [];
prs.forEach(pr => {
const daysOpen = getDaysOpen(pr.created_at);
if (daysOpen >= 14) {
criticalPRs.push(pr);
} else {
recentPRs.push(pr);
}
});
const lines = [
`*🔍 Daily PR Review Reminder 🔍*`,
`Found *${criticalPRs.length}* critical PR(s) (14+ days old)\n`
];
// Show critical PRs (14+ days) in detail
if (criticalPRs.length > 0) {
criticalPRs.forEach((pr, index) => {
const owner = toSlackMention(pr.user.login);
const reviewers = pr.requested_reviewers || [];
const reviewerMentions = reviewers.map(r => toSlackMention(r.login)).join(", ");
const daysOpen = getDaysOpen(pr.created_at);
const prInfo = `${index + 1}. <${pr.html_url}|#${pr.number}: ${pr.title}>`;
const ageInfo = ` 🔴 Opened *${daysOpen}* day(s) ago`;
const ownerInfo = ` 👤 Owner: ${owner}`;
const reviewerInfo = reviewers.length > 0
? ` 👁️ Reviewers: ${reviewerMentions}`
: ` 👁️ Reviewers: _Not assigned yet_`;
lines.push(prInfo);
lines.push(ageInfo);
lines.push(ownerInfo);
lines.push(reviewerInfo);
lines.push(""); // Empty line between PRs
});
}
lines.push("_Let's keep the code review process moving! 🚀_");
return lines.join("\n");
}
// Send notification to Slack webhook
async function sendSlackNotification(message) {
if (!SLACK_WEBHOOK_URL) {
console.log("⚠️ SLACK_PR_REVIEW_WEBHOOK_URL not configured. Message preview:");
console.log("=".repeat(60));
console.log(message);
console.log("=".repeat(60));
return;
}
try {
const response = await axios.post(SLACK_WEBHOOK_URL, {
text: message
});
if (response.status !== 200) {
throw new Error(`Slack API returned status ${response.status}`);
}
console.log("Slack notification sent successfully.");
} catch (error) {
console.error("Error sending Slack notification:", error);
throw error;
}
}
// Main execution flow
async function run() {
console.log(`Fetching open PRs from ${REPO_OWNER}/${REPO_NAME}...`);
const prs = await fetchOpenPRs();
console.log(`Found ${prs.length} open PR(s).`);
const message = buildSlackMessage(prs);
console.log("Sending Slack notification...");
await sendSlackNotification(message);
}
run().catch(error => {
console.error("Script execution failed:", error);
process.exit(1);
});
})();

View File

@@ -632,7 +632,7 @@ jobs:
- name: Unzip binaries - name: Unzip binaries
run: tar -xvf ./bins.tar.gz run: tar -xvf ./bins.tar.gz
- name: Run sqlness - name: Run sqlness
run: RUST_BACKTRACE=1 ./bins/sqlness-runner ${{ matrix.mode.opts }} -c ./tests/cases --bins-dir ./bins --preserve-state run: RUST_BACKTRACE=1 ./bins/sqlness-runner bare ${{ matrix.mode.opts }} -c ./tests/cases --bins-dir ./bins --preserve-state
- name: Upload sqlness logs - name: Upload sqlness logs
if: failure() if: failure()
uses: actions/upload-artifact@v4 uses: actions/upload-artifact@v4

View File

@@ -0,0 +1,36 @@
name: PR Review Reminder
on:
schedule:
# Run at 9:00 AM UTC+8 (01:00 AM UTC) on Monday, Wednesday, Friday
- cron: '0 1 * * 1,3,5'
workflow_dispatch:
jobs:
pr-review-reminder:
name: Send PR Review Reminders
runs-on: ubuntu-latest
permissions:
contents: read
pull-requests: read
if: ${{ github.repository == 'GreptimeTeam/greptimedb' }}
steps:
- name: Checkout repository
uses: actions/checkout@v4
- name: Setup Node.js
uses: actions/setup-node@v4
with:
node-version: '20'
- name: Install dependencies
working-directory: .github/scripts
run: npm ci
- name: Run PR review reminder
working-directory: .github/scripts
run: node pr-review-reminder.js
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
SLACK_PR_REVIEW_WEBHOOK_URL: ${{ vars.SLACK_PR_REVIEW_WEBHOOK_URL }}
GITHUBID_SLACKID_MAPPING: ${{ vars.GITHUBID_SLACKID_MAPPING }}

974
Cargo.lock generated

File diff suppressed because it is too large Load Diff

View File

@@ -99,12 +99,12 @@ rust.unexpected_cfgs = { level = "warn", check-cfg = ['cfg(tokio_unstable)'] }
# See for more detaiils: https://github.com/rust-lang/cargo/issues/11329 # See for more detaiils: https://github.com/rust-lang/cargo/issues/11329
ahash = { version = "0.8", features = ["compile-time-rng"] } ahash = { version = "0.8", features = ["compile-time-rng"] }
aquamarine = "0.6" aquamarine = "0.6"
arrow = { version = "56.0", features = ["prettyprint"] } arrow = { version = "56.2", features = ["prettyprint"] }
arrow-array = { version = "56.0", default-features = false, features = ["chrono-tz"] } arrow-array = { version = "56.2", default-features = false, features = ["chrono-tz"] }
arrow-buffer = "56.0" arrow-buffer = "56.2"
arrow-flight = "56.0" arrow-flight = "56.2"
arrow-ipc = { version = "56.0", default-features = false, features = ["lz4", "zstd"] } arrow-ipc = { version = "56.2", default-features = false, features = ["lz4", "zstd"] }
arrow-schema = { version = "56.0", features = ["serde"] } arrow-schema = { version = "56.2", features = ["serde"] }
async-stream = "0.3" async-stream = "0.3"
async-trait = "0.1" async-trait = "0.1"
# Remember to update axum-extra, axum-macros when updating axum # Remember to update axum-extra, axum-macros when updating axum
@@ -121,20 +121,21 @@ chrono = { version = "0.4", features = ["serde"] }
chrono-tz = "0.10.1" chrono-tz = "0.10.1"
clap = { version = "4.4", features = ["derive"] } clap = { version = "4.4", features = ["derive"] }
config = "0.13.0" config = "0.13.0"
const_format = "0.2"
crossbeam-utils = "0.8" crossbeam-utils = "0.8"
dashmap = "6.1" dashmap = "6.1"
datafusion = "49" datafusion = "50"
datafusion-common = "49" datafusion-common = "50"
datafusion-expr = "49" datafusion-expr = "50"
datafusion-functions = "49" datafusion-functions = "50"
datafusion-functions-aggregate-common = "49" datafusion-functions-aggregate-common = "50"
datafusion-optimizer = "49" datafusion-optimizer = "50"
datafusion-orc = { git = "https://github.com/GreptimeTeam/datafusion-orc", rev = "a0a5f902158f153119316eaeec868cff3fc8a99d" } datafusion-orc = "0.5"
datafusion-pg-catalog = { git = "https://github.com/datafusion-contrib/datafusion-postgres", rev = "3d1b7c7d5b82dd49bafc2803259365e633f654fa" } datafusion-pg-catalog = "0.12.1"
datafusion-physical-expr = "49" datafusion-physical-expr = "50"
datafusion-physical-plan = "49" datafusion-physical-plan = "50"
datafusion-sql = "49" datafusion-sql = "50"
datafusion-substrait = "49" datafusion-substrait = "50"
deadpool = "0.12" deadpool = "0.12"
deadpool-postgres = "0.14" deadpool-postgres = "0.14"
derive_builder = "0.20" derive_builder = "0.20"
@@ -147,7 +148,7 @@ etcd-client = { git = "https://github.com/GreptimeTeam/etcd-client", rev = "f62d
fst = "0.4.7" fst = "0.4.7"
futures = "0.3" futures = "0.3"
futures-util = "0.3" futures-util = "0.3"
greptime-proto = { git = "https://github.com/GreptimeTeam/greptime-proto.git", rev = "3e821d0d405e6733690a4e4352812ba2ff780a3e" } greptime-proto = { git = "https://github.com/GreptimeTeam/greptime-proto.git", rev = "14b9dc40bdc8288742b0cefc7bb024303b7429ef" }
hex = "0.4" hex = "0.4"
http = "1" http = "1"
humantime = "2.1" humantime = "2.1"
@@ -180,7 +181,7 @@ otel-arrow-rust = { git = "https://github.com/GreptimeTeam/otel-arrow", rev = "2
"server", "server",
] } ] }
parking_lot = "0.12" parking_lot = "0.12"
parquet = { version = "56.0", default-features = false, features = ["arrow", "async", "object_store"] } parquet = { version = "56.2", default-features = false, features = ["arrow", "async", "object_store"] }
paste = "1.0" paste = "1.0"
pin-project = "1.0" pin-project = "1.0"
pretty_assertions = "1.4.0" pretty_assertions = "1.4.0"
@@ -191,7 +192,7 @@ prost-types = "0.13"
raft-engine = { version = "0.4.1", default-features = false } raft-engine = { version = "0.4.1", default-features = false }
rand = "0.9" rand = "0.9"
ratelimit = "0.10" ratelimit = "0.10"
regex = "1.8" regex = "1.12"
regex-automata = "0.4" regex-automata = "0.4"
reqwest = { version = "0.12", default-features = false, features = [ reqwest = { version = "0.12", default-features = false, features = [
"json", "json",
@@ -207,6 +208,7 @@ rstest_reuse = "0.7"
rust_decimal = "1.33" rust_decimal = "1.33"
rustc-hash = "2.0" rustc-hash = "2.0"
# It is worth noting that we should try to avoid using aws-lc-rs until it can be compiled on various platforms. # It is worth noting that we should try to avoid using aws-lc-rs until it can be compiled on various platforms.
hostname = "0.4.0"
rustls = { version = "0.23.25", default-features = false } rustls = { version = "0.23.25", default-features = false }
sea-query = "0.32" sea-query = "0.32"
serde = { version = "1.0", features = ["derive"] } serde = { version = "1.0", features = ["derive"] }
@@ -216,10 +218,7 @@ simd-json = "0.15"
similar-asserts = "1.6.0" similar-asserts = "1.6.0"
smallvec = { version = "1", features = ["serde"] } smallvec = { version = "1", features = ["serde"] }
snafu = "0.8" snafu = "0.8"
sqlparser = { git = "https://github.com/GreptimeTeam/sqlparser-rs.git", rev = "39e4fc94c3c741981f77e9d63b5ce8c02e0a27ea", features = [ sqlparser = { version = "0.58.0", default-features = false, features = ["std", "visitor", "serde"] }
"visitor",
"serde",
] } # branch = "v0.55.x"
sqlx = { version = "0.8", features = [ sqlx = { version = "0.8", features = [
"runtime-tokio-rustls", "runtime-tokio-rustls",
"mysql", "mysql",
@@ -321,16 +320,20 @@ git = "https://github.com/GreptimeTeam/greptime-meter.git"
rev = "5618e779cf2bb4755b499c630fba4c35e91898cb" rev = "5618e779cf2bb4755b499c630fba4c35e91898cb"
[patch.crates-io] [patch.crates-io]
datafusion = { git = "https://github.com/GreptimeTeam/datafusion.git", rev = "7d5214512740b4dfb742b6b3d91ed9affcc2c9d0" } datafusion = { git = "https://github.com/GreptimeTeam/datafusion.git", rev = "fd4b2abcf3c3e43e94951bda452c9fd35243aab0" }
datafusion-common = { git = "https://github.com/GreptimeTeam/datafusion.git", rev = "7d5214512740b4dfb742b6b3d91ed9affcc2c9d0" } datafusion-common = { git = "https://github.com/GreptimeTeam/datafusion.git", rev = "fd4b2abcf3c3e43e94951bda452c9fd35243aab0" }
datafusion-expr = { git = "https://github.com/GreptimeTeam/datafusion.git", rev = "7d5214512740b4dfb742b6b3d91ed9affcc2c9d0" } datafusion-expr = { git = "https://github.com/GreptimeTeam/datafusion.git", rev = "fd4b2abcf3c3e43e94951bda452c9fd35243aab0" }
datafusion-functions = { git = "https://github.com/GreptimeTeam/datafusion.git", rev = "7d5214512740b4dfb742b6b3d91ed9affcc2c9d0" } datafusion-functions = { git = "https://github.com/GreptimeTeam/datafusion.git", rev = "fd4b2abcf3c3e43e94951bda452c9fd35243aab0" }
datafusion-functions-aggregate-common = { git = "https://github.com/GreptimeTeam/datafusion.git", rev = "7d5214512740b4dfb742b6b3d91ed9affcc2c9d0" } datafusion-functions-aggregate-common = { git = "https://github.com/GreptimeTeam/datafusion.git", rev = "fd4b2abcf3c3e43e94951bda452c9fd35243aab0" }
datafusion-optimizer = { git = "https://github.com/GreptimeTeam/datafusion.git", rev = "7d5214512740b4dfb742b6b3d91ed9affcc2c9d0" } datafusion-optimizer = { git = "https://github.com/GreptimeTeam/datafusion.git", rev = "fd4b2abcf3c3e43e94951bda452c9fd35243aab0" }
datafusion-physical-expr = { git = "https://github.com/GreptimeTeam/datafusion.git", rev = "7d5214512740b4dfb742b6b3d91ed9affcc2c9d0" } datafusion-physical-expr = { git = "https://github.com/GreptimeTeam/datafusion.git", rev = "fd4b2abcf3c3e43e94951bda452c9fd35243aab0" }
datafusion-physical-plan = { git = "https://github.com/GreptimeTeam/datafusion.git", rev = "7d5214512740b4dfb742b6b3d91ed9affcc2c9d0" } datafusion-physical-expr-common = { git = "https://github.com/GreptimeTeam/datafusion.git", rev = "fd4b2abcf3c3e43e94951bda452c9fd35243aab0" }
datafusion-sql = { git = "https://github.com/GreptimeTeam/datafusion.git", rev = "7d5214512740b4dfb742b6b3d91ed9affcc2c9d0" } datafusion-physical-plan = { git = "https://github.com/GreptimeTeam/datafusion.git", rev = "fd4b2abcf3c3e43e94951bda452c9fd35243aab0" }
datafusion-substrait = { git = "https://github.com/GreptimeTeam/datafusion.git", rev = "7d5214512740b4dfb742b6b3d91ed9affcc2c9d0" } datafusion-datasource = { git = "https://github.com/GreptimeTeam/datafusion.git", rev = "fd4b2abcf3c3e43e94951bda452c9fd35243aab0" }
datafusion-sql = { git = "https://github.com/GreptimeTeam/datafusion.git", rev = "fd4b2abcf3c3e43e94951bda452c9fd35243aab0" }
datafusion-substrait = { git = "https://github.com/GreptimeTeam/datafusion.git", rev = "fd4b2abcf3c3e43e94951bda452c9fd35243aab0" }
sqlparser = { git = "https://github.com/GreptimeTeam/sqlparser-rs.git", rev = "4b519a5caa95472cc3988f5556813a583dd35af1" } # branch = "v0.58.x"
bytes = { git = "https://github.com/discord9/bytes", rev = "1572ab22c3cbad0e9b6681d1f68eca4139322a2a" }
[profile.release] [profile.release]
debug = 1 debug = 1

View File

@@ -8,7 +8,7 @@ CARGO_BUILD_OPTS := --locked
IMAGE_REGISTRY ?= docker.io IMAGE_REGISTRY ?= docker.io
IMAGE_NAMESPACE ?= greptime IMAGE_NAMESPACE ?= greptime
IMAGE_TAG ?= latest IMAGE_TAG ?= latest
DEV_BUILDER_IMAGE_TAG ?= 2025-05-19-f55023f3-20250829091211 DEV_BUILDER_IMAGE_TAG ?= 2025-10-01-8fe17d43-20251011080129
BUILDX_MULTI_PLATFORM_BUILD ?= false BUILDX_MULTI_PLATFORM_BUILD ?= false
BUILDX_BUILDER_NAME ?= gtbuilder BUILDX_BUILDER_NAME ?= gtbuilder
BASE_IMAGE ?= ubuntu BASE_IMAGE ?= ubuntu
@@ -169,7 +169,7 @@ nextest: ## Install nextest tools.
.PHONY: sqlness-test .PHONY: sqlness-test
sqlness-test: ## Run sqlness test. sqlness-test: ## Run sqlness test.
cargo sqlness ${SQLNESS_OPTS} cargo sqlness bare ${SQLNESS_OPTS}
RUNS ?= 1 RUNS ?= 1
FUZZ_TARGET ?= fuzz_alter_table FUZZ_TARGET ?= fuzz_alter_table

View File

@@ -13,6 +13,7 @@
| Key | Type | Default | Descriptions | | Key | Type | Default | Descriptions |
| --- | -----| ------- | ----------- | | --- | -----| ------- | ----------- |
| `default_timezone` | String | Unset | The default timezone of the server. | | `default_timezone` | String | Unset | The default timezone of the server. |
| `default_column_prefix` | String | Unset | The default column prefix for auto-created time index and value columns. |
| `init_regions_in_background` | Bool | `false` | Initialize all regions in the background during the startup.<br/>By default, it provides services after all regions have been initialized. | | `init_regions_in_background` | Bool | `false` | Initialize all regions in the background during the startup.<br/>By default, it provides services after all regions have been initialized. |
| `init_regions_parallelism` | Integer | `16` | Parallelism of initializing regions. | | `init_regions_parallelism` | Integer | `16` | Parallelism of initializing regions. |
| `max_concurrent_queries` | Integer | `0` | The maximum current queries allowed to be executed. Zero means unlimited. | | `max_concurrent_queries` | Integer | `0` | The maximum current queries allowed to be executed. Zero means unlimited. |
@@ -25,12 +26,14 @@
| `http.addr` | String | `127.0.0.1:4000` | The address to bind the HTTP server. | | `http.addr` | String | `127.0.0.1:4000` | The address to bind the HTTP server. |
| `http.timeout` | String | `0s` | HTTP request timeout. Set to 0 to disable timeout. | | `http.timeout` | String | `0s` | HTTP request timeout. Set to 0 to disable timeout. |
| `http.body_limit` | String | `64MB` | HTTP request body limit.<br/>The following units are supported: `B`, `KB`, `KiB`, `MB`, `MiB`, `GB`, `GiB`, `TB`, `TiB`, `PB`, `PiB`.<br/>Set to 0 to disable limit. | | `http.body_limit` | String | `64MB` | HTTP request body limit.<br/>The following units are supported: `B`, `KB`, `KiB`, `MB`, `MiB`, `GB`, `GiB`, `TB`, `TiB`, `PB`, `PiB`.<br/>Set to 0 to disable limit. |
| `http.max_total_body_memory` | String | Unset | Maximum total memory for all concurrent HTTP request bodies.<br/>Set to 0 to disable the limit. Default: "0" (unlimited) |
| `http.enable_cors` | Bool | `true` | HTTP CORS support, it's turned on by default<br/>This allows browser to access http APIs without CORS restrictions | | `http.enable_cors` | Bool | `true` | HTTP CORS support, it's turned on by default<br/>This allows browser to access http APIs without CORS restrictions |
| `http.cors_allowed_origins` | Array | Unset | Customize allowed origins for HTTP CORS. | | `http.cors_allowed_origins` | Array | Unset | Customize allowed origins for HTTP CORS. |
| `http.prom_validation_mode` | String | `strict` | Whether to enable validation for Prometheus remote write requests.<br/>Available options:<br/>- strict: deny invalid UTF-8 strings (default).<br/>- lossy: allow invalid UTF-8 strings, replace invalid characters with REPLACEMENT_CHARACTER(U+FFFD).<br/>- unchecked: do not valid strings. | | `http.prom_validation_mode` | String | `strict` | Whether to enable validation for Prometheus remote write requests.<br/>Available options:<br/>- strict: deny invalid UTF-8 strings (default).<br/>- lossy: allow invalid UTF-8 strings, replace invalid characters with REPLACEMENT_CHARACTER(U+FFFD).<br/>- unchecked: do not valid strings. |
| `grpc` | -- | -- | The gRPC server options. | | `grpc` | -- | -- | The gRPC server options. |
| `grpc.bind_addr` | String | `127.0.0.1:4001` | The address to bind the gRPC server. | | `grpc.bind_addr` | String | `127.0.0.1:4001` | The address to bind the gRPC server. |
| `grpc.runtime_size` | Integer | `8` | The number of server worker threads. | | `grpc.runtime_size` | Integer | `8` | The number of server worker threads. |
| `grpc.max_total_message_memory` | String | Unset | Maximum total memory for all concurrent gRPC request messages.<br/>Set to 0 to disable the limit. Default: "0" (unlimited) |
| `grpc.max_connection_age` | String | Unset | The maximum connection age for gRPC connection.<br/>The value can be a human-readable time string. For example: `10m` for ten minutes or `1h` for one hour.<br/>Refer to https://grpc.io/docs/guides/keepalive/ for more details. | | `grpc.max_connection_age` | String | Unset | The maximum connection age for gRPC connection.<br/>The value can be a human-readable time string. For example: `10m` for ten minutes or `1h` for one hour.<br/>Refer to https://grpc.io/docs/guides/keepalive/ for more details. |
| `grpc.tls` | -- | -- | gRPC server TLS options, see `mysql.tls` section. | | `grpc.tls` | -- | -- | gRPC server TLS options, see `mysql.tls` section. |
| `grpc.tls.mode` | String | `disable` | TLS mode. | | `grpc.tls.mode` | String | `disable` | TLS mode. |
@@ -153,7 +156,7 @@
| `region_engine.mito.max_concurrent_scan_files` | Integer | `384` | Maximum number of SST files to scan concurrently. | | `region_engine.mito.max_concurrent_scan_files` | Integer | `384` | Maximum number of SST files to scan concurrently. |
| `region_engine.mito.allow_stale_entries` | Bool | `false` | Whether to allow stale WAL entries read during replay. | | `region_engine.mito.allow_stale_entries` | Bool | `false` | Whether to allow stale WAL entries read during replay. |
| `region_engine.mito.min_compaction_interval` | String | `0m` | Minimum time interval between two compactions.<br/>To align with the old behavior, the default value is 0 (no restrictions). | | `region_engine.mito.min_compaction_interval` | String | `0m` | Minimum time interval between two compactions.<br/>To align with the old behavior, the default value is 0 (no restrictions). |
| `region_engine.mito.enable_experimental_flat_format` | Bool | `false` | Whether to enable experimental flat format. | | `region_engine.mito.default_experimental_flat_format` | Bool | `false` | Whether to enable experimental flat format as the default format. |
| `region_engine.mito.index` | -- | -- | The options for index in Mito engine. | | `region_engine.mito.index` | -- | -- | The options for index in Mito engine. |
| `region_engine.mito.index.aux_path` | String | `""` | Auxiliary directory path for the index in filesystem, used to store intermediate files for<br/>creating the index and staging files for searching the index, defaults to `{data_home}/index_intermediate`.<br/>The default name for this directory is `index_intermediate` for backward compatibility.<br/><br/>This path contains two subdirectories:<br/>- `__intm`: for storing intermediate files used during creating index.<br/>- `staging`: for storing staging files used during searching index. | | `region_engine.mito.index.aux_path` | String | `""` | Auxiliary directory path for the index in filesystem, used to store intermediate files for<br/>creating the index and staging files for searching the index, defaults to `{data_home}/index_intermediate`.<br/>The default name for this directory is `index_intermediate` for backward compatibility.<br/><br/>This path contains two subdirectories:<br/>- `__intm`: for storing intermediate files used during creating index.<br/>- `staging`: for storing staging files used during searching index. |
| `region_engine.mito.index.staging_size` | String | `2GB` | The max capacity of the staging directory. | | `region_engine.mito.index.staging_size` | String | `2GB` | The max capacity of the staging directory. |
@@ -224,6 +227,7 @@
| Key | Type | Default | Descriptions | | Key | Type | Default | Descriptions |
| --- | -----| ------- | ----------- | | --- | -----| ------- | ----------- |
| `default_timezone` | String | Unset | The default timezone of the server. | | `default_timezone` | String | Unset | The default timezone of the server. |
| `default_column_prefix` | String | Unset | The default column prefix for auto-created time index and value columns. |
| `max_in_flight_write_bytes` | String | Unset | The maximum in-flight write bytes. | | `max_in_flight_write_bytes` | String | Unset | The maximum in-flight write bytes. |
| `runtime` | -- | -- | The runtime options. | | `runtime` | -- | -- | The runtime options. |
| `runtime.global_rt_size` | Integer | `8` | The number of threads to execute the runtime for global read operations. | | `runtime.global_rt_size` | Integer | `8` | The number of threads to execute the runtime for global read operations. |
@@ -235,6 +239,7 @@
| `http.addr` | String | `127.0.0.1:4000` | The address to bind the HTTP server. | | `http.addr` | String | `127.0.0.1:4000` | The address to bind the HTTP server. |
| `http.timeout` | String | `0s` | HTTP request timeout. Set to 0 to disable timeout. | | `http.timeout` | String | `0s` | HTTP request timeout. Set to 0 to disable timeout. |
| `http.body_limit` | String | `64MB` | HTTP request body limit.<br/>The following units are supported: `B`, `KB`, `KiB`, `MB`, `MiB`, `GB`, `GiB`, `TB`, `TiB`, `PB`, `PiB`.<br/>Set to 0 to disable limit. | | `http.body_limit` | String | `64MB` | HTTP request body limit.<br/>The following units are supported: `B`, `KB`, `KiB`, `MB`, `MiB`, `GB`, `GiB`, `TB`, `TiB`, `PB`, `PiB`.<br/>Set to 0 to disable limit. |
| `http.max_total_body_memory` | String | Unset | Maximum total memory for all concurrent HTTP request bodies.<br/>Set to 0 to disable the limit. Default: "0" (unlimited) |
| `http.enable_cors` | Bool | `true` | HTTP CORS support, it's turned on by default<br/>This allows browser to access http APIs without CORS restrictions | | `http.enable_cors` | Bool | `true` | HTTP CORS support, it's turned on by default<br/>This allows browser to access http APIs without CORS restrictions |
| `http.cors_allowed_origins` | Array | Unset | Customize allowed origins for HTTP CORS. | | `http.cors_allowed_origins` | Array | Unset | Customize allowed origins for HTTP CORS. |
| `http.prom_validation_mode` | String | `strict` | Whether to enable validation for Prometheus remote write requests.<br/>Available options:<br/>- strict: deny invalid UTF-8 strings (default).<br/>- lossy: allow invalid UTF-8 strings, replace invalid characters with REPLACEMENT_CHARACTER(U+FFFD).<br/>- unchecked: do not valid strings. | | `http.prom_validation_mode` | String | `strict` | Whether to enable validation for Prometheus remote write requests.<br/>Available options:<br/>- strict: deny invalid UTF-8 strings (default).<br/>- lossy: allow invalid UTF-8 strings, replace invalid characters with REPLACEMENT_CHARACTER(U+FFFD).<br/>- unchecked: do not valid strings. |
@@ -242,6 +247,7 @@
| `grpc.bind_addr` | String | `127.0.0.1:4001` | The address to bind the gRPC server. | | `grpc.bind_addr` | String | `127.0.0.1:4001` | The address to bind the gRPC server. |
| `grpc.server_addr` | String | `127.0.0.1:4001` | The address advertised to the metasrv, and used for connections from outside the host.<br/>If left empty or unset, the server will automatically use the IP address of the first network interface<br/>on the host, with the same port number as the one specified in `grpc.bind_addr`. | | `grpc.server_addr` | String | `127.0.0.1:4001` | The address advertised to the metasrv, and used for connections from outside the host.<br/>If left empty or unset, the server will automatically use the IP address of the first network interface<br/>on the host, with the same port number as the one specified in `grpc.bind_addr`. |
| `grpc.runtime_size` | Integer | `8` | The number of server worker threads. | | `grpc.runtime_size` | Integer | `8` | The number of server worker threads. |
| `grpc.max_total_message_memory` | String | Unset | Maximum total memory for all concurrent gRPC request messages.<br/>Set to 0 to disable the limit. Default: "0" (unlimited) |
| `grpc.flight_compression` | String | `arrow_ipc` | Compression mode for frontend side Arrow IPC service. Available options:<br/>- `none`: disable all compression<br/>- `transport`: only enable gRPC transport compression (zstd)<br/>- `arrow_ipc`: only enable Arrow IPC compression (lz4)<br/>- `all`: enable all compression.<br/>Default to `none` | | `grpc.flight_compression` | String | `arrow_ipc` | Compression mode for frontend side Arrow IPC service. Available options:<br/>- `none`: disable all compression<br/>- `transport`: only enable gRPC transport compression (zstd)<br/>- `arrow_ipc`: only enable Arrow IPC compression (lz4)<br/>- `all`: enable all compression.<br/>Default to `none` |
| `grpc.max_connection_age` | String | Unset | The maximum connection age for gRPC connection.<br/>The value can be a human-readable time string. For example: `10m` for ten minutes or `1h` for one hour.<br/>Refer to https://grpc.io/docs/guides/keepalive/ for more details. | | `grpc.max_connection_age` | String | Unset | The maximum connection age for gRPC connection.<br/>The value can be a human-readable time string. For example: `10m` for ten minutes or `1h` for one hour.<br/>Refer to https://grpc.io/docs/guides/keepalive/ for more details. |
| `grpc.tls` | -- | -- | gRPC server TLS options, see `mysql.tls` section. | | `grpc.tls` | -- | -- | gRPC server TLS options, see `mysql.tls` section. |
@@ -436,6 +442,7 @@
| Key | Type | Default | Descriptions | | Key | Type | Default | Descriptions |
| --- | -----| ------- | ----------- | | --- | -----| ------- | ----------- |
| `node_id` | Integer | Unset | The datanode identifier and should be unique in the cluster. | | `node_id` | Integer | Unset | The datanode identifier and should be unique in the cluster. |
| `default_column_prefix` | String | Unset | The default column prefix for auto-created time index and value columns. |
| `require_lease_before_startup` | Bool | `false` | Start services after regions have obtained leases.<br/>It will block the datanode start if it can't receive leases in the heartbeat from metasrv. | | `require_lease_before_startup` | Bool | `false` | Start services after regions have obtained leases.<br/>It will block the datanode start if it can't receive leases in the heartbeat from metasrv. |
| `init_regions_in_background` | Bool | `false` | Initialize all regions in the background during the startup.<br/>By default, it provides services after all regions have been initialized. | | `init_regions_in_background` | Bool | `false` | Initialize all regions in the background during the startup.<br/>By default, it provides services after all regions have been initialized. |
| `init_regions_parallelism` | Integer | `16` | Parallelism of initializing regions. | | `init_regions_parallelism` | Integer | `16` | Parallelism of initializing regions. |
@@ -474,7 +481,7 @@
| `meta_client.metadata_cache_ttl` | String | `10m` | TTL of the metadata cache. | | `meta_client.metadata_cache_ttl` | String | `10m` | TTL of the metadata cache. |
| `meta_client.metadata_cache_tti` | String | `5m` | -- | | `meta_client.metadata_cache_tti` | String | `5m` | -- |
| `wal` | -- | -- | The WAL options. | | `wal` | -- | -- | The WAL options. |
| `wal.provider` | String | `raft_engine` | The provider of the WAL.<br/>- `raft_engine`: the wal is stored in the local file system by raft-engine.<br/>- `kafka`: it's remote wal that data is stored in Kafka. | | `wal.provider` | String | `raft_engine` | The provider of the WAL.<br/>- `raft_engine`: the wal is stored in the local file system by raft-engine.<br/>- `kafka`: it's remote wal that data is stored in Kafka.<br/>- `noop`: it's a no-op WAL provider that does not store any WAL data.<br/>**Notes: any unflushed data will be lost when the datanode is shutdown.** |
| `wal.dir` | String | Unset | The directory to store the WAL files.<br/>**It's only used when the provider is `raft_engine`**. | | `wal.dir` | String | Unset | The directory to store the WAL files.<br/>**It's only used when the provider is `raft_engine`**. |
| `wal.file_size` | String | `128MB` | The size of the WAL segment file.<br/>**It's only used when the provider is `raft_engine`**. | | `wal.file_size` | String | `128MB` | The size of the WAL segment file.<br/>**It's only used when the provider is `raft_engine`**. |
| `wal.purge_threshold` | String | `1GB` | The threshold of the WAL size to trigger a purge.<br/>**It's only used when the provider is `raft_engine`**. | | `wal.purge_threshold` | String | `1GB` | The threshold of the WAL size to trigger a purge.<br/>**It's only used when the provider is `raft_engine`**. |
@@ -547,7 +554,7 @@
| `region_engine.mito.max_concurrent_scan_files` | Integer | `384` | Maximum number of SST files to scan concurrently. | | `region_engine.mito.max_concurrent_scan_files` | Integer | `384` | Maximum number of SST files to scan concurrently. |
| `region_engine.mito.allow_stale_entries` | Bool | `false` | Whether to allow stale WAL entries read during replay. | | `region_engine.mito.allow_stale_entries` | Bool | `false` | Whether to allow stale WAL entries read during replay. |
| `region_engine.mito.min_compaction_interval` | String | `0m` | Minimum time interval between two compactions.<br/>To align with the old behavior, the default value is 0 (no restrictions). | | `region_engine.mito.min_compaction_interval` | String | `0m` | Minimum time interval between two compactions.<br/>To align with the old behavior, the default value is 0 (no restrictions). |
| `region_engine.mito.enable_experimental_flat_format` | Bool | `false` | Whether to enable experimental flat format. | | `region_engine.mito.default_experimental_flat_format` | Bool | `false` | Whether to enable experimental flat format as the default format. |
| `region_engine.mito.index` | -- | -- | The options for index in Mito engine. | | `region_engine.mito.index` | -- | -- | The options for index in Mito engine. |
| `region_engine.mito.index.aux_path` | String | `""` | Auxiliary directory path for the index in filesystem, used to store intermediate files for<br/>creating the index and staging files for searching the index, defaults to `{data_home}/index_intermediate`.<br/>The default name for this directory is `index_intermediate` for backward compatibility.<br/><br/>This path contains two subdirectories:<br/>- `__intm`: for storing intermediate files used during creating index.<br/>- `staging`: for storing staging files used during searching index. | | `region_engine.mito.index.aux_path` | String | `""` | Auxiliary directory path for the index in filesystem, used to store intermediate files for<br/>creating the index and staging files for searching the index, defaults to `{data_home}/index_intermediate`.<br/>The default name for this directory is `index_intermediate` for backward compatibility.<br/><br/>This path contains two subdirectories:<br/>- `__intm`: for storing intermediate files used during creating index.<br/>- `staging`: for storing staging files used during searching index. |
| `region_engine.mito.index.staging_size` | String | `2GB` | The max capacity of the staging directory. | | `region_engine.mito.index.staging_size` | String | `2GB` | The max capacity of the staging directory. |

View File

@@ -2,6 +2,10 @@
## @toml2docs:none-default ## @toml2docs:none-default
node_id = 42 node_id = 42
## The default column prefix for auto-created time index and value columns.
## @toml2docs:none-default
default_column_prefix = "greptime"
## Start services after regions have obtained leases. ## Start services after regions have obtained leases.
## It will block the datanode start if it can't receive leases in the heartbeat from metasrv. ## It will block the datanode start if it can't receive leases in the heartbeat from metasrv.
require_lease_before_startup = false require_lease_before_startup = false
@@ -118,6 +122,7 @@ metadata_cache_tti = "5m"
## The provider of the WAL. ## The provider of the WAL.
## - `raft_engine`: the wal is stored in the local file system by raft-engine. ## - `raft_engine`: the wal is stored in the local file system by raft-engine.
## - `kafka`: it's remote wal that data is stored in Kafka. ## - `kafka`: it's remote wal that data is stored in Kafka.
## - `noop`: it's a no-op WAL provider that does not store any WAL data.<br/>**Notes: any unflushed data will be lost when the datanode is shutdown.**
provider = "raft_engine" provider = "raft_engine"
## The directory to store the WAL files. ## The directory to store the WAL files.
@@ -500,8 +505,8 @@ allow_stale_entries = false
## To align with the old behavior, the default value is 0 (no restrictions). ## To align with the old behavior, the default value is 0 (no restrictions).
min_compaction_interval = "0m" min_compaction_interval = "0m"
## Whether to enable experimental flat format. ## Whether to enable experimental flat format as the default format.
enable_experimental_flat_format = false default_experimental_flat_format = false
## The options for index in Mito engine. ## The options for index in Mito engine.
[region_engine.mito.index] [region_engine.mito.index]

View File

@@ -2,6 +2,10 @@
## @toml2docs:none-default ## @toml2docs:none-default
default_timezone = "UTC" default_timezone = "UTC"
## The default column prefix for auto-created time index and value columns.
## @toml2docs:none-default
default_column_prefix = "greptime"
## The maximum in-flight write bytes. ## The maximum in-flight write bytes.
## @toml2docs:none-default ## @toml2docs:none-default
#+ max_in_flight_write_bytes = "500MB" #+ max_in_flight_write_bytes = "500MB"
@@ -31,6 +35,10 @@ timeout = "0s"
## The following units are supported: `B`, `KB`, `KiB`, `MB`, `MiB`, `GB`, `GiB`, `TB`, `TiB`, `PB`, `PiB`. ## The following units are supported: `B`, `KB`, `KiB`, `MB`, `MiB`, `GB`, `GiB`, `TB`, `TiB`, `PB`, `PiB`.
## Set to 0 to disable limit. ## Set to 0 to disable limit.
body_limit = "64MB" body_limit = "64MB"
## Maximum total memory for all concurrent HTTP request bodies.
## Set to 0 to disable the limit. Default: "0" (unlimited)
## @toml2docs:none-default
#+ max_total_body_memory = "1GB"
## HTTP CORS support, it's turned on by default ## HTTP CORS support, it's turned on by default
## This allows browser to access http APIs without CORS restrictions ## This allows browser to access http APIs without CORS restrictions
enable_cors = true enable_cors = true
@@ -54,6 +62,10 @@ bind_addr = "127.0.0.1:4001"
server_addr = "127.0.0.1:4001" server_addr = "127.0.0.1:4001"
## The number of server worker threads. ## The number of server worker threads.
runtime_size = 8 runtime_size = 8
## Maximum total memory for all concurrent gRPC request messages.
## Set to 0 to disable the limit. Default: "0" (unlimited)
## @toml2docs:none-default
#+ max_total_message_memory = "1GB"
## Compression mode for frontend side Arrow IPC service. Available options: ## Compression mode for frontend side Arrow IPC service. Available options:
## - `none`: disable all compression ## - `none`: disable all compression
## - `transport`: only enable gRPC transport compression (zstd) ## - `transport`: only enable gRPC transport compression (zstd)

View File

@@ -2,6 +2,10 @@
## @toml2docs:none-default ## @toml2docs:none-default
default_timezone = "UTC" default_timezone = "UTC"
## The default column prefix for auto-created time index and value columns.
## @toml2docs:none-default
default_column_prefix = "greptime"
## Initialize all regions in the background during the startup. ## Initialize all regions in the background during the startup.
## By default, it provides services after all regions have been initialized. ## By default, it provides services after all regions have been initialized.
init_regions_in_background = false init_regions_in_background = false
@@ -36,6 +40,10 @@ timeout = "0s"
## The following units are supported: `B`, `KB`, `KiB`, `MB`, `MiB`, `GB`, `GiB`, `TB`, `TiB`, `PB`, `PiB`. ## The following units are supported: `B`, `KB`, `KiB`, `MB`, `MiB`, `GB`, `GiB`, `TB`, `TiB`, `PB`, `PiB`.
## Set to 0 to disable limit. ## Set to 0 to disable limit.
body_limit = "64MB" body_limit = "64MB"
## Maximum total memory for all concurrent HTTP request bodies.
## Set to 0 to disable the limit. Default: "0" (unlimited)
## @toml2docs:none-default
#+ max_total_body_memory = "1GB"
## HTTP CORS support, it's turned on by default ## HTTP CORS support, it's turned on by default
## This allows browser to access http APIs without CORS restrictions ## This allows browser to access http APIs without CORS restrictions
enable_cors = true enable_cors = true
@@ -56,6 +64,10 @@ prom_validation_mode = "strict"
bind_addr = "127.0.0.1:4001" bind_addr = "127.0.0.1:4001"
## The number of server worker threads. ## The number of server worker threads.
runtime_size = 8 runtime_size = 8
## Maximum total memory for all concurrent gRPC request messages.
## Set to 0 to disable the limit. Default: "0" (unlimited)
## @toml2docs:none-default
#+ max_total_message_memory = "1GB"
## The maximum connection age for gRPC connection. ## The maximum connection age for gRPC connection.
## The value can be a human-readable time string. For example: `10m` for ten minutes or `1h` for one hour. ## The value can be a human-readable time string. For example: `10m` for ten minutes or `1h` for one hour.
## Refer to https://grpc.io/docs/guides/keepalive/ for more details. ## Refer to https://grpc.io/docs/guides/keepalive/ for more details.
@@ -584,8 +596,8 @@ allow_stale_entries = false
## To align with the old behavior, the default value is 0 (no restrictions). ## To align with the old behavior, the default value is 0 (no restrictions).
min_compaction_interval = "0m" min_compaction_interval = "0m"
## Whether to enable experimental flat format. ## Whether to enable experimental flat format as the default format.
enable_experimental_flat_format = false default_experimental_flat_format = false
## The options for index in Mito engine. ## The options for index in Mito engine.
[region_engine.mito.index] [region_engine.mito.index]

View File

@@ -71,6 +71,15 @@ curl -X POST localhost:4000/debug/prof/mem/activate
# Deactivate heap profiling # Deactivate heap profiling
curl -X POST localhost:4000/debug/prof/mem/deactivate curl -X POST localhost:4000/debug/prof/mem/deactivate
# Activate gdump feature that dumps memory profiling data every time virtual memory usage exceeds previous maximum value.
curl -X POST localhost:4000/debug/prof/mem/gdump -d 'activate=true'
# Deactivate gdump.
curl -X POST localhost:4000/debug/prof/mem/gdump -d 'activate=false'
# Retrieve current gdump status.
curl -X GET localhost:4000/debug/prof/mem/gdump
``` ```
### Dump memory profiling data ### Dump memory profiling data
@@ -83,6 +92,9 @@ curl -X POST localhost:4000/debug/prof/mem > greptime.hprof
curl -X POST "localhost:4000/debug/prof/mem?output=flamegraph" > greptime.svg curl -X POST "localhost:4000/debug/prof/mem?output=flamegraph" > greptime.svg
# or output pprof format # or output pprof format
curl -X POST "localhost:4000/debug/prof/mem?output=proto" > greptime.pprof curl -X POST "localhost:4000/debug/prof/mem?output=proto" > greptime.pprof
curl -X POST "localhost:4000/debug/prof/bytes" > greptime.svg
``` ```
You can periodically dump profiling data and compare them to find the delta memory usage. You can periodically dump profiling data and compare them to find the delta memory usage.

18
flake.lock generated
View File

@@ -8,11 +8,11 @@
"rust-analyzer-src": "rust-analyzer-src" "rust-analyzer-src": "rust-analyzer-src"
}, },
"locked": { "locked": {
"lastModified": 1745735608, "lastModified": 1760078406,
"narHash": "sha256-L0jzm815XBFfF2wCFmR+M1CF+beIEFj6SxlqVKF59Ec=", "narHash": "sha256-JeJK0ZA845PtkCHkfo4KjeI1mYrsr2s3cxBYKhF4BoE=",
"owner": "nix-community", "owner": "nix-community",
"repo": "fenix", "repo": "fenix",
"rev": "c39a78eba6ed2a022cc3218db90d485077101496", "rev": "351277c60d104944122ee389cdf581c5ce2c6732",
"type": "github" "type": "github"
}, },
"original": { "original": {
@@ -41,11 +41,11 @@
}, },
"nixpkgs": { "nixpkgs": {
"locked": { "locked": {
"lastModified": 1748162331, "lastModified": 1759994382,
"narHash": "sha256-rqc2RKYTxP3tbjA+PB3VMRQNnjesrT0pEofXQTrMsS8=", "narHash": "sha256-wSK+3UkalDZRVHGCRikZ//CyZUJWDJkBDTQX1+G77Ow=",
"owner": "NixOS", "owner": "NixOS",
"repo": "nixpkgs", "repo": "nixpkgs",
"rev": "7c43f080a7f28b2774f3b3f43234ca11661bf334", "rev": "5da4a26309e796daa7ffca72df93dbe53b8164c7",
"type": "github" "type": "github"
}, },
"original": { "original": {
@@ -65,11 +65,11 @@
"rust-analyzer-src": { "rust-analyzer-src": {
"flake": false, "flake": false,
"locked": { "locked": {
"lastModified": 1745694049, "lastModified": 1760014945,
"narHash": "sha256-fxvRYH/tS7hGQeg9zCVh5RBcSWT+JGJet7RA8Ss+rC0=", "narHash": "sha256-ySdl7F9+oeWNHVrg3QL/brazqmJvYFEdpGnF3pyoDH8=",
"owner": "rust-lang", "owner": "rust-lang",
"repo": "rust-analyzer", "repo": "rust-analyzer",
"rev": "d8887c0758bbd2d5f752d5bd405d4491e90e7ed6", "rev": "90d2e1ce4dfe7dc49250a8b88a0f08ffdb9cb23f",
"type": "github" "type": "github"
}, },
"original": { "original": {

View File

@@ -19,7 +19,7 @@
lib = nixpkgs.lib; lib = nixpkgs.lib;
rustToolchain = fenix.packages.${system}.fromToolchainName { rustToolchain = fenix.packages.${system}.fromToolchainName {
name = (lib.importTOML ./rust-toolchain.toml).toolchain.channel; name = (lib.importTOML ./rust-toolchain.toml).toolchain.channel;
sha256 = "sha256-tJJr8oqX3YD+ohhPK7jlt/7kvKBnBqJVjYtoFr520d4="; sha256 = "sha256-GCGEXGZeJySLND0KU5TdtTrqFV76TF3UdvAHSUegSsk=";
}; };
in in
{ {

View File

@@ -1,2 +1,2 @@
[toolchain] [toolchain]
channel = "nightly-2025-05-19" channel = "nightly-2025-10-01"

File diff suppressed because it is too large Load Diff

View File

@@ -12,8 +12,6 @@
// See the License for the specific language governing permissions and // See the License for the specific language governing permissions and
// limitations under the License. // limitations under the License.
#![feature(let_chains)]
pub mod error; pub mod error;
pub mod helper; pub mod helper;

View File

@@ -16,8 +16,8 @@ use std::collections::HashMap;
use datatypes::schema::{ use datatypes::schema::{
COMMENT_KEY, ColumnDefaultConstraint, ColumnSchema, FULLTEXT_KEY, FulltextAnalyzer, COMMENT_KEY, ColumnDefaultConstraint, ColumnSchema, FULLTEXT_KEY, FulltextAnalyzer,
FulltextBackend, FulltextOptions, INVERTED_INDEX_KEY, SKIPPING_INDEX_KEY, SkippingIndexOptions, FulltextBackend, FulltextOptions, INVERTED_INDEX_KEY, JSON_STRUCTURE_SETTINGS_KEY,
SkippingIndexType, SKIPPING_INDEX_KEY, SkippingIndexOptions, SkippingIndexType,
}; };
use greptime_proto::v1::{ use greptime_proto::v1::{
Analyzer, FulltextBackend as PbFulltextBackend, SkippingIndexType as PbSkippingIndexType, Analyzer, FulltextBackend as PbFulltextBackend, SkippingIndexType as PbSkippingIndexType,
@@ -37,8 +37,10 @@ const SKIPPING_INDEX_GRPC_KEY: &str = "skipping_index";
/// Tries to construct a `ColumnSchema` from the given `ColumnDef`. /// Tries to construct a `ColumnSchema` from the given `ColumnDef`.
pub fn try_as_column_schema(column_def: &ColumnDef) -> Result<ColumnSchema> { pub fn try_as_column_schema(column_def: &ColumnDef) -> Result<ColumnSchema> {
let data_type = let data_type = ColumnDataTypeWrapper::try_new(
ColumnDataTypeWrapper::try_new(column_def.data_type, column_def.datatype_extension)?; column_def.data_type,
column_def.datatype_extension.clone(),
)?;
let constraint = if column_def.default_constraint.is_empty() { let constraint = if column_def.default_constraint.is_empty() {
None None
@@ -66,6 +68,9 @@ pub fn try_as_column_schema(column_def: &ColumnDef) -> Result<ColumnSchema> {
if let Some(skipping_index) = options.options.get(SKIPPING_INDEX_GRPC_KEY) { if let Some(skipping_index) = options.options.get(SKIPPING_INDEX_GRPC_KEY) {
metadata.insert(SKIPPING_INDEX_KEY.to_string(), skipping_index.to_owned()); metadata.insert(SKIPPING_INDEX_KEY.to_string(), skipping_index.to_owned());
} }
if let Some(settings) = options.options.get(JSON_STRUCTURE_SETTINGS_KEY) {
metadata.insert(JSON_STRUCTURE_SETTINGS_KEY.to_string(), settings.clone());
}
} }
ColumnSchema::new(&column_def.name, data_type.into(), column_def.is_nullable) ColumnSchema::new(&column_def.name, data_type.into(), column_def.is_nullable)
@@ -137,6 +142,11 @@ pub fn options_from_column_schema(column_schema: &ColumnSchema) -> Option<Column
.options .options
.insert(SKIPPING_INDEX_GRPC_KEY.to_string(), skipping_index.clone()); .insert(SKIPPING_INDEX_GRPC_KEY.to_string(), skipping_index.clone());
} }
if let Some(settings) = column_schema.metadata().get(JSON_STRUCTURE_SETTINGS_KEY) {
options
.options
.insert(JSON_STRUCTURE_SETTINGS_KEY.to_string(), settings.clone());
}
(!options.options.is_empty()).then_some(options) (!options.options.is_empty()).then_some(options)
} }

View File

@@ -35,7 +35,7 @@ pub fn userinfo_by_name(username: Option<String>) -> UserInfoRef {
DefaultUserInfo::with_name(username.unwrap_or_else(|| DEFAULT_USERNAME.to_string())) DefaultUserInfo::with_name(username.unwrap_or_else(|| DEFAULT_USERNAME.to_string()))
} }
pub fn user_provider_from_option(opt: &String) -> Result<UserProviderRef> { pub fn user_provider_from_option(opt: &str) -> Result<UserProviderRef> {
let (name, content) = opt.split_once(':').with_context(|| InvalidConfigSnafu { let (name, content) = opt.split_once(':').with_context(|| InvalidConfigSnafu {
value: opt.to_string(), value: opt.to_string(),
msg: "UserProviderOption must be in format `<option>:<value>`", msg: "UserProviderOption must be in format `<option>:<value>`",
@@ -57,7 +57,7 @@ pub fn user_provider_from_option(opt: &String) -> Result<UserProviderRef> {
} }
} }
pub fn static_user_provider_from_option(opt: &String) -> Result<StaticUserProvider> { pub fn static_user_provider_from_option(opt: &str) -> Result<StaticUserProvider> {
let (name, content) = opt.split_once(':').with_context(|| InvalidConfigSnafu { let (name, content) = opt.split_once(':').with_context(|| InvalidConfigSnafu {
value: opt.to_string(), value: opt.to_string(),
msg: "UserProviderOption must be in format `<option>:<value>`", msg: "UserProviderOption must be in format `<option>:<value>`",

View File

@@ -29,6 +29,7 @@ use crate::information_schema::{InformationExtensionRef, InformationSchemaProvid
use crate::kvbackend::KvBackendCatalogManager; use crate::kvbackend::KvBackendCatalogManager;
use crate::kvbackend::manager::{CATALOG_CACHE_MAX_CAPACITY, SystemCatalog}; use crate::kvbackend::manager::{CATALOG_CACHE_MAX_CAPACITY, SystemCatalog};
use crate::process_manager::ProcessManagerRef; use crate::process_manager::ProcessManagerRef;
use crate::system_schema::numbers_table_provider::NumbersTableProvider;
use crate::system_schema::pg_catalog::PGCatalogProvider; use crate::system_schema::pg_catalog::PGCatalogProvider;
pub struct KvBackendCatalogManagerBuilder { pub struct KvBackendCatalogManagerBuilder {
@@ -119,6 +120,7 @@ impl KvBackendCatalogManagerBuilder {
DEFAULT_CATALOG_NAME.to_string(), DEFAULT_CATALOG_NAME.to_string(),
me.clone(), me.clone(),
)), )),
numbers_table_provider: NumbersTableProvider,
backend, backend,
process_manager, process_manager,
#[cfg(feature = "enterprise")] #[cfg(feature = "enterprise")]

View File

@@ -18,8 +18,7 @@ use std::sync::{Arc, Weak};
use async_stream::try_stream; use async_stream::try_stream;
use common_catalog::consts::{ use common_catalog::consts::{
DEFAULT_CATALOG_NAME, DEFAULT_SCHEMA_NAME, INFORMATION_SCHEMA_NAME, NUMBERS_TABLE_ID, DEFAULT_CATALOG_NAME, DEFAULT_SCHEMA_NAME, INFORMATION_SCHEMA_NAME, PG_CATALOG_NAME,
PG_CATALOG_NAME,
}; };
use common_error::ext::BoxedError; use common_error::ext::BoxedError;
use common_meta::cache::{ use common_meta::cache::{
@@ -45,7 +44,6 @@ use table::TableRef;
use table::dist_table::DistTable; use table::dist_table::DistTable;
use table::metadata::{TableId, TableInfoRef}; use table::metadata::{TableId, TableInfoRef};
use table::table::PartitionRules; use table::table::PartitionRules;
use table::table::numbers::{NUMBERS_TABLE_NAME, NumbersTable};
use table::table_name::TableName; use table::table_name::TableName;
use tokio::sync::Semaphore; use tokio::sync::Semaphore;
use tokio_stream::wrappers::ReceiverStream; use tokio_stream::wrappers::ReceiverStream;
@@ -61,6 +59,7 @@ use crate::information_schema::{InformationExtensionRef, InformationSchemaProvid
use crate::kvbackend::TableCacheRef; use crate::kvbackend::TableCacheRef;
use crate::process_manager::ProcessManagerRef; use crate::process_manager::ProcessManagerRef;
use crate::system_schema::SystemSchemaProvider; use crate::system_schema::SystemSchemaProvider;
use crate::system_schema::numbers_table_provider::NumbersTableProvider;
use crate::system_schema::pg_catalog::PGCatalogProvider; use crate::system_schema::pg_catalog::PGCatalogProvider;
/// Access all existing catalog, schema and tables. /// Access all existing catalog, schema and tables.
@@ -555,6 +554,7 @@ pub(super) struct SystemCatalog {
// system_schema_provider for default catalog // system_schema_provider for default catalog
pub(super) information_schema_provider: Arc<InformationSchemaProvider>, pub(super) information_schema_provider: Arc<InformationSchemaProvider>,
pub(super) pg_catalog_provider: Arc<PGCatalogProvider>, pub(super) pg_catalog_provider: Arc<PGCatalogProvider>,
pub(super) numbers_table_provider: NumbersTableProvider,
pub(super) backend: KvBackendRef, pub(super) backend: KvBackendRef,
pub(super) process_manager: Option<ProcessManagerRef>, pub(super) process_manager: Option<ProcessManagerRef>,
#[cfg(feature = "enterprise")] #[cfg(feature = "enterprise")]
@@ -584,9 +584,7 @@ impl SystemCatalog {
PG_CATALOG_NAME if channel == Channel::Postgres => { PG_CATALOG_NAME if channel == Channel::Postgres => {
self.pg_catalog_provider.table_names() self.pg_catalog_provider.table_names()
} }
DEFAULT_SCHEMA_NAME => { DEFAULT_SCHEMA_NAME => self.numbers_table_provider.table_names(),
vec![NUMBERS_TABLE_NAME.to_string()]
}
_ => vec![], _ => vec![],
} }
} }
@@ -604,7 +602,7 @@ impl SystemCatalog {
if schema == INFORMATION_SCHEMA_NAME { if schema == INFORMATION_SCHEMA_NAME {
self.information_schema_provider.table(table).is_some() self.information_schema_provider.table(table).is_some()
} else if schema == DEFAULT_SCHEMA_NAME { } else if schema == DEFAULT_SCHEMA_NAME {
table == NUMBERS_TABLE_NAME self.numbers_table_provider.table_exists(table)
} else if schema == PG_CATALOG_NAME && channel == Channel::Postgres { } else if schema == PG_CATALOG_NAME && channel == Channel::Postgres {
self.pg_catalog_provider.table(table).is_some() self.pg_catalog_provider.table(table).is_some()
} else { } else {
@@ -649,8 +647,8 @@ impl SystemCatalog {
}); });
pg_catalog_provider.table(table_name) pg_catalog_provider.table(table_name)
} }
} else if schema == DEFAULT_SCHEMA_NAME && table_name == NUMBERS_TABLE_NAME { } else if schema == DEFAULT_SCHEMA_NAME {
Some(NumbersTable::table(NUMBERS_TABLE_ID)) self.numbers_table_provider.table(table_name)
} else { } else {
None None
} }

View File

@@ -14,7 +14,6 @@
#![feature(assert_matches)] #![feature(assert_matches)]
#![feature(try_blocks)] #![feature(try_blocks)]
#![feature(let_chains)]
use std::any::Any; use std::any::Any;
use std::fmt::{Debug, Formatter}; use std::fmt::{Debug, Formatter};

View File

@@ -392,15 +392,15 @@ impl MemoryCatalogManager {
if !manager.schema_exist_sync(catalog, schema).unwrap() { if !manager.schema_exist_sync(catalog, schema).unwrap() {
manager manager
.register_schema_sync(RegisterSchemaRequest { .register_schema_sync(RegisterSchemaRequest {
catalog: catalog.to_string(), catalog: catalog.clone(),
schema: schema.to_string(), schema: schema.clone(),
}) })
.unwrap(); .unwrap();
} }
let request = RegisterTableRequest { let request = RegisterTableRequest {
catalog: catalog.to_string(), catalog: catalog.clone(),
schema: schema.to_string(), schema: schema.clone(),
table_name: table.table_info().name.clone(), table_name: table.table_info().name.clone(),
table_id: table.table_info().ident.table_id, table_id: table.table_info().ident.table_id,
table, table,

View File

@@ -56,14 +56,21 @@ pub struct ProcessManager {
#[derive(Debug, Clone)] #[derive(Debug, Clone)]
pub enum QueryStatement { pub enum QueryStatement {
Sql(Statement), Sql(Statement),
Promql(EvalStmt), // The optional string is the alias of the PromQL query.
Promql(EvalStmt, Option<String>),
} }
impl Display for QueryStatement { impl Display for QueryStatement {
fn fmt(&self, f: &mut Formatter<'_>) -> std::fmt::Result { fn fmt(&self, f: &mut Formatter<'_>) -> std::fmt::Result {
match self { match self {
QueryStatement::Sql(stmt) => write!(f, "{}", stmt), QueryStatement::Sql(stmt) => write!(f, "{}", stmt),
QueryStatement::Promql(eval_stmt) => write!(f, "{}", eval_stmt), QueryStatement::Promql(eval_stmt, alias) => {
if let Some(alias) = alias {
write!(f, "{} AS {}", eval_stmt, alias)
} else {
write!(f, "{}", eval_stmt)
}
}
} }
} }
} }
@@ -338,9 +345,9 @@ impl SlowQueryTimer {
}; };
match &self.stmt { match &self.stmt {
QueryStatement::Promql(stmt) => { QueryStatement::Promql(stmt, _alias) => {
slow_query_event.is_promql = true; slow_query_event.is_promql = true;
slow_query_event.query = stmt.expr.to_string(); slow_query_event.query = self.stmt.to_string();
slow_query_event.promql_step = Some(stmt.interval.as_millis() as u64); slow_query_event.promql_step = Some(stmt.interval.as_millis() as u64);
let start = stmt let start = stmt

View File

@@ -14,6 +14,7 @@
pub mod information_schema; pub mod information_schema;
mod memory_table; mod memory_table;
pub mod numbers_table_provider;
pub mod pg_catalog; pub mod pg_catalog;
pub mod predicate; pub mod predicate;
mod utils; mod utils;

View File

@@ -48,7 +48,7 @@ use datatypes::schema::SchemaRef;
use lazy_static::lazy_static; use lazy_static::lazy_static;
use paste::paste; use paste::paste;
use process_list::InformationSchemaProcessList; use process_list::InformationSchemaProcessList;
use store_api::sst_entry::{ManifestSstEntry, StorageSstEntry}; use store_api::sst_entry::{ManifestSstEntry, PuffinIndexMetaEntry, StorageSstEntry};
use store_api::storage::{ScanRequest, TableId}; use store_api::storage::{ScanRequest, TableId};
use table::TableRef; use table::TableRef;
use table::metadata::TableType; use table::metadata::TableType;
@@ -68,7 +68,7 @@ use crate::system_schema::information_schema::region_peers::InformationSchemaReg
use crate::system_schema::information_schema::runtime_metrics::InformationSchemaMetrics; use crate::system_schema::information_schema::runtime_metrics::InformationSchemaMetrics;
use crate::system_schema::information_schema::schemata::InformationSchemaSchemata; use crate::system_schema::information_schema::schemata::InformationSchemaSchemata;
use crate::system_schema::information_schema::ssts::{ use crate::system_schema::information_schema::ssts::{
InformationSchemaSstsManifest, InformationSchemaSstsStorage, InformationSchemaSstsIndexMeta, InformationSchemaSstsManifest, InformationSchemaSstsStorage,
}; };
use crate::system_schema::information_schema::table_constraints::InformationSchemaTableConstraints; use crate::system_schema::information_schema::table_constraints::InformationSchemaTableConstraints;
use crate::system_schema::information_schema::tables::InformationSchemaTables; use crate::system_schema::information_schema::tables::InformationSchemaTables;
@@ -263,6 +263,9 @@ impl SystemSchemaProviderInner for InformationSchemaProvider {
SSTS_STORAGE => Some(Arc::new(InformationSchemaSstsStorage::new( SSTS_STORAGE => Some(Arc::new(InformationSchemaSstsStorage::new(
self.catalog_manager.clone(), self.catalog_manager.clone(),
)) as _), )) as _),
SSTS_INDEX_META => Some(Arc::new(InformationSchemaSstsIndexMeta::new(
self.catalog_manager.clone(),
)) as _),
_ => None, _ => None,
} }
} }
@@ -342,6 +345,10 @@ impl InformationSchemaProvider {
SSTS_STORAGE.to_string(), SSTS_STORAGE.to_string(),
self.build_table(SSTS_STORAGE).unwrap(), self.build_table(SSTS_STORAGE).unwrap(),
); );
tables.insert(
SSTS_INDEX_META.to_string(),
self.build_table(SSTS_INDEX_META).unwrap(),
);
} }
tables.insert(TABLES.to_string(), self.build_table(TABLES).unwrap()); tables.insert(TABLES.to_string(), self.build_table(TABLES).unwrap());
@@ -362,7 +369,7 @@ impl InformationSchemaProvider {
} }
#[cfg(feature = "enterprise")] #[cfg(feature = "enterprise")]
for name in self.extra_table_factories.keys() { for name in self.extra_table_factories.keys() {
tables.insert(name.to_string(), self.build_table(name).expect(name)); tables.insert(name.clone(), self.build_table(name).expect(name));
} }
// Add memory tables // Add memory tables
for name in MEMORY_TABLES.iter() { for name in MEMORY_TABLES.iter() {
@@ -456,6 +463,8 @@ pub enum DatanodeInspectKind {
SstManifest, SstManifest,
/// List SST entries discovered in storage layer /// List SST entries discovered in storage layer
SstStorage, SstStorage,
/// List index metadata collected from manifest
SstIndexMeta,
} }
impl DatanodeInspectRequest { impl DatanodeInspectRequest {
@@ -464,6 +473,7 @@ impl DatanodeInspectRequest {
match self.kind { match self.kind {
DatanodeInspectKind::SstManifest => ManifestSstEntry::build_plan(self.scan), DatanodeInspectKind::SstManifest => ManifestSstEntry::build_plan(self.scan),
DatanodeInspectKind::SstStorage => StorageSstEntry::build_plan(self.scan), DatanodeInspectKind::SstStorage => StorageSstEntry::build_plan(self.scan),
DatanodeInspectKind::SstIndexMeta => PuffinIndexMetaEntry::build_plan(self.scan),
} }
} }
} }

View File

@@ -33,7 +33,6 @@ use datatypes::timestamp::TimestampMillisecond;
use datatypes::value::Value; use datatypes::value::Value;
use datatypes::vectors::{ use datatypes::vectors::{
Int64VectorBuilder, StringVectorBuilder, TimestampMillisecondVectorBuilder, Int64VectorBuilder, StringVectorBuilder, TimestampMillisecondVectorBuilder,
UInt32VectorBuilder, UInt64VectorBuilder,
}; };
use serde::Serialize; use serde::Serialize;
use snafu::ResultExt; use snafu::ResultExt;
@@ -50,8 +49,11 @@ const PEER_TYPE_METASRV: &str = "METASRV";
const PEER_ID: &str = "peer_id"; const PEER_ID: &str = "peer_id";
const PEER_TYPE: &str = "peer_type"; const PEER_TYPE: &str = "peer_type";
const PEER_ADDR: &str = "peer_addr"; const PEER_ADDR: &str = "peer_addr";
const CPUS: &str = "cpus"; const PEER_HOSTNAME: &str = "peer_hostname";
const MEMORY_BYTES: &str = "memory_bytes"; const TOTAL_CPU_MILLICORES: &str = "total_cpu_millicores";
const TOTAL_MEMORY_BYTES: &str = "total_memory_bytes";
const CPU_USAGE_MILLICORES: &str = "cpu_usage_millicores";
const MEMORY_USAGE_BYTES: &str = "memory_usage_bytes";
const VERSION: &str = "version"; const VERSION: &str = "version";
const GIT_COMMIT: &str = "git_commit"; const GIT_COMMIT: &str = "git_commit";
const START_TIME: &str = "start_time"; const START_TIME: &str = "start_time";
@@ -66,8 +68,11 @@ const INIT_CAPACITY: usize = 42;
/// - `peer_id`: the peer server id. /// - `peer_id`: the peer server id.
/// - `peer_type`: the peer type, such as `datanode`, `frontend`, `metasrv` etc. /// - `peer_type`: the peer type, such as `datanode`, `frontend`, `metasrv` etc.
/// - `peer_addr`: the peer gRPC address. /// - `peer_addr`: the peer gRPC address.
/// - `cpus`: the number of CPUs of the peer. /// - `peer_hostname`: the hostname of the peer.
/// - `memory_bytes`: the memory bytes of the peer. /// - `total_cpu_millicores`: the total CPU millicores of the peer.
/// - `total_memory_bytes`: the total memory bytes of the peer.
/// - `cpu_usage_millicores`: the CPU usage millicores of the peer.
/// - `memory_usage_bytes`: the memory usage bytes of the peer.
/// - `version`: the build package version of the peer. /// - `version`: the build package version of the peer.
/// - `git_commit`: the build git commit hash of the peer. /// - `git_commit`: the build git commit hash of the peer.
/// - `start_time`: the starting time of the peer. /// - `start_time`: the starting time of the peer.
@@ -94,8 +99,27 @@ impl InformationSchemaClusterInfo {
ColumnSchema::new(PEER_ID, ConcreteDataType::int64_datatype(), false), ColumnSchema::new(PEER_ID, ConcreteDataType::int64_datatype(), false),
ColumnSchema::new(PEER_TYPE, ConcreteDataType::string_datatype(), false), ColumnSchema::new(PEER_TYPE, ConcreteDataType::string_datatype(), false),
ColumnSchema::new(PEER_ADDR, ConcreteDataType::string_datatype(), true), ColumnSchema::new(PEER_ADDR, ConcreteDataType::string_datatype(), true),
ColumnSchema::new(CPUS, ConcreteDataType::uint32_datatype(), false), ColumnSchema::new(PEER_HOSTNAME, ConcreteDataType::string_datatype(), true),
ColumnSchema::new(MEMORY_BYTES, ConcreteDataType::uint64_datatype(), false), ColumnSchema::new(
TOTAL_CPU_MILLICORES,
ConcreteDataType::int64_datatype(),
false,
),
ColumnSchema::new(
TOTAL_MEMORY_BYTES,
ConcreteDataType::int64_datatype(),
false,
),
ColumnSchema::new(
CPU_USAGE_MILLICORES,
ConcreteDataType::int64_datatype(),
false,
),
ColumnSchema::new(
MEMORY_USAGE_BYTES,
ConcreteDataType::int64_datatype(),
false,
),
ColumnSchema::new(VERSION, ConcreteDataType::string_datatype(), false), ColumnSchema::new(VERSION, ConcreteDataType::string_datatype(), false),
ColumnSchema::new(GIT_COMMIT, ConcreteDataType::string_datatype(), false), ColumnSchema::new(GIT_COMMIT, ConcreteDataType::string_datatype(), false),
ColumnSchema::new( ColumnSchema::new(
@@ -155,8 +179,11 @@ struct InformationSchemaClusterInfoBuilder {
peer_ids: Int64VectorBuilder, peer_ids: Int64VectorBuilder,
peer_types: StringVectorBuilder, peer_types: StringVectorBuilder,
peer_addrs: StringVectorBuilder, peer_addrs: StringVectorBuilder,
cpus: UInt32VectorBuilder, peer_hostnames: StringVectorBuilder,
memory_bytes: UInt64VectorBuilder, total_cpu_millicores: Int64VectorBuilder,
total_memory_bytes: Int64VectorBuilder,
cpu_usage_millicores: Int64VectorBuilder,
memory_usage_bytes: Int64VectorBuilder,
versions: StringVectorBuilder, versions: StringVectorBuilder,
git_commits: StringVectorBuilder, git_commits: StringVectorBuilder,
start_times: TimestampMillisecondVectorBuilder, start_times: TimestampMillisecondVectorBuilder,
@@ -173,8 +200,11 @@ impl InformationSchemaClusterInfoBuilder {
peer_ids: Int64VectorBuilder::with_capacity(INIT_CAPACITY), peer_ids: Int64VectorBuilder::with_capacity(INIT_CAPACITY),
peer_types: StringVectorBuilder::with_capacity(INIT_CAPACITY), peer_types: StringVectorBuilder::with_capacity(INIT_CAPACITY),
peer_addrs: StringVectorBuilder::with_capacity(INIT_CAPACITY), peer_addrs: StringVectorBuilder::with_capacity(INIT_CAPACITY),
cpus: UInt32VectorBuilder::with_capacity(INIT_CAPACITY), peer_hostnames: StringVectorBuilder::with_capacity(INIT_CAPACITY),
memory_bytes: UInt64VectorBuilder::with_capacity(INIT_CAPACITY), total_cpu_millicores: Int64VectorBuilder::with_capacity(INIT_CAPACITY),
total_memory_bytes: Int64VectorBuilder::with_capacity(INIT_CAPACITY),
cpu_usage_millicores: Int64VectorBuilder::with_capacity(INIT_CAPACITY),
memory_usage_bytes: Int64VectorBuilder::with_capacity(INIT_CAPACITY),
versions: StringVectorBuilder::with_capacity(INIT_CAPACITY), versions: StringVectorBuilder::with_capacity(INIT_CAPACITY),
git_commits: StringVectorBuilder::with_capacity(INIT_CAPACITY), git_commits: StringVectorBuilder::with_capacity(INIT_CAPACITY),
start_times: TimestampMillisecondVectorBuilder::with_capacity(INIT_CAPACITY), start_times: TimestampMillisecondVectorBuilder::with_capacity(INIT_CAPACITY),
@@ -203,6 +233,7 @@ impl InformationSchemaClusterInfoBuilder {
(PEER_ID, &Value::from(peer_id)), (PEER_ID, &Value::from(peer_id)),
(PEER_TYPE, &Value::from(peer_type)), (PEER_TYPE, &Value::from(peer_type)),
(PEER_ADDR, &Value::from(node_info.peer.addr.as_str())), (PEER_ADDR, &Value::from(node_info.peer.addr.as_str())),
(PEER_HOSTNAME, &Value::from(node_info.hostname.as_str())),
(VERSION, &Value::from(node_info.version.as_str())), (VERSION, &Value::from(node_info.version.as_str())),
(GIT_COMMIT, &Value::from(node_info.git_commit.as_str())), (GIT_COMMIT, &Value::from(node_info.git_commit.as_str())),
]; ];
@@ -214,6 +245,7 @@ impl InformationSchemaClusterInfoBuilder {
self.peer_ids.push(Some(peer_id)); self.peer_ids.push(Some(peer_id));
self.peer_types.push(Some(peer_type)); self.peer_types.push(Some(peer_type));
self.peer_addrs.push(Some(&node_info.peer.addr)); self.peer_addrs.push(Some(&node_info.peer.addr));
self.peer_hostnames.push(Some(&node_info.hostname));
self.versions.push(Some(&node_info.version)); self.versions.push(Some(&node_info.version));
self.git_commits.push(Some(&node_info.git_commit)); self.git_commits.push(Some(&node_info.git_commit));
if node_info.start_time_ms > 0 { if node_info.start_time_ms > 0 {
@@ -228,8 +260,14 @@ impl InformationSchemaClusterInfoBuilder {
self.start_times.push(None); self.start_times.push(None);
self.uptimes.push(None); self.uptimes.push(None);
} }
self.cpus.push(Some(node_info.cpus)); self.total_cpu_millicores
self.memory_bytes.push(Some(node_info.memory_bytes)); .push(Some(node_info.total_cpu_millicores));
self.total_memory_bytes
.push(Some(node_info.total_memory_bytes));
self.cpu_usage_millicores
.push(Some(node_info.cpu_usage_millicores));
self.memory_usage_bytes
.push(Some(node_info.memory_usage_bytes));
if node_info.last_activity_ts > 0 { if node_info.last_activity_ts > 0 {
self.active_times.push(Some( self.active_times.push(Some(
@@ -253,8 +291,11 @@ impl InformationSchemaClusterInfoBuilder {
Arc::new(self.peer_ids.finish()), Arc::new(self.peer_ids.finish()),
Arc::new(self.peer_types.finish()), Arc::new(self.peer_types.finish()),
Arc::new(self.peer_addrs.finish()), Arc::new(self.peer_addrs.finish()),
Arc::new(self.cpus.finish()), Arc::new(self.peer_hostnames.finish()),
Arc::new(self.memory_bytes.finish()), Arc::new(self.total_cpu_millicores.finish()),
Arc::new(self.total_memory_bytes.finish()),
Arc::new(self.cpu_usage_millicores.finish()),
Arc::new(self.memory_usage_bytes.finish()),
Arc::new(self.versions.finish()), Arc::new(self.versions.finish()),
Arc::new(self.git_commits.finish()), Arc::new(self.git_commits.finish()),
Arc::new(self.start_times.finish()), Arc::new(self.start_times.finish()),

View File

@@ -254,9 +254,9 @@ impl InformationSchemaFlowsBuilder {
.await .await
.map_err(BoxedError::new) .map_err(BoxedError::new)
.context(InternalSnafu)? .context(InternalSnafu)?
.context(FlowInfoNotFoundSnafu { .with_context(|| FlowInfoNotFoundSnafu {
catalog_name: catalog_name.to_string(), catalog_name: catalog_name.clone(),
flow_name: flow_name.to_string(), flow_name: flow_name.clone(),
})?; })?;
self.add_flow(&predicates, flow_id.flow_id(), flow_info, &flow_stat) self.add_flow(&predicates, flow_id.flow_id(), flow_info, &flow_stat)
.await?; .await?;
@@ -273,11 +273,11 @@ impl InformationSchemaFlowsBuilder {
flow_stat: &Option<FlowStat>, flow_stat: &Option<FlowStat>,
) -> Result<()> { ) -> Result<()> {
let row = [ let row = [
(FLOW_NAME, &Value::from(flow_info.flow_name().to_string())), (FLOW_NAME, &Value::from(flow_info.flow_name().clone())),
(FLOW_ID, &Value::from(flow_id)), (FLOW_ID, &Value::from(flow_id)),
( (
TABLE_CATALOG, TABLE_CATALOG,
&Value::from(flow_info.catalog_name().to_string()), &Value::from(flow_info.catalog_name().clone()),
), ),
]; ];
if !predicates.eval(&row) { if !predicates.eval(&row) {

View File

@@ -135,7 +135,7 @@ async fn make_process_list(
for process in queries { for process in queries {
let display_id = DisplayProcessId { let display_id = DisplayProcessId {
server_addr: process.frontend.to_string(), server_addr: process.frontend.clone(),
id: process.id, id: process.id,
} }
.to_string(); .to_string();

View File

@@ -199,10 +199,7 @@ impl InformationSchemaRegionPeersBuilder {
if table_info.table_type == TableType::Temporary { if table_info.table_type == TableType::Temporary {
Ok(None) Ok(None)
} else { } else {
Ok(Some(( Ok(Some((table_info.ident.table_id, table_info.name.clone())))
table_info.ident.table_id,
table_info.name.to_string(),
)))
} }
}); });

View File

@@ -15,20 +15,22 @@
use std::sync::{Arc, Weak}; use std::sync::{Arc, Weak};
use common_catalog::consts::{ use common_catalog::consts::{
INFORMATION_SCHEMA_SSTS_MANIFEST_TABLE_ID, INFORMATION_SCHEMA_SSTS_STORAGE_TABLE_ID, INFORMATION_SCHEMA_SSTS_INDEX_META_TABLE_ID, INFORMATION_SCHEMA_SSTS_MANIFEST_TABLE_ID,
INFORMATION_SCHEMA_SSTS_STORAGE_TABLE_ID,
}; };
use common_error::ext::BoxedError; use common_error::ext::BoxedError;
use common_recordbatch::SendableRecordBatchStream; use common_recordbatch::SendableRecordBatchStream;
use common_recordbatch::adapter::AsyncRecordBatchStreamAdapter; use common_recordbatch::adapter::AsyncRecordBatchStreamAdapter;
use datatypes::schema::SchemaRef; use datatypes::schema::SchemaRef;
use snafu::ResultExt; use snafu::ResultExt;
use store_api::sst_entry::{ManifestSstEntry, StorageSstEntry}; use store_api::sst_entry::{ManifestSstEntry, PuffinIndexMetaEntry, StorageSstEntry};
use store_api::storage::{ScanRequest, TableId}; use store_api::storage::{ScanRequest, TableId};
use crate::CatalogManager; use crate::CatalogManager;
use crate::error::{ProjectSchemaSnafu, Result}; use crate::error::{ProjectSchemaSnafu, Result};
use crate::information_schema::{ use crate::information_schema::{
DatanodeInspectKind, DatanodeInspectRequest, InformationTable, SSTS_MANIFEST, SSTS_STORAGE, DatanodeInspectKind, DatanodeInspectRequest, InformationTable, SSTS_INDEX_META, SSTS_MANIFEST,
SSTS_STORAGE,
}; };
use crate::system_schema::utils; use crate::system_schema::utils;
@@ -140,3 +142,58 @@ impl InformationTable for InformationSchemaSstsStorage {
))) )))
} }
} }
/// Information schema table for index metadata.
pub struct InformationSchemaSstsIndexMeta {
schema: SchemaRef,
catalog_manager: Weak<dyn CatalogManager>,
}
impl InformationSchemaSstsIndexMeta {
pub(super) fn new(catalog_manager: Weak<dyn CatalogManager>) -> Self {
Self {
schema: PuffinIndexMetaEntry::schema(),
catalog_manager,
}
}
}
impl InformationTable for InformationSchemaSstsIndexMeta {
fn table_id(&self) -> TableId {
INFORMATION_SCHEMA_SSTS_INDEX_META_TABLE_ID
}
fn table_name(&self) -> &'static str {
SSTS_INDEX_META
}
fn schema(&self) -> SchemaRef {
self.schema.clone()
}
fn to_stream(&self, request: ScanRequest) -> Result<SendableRecordBatchStream> {
let schema = if let Some(p) = &request.projection {
Arc::new(self.schema.try_project(p).context(ProjectSchemaSnafu)?)
} else {
self.schema.clone()
};
let info_ext = utils::information_extension(&self.catalog_manager)?;
let req = DatanodeInspectRequest {
kind: DatanodeInspectKind::SstIndexMeta,
scan: request,
};
let future = async move {
info_ext
.inspect_datanode(req)
.await
.map_err(BoxedError::new)
.context(common_recordbatch::error::ExternalSnafu)
};
Ok(Box::pin(AsyncRecordBatchStreamAdapter::new(
schema,
Box::pin(future),
)))
}
}

View File

@@ -50,3 +50,4 @@ pub const REGION_STATISTICS: &str = "region_statistics";
pub const PROCESS_LIST: &str = "process_list"; pub const PROCESS_LIST: &str = "process_list";
pub const SSTS_MANIFEST: &str = "ssts_manifest"; pub const SSTS_MANIFEST: &str = "ssts_manifest";
pub const SSTS_STORAGE: &str = "ssts_storage"; pub const SSTS_STORAGE: &str = "ssts_storage";
pub const SSTS_INDEX_META: &str = "ssts_index_meta";

View File

@@ -371,7 +371,8 @@ impl InformationSchemaTablesBuilder {
self.auto_increment.push(Some(0)); self.auto_increment.push(Some(0));
self.row_format.push(Some("Fixed")); self.row_format.push(Some("Fixed"));
self.table_collation.push(Some("utf8_bin")); self.table_collation.push(Some("utf8_bin"));
self.update_time.push(None); self.update_time
.push(Some(table_info.meta.updated_on.timestamp().into()));
self.check_time.push(None); self.check_time.push(None);
// use mariadb default table version number here // use mariadb default table version number here
self.version.push(Some(11)); self.version.push(Some(11));

View File

@@ -0,0 +1,59 @@
// Copyright 2023 Greptime Team
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
#[cfg(any(test, feature = "testing", debug_assertions))]
use common_catalog::consts::NUMBERS_TABLE_ID;
use table::TableRef;
#[cfg(any(test, feature = "testing", debug_assertions))]
use table::table::numbers::NUMBERS_TABLE_NAME;
#[cfg(any(test, feature = "testing", debug_assertions))]
use table::table::numbers::NumbersTable;
// NumbersTableProvider is a dedicated provider for feature-gating the numbers table.
#[derive(Clone)]
pub struct NumbersTableProvider;
#[cfg(any(test, feature = "testing", debug_assertions))]
impl NumbersTableProvider {
pub(crate) fn table_exists(&self, name: &str) -> bool {
name == NUMBERS_TABLE_NAME
}
pub(crate) fn table_names(&self) -> Vec<String> {
vec![NUMBERS_TABLE_NAME.to_string()]
}
pub(crate) fn table(&self, name: &str) -> Option<TableRef> {
if name == NUMBERS_TABLE_NAME {
Some(NumbersTable::table(NUMBERS_TABLE_ID))
} else {
None
}
}
}
#[cfg(not(any(test, feature = "testing", debug_assertions)))]
impl NumbersTableProvider {
pub(crate) fn table_exists(&self, _name: &str) -> bool {
false
}
pub(crate) fn table_names(&self) -> Vec<String> {
vec![]
}
pub(crate) fn table(&self, _name: &str) -> Option<TableRef> {
None
}
}

View File

@@ -27,6 +27,7 @@ use datafusion::error::DataFusionError;
use datafusion::execution::TaskContext; use datafusion::execution::TaskContext;
use datafusion::physical_plan::stream::RecordBatchStreamAdapter as DfRecordBatchStreamAdapter; use datafusion::physical_plan::stream::RecordBatchStreamAdapter as DfRecordBatchStreamAdapter;
use datafusion_pg_catalog::pg_catalog::catalog_info::CatalogInfo; use datafusion_pg_catalog::pg_catalog::catalog_info::CatalogInfo;
use datafusion_pg_catalog::pg_catalog::context::EmptyContextProvider;
use datafusion_pg_catalog::pg_catalog::{ use datafusion_pg_catalog::pg_catalog::{
PG_CATALOG_TABLES, PgCatalogSchemaProvider, PgCatalogStaticTables, PgCatalogTable, PG_CATALOG_TABLES, PgCatalogSchemaProvider, PgCatalogStaticTables, PgCatalogTable,
}; };
@@ -44,7 +45,7 @@ use crate::system_schema::{
/// [`PGCatalogProvider`] is the provider for a schema named `pg_catalog`, it is not a catalog. /// [`PGCatalogProvider`] is the provider for a schema named `pg_catalog`, it is not a catalog.
pub struct PGCatalogProvider { pub struct PGCatalogProvider {
catalog_name: String, catalog_name: String,
inner: PgCatalogSchemaProvider<CatalogManagerWrapper>, inner: PgCatalogSchemaProvider<CatalogManagerWrapper, EmptyContextProvider>,
tables: HashMap<String, TableRef>, tables: HashMap<String, TableRef>,
table_ids: HashMap<&'static str, u32>, table_ids: HashMap<&'static str, u32>,
} }
@@ -69,6 +70,7 @@ impl PGCatalogProvider {
catalog_manager, catalog_manager,
}, },
Arc::new(static_tables), Arc::new(static_tables),
EmptyContextProvider,
) )
.expect("Failed to initialize PgCatalogSchemaProvider"); .expect("Failed to initialize PgCatalogSchemaProvider");
@@ -166,7 +168,7 @@ impl CatalogInfo for CatalogManagerWrapper {
.await .await
.map_err(|e| DataFusionError::External(Box::new(e))) .map_err(|e| DataFusionError::External(Box::new(e)))
} else { } else {
Ok(vec![self.catalog_name.to_string()]) Ok(vec![self.catalog_name.clone()])
} }
} }

View File

@@ -201,7 +201,7 @@ impl DfTableSourceProvider {
Ok(Arc::new(ViewTable::new( Ok(Arc::new(ViewTable::new(
logical_plan, logical_plan,
Some(view_info.definition.to_string()), Some(view_info.definition.clone()),
))) )))
} }
} }

View File

@@ -61,7 +61,6 @@ servers.workspace = true
session.workspace = true session.workspace = true
snafu.workspace = true snafu.workspace = true
store-api.workspace = true store-api.workspace = true
substrait.workspace = true
table.workspace = true table.workspace = true
tokio.workspace = true tokio.workspace = true
tracing-appender.workspace = true tracing-appender.workspace = true

View File

@@ -157,6 +157,7 @@ fn create_table_info(table_id: TableId, table_name: TableName) -> RawTableInfo {
schema: RawSchema::new(column_schemas), schema: RawSchema::new(column_schemas),
engine: "mito".to_string(), engine: "mito".to_string(),
created_on: chrono::DateTime::default(), created_on: chrono::DateTime::default(),
updated_on: chrono::DateTime::default(),
primary_key_indices: vec![], primary_key_indices: vec![],
next_column_id: columns as u32 + 1, next_column_id: columns as u32 + 1,
value_indices: vec![], value_indices: vec![],

View File

@@ -16,6 +16,7 @@ mod export;
mod import; mod import;
use clap::Subcommand; use clap::Subcommand;
use client::DEFAULT_CATALOG_NAME;
use common_error::ext::BoxedError; use common_error::ext::BoxedError;
use crate::Tool; use crate::Tool;
@@ -37,3 +38,7 @@ impl DataCommand {
} }
} }
} }
pub(crate) fn default_database() -> String {
format!("{DEFAULT_CATALOG_NAME}-*")
}

View File

@@ -30,6 +30,7 @@ use snafu::{OptionExt, ResultExt};
use tokio::sync::Semaphore; use tokio::sync::Semaphore;
use tokio::time::Instant; use tokio::time::Instant;
use crate::data::default_database;
use crate::database::{DatabaseClient, parse_proxy_opts}; use crate::database::{DatabaseClient, parse_proxy_opts};
use crate::error::{ use crate::error::{
EmptyResultSnafu, Error, OpenDalSnafu, OutputDirNotSetSnafu, Result, S3ConfigNotSetSnafu, EmptyResultSnafu, Error, OpenDalSnafu, OutputDirNotSetSnafu, Result, S3ConfigNotSetSnafu,
@@ -63,7 +64,7 @@ pub struct ExportCommand {
output_dir: Option<String>, output_dir: Option<String>,
/// The name of the catalog to export. /// The name of the catalog to export.
#[clap(long, default_value = "greptime-*")] #[clap(long, default_value_t = default_database())]
database: String, database: String,
/// Parallelism of the export. /// Parallelism of the export.

View File

@@ -25,6 +25,7 @@ use snafu::{OptionExt, ResultExt};
use tokio::sync::Semaphore; use tokio::sync::Semaphore;
use tokio::time::Instant; use tokio::time::Instant;
use crate::data::default_database;
use crate::database::{DatabaseClient, parse_proxy_opts}; use crate::database::{DatabaseClient, parse_proxy_opts};
use crate::error::{Error, FileIoSnafu, Result, SchemaNotFoundSnafu}; use crate::error::{Error, FileIoSnafu, Result, SchemaNotFoundSnafu};
use crate::{Tool, database}; use crate::{Tool, database};
@@ -52,7 +53,7 @@ pub struct ImportCommand {
input_dir: String, input_dir: String,
/// The name of the catalog to import. /// The name of the catalog to import.
#[clap(long, default_value = "greptime-*")] #[clap(long, default_value_t = default_database())]
database: String, database: String,
/// Parallelism of the import. /// Parallelism of the import.

View File

@@ -41,7 +41,7 @@ impl DelKeyCommand {
pub async fn build(&self) -> Result<Box<dyn Tool>, BoxedError> { pub async fn build(&self) -> Result<Box<dyn Tool>, BoxedError> {
let kv_backend = self.store.build().await?; let kv_backend = self.store.build().await?;
Ok(Box::new(DelKeyTool { Ok(Box::new(DelKeyTool {
key: self.key.to_string(), key: self.key.clone(),
prefix: self.prefix, prefix: self.prefix,
key_deleter: KeyDeleter::new(kv_backend), key_deleter: KeyDeleter::new(kv_backend),
})) }))

View File

@@ -138,13 +138,7 @@ impl RepairTool {
let table_names = table_names let table_names = table_names
.iter() .iter()
.map(|table_name| { .map(|table_name| (catalog.clone(), schema_name.clone(), table_name.clone()))
(
catalog.to_string(),
schema_name.to_string(),
table_name.to_string(),
)
})
.collect::<Vec<_>>(); .collect::<Vec<_>>();
return Ok(IteratorInput::new_table_names(table_names)); return Ok(IteratorInput::new_table_names(table_names));
} else if !self.table_ids.is_empty() { } else if !self.table_ids.is_empty() {

View File

@@ -32,9 +32,9 @@ pub fn generate_alter_table_expr_for_all_columns(
let schema = &table_info.meta.schema; let schema = &table_info.meta.schema;
let mut alter_table_expr = AlterTableExpr { let mut alter_table_expr = AlterTableExpr {
catalog_name: table_info.catalog_name.to_string(), catalog_name: table_info.catalog_name.clone(),
schema_name: table_info.schema_name.to_string(), schema_name: table_info.schema_name.clone(),
table_name: table_info.name.to_string(), table_name: table_info.name.clone(),
..Default::default() ..Default::default()
}; };

View File

@@ -44,9 +44,9 @@ pub fn generate_create_table_expr(table_info: &RawTableInfo) -> Result<CreateTab
let table_options = HashMap::from(&table_info.meta.options); let table_options = HashMap::from(&table_info.meta.options);
Ok(CreateTableExpr { Ok(CreateTableExpr {
catalog_name: table_info.catalog_name.to_string(), catalog_name: table_info.catalog_name.clone(),
schema_name: table_info.schema_name.to_string(), schema_name: table_info.schema_name.clone(),
table_name: table_info.name.to_string(), table_name: table_info.name.clone(),
desc: String::default(), desc: String::default(),
column_defs, column_defs,
time_index, time_index,
@@ -54,7 +54,7 @@ pub fn generate_create_table_expr(table_info: &RawTableInfo) -> Result<CreateTab
create_if_not_exists: true, create_if_not_exists: true,
table_options, table_options,
table_id: None, table_id: None,
engine: table_info.meta.engine.to_string(), engine: table_info.meta.engine.clone(),
}) })
} }

View File

@@ -18,7 +18,7 @@ use common_error::define_from_tonic_status;
use common_error::ext::{BoxedError, ErrorExt}; use common_error::ext::{BoxedError, ErrorExt};
use common_error::status_code::StatusCode; use common_error::status_code::StatusCode;
use common_macro::stack_trace_debug; use common_macro::stack_trace_debug;
use snafu::{Location, Snafu, location}; use snafu::{Location, Snafu};
use tonic::Code; use tonic::Code;
use tonic::metadata::errors::InvalidMetadataValue; use tonic::metadata::errors::InvalidMetadataValue;

View File

@@ -29,9 +29,11 @@ base64.workspace = true
cache.workspace = true cache.workspace = true
catalog.workspace = true catalog.workspace = true
chrono.workspace = true chrono.workspace = true
either = "1.15"
clap.workspace = true clap.workspace = true
cli.workspace = true cli.workspace = true
client.workspace = true client.workspace = true
colored = "2.1.0"
common-base.workspace = true common-base.workspace = true
common-catalog.workspace = true common-catalog.workspace = true
common-config.workspace = true common-config.workspace = true
@@ -63,9 +65,11 @@ lazy_static.workspace = true
meta-client.workspace = true meta-client.workspace = true
meta-srv.workspace = true meta-srv.workspace = true
metric-engine.workspace = true metric-engine.workspace = true
mito2.workspace = true
moka.workspace = true moka.workspace = true
nu-ansi-term = "0.46" nu-ansi-term = "0.46"
object-store.workspace = true object-store.workspace = true
parquet = { workspace = true, features = ["object_store"] }
plugins.workspace = true plugins.workspace = true
prometheus.workspace = true prometheus.workspace = true
prost.workspace = true prost.workspace = true
@@ -82,13 +86,17 @@ similar-asserts.workspace = true
snafu.workspace = true snafu.workspace = true
common-stat.workspace = true common-stat.workspace = true
store-api.workspace = true store-api.workspace = true
substrait.workspace = true
table.workspace = true table.workspace = true
tokio.workspace = true tokio.workspace = true
toml.workspace = true toml.workspace = true
tonic.workspace = true tonic.workspace = true
tracing-appender.workspace = true tracing-appender.workspace = true
[target.'cfg(unix)'.dependencies]
pprof = { version = "0.14", features = [
"flamegraph",
] }
[target.'cfg(not(windows))'.dependencies] [target.'cfg(not(windows))'.dependencies]
tikv-jemallocator = "0.6" tikv-jemallocator = "0.6"

View File

@@ -103,12 +103,15 @@ async fn main_body() -> Result<()> {
async fn start(cli: Command) -> Result<()> { async fn start(cli: Command) -> Result<()> {
match cli.subcmd { match cli.subcmd {
SubCommand::Datanode(cmd) => { SubCommand::Datanode(cmd) => match cmd.subcmd {
let opts = cmd.load_options(&cli.global_options)?; datanode::SubCommand::Start(ref start) => {
let plugins = Plugins::new(); let opts = start.load_options(&cli.global_options)?;
let builder = InstanceBuilder::try_new_with_init(opts, plugins).await?; let plugins = Plugins::new();
cmd.build_with(builder).await?.run().await let builder = InstanceBuilder::try_new_with_init(opts, plugins).await?;
} cmd.build_with(builder).await?.run().await
}
datanode::SubCommand::Objbench(ref bench) => bench.run().await,
},
SubCommand::Flownode(cmd) => { SubCommand::Flownode(cmd) => {
cmd.build(cmd.load_options(&cli.global_options)?) cmd.build(cmd.load_options(&cli.global_options)?)
.await? .await?

View File

@@ -13,6 +13,8 @@
// limitations under the License. // limitations under the License.
pub mod builder; pub mod builder;
#[allow(clippy::print_stdout)]
mod objbench;
use std::path::Path; use std::path::Path;
use std::time::Duration; use std::time::Duration;
@@ -23,13 +25,16 @@ use common_config::Configurable;
use common_telemetry::logging::{DEFAULT_LOGGING_DIR, TracingOptions}; use common_telemetry::logging::{DEFAULT_LOGGING_DIR, TracingOptions};
use common_telemetry::{info, warn}; use common_telemetry::{info, warn};
use common_wal::config::DatanodeWalConfig; use common_wal::config::DatanodeWalConfig;
use datanode::config::RegionEngineConfig;
use datanode::datanode::Datanode; use datanode::datanode::Datanode;
use meta_client::MetaClientOptions; use meta_client::MetaClientOptions;
use serde::{Deserialize, Serialize};
use snafu::{ResultExt, ensure}; use snafu::{ResultExt, ensure};
use tracing_appender::non_blocking::WorkerGuard; use tracing_appender::non_blocking::WorkerGuard;
use crate::App; use crate::App;
use crate::datanode::builder::InstanceBuilder; use crate::datanode::builder::InstanceBuilder;
use crate::datanode::objbench::ObjbenchCommand;
use crate::error::{ use crate::error::{
LoadLayeredConfigSnafu, MissingConfigSnafu, Result, ShutdownDatanodeSnafu, StartDatanodeSnafu, LoadLayeredConfigSnafu, MissingConfigSnafu, Result, ShutdownDatanodeSnafu, StartDatanodeSnafu,
}; };
@@ -89,7 +94,7 @@ impl App for Instance {
#[derive(Parser)] #[derive(Parser)]
pub struct Command { pub struct Command {
#[clap(subcommand)] #[clap(subcommand)]
subcmd: SubCommand, pub subcmd: SubCommand,
} }
impl Command { impl Command {
@@ -100,13 +105,26 @@ impl Command {
pub fn load_options(&self, global_options: &GlobalOptions) -> Result<DatanodeOptions> { pub fn load_options(&self, global_options: &GlobalOptions) -> Result<DatanodeOptions> {
match &self.subcmd { match &self.subcmd {
SubCommand::Start(cmd) => cmd.load_options(global_options), SubCommand::Start(cmd) => cmd.load_options(global_options),
SubCommand::Objbench(_) => {
// For objbench command, we don't need to load DatanodeOptions
// It's a standalone utility command
let mut opts = datanode::config::DatanodeOptions::default();
opts.sanitize();
Ok(DatanodeOptions {
runtime: Default::default(),
plugins: Default::default(),
component: opts,
})
}
} }
} }
} }
#[derive(Parser)] #[derive(Parser)]
enum SubCommand { pub enum SubCommand {
Start(StartCommand), Start(StartCommand),
/// Object storage benchmark tool
Objbench(ObjbenchCommand),
} }
impl SubCommand { impl SubCommand {
@@ -116,12 +134,33 @@ impl SubCommand {
info!("Building datanode with {:#?}", cmd); info!("Building datanode with {:#?}", cmd);
builder.build().await builder.build().await
} }
SubCommand::Objbench(cmd) => {
cmd.run().await?;
std::process::exit(0);
}
} }
} }
} }
/// Storage engine config
#[derive(Debug, Clone, Serialize, Deserialize, PartialEq, Default)]
#[serde(default)]
pub struct StorageConfig {
/// The working directory of database
pub data_home: String,
#[serde(flatten)]
pub store: object_store::config::ObjectStoreConfig,
}
#[derive(Debug, Clone, Serialize, Deserialize, PartialEq, Default)]
#[serde(default)]
struct StorageConfigWrapper {
storage: StorageConfig,
region_engine: Vec<RegionEngineConfig>,
}
#[derive(Debug, Parser, Default)] #[derive(Debug, Parser, Default)]
struct StartCommand { pub struct StartCommand {
#[clap(long)] #[clap(long)]
node_id: Option<u64>, node_id: Option<u64>,
/// The address to bind the gRPC server. /// The address to bind the gRPC server.
@@ -149,7 +188,7 @@ struct StartCommand {
} }
impl StartCommand { impl StartCommand {
fn load_options(&self, global_options: &GlobalOptions) -> Result<DatanodeOptions> { pub fn load_options(&self, global_options: &GlobalOptions) -> Result<DatanodeOptions> {
let mut opts = DatanodeOptions::load_layered_options( let mut opts = DatanodeOptions::load_layered_options(
self.config_file.as_deref(), self.config_file.as_deref(),
self.env_prefix.as_ref(), self.env_prefix.as_ref(),

View File

@@ -0,0 +1,676 @@
// Copyright 2023 Greptime Team
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
use std::path::PathBuf;
use std::sync::Arc;
use std::time::Instant;
use clap::Parser;
use colored::Colorize;
use datanode::config::RegionEngineConfig;
use datanode::store;
use either::Either;
use mito2::access_layer::{
AccessLayer, AccessLayerRef, Metrics, OperationType, SstWriteRequest, WriteType,
};
use mito2::cache::{CacheManager, CacheManagerRef};
use mito2::config::{FulltextIndexConfig, MitoConfig, Mode};
use mito2::read::Source;
use mito2::sst::file::{FileHandle, FileMeta};
use mito2::sst::file_purger::{FilePurger, FilePurgerRef};
use mito2::sst::index::intermediate::IntermediateManager;
use mito2::sst::index::puffin_manager::PuffinManagerFactory;
use mito2::sst::parquet::reader::ParquetReaderBuilder;
use mito2::sst::parquet::{PARQUET_METADATA_KEY, WriteOptions};
use mito2::worker::write_cache_from_config;
use object_store::ObjectStore;
use regex::Regex;
use snafu::OptionExt;
use store_api::metadata::{RegionMetadata, RegionMetadataRef};
use store_api::path_utils::region_name;
use store_api::region_request::PathType;
use store_api::storage::FileId;
use crate::datanode::{StorageConfig, StorageConfigWrapper};
use crate::error;
/// Object storage benchmark command
#[derive(Debug, Parser)]
pub struct ObjbenchCommand {
/// Path to the object-store config file (TOML). Must deserialize into object_store::config::ObjectStoreConfig.
#[clap(long, value_name = "FILE")]
pub config: PathBuf,
/// Source SST file path in object-store (e.g. "region_dir/<uuid>.parquet").
#[clap(long, value_name = "PATH")]
pub source: String,
/// Verbose output
#[clap(short, long, default_value_t = false)]
pub verbose: bool,
/// Output file path for pprof flamegraph (enables profiling)
#[clap(long, value_name = "FILE")]
pub pprof_file: Option<PathBuf>,
}
fn parse_config(config_path: &PathBuf) -> error::Result<(StorageConfig, MitoConfig)> {
let cfg_str = std::fs::read_to_string(config_path).map_err(|e| {
error::IllegalConfigSnafu {
msg: format!("failed to read config {}: {e}", config_path.display()),
}
.build()
})?;
let store_cfg: StorageConfigWrapper = toml::from_str(&cfg_str).map_err(|e| {
error::IllegalConfigSnafu {
msg: format!("failed to parse config {}: {e}", config_path.display()),
}
.build()
})?;
let storage_config = store_cfg.storage;
let mito_engine_config = store_cfg
.region_engine
.into_iter()
.filter_map(|c| {
if let RegionEngineConfig::Mito(mito) = c {
Some(mito)
} else {
None
}
})
.next()
.with_context(|| error::IllegalConfigSnafu {
msg: format!("Engine config not found in {:?}", config_path),
})?;
Ok((storage_config, mito_engine_config))
}
impl ObjbenchCommand {
pub async fn run(&self) -> error::Result<()> {
if self.verbose {
common_telemetry::init_default_ut_logging();
}
println!("{}", "Starting objbench with config:".cyan().bold());
// Build object store from config
let (store_cfg, mut mito_engine_config) = parse_config(&self.config)?;
let object_store = build_object_store(&store_cfg).await?;
println!("{} Object store initialized", "".green());
// Prepare source identifiers
let components = parse_file_dir_components(&self.source)?;
println!(
"{} Source path parsed: {}, components: {:?}",
"".green(),
self.source,
components
);
// Load parquet metadata to extract RegionMetadata and file stats
println!("{}", "Loading parquet metadata...".yellow());
let file_size = object_store
.stat(&self.source)
.await
.map_err(|e| {
error::IllegalConfigSnafu {
msg: format!("stat failed: {e}"),
}
.build()
})?
.content_length();
let parquet_meta = load_parquet_metadata(object_store.clone(), &self.source, file_size)
.await
.map_err(|e| {
error::IllegalConfigSnafu {
msg: format!("read parquet metadata failed: {e}"),
}
.build()
})?;
let region_meta = extract_region_metadata(&self.source, &parquet_meta)?;
let num_rows = parquet_meta.file_metadata().num_rows() as u64;
let num_row_groups = parquet_meta.num_row_groups() as u64;
println!(
"{} Metadata loaded - rows: {}, size: {} bytes",
"".green(),
num_rows,
file_size
);
// Build a FileHandle for the source file
let file_meta = FileMeta {
region_id: region_meta.region_id,
file_id: components.file_id,
time_range: Default::default(),
level: 0,
file_size,
available_indexes: Default::default(),
index_file_size: 0,
num_rows,
num_row_groups,
sequence: None,
partition_expr: None,
num_series: 0,
};
let src_handle = FileHandle::new(file_meta, new_noop_file_purger());
// Build the reader for a single file via ParquetReaderBuilder
let table_dir = components.table_dir();
let (src_access_layer, cache_manager) = build_access_layer_simple(
&components,
object_store.clone(),
&mut mito_engine_config,
&store_cfg.data_home,
)
.await?;
let reader_build_start = Instant::now();
let reader = ParquetReaderBuilder::new(
table_dir,
components.path_type,
src_handle.clone(),
object_store.clone(),
)
.expected_metadata(Some(region_meta.clone()))
.build()
.await
.map_err(|e| {
error::IllegalConfigSnafu {
msg: format!("build reader failed: {e:?}"),
}
.build()
})?;
let reader_build_elapsed = reader_build_start.elapsed();
let total_rows = reader.parquet_metadata().file_metadata().num_rows();
println!("{} Reader built in {:?}", "".green(), reader_build_elapsed);
// Build write request
let fulltext_index_config = FulltextIndexConfig {
create_on_compaction: Mode::Disable,
..Default::default()
};
let write_req = SstWriteRequest {
op_type: OperationType::Flush,
metadata: region_meta,
source: Either::Left(Source::Reader(Box::new(reader))),
cache_manager,
storage: None,
max_sequence: None,
index_options: Default::default(),
index_config: mito_engine_config.index.clone(),
inverted_index_config: MitoConfig::default().inverted_index,
fulltext_index_config,
bloom_filter_index_config: MitoConfig::default().bloom_filter_index,
};
// Write SST
println!("{}", "Writing SST...".yellow());
// Start profiling if pprof_file is specified
#[cfg(unix)]
let profiler_guard = if self.pprof_file.is_some() {
println!("{} Starting profiling...", "".yellow());
Some(
pprof::ProfilerGuardBuilder::default()
.frequency(99)
.blocklist(&["libc", "libgcc", "pthread", "vdso"])
.build()
.map_err(|e| {
error::IllegalConfigSnafu {
msg: format!("Failed to start profiler: {e}"),
}
.build()
})?,
)
} else {
None
};
#[cfg(not(unix))]
if self.pprof_file.is_some() {
eprintln!(
"{}: Profiling is not supported on this platform",
"Warning".yellow()
);
}
let write_start = Instant::now();
let mut metrics = Metrics::new(WriteType::Flush);
let infos = src_access_layer
.write_sst(write_req, &WriteOptions::default(), &mut metrics)
.await
.map_err(|e| {
error::IllegalConfigSnafu {
msg: format!("write_sst failed: {e:?}"),
}
.build()
})?;
let write_elapsed = write_start.elapsed();
// Stop profiling and generate flamegraph if enabled
#[cfg(unix)]
if let (Some(guard), Some(pprof_file)) = (profiler_guard, &self.pprof_file) {
println!("{} Generating flamegraph...", "🔥".yellow());
match guard.report().build() {
Ok(report) => {
let mut flamegraph_data = Vec::new();
if let Err(e) = report.flamegraph(&mut flamegraph_data) {
println!("{}: Failed to generate flamegraph: {}", "Error".red(), e);
} else if let Err(e) = std::fs::write(pprof_file, flamegraph_data) {
println!(
"{}: Failed to write flamegraph to {}: {}",
"Error".red(),
pprof_file.display(),
e
);
} else {
println!(
"{} Flamegraph saved to {}",
"".green(),
pprof_file.display().to_string().cyan()
);
}
}
Err(e) => {
println!("{}: Failed to generate pprof report: {}", "Error".red(), e);
}
}
}
assert_eq!(infos.len(), 1);
let dst_file_id = infos[0].file_id;
let dst_file_path = format!("{}/{}.parquet", components.region_dir(), dst_file_id);
let mut dst_index_path = None;
if infos[0].index_metadata.file_size > 0 {
dst_index_path = Some(format!(
"{}/index/{}.puffin",
components.region_dir(),
dst_file_id
));
}
// Report results with ANSI colors
println!("\n{} {}", "Write complete!".green().bold(), "".green());
println!(" {}: {}", "Destination file".bold(), dst_file_path.cyan());
println!(" {}: {}", "Rows".bold(), total_rows.to_string().cyan());
println!(
" {}: {}",
"File size".bold(),
format!("{} bytes", file_size).cyan()
);
println!(
" {}: {:?}",
"Reader build time".bold(),
reader_build_elapsed
);
println!(" {}: {:?}", "Total time".bold(), write_elapsed);
// Print metrics in a formatted way
println!(" {}: {:?}", "Metrics".bold(), metrics,);
// Print infos
println!(" {}: {:?}", "Index".bold(), infos[0].index_metadata);
// Cleanup
println!("\n{}", "Cleaning up...".yellow());
object_store.delete(&dst_file_path).await.map_err(|e| {
error::IllegalConfigSnafu {
msg: format!("Failed to delete dest file {}: {}", dst_file_path, e),
}
.build()
})?;
println!("{} Temporary file {} deleted", "".green(), dst_file_path);
if let Some(index_path) = dst_index_path {
object_store.delete(&index_path).await.map_err(|e| {
error::IllegalConfigSnafu {
msg: format!("Failed to delete dest index file {}: {}", index_path, e),
}
.build()
})?;
println!(
"{} Temporary index file {} deleted",
"".green(),
index_path
);
}
println!("\n{}", "Benchmark completed successfully!".green().bold());
Ok(())
}
}
#[derive(Debug)]
struct FileDirComponents {
catalog: String,
schema: String,
table_id: u32,
region_sequence: u32,
path_type: PathType,
file_id: FileId,
}
impl FileDirComponents {
fn table_dir(&self) -> String {
format!("data/{}/{}/{}", self.catalog, self.schema, self.table_id)
}
fn region_dir(&self) -> String {
let region_name = region_name(self.table_id, self.region_sequence);
match self.path_type {
PathType::Bare => {
format!(
"data/{}/{}/{}/{}",
self.catalog, self.schema, self.table_id, region_name
)
}
PathType::Data => {
format!(
"data/{}/{}/{}/{}/data",
self.catalog, self.schema, self.table_id, region_name
)
}
PathType::Metadata => {
format!(
"data/{}/{}/{}/{}/metadata",
self.catalog, self.schema, self.table_id, region_name
)
}
}
}
}
fn parse_file_dir_components(path: &str) -> error::Result<FileDirComponents> {
// Define the regex pattern to match all three path styles
let pattern =
r"^data/([^/]+)/([^/]+)/([^/]+)/([^/]+)_([^/]+)(?:/data|/metadata)?/(.+).parquet$";
// Compile the regex
let re = Regex::new(pattern).expect("Invalid regex pattern");
// Determine the path type
let path_type = if path.contains("/data/") {
PathType::Data
} else if path.contains("/metadata/") {
PathType::Metadata
} else {
PathType::Bare
};
// Try to match the path
let components = (|| {
let captures = re.captures(path)?;
if captures.len() != 7 {
return None;
}
let mut components = FileDirComponents {
catalog: "".to_string(),
schema: "".to_string(),
table_id: 0,
region_sequence: 0,
path_type,
file_id: FileId::default(),
};
// Extract the components
components.catalog = captures.get(1)?.as_str().to_string();
components.schema = captures.get(2)?.as_str().to_string();
components.table_id = captures[3].parse().ok()?;
components.region_sequence = captures[5].parse().ok()?;
let file_id_str = &captures[6];
components.file_id = FileId::parse_str(file_id_str).ok()?;
Some(components)
})();
components.context(error::IllegalConfigSnafu {
msg: format!("Expect valid source file path, got: {}", path),
})
}
fn extract_region_metadata(
file_path: &str,
meta: &parquet::file::metadata::ParquetMetaData,
) -> error::Result<RegionMetadataRef> {
use parquet::format::KeyValue;
let kvs: Option<&Vec<KeyValue>> = meta.file_metadata().key_value_metadata();
let Some(kvs) = kvs else {
return Err(error::IllegalConfigSnafu {
msg: format!("{file_path}: missing parquet key_value metadata"),
}
.build());
};
let json = kvs
.iter()
.find(|kv| kv.key == PARQUET_METADATA_KEY)
.and_then(|kv| kv.value.as_ref())
.ok_or_else(|| {
error::IllegalConfigSnafu {
msg: format!("{file_path}: key {PARQUET_METADATA_KEY} not found or empty"),
}
.build()
})?;
let region: RegionMetadata = RegionMetadata::from_json(json).map_err(|e| {
error::IllegalConfigSnafu {
msg: format!("invalid region metadata json: {e}"),
}
.build()
})?;
Ok(Arc::new(region))
}
async fn build_object_store(sc: &StorageConfig) -> error::Result<ObjectStore> {
store::new_object_store(sc.store.clone(), &sc.data_home)
.await
.map_err(|e| {
error::IllegalConfigSnafu {
msg: format!("Failed to build object store: {e:?}"),
}
.build()
})
}
async fn build_access_layer_simple(
components: &FileDirComponents,
object_store: ObjectStore,
config: &mut MitoConfig,
data_home: &str,
) -> error::Result<(AccessLayerRef, CacheManagerRef)> {
let _ = config.index.sanitize(data_home, &config.inverted_index);
let puffin_manager = PuffinManagerFactory::new(
&config.index.aux_path,
config.index.staging_size.as_bytes(),
Some(config.index.write_buffer_size.as_bytes() as _),
config.index.staging_ttl,
)
.await
.map_err(|e| {
error::IllegalConfigSnafu {
msg: format!("Failed to build access layer: {e:?}"),
}
.build()
})?;
let intermediate_manager = IntermediateManager::init_fs(&config.index.aux_path)
.await
.map_err(|e| {
error::IllegalConfigSnafu {
msg: format!("Failed to build IntermediateManager: {e:?}"),
}
.build()
})?
.with_buffer_size(Some(config.index.write_buffer_size.as_bytes() as _));
let cache_manager =
build_cache_manager(config, puffin_manager.clone(), intermediate_manager.clone()).await?;
let layer = AccessLayer::new(
components.table_dir(),
components.path_type,
object_store,
puffin_manager,
intermediate_manager,
);
Ok((Arc::new(layer), cache_manager))
}
async fn build_cache_manager(
config: &MitoConfig,
puffin_manager: PuffinManagerFactory,
intermediate_manager: IntermediateManager,
) -> error::Result<CacheManagerRef> {
let write_cache = write_cache_from_config(config, puffin_manager, intermediate_manager)
.await
.map_err(|e| {
error::IllegalConfigSnafu {
msg: format!("Failed to build write cache: {e:?}"),
}
.build()
})?;
let cache_manager = Arc::new(
CacheManager::builder()
.sst_meta_cache_size(config.sst_meta_cache_size.as_bytes())
.vector_cache_size(config.vector_cache_size.as_bytes())
.page_cache_size(config.page_cache_size.as_bytes())
.selector_result_cache_size(config.selector_result_cache_size.as_bytes())
.index_metadata_size(config.index.metadata_cache_size.as_bytes())
.index_content_size(config.index.content_cache_size.as_bytes())
.index_content_page_size(config.index.content_cache_page_size.as_bytes())
.index_result_cache_size(config.index.result_cache_size.as_bytes())
.puffin_metadata_size(config.index.metadata_cache_size.as_bytes())
.write_cache(write_cache)
.build(),
);
Ok(cache_manager)
}
fn new_noop_file_purger() -> FilePurgerRef {
#[derive(Debug)]
struct Noop;
impl FilePurger for Noop {
fn remove_file(&self, _file_meta: FileMeta, _is_delete: bool) {}
}
Arc::new(Noop)
}
async fn load_parquet_metadata(
object_store: ObjectStore,
path: &str,
file_size: u64,
) -> Result<parquet::file::metadata::ParquetMetaData, Box<dyn std::error::Error + Send + Sync>> {
use parquet::file::FOOTER_SIZE;
use parquet::file::metadata::ParquetMetaDataReader;
let actual_size = if file_size == 0 {
object_store.stat(path).await?.content_length()
} else {
file_size
};
if actual_size < FOOTER_SIZE as u64 {
return Err("file too small".into());
}
let prefetch: u64 = 64 * 1024;
let start = actual_size.saturating_sub(prefetch);
let buffer = object_store
.read_with(path)
.range(start..actual_size)
.await?
.to_vec();
let buffer_len = buffer.len();
let mut footer = [0; 8];
footer.copy_from_slice(&buffer[buffer_len - FOOTER_SIZE..]);
let footer = ParquetMetaDataReader::decode_footer_tail(&footer)?;
let metadata_len = footer.metadata_length() as u64;
if actual_size - (FOOTER_SIZE as u64) < metadata_len {
return Err("invalid footer/metadata length".into());
}
if (metadata_len as usize) <= buffer_len - FOOTER_SIZE {
let metadata_start = buffer_len - metadata_len as usize - FOOTER_SIZE;
let meta = ParquetMetaDataReader::decode_metadata(
&buffer[metadata_start..buffer_len - FOOTER_SIZE],
)?;
Ok(meta)
} else {
let metadata_start = actual_size - metadata_len - FOOTER_SIZE as u64;
let data = object_store
.read_with(path)
.range(metadata_start..(actual_size - FOOTER_SIZE as u64))
.await?
.to_vec();
let meta = ParquetMetaDataReader::decode_metadata(&data)?;
Ok(meta)
}
}
#[cfg(test)]
mod tests {
use std::path::PathBuf;
use std::str::FromStr;
use common_base::readable_size::ReadableSize;
use store_api::region_request::PathType;
use crate::datanode::objbench::{parse_config, parse_file_dir_components};
#[test]
fn test_parse_dir() {
let meta_path = "data/greptime/public/1024/1024_0000000000/metadata/00020380-009c-426d-953e-b4e34c15af34.parquet";
let c = parse_file_dir_components(meta_path).unwrap();
assert_eq!(
c.file_id.to_string(),
"00020380-009c-426d-953e-b4e34c15af34"
);
assert_eq!(c.catalog, "greptime");
assert_eq!(c.schema, "public");
assert_eq!(c.table_id, 1024);
assert_eq!(c.region_sequence, 0);
assert_eq!(c.path_type, PathType::Metadata);
let c = parse_file_dir_components(
"data/greptime/public/1024/1024_0000000000/data/00020380-009c-426d-953e-b4e34c15af34.parquet",
).unwrap();
assert_eq!(
c.file_id.to_string(),
"00020380-009c-426d-953e-b4e34c15af34"
);
assert_eq!(c.catalog, "greptime");
assert_eq!(c.schema, "public");
assert_eq!(c.table_id, 1024);
assert_eq!(c.region_sequence, 0);
assert_eq!(c.path_type, PathType::Data);
let c = parse_file_dir_components(
"data/greptime/public/1024/1024_0000000000/00020380-009c-426d-953e-b4e34c15af34.parquet",
).unwrap();
assert_eq!(
c.file_id.to_string(),
"00020380-009c-426d-953e-b4e34c15af34"
);
assert_eq!(c.catalog, "greptime");
assert_eq!(c.schema, "public");
assert_eq!(c.table_id, 1024);
assert_eq!(c.region_sequence, 0);
assert_eq!(c.path_type, PathType::Bare);
}
#[test]
fn test_parse_config() {
let path = "../../config/datanode.example.toml";
let (storage, engine) = parse_config(&PathBuf::from_str(path).unwrap()).unwrap();
assert_eq!(storage.data_home, "./greptimedb_data");
assert_eq!(engine.index.staging_size, ReadableSize::gb(2));
}
}

View File

@@ -316,6 +316,13 @@ pub enum Error {
location: Location, location: Location,
source: standalone::error::Error, source: standalone::error::Error,
}, },
#[snafu(display("Invalid WAL provider"))]
InvalidWalProvider {
#[snafu(implicit)]
location: Location,
source: common_wal::error::Error,
},
} }
pub type Result<T> = std::result::Result<T, Error>; pub type Result<T> = std::result::Result<T, Error>;
@@ -373,6 +380,7 @@ impl ErrorExt for Error {
} }
Error::MetaClientInit { source, .. } => source.status_code(), Error::MetaClientInit { source, .. } => source.status_code(),
Error::SchemaNotFound { .. } => StatusCode::DatabaseNotFound, Error::SchemaNotFound { .. } => StatusCode::DatabaseNotFound,
Error::InvalidWalProvider { .. } => StatusCode::InvalidArguments,
} }
} }

View File

@@ -30,6 +30,7 @@ use common_meta::heartbeat::handler::invalidate_table_cache::InvalidateCacheHand
use common_meta::heartbeat::handler::parse_mailbox_message::ParseMailboxMessageHandler; use common_meta::heartbeat::handler::parse_mailbox_message::ParseMailboxMessageHandler;
use common_meta::key::TableMetadataManager; use common_meta::key::TableMetadataManager;
use common_meta::key::flow::FlowMetadataManager; use common_meta::key::flow::FlowMetadataManager;
use common_stat::ResourceStatImpl;
use common_telemetry::info; use common_telemetry::info;
use common_telemetry::logging::{DEFAULT_LOGGING_DIR, TracingOptions}; use common_telemetry::logging::{DEFAULT_LOGGING_DIR, TracingOptions};
use common_version::{short_version, verbose_version}; use common_version::{short_version, verbose_version};
@@ -372,11 +373,15 @@ impl StartCommand {
Arc::new(InvalidateCacheHandler::new(layered_cache_registry.clone())), Arc::new(InvalidateCacheHandler::new(layered_cache_registry.clone())),
]); ]);
let mut resource_stat = ResourceStatImpl::default();
resource_stat.start_collect_cpu_usage();
let heartbeat_task = flow::heartbeat::HeartbeatTask::new( let heartbeat_task = flow::heartbeat::HeartbeatTask::new(
&opts, &opts,
meta_client.clone(), meta_client.clone(),
opts.heartbeat.clone(), opts.heartbeat.clone(),
Arc::new(executor), Arc::new(executor),
Arc::new(resource_stat),
); );
let flow_metadata_manager = Arc::new(FlowMetadataManager::new(cached_meta_backend.clone())); let flow_metadata_manager = Arc::new(FlowMetadataManager::new(cached_meta_backend.clone()));

View File

@@ -25,11 +25,14 @@ use clap::Parser;
use client::client_manager::NodeClients; use client::client_manager::NodeClients;
use common_base::Plugins; use common_base::Plugins;
use common_config::{Configurable, DEFAULT_DATA_HOME}; use common_config::{Configurable, DEFAULT_DATA_HOME};
use common_error::ext::BoxedError;
use common_grpc::channel_manager::ChannelConfig; use common_grpc::channel_manager::ChannelConfig;
use common_meta::cache::{CacheRegistryBuilder, LayeredCacheRegistryBuilder}; use common_meta::cache::{CacheRegistryBuilder, LayeredCacheRegistryBuilder};
use common_meta::heartbeat::handler::HandlerGroupExecutor; use common_meta::heartbeat::handler::HandlerGroupExecutor;
use common_meta::heartbeat::handler::invalidate_table_cache::InvalidateCacheHandler; use common_meta::heartbeat::handler::invalidate_table_cache::InvalidateCacheHandler;
use common_meta::heartbeat::handler::parse_mailbox_message::ParseMailboxMessageHandler; use common_meta::heartbeat::handler::parse_mailbox_message::ParseMailboxMessageHandler;
use common_query::prelude::set_default_prefix;
use common_stat::ResourceStatImpl;
use common_telemetry::info; use common_telemetry::info;
use common_telemetry::logging::{DEFAULT_LOGGING_DIR, TracingOptions}; use common_telemetry::logging::{DEFAULT_LOGGING_DIR, TracingOptions};
use common_time::timezone::set_default_timezone; use common_time::timezone::set_default_timezone;
@@ -252,10 +255,10 @@ impl StartCommand {
if let Some(addr) = &self.internal_rpc_bind_addr { if let Some(addr) = &self.internal_rpc_bind_addr {
if let Some(internal_grpc) = &mut opts.internal_grpc { if let Some(internal_grpc) = &mut opts.internal_grpc {
internal_grpc.bind_addr = addr.to_string(); internal_grpc.bind_addr = addr.clone();
} else { } else {
let grpc_options = GrpcOptions { let grpc_options = GrpcOptions {
bind_addr: addr.to_string(), bind_addr: addr.clone(),
..Default::default() ..Default::default()
}; };
@@ -265,10 +268,10 @@ impl StartCommand {
if let Some(addr) = &self.internal_rpc_server_addr { if let Some(addr) = &self.internal_rpc_server_addr {
if let Some(internal_grpc) = &mut opts.internal_grpc { if let Some(internal_grpc) = &mut opts.internal_grpc {
internal_grpc.server_addr = addr.to_string(); internal_grpc.server_addr = addr.clone();
} else { } else {
let grpc_options = GrpcOptions { let grpc_options = GrpcOptions {
server_addr: addr.to_string(), server_addr: addr.clone(),
..Default::default() ..Default::default()
}; };
opts.internal_grpc = Some(grpc_options); opts.internal_grpc = Some(grpc_options);
@@ -332,6 +335,9 @@ impl StartCommand {
.context(error::StartFrontendSnafu)?; .context(error::StartFrontendSnafu)?;
set_default_timezone(opts.default_timezone.as_deref()).context(error::InitTimezoneSnafu)?; set_default_timezone(opts.default_timezone.as_deref()).context(error::InitTimezoneSnafu)?;
set_default_prefix(opts.default_column_prefix.as_deref())
.map_err(BoxedError::new)
.context(error::BuildCliSnafu)?;
let meta_client_options = opts let meta_client_options = opts
.meta_client .meta_client
@@ -421,11 +427,15 @@ impl StartCommand {
Arc::new(InvalidateCacheHandler::new(layered_cache_registry.clone())), Arc::new(InvalidateCacheHandler::new(layered_cache_registry.clone())),
]); ]);
let mut resource_stat = ResourceStatImpl::default();
resource_stat.start_collect_cpu_usage();
let heartbeat_task = HeartbeatTask::new( let heartbeat_task = HeartbeatTask::new(
&opts, &opts,
meta_client.clone(), meta_client.clone(),
opts.heartbeat.clone(), opts.heartbeat.clone(),
Arc::new(executor), Arc::new(executor),
Arc::new(resource_stat),
); );
let heartbeat_task = Some(heartbeat_task); let heartbeat_task = Some(heartbeat_task);

View File

@@ -12,13 +12,13 @@
// See the License for the specific language governing permissions and // See the License for the specific language governing permissions and
// limitations under the License. // limitations under the License.
#![feature(assert_matches, let_chains)] #![feature(assert_matches)]
use async_trait::async_trait; use async_trait::async_trait;
use common_error::ext::ErrorExt; use common_error::ext::ErrorExt;
use common_error::status_code::StatusCode; use common_error::status_code::StatusCode;
use common_mem_prof::activate_heap_profile; use common_mem_prof::activate_heap_profile;
use common_stat::{get_cpu_limit, get_memory_limit}; use common_stat::{get_total_cpu_millicores, get_total_memory_bytes};
use common_telemetry::{error, info, warn}; use common_telemetry::{error, info, warn};
use crate::error::Result; use crate::error::Result;
@@ -125,7 +125,8 @@ pub fn log_versions(version: &str, short_version: &str, app: &str) {
} }
pub fn create_resource_limit_metrics(app: &str) { pub fn create_resource_limit_metrics(app: &str) {
if let Some(cpu_limit) = get_cpu_limit() { let cpu_limit = get_total_cpu_millicores();
if cpu_limit > 0 {
info!( info!(
"GreptimeDB start with cpu limit in millicores: {}", "GreptimeDB start with cpu limit in millicores: {}",
cpu_limit cpu_limit
@@ -133,7 +134,8 @@ pub fn create_resource_limit_metrics(app: &str) {
CPU_LIMIT.with_label_values(&[app]).set(cpu_limit); CPU_LIMIT.with_label_values(&[app]).set(cpu_limit);
} }
if let Some(memory_limit) = get_memory_limit() { let memory_limit = get_total_memory_bytes();
if memory_limit > 0 {
info!( info!(
"GreptimeDB start with memory limit in bytes: {}", "GreptimeDB start with memory limit in bytes: {}",
memory_limit memory_limit

View File

@@ -19,6 +19,7 @@ use std::{fs, path};
use async_trait::async_trait; use async_trait::async_trait;
use cache::{build_fundamental_cache_registry, with_default_composite_cache_registry}; use cache::{build_fundamental_cache_registry, with_default_composite_cache_registry};
use catalog::information_schema::InformationExtensionRef;
use catalog::kvbackend::KvBackendCatalogManagerBuilder; use catalog::kvbackend::KvBackendCatalogManagerBuilder;
use catalog::process_manager::ProcessManager; use catalog::process_manager::ProcessManager;
use clap::Parser; use clap::Parser;
@@ -40,6 +41,7 @@ use common_meta::region_registry::LeaderRegionRegistry;
use common_meta::sequence::SequenceBuilder; use common_meta::sequence::SequenceBuilder;
use common_meta::wal_options_allocator::{WalOptionsAllocatorRef, build_wal_options_allocator}; use common_meta::wal_options_allocator::{WalOptionsAllocatorRef, build_wal_options_allocator};
use common_procedure::ProcedureManagerRef; use common_procedure::ProcedureManagerRef;
use common_query::prelude::set_default_prefix;
use common_telemetry::info; use common_telemetry::info;
use common_telemetry::logging::{DEFAULT_LOGGING_DIR, TracingOptions}; use common_telemetry::logging::{DEFAULT_LOGGING_DIR, TracingOptions};
use common_time::timezone::set_default_timezone; use common_time::timezone::set_default_timezone;
@@ -354,6 +356,10 @@ impl StartCommand {
let mut plugins = Plugins::new(); let mut plugins = Plugins::new();
let plugin_opts = opts.plugins; let plugin_opts = opts.plugins;
let mut opts = opts.component; let mut opts = opts.component;
set_default_prefix(opts.default_column_prefix.as_deref())
.map_err(BoxedError::new)
.context(error::BuildCliSnafu)?;
opts.grpc.detect_server_addr(); opts.grpc.detect_server_addr();
let fe_opts = opts.frontend_options(); let fe_opts = opts.frontend_options();
let dn_opts = opts.datanode_options(); let dn_opts = opts.datanode_options();
@@ -404,6 +410,8 @@ impl StartCommand {
procedure_manager.clone(), procedure_manager.clone(),
)); ));
plugins.insert::<InformationExtensionRef>(information_extension.clone());
let process_manager = Arc::new(ProcessManager::new(opts.grpc.server_addr.clone(), None)); let process_manager = Arc::new(ProcessManager::new(opts.grpc.server_addr.clone(), None));
let builder = KvBackendCatalogManagerBuilder::new( let builder = KvBackendCatalogManagerBuilder::new(
information_extension.clone(), information_extension.clone(),
@@ -473,7 +481,11 @@ impl StartCommand {
.step(10) .step(10)
.build(), .build(),
); );
let kafka_options = opts.wal.clone().into(); let kafka_options = opts
.wal
.clone()
.try_into()
.context(error::InvalidWalProviderSnafu)?;
let wal_options_allocator = build_wal_options_allocator(&kafka_options, kv_backend.clone()) let wal_options_allocator = build_wal_options_allocator(&kafka_options, kv_backend.clone())
.await .await
.context(error::BuildWalOptionsAllocatorSnafu)?; .context(error::BuildWalOptionsAllocatorSnafu)?;

View File

@@ -48,6 +48,7 @@ fn test_load_datanode_example_config() {
let expected = GreptimeOptions::<DatanodeOptions> { let expected = GreptimeOptions::<DatanodeOptions> {
component: DatanodeOptions { component: DatanodeOptions {
node_id: Some(42), node_id: Some(42),
default_column_prefix: Some("greptime".to_string()),
meta_client: Some(MetaClientOptions { meta_client: Some(MetaClientOptions {
metasrv_addrs: vec!["127.0.0.1:3002".to_string()], metasrv_addrs: vec!["127.0.0.1:3002".to_string()],
timeout: Duration::from_secs(3), timeout: Duration::from_secs(3),
@@ -113,6 +114,7 @@ fn test_load_frontend_example_config() {
let expected = GreptimeOptions::<FrontendOptions> { let expected = GreptimeOptions::<FrontendOptions> {
component: FrontendOptions { component: FrontendOptions {
default_timezone: Some("UTC".to_string()), default_timezone: Some("UTC".to_string()),
default_column_prefix: Some("greptime".to_string()),
meta_client: Some(MetaClientOptions { meta_client: Some(MetaClientOptions {
metasrv_addrs: vec!["127.0.0.1:3002".to_string()], metasrv_addrs: vec!["127.0.0.1:3002".to_string()],
timeout: Duration::from_secs(3), timeout: Duration::from_secs(3),
@@ -273,6 +275,7 @@ fn test_load_standalone_example_config() {
let expected = GreptimeOptions::<StandaloneOptions> { let expected = GreptimeOptions::<StandaloneOptions> {
component: StandaloneOptions { component: StandaloneOptions {
default_timezone: Some("UTC".to_string()), default_timezone: Some("UTC".to_string()),
default_column_prefix: Some("greptime".to_string()),
wal: DatanodeWalConfig::RaftEngine(RaftEngineConfig { wal: DatanodeWalConfig::RaftEngine(RaftEngineConfig {
dir: Some(format!("{}/{}", DEFAULT_DATA_HOME, WAL_DIR)), dir: Some(format!("{}/{}", DEFAULT_DATA_HOME, WAL_DIR)),
sync_period: Some(Duration::from_secs(10)), sync_period: Some(Duration::from_secs(10)),

View File

@@ -18,9 +18,11 @@ bytes.workspace = true
common-error.workspace = true common-error.workspace = true
common-macro.workspace = true common-macro.workspace = true
futures.workspace = true futures.workspace = true
lazy_static.workspace = true
paste.workspace = true paste.workspace = true
pin-project.workspace = true pin-project.workspace = true
rand.workspace = true rand.workspace = true
regex.workspace = true
serde = { version = "1.0", features = ["derive"] } serde = { version = "1.0", features = ["derive"] }
snafu.workspace = true snafu.workspace = true
tokio.workspace = true tokio.workspace = true

View File

@@ -19,6 +19,7 @@ pub mod plugins;
pub mod range_read; pub mod range_read;
#[allow(clippy::all)] #[allow(clippy::all)]
pub mod readable_size; pub mod readable_size;
pub mod regex_pattern;
pub mod secrets; pub mod secrets;
pub mod serde; pub mod serde;

View File

@@ -75,11 +75,11 @@ impl Plugins {
self.read().is_empty() self.read().is_empty()
} }
fn read(&self) -> RwLockReadGuard<SendSyncAnyMap> { fn read(&self) -> RwLockReadGuard<'_, SendSyncAnyMap> {
self.inner.read().unwrap() self.inner.read().unwrap()
} }
fn write(&self) -> RwLockWriteGuard<SendSyncAnyMap> { fn write(&self) -> RwLockWriteGuard<'_, SendSyncAnyMap> {
self.inner.write().unwrap() self.inner.write().unwrap()
} }
} }

View File

@@ -0,0 +1,22 @@
// Copyright 2023 Greptime Team
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
use lazy_static::lazy_static;
use regex::Regex;
pub const NAME_PATTERN: &str = r"[a-zA-Z_:-][a-zA-Z0-9_:\-\.@#]*";
lazy_static! {
pub static ref NAME_PATTERN_REG: Regex = Regex::new(&format!("^{NAME_PATTERN}$")).unwrap();
}

View File

@@ -8,5 +8,6 @@ license.workspace = true
workspace = true workspace = true
[dependencies] [dependencies]
const_format.workspace = true
[dev-dependencies] [dev-dependencies]

View File

@@ -0,0 +1,27 @@
// Copyright 2023 Greptime Team
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
fn main() {
// Set DEFAULT_CATALOG_NAME from environment variable or use default value
let default_catalog_name =
std::env::var("DEFAULT_CATALOG_NAME").unwrap_or_else(|_| "greptime".to_string());
println!(
"cargo:rustc-env=DEFAULT_CATALOG_NAME={}",
default_catalog_name
);
// Rerun build script if the environment variable changes
println!("cargo:rerun-if-env-changed=DEFAULT_CATALOG_NAME");
}

View File

@@ -12,13 +12,15 @@
// See the License for the specific language governing permissions and // See the License for the specific language governing permissions and
// limitations under the License. // limitations under the License.
use const_format::concatcp;
pub const SYSTEM_CATALOG_NAME: &str = "system"; pub const SYSTEM_CATALOG_NAME: &str = "system";
pub const INFORMATION_SCHEMA_NAME: &str = "information_schema"; pub const INFORMATION_SCHEMA_NAME: &str = "information_schema";
pub const PG_CATALOG_NAME: &str = "pg_catalog"; pub const PG_CATALOG_NAME: &str = "pg_catalog";
pub const SYSTEM_CATALOG_TABLE_NAME: &str = "system_catalog"; pub const SYSTEM_CATALOG_TABLE_NAME: &str = "system_catalog";
pub const DEFAULT_CATALOG_NAME: &str = "greptime"; pub const DEFAULT_CATALOG_NAME: &str = env!("DEFAULT_CATALOG_NAME");
pub const DEFAULT_SCHEMA_NAME: &str = "public"; pub const DEFAULT_SCHEMA_NAME: &str = "public";
pub const DEFAULT_PRIVATE_SCHEMA_NAME: &str = "greptime_private"; pub const DEFAULT_PRIVATE_SCHEMA_NAME: &str = concatcp!(DEFAULT_CATALOG_NAME, "_private");
/// Reserves [0,MIN_USER_FLOW_ID) for internal usage. /// Reserves [0,MIN_USER_FLOW_ID) for internal usage.
/// User defined table id starts from this value. /// User defined table id starts from this value.
@@ -108,6 +110,8 @@ pub const INFORMATION_SCHEMA_PROCESS_LIST_TABLE_ID: u32 = 36;
pub const INFORMATION_SCHEMA_SSTS_MANIFEST_TABLE_ID: u32 = 37; pub const INFORMATION_SCHEMA_SSTS_MANIFEST_TABLE_ID: u32 = 37;
/// id for information_schema.ssts_storage /// id for information_schema.ssts_storage
pub const INFORMATION_SCHEMA_SSTS_STORAGE_TABLE_ID: u32 = 38; pub const INFORMATION_SCHEMA_SSTS_STORAGE_TABLE_ID: u32 = 38;
/// id for information_schema.ssts_index_meta
pub const INFORMATION_SCHEMA_SSTS_INDEX_META_TABLE_ID: u32 = 39;
// ----- End of information_schema tables ----- // ----- End of information_schema tables -----
@@ -148,4 +152,9 @@ pub const TRACE_TABLE_NAME_SESSION_KEY: &str = "trace_table_name";
pub fn trace_services_table_name(trace_table_name: &str) -> String { pub fn trace_services_table_name(trace_table_name: &str) -> String {
format!("{}_services", trace_table_name) format!("{}_services", trace_table_name)
} }
/// Generate the trace operations table name from the trace table name by adding `_operations` suffix.
pub fn trace_operations_table_name(trace_table_name: &str) -> String {
format!("{}_operations", trace_table_name)
}
// ---- End of special table and fields ---- // ---- End of special table and fields ----

View File

@@ -13,13 +13,11 @@ common-error.workspace = true
common-macro.workspace = true common-macro.workspace = true
config.workspace = true config.workspace = true
humantime-serde.workspace = true humantime-serde.workspace = true
num_cpus.workspace = true
object-store.workspace = true object-store.workspace = true
serde.workspace = true serde.workspace = true
serde_json.workspace = true serde_json.workspace = true
serde_with.workspace = true serde_with.workspace = true
snafu.workspace = true snafu.workspace = true
sysinfo.workspace = true
toml.workspace = true toml.workspace = true
[dev-dependencies] [dev-dependencies]

View File

@@ -14,7 +14,6 @@
pub mod config; pub mod config;
pub mod error; pub mod error;
pub mod utils;
use std::time::Duration; use std::time::Duration;

View File

@@ -1,73 +0,0 @@
// Copyright 2023 Greptime Team
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
use common_base::readable_size::ReadableSize;
use sysinfo::System;
/// Get the CPU core number of system, aware of cgroups.
pub fn get_cpus() -> usize {
// This function will check cgroups
num_cpus::get()
}
/// Get the total memory of the system.
/// If `cgroup_limits` is enabled, it will also check it.
pub fn get_sys_total_memory() -> Option<ReadableSize> {
if sysinfo::IS_SUPPORTED_SYSTEM {
let mut sys_info = System::new();
sys_info.refresh_memory();
let mut total_memory = sys_info.total_memory();
// Compare with cgroups memory limit, use smaller values
// This method is only implemented for Linux. It always returns None for all other systems.
if let Some(cgroup_limits) = sys_info.cgroup_limits() {
total_memory = total_memory.min(cgroup_limits.total_memory)
}
Some(ReadableSize(total_memory))
} else {
None
}
}
/// `ResourceSpec` holds the static resource specifications of a node,
/// such as CPU cores and memory capacity. These values are fixed
/// at startup and do not change dynamically during runtime.
#[derive(Debug, Clone, Copy)]
pub struct ResourceSpec {
pub cpus: usize,
pub memory: Option<ReadableSize>,
}
impl Default for ResourceSpec {
fn default() -> Self {
Self {
cpus: get_cpus(),
memory: get_sys_total_memory(),
}
}
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn test_get_cpus() {
assert!(get_cpus() > 0);
}
#[test]
fn test_get_sys_total_memory() {
assert!(get_sys_total_memory().unwrap() > ReadableSize::mb(0));
}
}

View File

@@ -36,7 +36,7 @@ object_store_opendal.workspace = true
orc-rust = { version = "0.6.3", default-features = false, features = ["async"] } orc-rust = { version = "0.6.3", default-features = false, features = ["async"] }
parquet.workspace = true parquet.workspace = true
paste.workspace = true paste.workspace = true
regex = "1.7" regex.workspace = true
serde.workspace = true serde.workspace = true
snafu.workspace = true snafu.workspace = true
strum.workspace = true strum.workspace = true

View File

@@ -150,7 +150,7 @@ impl<
if let Some(ref mut writer) = self.writer { if let Some(ref mut writer) = self.writer {
Ok(writer) Ok(writer)
} else { } else {
let writer = (self.writer_factory)(self.path.to_string()).await?; let writer = (self.writer_factory)(self.path.clone()).await?;
Ok(self.writer.insert(writer)) Ok(self.writer.insert(writer))
} }
} }

View File

@@ -33,7 +33,7 @@ use bytes::{Buf, Bytes};
use datafusion::datasource::physical_plan::FileOpenFuture; use datafusion::datasource::physical_plan::FileOpenFuture;
use datafusion::error::{DataFusionError, Result as DataFusionResult}; use datafusion::error::{DataFusionError, Result as DataFusionResult};
use datafusion::physical_plan::SendableRecordBatchStream; use datafusion::physical_plan::SendableRecordBatchStream;
use futures::StreamExt; use futures::{StreamExt, TryStreamExt};
use object_store::ObjectStore; use object_store::ObjectStore;
use snafu::ResultExt; use snafu::ResultExt;
use tokio_util::compat::FuturesAsyncWriteCompatExt; use tokio_util::compat::FuturesAsyncWriteCompatExt;
@@ -179,7 +179,7 @@ pub fn open_with_decoder<T: ArrowDecoder, F: Fn() -> DataFusionResult<T>>(
Poll::Ready(decoder.flush().transpose()) Poll::Ready(decoder.flush().transpose())
}); });
Ok(stream.boxed()) Ok(stream.map_err(Into::into).boxed())
})) }))
} }

View File

@@ -87,7 +87,7 @@ pub(crate) fn scan_config(
) -> FileScanConfig { ) -> FileScanConfig {
// object_store only recognize the Unix style path, so make it happy. // object_store only recognize the Unix style path, so make it happy.
let filename = &filename.replace('\\', "/"); let filename = &filename.replace('\\', "/");
let file_group = FileGroup::new(vec![PartitionedFile::new(filename.to_string(), 4096)]); let file_group = FileGroup::new(vec![PartitionedFile::new(filename.clone(), 4096)]);
FileScanConfigBuilder::new(ObjectStoreUrl::local_filesystem(), file_schema, file_source) FileScanConfigBuilder::new(ObjectStoreUrl::local_filesystem(), file_schema, file_source)
.with_file_group(file_group) .with_file_group(file_group)

View File

@@ -51,6 +51,7 @@ nalgebra.workspace = true
num = "0.4" num = "0.4"
num-traits = "0.2" num-traits = "0.2"
paste.workspace = true paste.workspace = true
regex.workspace = true
s2 = { version = "0.0.12", optional = true } s2 = { version = "0.0.12", optional = true }
serde.workspace = true serde.workspace = true
serde_json.workspace = true serde_json.workspace = true

View File

@@ -37,6 +37,8 @@ const COMPACT_TYPE_STRICT_WINDOW: &str = "strict_window";
/// Compact type: strict window (short name). /// Compact type: strict window (short name).
const COMPACT_TYPE_STRICT_WINDOW_SHORT: &str = "swcs"; const COMPACT_TYPE_STRICT_WINDOW_SHORT: &str = "swcs";
const DEFAULT_COMPACTION_PARALLELISM: u32 = 1;
#[admin_fn( #[admin_fn(
name = FlushTableFunction, name = FlushTableFunction,
display_name = flush_table, display_name = flush_table,
@@ -95,7 +97,7 @@ pub(crate) async fn compact_table(
query_ctx: &QueryContextRef, query_ctx: &QueryContextRef,
params: &[ValueRef<'_>], params: &[ValueRef<'_>],
) -> Result<Value> { ) -> Result<Value> {
let request = parse_compact_params(params, query_ctx)?; let request = parse_compact_request(params, query_ctx)?;
info!("Compact table request: {:?}", request); info!("Compact table request: {:?}", request);
let affected_rows = table_mutation_handler let affected_rows = table_mutation_handler
@@ -117,37 +119,46 @@ fn compact_signature() -> Signature {
/// - `[<table_name>]`: only tables name provided, using default compaction type: regular /// - `[<table_name>]`: only tables name provided, using default compaction type: regular
/// - `[<table_name>, <type>]`: specify table name and compaction type. The compaction options will be default. /// - `[<table_name>, <type>]`: specify table name and compaction type. The compaction options will be default.
/// - `[<table_name>, <type>, <options>]`: provides both type and type-specific options. /// - `[<table_name>, <type>, <options>]`: provides both type and type-specific options.
fn parse_compact_params( /// - For `twcs`, it accepts `parallelism=[N]` where N is an unsigned 32 bits number
/// - For `swcs`, it accepts two numeric parameter: `parallelism` and `window`.
fn parse_compact_request(
params: &[ValueRef<'_>], params: &[ValueRef<'_>],
query_ctx: &QueryContextRef, query_ctx: &QueryContextRef,
) -> Result<CompactTableRequest> { ) -> Result<CompactTableRequest> {
ensure!( ensure!(
!params.is_empty(), !params.is_empty() && params.len() <= 3,
InvalidFuncArgsSnafu { InvalidFuncArgsSnafu {
err_msg: "Args cannot be empty", err_msg: format!(
"The length of the args is not correct, expect 1-4, have: {}",
params.len()
),
} }
); );
let (table_name, compact_type) = match params { let (table_name, compact_type, parallelism) = match params {
// 1. Only table name, strategy defaults to twcs and default parallelism.
[ValueRef::String(table_name)] => ( [ValueRef::String(table_name)] => (
table_name, table_name,
compact_request::Options::Regular(Default::default()), compact_request::Options::Regular(Default::default()),
DEFAULT_COMPACTION_PARALLELISM,
), ),
// 2. Both table name and strategy are provided.
[ [
ValueRef::String(table_name), ValueRef::String(table_name),
ValueRef::String(compact_ty_str), ValueRef::String(compact_ty_str),
] => { ] => {
let compact_type = parse_compact_type(compact_ty_str, None)?; let (compact_type, parallelism) = parse_compact_options(compact_ty_str, None)?;
(table_name, compact_type) (table_name, compact_type, parallelism)
} }
// 3. Table name, strategy and strategy specific options
[ [
ValueRef::String(table_name), ValueRef::String(table_name),
ValueRef::String(compact_ty_str), ValueRef::String(compact_ty_str),
ValueRef::String(options_str), ValueRef::String(options_str),
] => { ] => {
let compact_type = parse_compact_type(compact_ty_str, Some(options_str))?; let (compact_type, parallelism) =
(table_name, compact_type) parse_compact_options(compact_ty_str, Some(options_str))?;
(table_name, compact_type, parallelism)
} }
_ => { _ => {
return UnsupportedInputDataTypeSnafu { return UnsupportedInputDataTypeSnafu {
@@ -167,35 +178,126 @@ fn parse_compact_params(
schema_name, schema_name,
table_name, table_name,
compact_options: compact_type, compact_options: compact_type,
parallelism,
}) })
} }
/// Parses compaction strategy type. For `strict_window` or `swcs` strict window compaction is chose, /// Parses compaction strategy type. For `strict_window` or `swcs` strict window compaction is chosen,
/// otherwise choose regular (TWCS) compaction. /// otherwise choose regular (TWCS) compaction.
fn parse_compact_type(type_str: &str, option: Option<&str>) -> Result<compact_request::Options> { fn parse_compact_options(
type_str: &str,
option: Option<&str>,
) -> Result<(compact_request::Options, u32)> {
if type_str.eq_ignore_ascii_case(COMPACT_TYPE_STRICT_WINDOW) if type_str.eq_ignore_ascii_case(COMPACT_TYPE_STRICT_WINDOW)
| type_str.eq_ignore_ascii_case(COMPACT_TYPE_STRICT_WINDOW_SHORT) | type_str.eq_ignore_ascii_case(COMPACT_TYPE_STRICT_WINDOW_SHORT)
{ {
let window_seconds = option let Some(option_str) = option else {
.map(|v| { return Ok((
i64::from_str(v).map_err(|_| { compact_request::Options::StrictWindow(StrictWindow { window_seconds: 0 }),
InvalidFuncArgsSnafu { DEFAULT_COMPACTION_PARALLELISM,
err_msg: format!( ));
"Compact window is expected to be a valid number, provided: {}", };
v
),
}
.build()
})
})
.transpose()?
.unwrap_or(0);
Ok(compact_request::Options::StrictWindow(StrictWindow { // For compatibility, accepts single number as window size.
window_seconds, if let Ok(window_seconds) = i64::from_str(option_str) {
})) return Ok((
compact_request::Options::StrictWindow(StrictWindow { window_seconds }),
DEFAULT_COMPACTION_PARALLELISM,
));
};
// Parse keyword arguments in forms: `key1=value1,key2=value2`
let mut window_seconds = 0i64;
let mut parallelism = DEFAULT_COMPACTION_PARALLELISM;
let pairs: Vec<&str> = option_str.split(',').collect();
for pair in pairs {
let kv: Vec<&str> = pair.trim().split('=').collect();
if kv.len() != 2 {
return InvalidFuncArgsSnafu {
err_msg: format!("Invalid key-value pair: {}", pair.trim()),
}
.fail();
}
let key = kv[0].trim();
let value = kv[1].trim();
match key {
"window" | "window_seconds" => {
window_seconds = i64::from_str(value).map_err(|_| {
InvalidFuncArgsSnafu {
err_msg: format!("Invalid value for window: {}", value),
}
.build()
})?;
}
"parallelism" => {
parallelism = value.parse::<u32>().map_err(|_| {
InvalidFuncArgsSnafu {
err_msg: format!("Invalid value for parallelism: {}", value),
}
.build()
})?;
}
_ => {
return InvalidFuncArgsSnafu {
err_msg: format!("Unknown parameter: {}", key),
}
.fail();
}
}
}
Ok((
compact_request::Options::StrictWindow(StrictWindow { window_seconds }),
parallelism,
))
} else { } else {
Ok(compact_request::Options::Regular(Default::default())) // TWCS strategy
let Some(option_str) = option else {
return Ok((
compact_request::Options::Regular(Default::default()),
DEFAULT_COMPACTION_PARALLELISM,
));
};
let mut parallelism = DEFAULT_COMPACTION_PARALLELISM;
let pairs: Vec<&str> = option_str.split(',').collect();
for pair in pairs {
let kv: Vec<&str> = pair.trim().split('=').collect();
if kv.len() != 2 {
return InvalidFuncArgsSnafu {
err_msg: format!("Invalid key-value pair: {}", pair.trim()),
}
.fail();
}
let key = kv[0].trim();
let value = kv[1].trim();
match key {
"parallelism" => {
parallelism = value.parse::<u32>().map_err(|_| {
InvalidFuncArgsSnafu {
err_msg: format!("Invalid value for parallelism: {}", value),
}
.build()
})?;
}
_ => {
return InvalidFuncArgsSnafu {
err_msg: format!("Unknown parameter: {}", key),
}
.fail();
}
}
}
Ok((
compact_request::Options::Regular(Default::default()),
parallelism,
))
} }
} }
@@ -301,7 +403,7 @@ mod tests {
assert_eq!( assert_eq!(
expected, expected,
&parse_compact_params(&params, &QueryContext::arc()).unwrap() &parse_compact_request(&params, &QueryContext::arc()).unwrap()
); );
} }
} }
@@ -316,6 +418,7 @@ mod tests {
schema_name: DEFAULT_SCHEMA_NAME.to_string(), schema_name: DEFAULT_SCHEMA_NAME.to_string(),
table_name: "table".to_string(), table_name: "table".to_string(),
compact_options: Options::Regular(Default::default()), compact_options: Options::Regular(Default::default()),
parallelism: 1,
}, },
), ),
( (
@@ -325,6 +428,7 @@ mod tests {
schema_name: DEFAULT_SCHEMA_NAME.to_string(), schema_name: DEFAULT_SCHEMA_NAME.to_string(),
table_name: "table".to_string(), table_name: "table".to_string(),
compact_options: Options::Regular(Default::default()), compact_options: Options::Regular(Default::default()),
parallelism: 1,
}, },
), ),
( (
@@ -337,6 +441,7 @@ mod tests {
schema_name: DEFAULT_SCHEMA_NAME.to_string(), schema_name: DEFAULT_SCHEMA_NAME.to_string(),
table_name: "table".to_string(), table_name: "table".to_string(),
compact_options: Options::Regular(Default::default()), compact_options: Options::Regular(Default::default()),
parallelism: 1,
}, },
), ),
( (
@@ -346,6 +451,7 @@ mod tests {
schema_name: DEFAULT_SCHEMA_NAME.to_string(), schema_name: DEFAULT_SCHEMA_NAME.to_string(),
table_name: "table".to_string(), table_name: "table".to_string(),
compact_options: Options::Regular(Default::default()), compact_options: Options::Regular(Default::default()),
parallelism: 1,
}, },
), ),
( (
@@ -355,6 +461,7 @@ mod tests {
schema_name: DEFAULT_SCHEMA_NAME.to_string(), schema_name: DEFAULT_SCHEMA_NAME.to_string(),
table_name: "table".to_string(), table_name: "table".to_string(),
compact_options: Options::StrictWindow(StrictWindow { window_seconds: 0 }), compact_options: Options::StrictWindow(StrictWindow { window_seconds: 0 }),
parallelism: 1,
}, },
), ),
( (
@@ -366,15 +473,7 @@ mod tests {
compact_options: Options::StrictWindow(StrictWindow { compact_options: Options::StrictWindow(StrictWindow {
window_seconds: 3600, window_seconds: 3600,
}), }),
}, parallelism: 1,
),
(
&["table", "regular", "abcd"],
CompactTableRequest {
catalog_name: DEFAULT_CATALOG_NAME.to_string(),
schema_name: DEFAULT_SCHEMA_NAME.to_string(),
table_name: "table".to_string(),
compact_options: Options::Regular(Default::default()),
}, },
), ),
( (
@@ -386,12 +485,82 @@ mod tests {
compact_options: Options::StrictWindow(StrictWindow { compact_options: Options::StrictWindow(StrictWindow {
window_seconds: 120, window_seconds: 120,
}), }),
parallelism: 1,
},
),
// Test with parallelism parameter
(
&["table", "regular", "parallelism=4"],
CompactTableRequest {
catalog_name: DEFAULT_CATALOG_NAME.to_string(),
schema_name: DEFAULT_SCHEMA_NAME.to_string(),
table_name: "table".to_string(),
compact_options: Options::Regular(Default::default()),
parallelism: 4,
},
),
(
&["table", "strict_window", "window=3600,parallelism=2"],
CompactTableRequest {
catalog_name: DEFAULT_CATALOG_NAME.to_string(),
schema_name: DEFAULT_SCHEMA_NAME.to_string(),
table_name: "table".to_string(),
compact_options: Options::StrictWindow(StrictWindow {
window_seconds: 3600,
}),
parallelism: 2,
},
),
(
&["table", "strict_window", "window=3600"],
CompactTableRequest {
catalog_name: DEFAULT_CATALOG_NAME.to_string(),
schema_name: DEFAULT_SCHEMA_NAME.to_string(),
table_name: "table".to_string(),
compact_options: Options::StrictWindow(StrictWindow {
window_seconds: 3600,
}),
parallelism: 1,
},
),
(
&["table", "strict_window", "window_seconds=7200"],
CompactTableRequest {
catalog_name: DEFAULT_CATALOG_NAME.to_string(),
schema_name: DEFAULT_SCHEMA_NAME.to_string(),
table_name: "table".to_string(),
compact_options: Options::StrictWindow(StrictWindow {
window_seconds: 7200,
}),
parallelism: 1,
},
),
(
&["table", "strict_window", "window=1800"],
CompactTableRequest {
catalog_name: DEFAULT_CATALOG_NAME.to_string(),
schema_name: DEFAULT_SCHEMA_NAME.to_string(),
table_name: "table".to_string(),
compact_options: Options::StrictWindow(StrictWindow {
window_seconds: 1800,
}),
parallelism: 1,
},
),
(
&["table", "regular", "parallelism=8"],
CompactTableRequest {
catalog_name: DEFAULT_CATALOG_NAME.to_string(),
schema_name: DEFAULT_SCHEMA_NAME.to_string(),
table_name: "table".to_string(),
compact_options: Options::Regular(Default::default()),
parallelism: 8,
}, },
), ),
]); ]);
assert!( assert!(
parse_compact_params( parse_compact_request(
&["table", "strict_window", "abc"] &["table", "strict_window", "abc"]
.into_iter() .into_iter()
.map(ValueRef::String) .map(ValueRef::String)
@@ -402,7 +571,7 @@ mod tests {
); );
assert!( assert!(
parse_compact_params( parse_compact_request(
&["a.b.table", "strict_window", "abc"] &["a.b.table", "strict_window", "abc"]
.into_iter() .into_iter()
.map(ValueRef::String) .map(ValueRef::String)
@@ -411,5 +580,88 @@ mod tests {
) )
.is_err() .is_err()
); );
// Test invalid parallelism
assert!(
parse_compact_request(
&["table", "regular", "options", "invalid"]
.into_iter()
.map(ValueRef::String)
.collect::<Vec<_>>(),
&QueryContext::arc(),
)
.is_err()
);
// Test too many parameters
assert!(
parse_compact_request(
&["table", "regular", "options", "4", "extra"]
.into_iter()
.map(ValueRef::String)
.collect::<Vec<_>>(),
&QueryContext::arc(),
)
.is_err()
);
// Test invalid keyword argument format
assert!(
parse_compact_request(
&["table", "strict_window", "window"]
.into_iter()
.map(ValueRef::String)
.collect::<Vec<_>>(),
&QueryContext::arc(),
)
.is_err()
);
// Test invalid keyword
assert!(
parse_compact_request(
&["table", "strict_window", "invalid_key=123"]
.into_iter()
.map(ValueRef::String)
.collect::<Vec<_>>(),
&QueryContext::arc(),
)
.is_err()
);
assert!(
parse_compact_request(
&["table", "regular", "abcd"]
.into_iter()
.map(ValueRef::String)
.collect::<Vec<_>>(),
&QueryContext::arc(),
)
.is_err()
);
// Test invalid window value
assert!(
parse_compact_request(
&["table", "strict_window", "window=abc"]
.into_iter()
.map(ValueRef::String)
.collect::<Vec<_>>(),
&QueryContext::arc(),
)
.is_err()
);
// Test invalid parallelism in options string
assert!(
parse_compact_request(
&["table", "strict_window", "parallelism=abc"]
.into_iter()
.map(ValueRef::String)
.collect::<Vec<_>>(),
&QueryContext::arc(),
)
.is_err()
);
} }
} }

View File

@@ -22,12 +22,15 @@
//! `foo_merge`'s input arg is the same as `foo_state`'s output, and its output is the same as `foo`'s input. //! `foo_merge`'s input arg is the same as `foo_state`'s output, and its output is the same as `foo`'s input.
//! //!
use std::hash::{Hash, Hasher};
use std::sync::Arc; use std::sync::Arc;
use arrow::array::StructArray; use arrow::array::StructArray;
use arrow_schema::{FieldRef, Fields}; use arrow_schema::{FieldRef, Fields};
use common_telemetry::debug; use common_telemetry::debug;
use datafusion::functions_aggregate::all_default_aggregate_functions; use datafusion::functions_aggregate::all_default_aggregate_functions;
use datafusion::functions_aggregate::count::Count;
use datafusion::functions_aggregate::min_max::{Max, Min};
use datafusion::optimizer::AnalyzerRule; use datafusion::optimizer::AnalyzerRule;
use datafusion::optimizer::analyzer::type_coercion::TypeCoercion; use datafusion::optimizer::analyzer::type_coercion::TypeCoercion;
use datafusion::physical_planner::create_aggregate_expr_and_maybe_filter; use datafusion::physical_planner::create_aggregate_expr_and_maybe_filter;
@@ -272,7 +275,7 @@ impl StateMergeHelper {
} }
/// Wrapper to make an aggregate function out of a state function. /// Wrapper to make an aggregate function out of a state function.
#[derive(Debug, Clone, PartialEq, Eq)] #[derive(Debug, Clone, PartialEq, Eq, Hash)]
pub struct StateWrapper { pub struct StateWrapper {
inner: AggregateUDF, inner: AggregateUDF,
name: String, name: String,
@@ -412,6 +415,51 @@ impl AggregateUDFImpl for StateWrapper {
fn coerce_types(&self, arg_types: &[DataType]) -> datafusion_common::Result<Vec<DataType>> { fn coerce_types(&self, arg_types: &[DataType]) -> datafusion_common::Result<Vec<DataType>> {
self.inner.coerce_types(arg_types) self.inner.coerce_types(arg_types)
} }
fn value_from_stats(
&self,
statistics_args: &datafusion_expr::StatisticsArgs,
) -> Option<ScalarValue> {
let inner = self.inner().inner().as_any();
// only count/min/max need special handling here, for getting result from statistics
// the result of count/min/max is also the result of count_state so can return directly
let can_use_stat = inner.is::<Count>() || inner.is::<Max>() || inner.is::<Min>();
if !can_use_stat {
return None;
}
// fix return type by extract the first field's data type from the struct type
let state_type = if let DataType::Struct(fields) = &statistics_args.return_type {
if fields.is_empty() {
return None;
}
fields[0].data_type().clone()
} else {
return None;
};
let fixed_args = datafusion_expr::StatisticsArgs {
statistics: statistics_args.statistics,
return_type: &state_type,
is_distinct: statistics_args.is_distinct,
exprs: statistics_args.exprs,
};
let ret = self.inner().value_from_stats(&fixed_args)?;
// wrap the result into struct scalar value
let fields = if let DataType::Struct(fields) = &statistics_args.return_type {
fields
} else {
return None;
};
let array = ret.to_array().ok()?;
let struct_array = StructArray::new(fields.clone(), vec![array], None);
let ret = ScalarValue::Struct(Arc::new(struct_array));
Some(ret)
}
} }
/// The wrapper's input is the same as the original aggregate function's input, /// The wrapper's input is the same as the original aggregate function's input,
@@ -616,6 +664,20 @@ impl AggregateUDFImpl for MergeWrapper {
} }
} }
impl PartialEq for MergeWrapper {
fn eq(&self, other: &Self) -> bool {
self.inner == other.inner
}
}
impl Eq for MergeWrapper {}
impl Hash for MergeWrapper {
fn hash<H: Hasher>(&self, state: &mut H) {
self.inner.hash(state);
}
}
/// The merge accumulator, which modify `update_batch`'s behavior to accept one struct array which /// The merge accumulator, which modify `update_batch`'s behavior to accept one struct array which
/// include the state fields of original aggregate function, and merge said states into original accumulator /// include the state fields of original aggregate function, and merge said states into original accumulator
/// the output is the same as original aggregate function /// the output is the same as original aggregate function

View File

@@ -39,8 +39,7 @@ use datafusion::prelude::SessionContext;
use datafusion_common::arrow::array::AsArray; use datafusion_common::arrow::array::AsArray;
use datafusion_common::arrow::datatypes::{Float64Type, UInt64Type}; use datafusion_common::arrow::datatypes::{Float64Type, UInt64Type};
use datafusion_common::{Column, TableReference}; use datafusion_common::{Column, TableReference};
use datafusion_expr::expr::AggregateFunction; use datafusion_expr::expr::{AggregateFunction, NullTreatment};
use datafusion_expr::sqlparser::ast::NullTreatment;
use datafusion_expr::{ use datafusion_expr::{
Aggregate, ColumnarValue, Expr, LogicalPlan, ScalarFunctionArgs, SortExpr, TableScan, lit, Aggregate, ColumnarValue, Expr, LogicalPlan, ScalarFunctionArgs, SortExpr, TableScan, lit,
}; };

View File

@@ -68,7 +68,7 @@ impl CountHash {
} }
} }
#[derive(Debug, Clone)] #[derive(Debug, Clone, Eq, PartialEq, Hash)]
pub struct CountHash { pub struct CountHash {
signature: Signature, signature: Signature,
} }

View File

@@ -15,7 +15,7 @@
use std::borrow::Cow; use std::borrow::Cow;
use std::sync::Arc; use std::sync::Arc;
use arrow::array::{Array, ArrayRef, AsArray, BinaryArray, StringArray}; use arrow::array::{Array, ArrayRef, AsArray, BinaryArray, LargeStringArray, StringArray};
use arrow_schema::{DataType, Field}; use arrow_schema::{DataType, Field};
use datafusion::logical_expr::{Signature, TypeSignature, Volatility}; use datafusion::logical_expr::{Signature, TypeSignature, Volatility};
use datafusion_common::{Result, ScalarValue}; use datafusion_common::{Result, ScalarValue};
@@ -63,7 +63,7 @@ impl VectorProduct {
} }
let t = args.schema.field(0).data_type(); let t = args.schema.field(0).data_type();
if !matches!(t, DataType::Utf8 | DataType::Binary) { if !matches!(t, DataType::Utf8 | DataType::LargeUtf8 | DataType::Binary) {
return Err(datafusion_common::DataFusionError::Internal(format!( return Err(datafusion_common::DataFusionError::Internal(format!(
"unexpected input datatype {t} when creating `VEC_PRODUCT`" "unexpected input datatype {t} when creating `VEC_PRODUCT`"
))); )));
@@ -91,6 +91,13 @@ impl VectorProduct {
.map(|x| x.map(Cow::Owned)) .map(|x| x.map(Cow::Owned))
.collect::<Result<Vec<_>>>()? .collect::<Result<Vec<_>>>()?
} }
DataType::LargeUtf8 => {
let arr: &LargeStringArray = values[0].as_string();
arr.iter()
.filter_map(|x| x.map(|s| parse_veclit_from_strlit(s).map_err(Into::into)))
.map(|x: Result<Vec<f32>>| x.map(Cow::Owned))
.collect::<Result<Vec<_>>>()?
}
DataType::Binary => { DataType::Binary => {
let arr: &BinaryArray = values[0].as_binary(); let arr: &BinaryArray = values[0].as_binary();
arr.iter() arr.iter()

View File

@@ -14,7 +14,7 @@
use std::sync::Arc; use std::sync::Arc;
use arrow::array::{Array, ArrayRef, AsArray, BinaryArray, StringArray}; use arrow::array::{Array, ArrayRef, AsArray, BinaryArray, LargeStringArray, StringArray};
use arrow_schema::{DataType, Field}; use arrow_schema::{DataType, Field};
use datafusion_common::{Result, ScalarValue}; use datafusion_common::{Result, ScalarValue};
use datafusion_expr::{ use datafusion_expr::{
@@ -63,7 +63,7 @@ impl VectorSum {
} }
let t = args.schema.field(0).data_type(); let t = args.schema.field(0).data_type();
if !matches!(t, DataType::Utf8 | DataType::Binary) { if !matches!(t, DataType::Utf8 | DataType::LargeUtf8 | DataType::Binary) {
return Err(datafusion_common::DataFusionError::Internal(format!( return Err(datafusion_common::DataFusionError::Internal(format!(
"unexpected input datatype {t} when creating `VEC_SUM`" "unexpected input datatype {t} when creating `VEC_SUM`"
))); )));
@@ -98,6 +98,21 @@ impl VectorSum {
*self.inner(vec_column.len()) += vec_column; *self.inner(vec_column.len()) += vec_column;
} }
} }
DataType::LargeUtf8 => {
let arr: &LargeStringArray = values[0].as_string();
for s in arr.iter() {
let Some(s) = s else {
if is_update {
self.has_null = true;
self.sum = None;
}
return Ok(());
};
let values = parse_veclit_from_strlit(s)?;
let vec_column = DVectorView::from_slice(&values, values.len());
*self.inner(vec_column.len()) += vec_column;
}
}
DataType::Binary => { DataType::Binary => {
let arr: &BinaryArray = values[0].as_binary(); let arr: &BinaryArray = values[0].as_binary();
for b in arr.iter() { for b in arr.iter() {

View File

@@ -34,6 +34,7 @@ use crate::scalars::json::JsonFunction;
use crate::scalars::matches::MatchesFunction; use crate::scalars::matches::MatchesFunction;
use crate::scalars::matches_term::MatchesTermFunction; use crate::scalars::matches_term::MatchesTermFunction;
use crate::scalars::math::MathFunction; use crate::scalars::math::MathFunction;
use crate::scalars::string::register_string_functions;
use crate::scalars::timestamp::TimestampFunction; use crate::scalars::timestamp::TimestampFunction;
use crate::scalars::uddsketch_calc::UddSketchCalcFunction; use crate::scalars::uddsketch_calc::UddSketchCalcFunction;
use crate::scalars::vector::VectorFunction as VectorScalarFunction; use crate::scalars::vector::VectorFunction as VectorScalarFunction;
@@ -71,7 +72,7 @@ impl FunctionRegistry {
for alias in func.aliases() { for alias in func.aliases() {
let func: ScalarFunctionFactory = func.clone().into(); let func: ScalarFunctionFactory = func.clone().into();
let alias = ScalarFunctionFactory { let alias = ScalarFunctionFactory {
name: alias.to_string(), name: alias.clone(),
..func ..func
}; };
self.register(alias); self.register(alias);
@@ -154,6 +155,9 @@ pub static FUNCTION_REGISTRY: LazyLock<Arc<FunctionRegistry>> = LazyLock::new(||
// Json related functions // Json related functions
JsonFunction::register(&function_registry); JsonFunction::register(&function_registry);
// String related functions
register_string_functions(&function_registry);
// Vector related functions // Vector related functions
VectorScalarFunction::register(&function_registry); VectorScalarFunction::register(&function_registry);
VectorAggrFunction::register(&function_registry); VectorAggrFunction::register(&function_registry);

View File

@@ -38,7 +38,7 @@ pub(crate) fn one_of_sigs2(args1: Vec<DataType>, args2: Vec<DataType>) -> Signat
/// Cast a [`ValueRef`] to u64, returns `None` if fails /// Cast a [`ValueRef`] to u64, returns `None` if fails
pub fn cast_u64(value: &ValueRef) -> Result<Option<u64>> { pub fn cast_u64(value: &ValueRef) -> Result<Option<u64>> {
cast((*value).into(), &ConcreteDataType::uint64_datatype()) cast(value.clone().into(), &ConcreteDataType::uint64_datatype())
.context(InvalidInputTypeSnafu { .context(InvalidInputTypeSnafu {
err_msg: format!( err_msg: format!(
"Failed to cast input into uint64, actual type: {:#?}", "Failed to cast input into uint64, actual type: {:#?}",
@@ -50,7 +50,7 @@ pub fn cast_u64(value: &ValueRef) -> Result<Option<u64>> {
/// Cast a [`ValueRef`] to u32, returns `None` if fails /// Cast a [`ValueRef`] to u32, returns `None` if fails
pub fn cast_u32(value: &ValueRef) -> Result<Option<u32>> { pub fn cast_u32(value: &ValueRef) -> Result<Option<u32>> {
cast((*value).into(), &ConcreteDataType::uint32_datatype()) cast(value.clone().into(), &ConcreteDataType::uint32_datatype())
.context(InvalidInputTypeSnafu { .context(InvalidInputTypeSnafu {
err_msg: format!( err_msg: format!(
"Failed to cast input into uint32, actual type: {:#?}", "Failed to cast input into uint32, actual type: {:#?}",

View File

@@ -12,7 +12,6 @@
// See the License for the specific language governing permissions and // See the License for the specific language governing permissions and
// limitations under the License. // limitations under the License.
#![feature(let_chains)]
#![feature(try_blocks)] #![feature(try_blocks)]
#![feature(assert_matches)] #![feature(assert_matches)]

View File

@@ -20,6 +20,7 @@ pub mod json;
pub mod matches; pub mod matches;
pub mod matches_term; pub mod matches_term;
pub mod math; pub mod math;
pub(crate) mod string;
pub mod vector; pub mod vector;
pub(crate) mod hll_count; pub(crate) mod hll_count;

View File

@@ -20,7 +20,9 @@ use common_query::error;
use common_time::{Date, Timestamp}; use common_time::{Date, Timestamp};
use datafusion_common::DataFusionError; use datafusion_common::DataFusionError;
use datafusion_common::arrow::array::{Array, AsArray, StringViewBuilder}; use datafusion_common::arrow::array::{Array, AsArray, StringViewBuilder};
use datafusion_common::arrow::datatypes::{ArrowTimestampType, DataType, Date32Type, TimeUnit}; use datafusion_common::arrow::datatypes::{
ArrowTimestampType, DataType, Date32Type, Date64Type, TimeUnit,
};
use datafusion_expr::{ColumnarValue, ScalarFunctionArgs, Signature}; use datafusion_expr::{ColumnarValue, ScalarFunctionArgs, Signature};
use snafu::ResultExt; use snafu::ResultExt;
@@ -40,6 +42,7 @@ impl Default for DateFormatFunction {
signature: helper::one_of_sigs2( signature: helper::one_of_sigs2(
vec![ vec![
DataType::Date32, DataType::Date32,
DataType::Date64,
DataType::Timestamp(TimeUnit::Second, None), DataType::Timestamp(TimeUnit::Second, None),
DataType::Timestamp(TimeUnit::Millisecond, None), DataType::Timestamp(TimeUnit::Millisecond, None),
DataType::Timestamp(TimeUnit::Microsecond, None), DataType::Timestamp(TimeUnit::Microsecond, None),
@@ -115,6 +118,29 @@ impl Function for DateFormatFunction {
builder.append_option(result.as_deref()); builder.append_option(result.as_deref());
} }
} }
DataType::Date64 => {
let left = left.as_primitive::<Date64Type>();
for i in 0..size {
let date = left.is_valid(i).then(|| {
let ms = left.value(i);
Timestamp::new_millisecond(ms)
});
let format = formats.is_valid(i).then(|| formats.value(i));
let result = match (date, format) {
(Some(ts), Some(fmt)) => {
Some(ts.as_formatted_string(fmt, Some(timezone)).map_err(|e| {
DataFusionError::Execution(format!(
"cannot format {ts:?} as '{fmt}': {e}"
))
})?)
}
_ => None,
};
builder.append_option(result.as_deref());
}
}
x => { x => {
return Err(DataFusionError::Execution(format!( return Err(DataFusionError::Execution(format!(
"unsupported input data type {x}" "unsupported input data type {x}"
@@ -137,7 +163,9 @@ mod tests {
use std::sync::Arc; use std::sync::Arc;
use arrow_schema::Field; use arrow_schema::Field;
use datafusion_common::arrow::array::{Date32Array, StringArray, TimestampSecondArray}; use datafusion_common::arrow::array::{
Date32Array, Date64Array, StringArray, TimestampSecondArray,
};
use datafusion_common::config::ConfigOptions; use datafusion_common::config::ConfigOptions;
use datafusion_expr::{TypeSignature, Volatility}; use datafusion_expr::{TypeSignature, Volatility};
@@ -166,7 +194,7 @@ mod tests {
Signature { Signature {
type_signature: TypeSignature::OneOf(sigs), type_signature: TypeSignature::OneOf(sigs),
volatility: Volatility::Immutable volatility: Volatility::Immutable
} if sigs.len() == 5)); } if sigs.len() == 6));
} }
#[test] #[test]
@@ -213,6 +241,50 @@ mod tests {
} }
} }
#[test]
fn test_date64_date_format() {
let f = DateFormatFunction::default();
let dates = vec![Some(123000), None, Some(42000), None];
let formats = vec![
"%Y-%m-%d %T.%3f",
"%Y-%m-%d %T.%3f",
"%Y-%m-%d %T.%3f",
"%Y-%m-%d %T.%3f",
];
let results = [
Some("1970-01-01 00:02:03.000"),
None,
Some("1970-01-01 00:00:42.000"),
None,
];
let mut config_options = ConfigOptions::default();
config_options.extensions.insert(FunctionContext::default());
let config_options = Arc::new(config_options);
let args = ScalarFunctionArgs {
args: vec![
ColumnarValue::Array(Arc::new(Date64Array::from(dates))),
ColumnarValue::Array(Arc::new(StringArray::from_iter_values(formats))),
],
arg_fields: vec![],
number_rows: 4,
return_field: Arc::new(Field::new("x", DataType::Utf8View, false)),
config_options,
};
let result = f
.invoke_with_args(args)
.and_then(|x| x.to_array(4))
.unwrap();
let vector = result.as_string_view();
assert_eq!(4, vector.len());
for (actual, expect) in vector.iter().zip(results) {
assert_eq!(actual, expect);
}
}
#[test] #[test]
fn test_date_date_format() { fn test_date_date_format() {
let f = DateFormatFunction::default(); let f = DateFormatFunction::default();

View File

@@ -41,7 +41,7 @@ where
let right: &<R as Scalar>::VectorType = unsafe { Helper::static_cast(right.inner()) }; let right: &<R as Scalar>::VectorType = unsafe { Helper::static_cast(right.inner()) };
let b = right.get_data(0); let b = right.get_data(0);
let it = left.iter_data().map(|a| f(a, b, ctx)); let it = left.iter_data().map(|a| f(a, b.clone(), ctx));
<O as Scalar>::VectorType::from_owned_iterator(it) <O as Scalar>::VectorType::from_owned_iterator(it)
} }
@@ -62,7 +62,7 @@ where
let a = left.get_data(0); let a = left.get_data(0);
let right: &<R as Scalar>::VectorType = unsafe { Helper::static_cast(r) }; let right: &<R as Scalar>::VectorType = unsafe { Helper::static_cast(r) };
let it = right.iter_data().map(|b| f(a, b, ctx)); let it = right.iter_data().map(|b| f(a.clone(), b, ctx));
<O as Scalar>::VectorType::from_owned_iterator(it) <O as Scalar>::VectorType::from_owned_iterator(it)
} }

View File

@@ -76,7 +76,7 @@ impl Function for GeohashFunction {
} }
fn return_type(&self, _: &[DataType]) -> datafusion_common::Result<DataType> { fn return_type(&self, _: &[DataType]) -> datafusion_common::Result<DataType> {
Ok(DataType::Utf8) Ok(DataType::Utf8View)
} }
fn signature(&self) -> &Signature { fn signature(&self) -> &Signature {
@@ -176,7 +176,7 @@ impl Function for GeohashNeighboursFunction {
Ok(DataType::List(Arc::new(Field::new( Ok(DataType::List(Arc::new(Field::new(
"item", "item",
DataType::Utf8View, DataType::Utf8View,
false, true,
)))) ))))
} }

View File

@@ -355,9 +355,9 @@ impl Function for H3CellCenterLatLng {
fn return_type(&self, _: &[DataType]) -> datafusion_common::Result<DataType> { fn return_type(&self, _: &[DataType]) -> datafusion_common::Result<DataType> {
Ok(DataType::List(Arc::new(Field::new( Ok(DataType::List(Arc::new(Field::new(
"x", "item",
DataType::Float64, DataType::Float64,
false, true,
)))) ))))
} }

View File

@@ -309,7 +309,7 @@ fn is_ipv6_in_range(ip: &Ipv6Addr, cidr_base: &Ipv6Addr, prefix_len: u8) -> Opti
} }
// If there's a partial byte to check // If there's a partial byte to check
if prefix_len % 8 != 0 && full_bytes < 16 { if !prefix_len.is_multiple_of(8) && full_bytes < 16 {
let bits_to_check = prefix_len % 8; let bits_to_check = prefix_len % 8;
let mask = 0xFF_u8 << (8 - bits_to_check); let mask = 0xFF_u8 << (8 - bits_to_check);

View File

@@ -0,0 +1,26 @@
// Copyright 2023 Greptime Team
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
//! String scalar functions
mod regexp_extract;
pub(crate) use regexp_extract::RegexpExtractFunction;
use crate::function_registry::FunctionRegistry;
/// Register all string functions
pub fn register_string_functions(registry: &FunctionRegistry) {
RegexpExtractFunction::register(registry);
}

View File

@@ -0,0 +1,339 @@
// Copyright 2023 Greptime Team
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
//! Implementation of REGEXP_EXTRACT function
use std::fmt;
use std::sync::Arc;
use datafusion_common::DataFusionError;
use datafusion_common::arrow::array::{Array, AsArray, LargeStringBuilder};
use datafusion_common::arrow::compute::cast;
use datafusion_common::arrow::datatypes::DataType;
use datafusion_expr::{ColumnarValue, ScalarFunctionArgs, Signature, TypeSignature, Volatility};
use regex::{Regex, RegexBuilder};
use crate::function::Function;
use crate::function_registry::FunctionRegistry;
const NAME: &str = "regexp_extract";
// Safety limits
const MAX_REGEX_SIZE: usize = 1024 * 1024; // compiled regex heap cap
const MAX_DFA_SIZE: usize = 2 * 1024 * 1024; // lazy DFA cap
const MAX_TOTAL_RESULT_SIZE: usize = 64 * 1024 * 1024; // total batch cap
const MAX_SINGLE_MATCH: usize = 1024 * 1024; // per-row cap
const MAX_PATTERN_LEN: usize = 10_000; // pattern text length cap
/// REGEXP_EXTRACT function implementation
/// Extracts the first substring matching the given regular expression pattern.
/// If no match is found, returns NULL.
///
#[derive(Debug)]
pub struct RegexpExtractFunction {
signature: Signature,
}
impl RegexpExtractFunction {
pub fn register(registry: &FunctionRegistry) {
registry.register_scalar(RegexpExtractFunction::default());
}
}
impl Default for RegexpExtractFunction {
fn default() -> Self {
Self {
signature: Signature::one_of(
vec![
TypeSignature::Exact(vec![DataType::Utf8View, DataType::Utf8]),
TypeSignature::Exact(vec![DataType::Utf8View, DataType::Utf8View]),
TypeSignature::Exact(vec![DataType::Utf8, DataType::Utf8View]),
TypeSignature::Exact(vec![DataType::LargeUtf8, DataType::Utf8View]),
TypeSignature::Exact(vec![DataType::Utf8View, DataType::LargeUtf8]),
TypeSignature::Exact(vec![DataType::Utf8, DataType::Utf8]),
TypeSignature::Exact(vec![DataType::LargeUtf8, DataType::Utf8]),
TypeSignature::Exact(vec![DataType::Utf8, DataType::LargeUtf8]),
TypeSignature::Exact(vec![DataType::LargeUtf8, DataType::LargeUtf8]),
],
Volatility::Immutable,
),
}
}
}
impl fmt::Display for RegexpExtractFunction {
fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
write!(f, "{}", NAME.to_ascii_uppercase())
}
}
impl Function for RegexpExtractFunction {
fn name(&self) -> &str {
NAME
}
// Always return LargeUtf8 for simplicity and safety
fn return_type(&self, _: &[DataType]) -> datafusion_common::Result<DataType> {
Ok(DataType::LargeUtf8)
}
fn signature(&self) -> &Signature {
&self.signature
}
fn invoke_with_args(
&self,
args: ScalarFunctionArgs,
) -> datafusion_common::Result<ColumnarValue> {
if args.args.len() != 2 {
return Err(DataFusionError::Execution(
"REGEXP_EXTRACT requires exactly two arguments (text, pattern)".to_string(),
));
}
// Keep original ColumnarValue variants for scalar-pattern fast path
let pattern_is_scalar = matches!(args.args[1], ColumnarValue::Scalar(_));
let arrays = ColumnarValue::values_to_arrays(&args.args)?;
let text_array = &arrays[0];
let pattern_array = &arrays[1];
// Cast both to LargeUtf8 for uniform access (supports Utf8/Utf8View/Dictionary<String>)
let text_large = cast(text_array.as_ref(), &DataType::LargeUtf8).map_err(|e| {
DataFusionError::Execution(format!("REGEXP_EXTRACT: text cast failed: {e}"))
})?;
let pattern_large = cast(pattern_array.as_ref(), &DataType::LargeUtf8).map_err(|e| {
DataFusionError::Execution(format!("REGEXP_EXTRACT: pattern cast failed: {e}"))
})?;
let text = text_large.as_string::<i64>();
let pattern = pattern_large.as_string::<i64>();
let len = text.len();
// Pre-size result builder with conservative estimate
let mut estimated_total = 0usize;
for i in 0..len {
if !text.is_null(i) {
estimated_total = estimated_total.saturating_add(text.value_length(i) as usize);
if estimated_total > MAX_TOTAL_RESULT_SIZE {
return Err(DataFusionError::ResourcesExhausted(format!(
"REGEXP_EXTRACT total output exceeds {} bytes",
MAX_TOTAL_RESULT_SIZE
)));
}
}
}
let mut builder = LargeStringBuilder::with_capacity(len, estimated_total);
// Fast path: if pattern is scalar, compile once
let compiled_scalar: Option<Regex> = if pattern_is_scalar && len > 0 && !pattern.is_null(0)
{
Some(compile_regex_checked(pattern.value(0))?)
} else {
None
};
for i in 0..len {
if text.is_null(i) || pattern.is_null(i) {
builder.append_null();
continue;
}
let s = text.value(i);
let pat = pattern.value(i);
// Compile or reuse regex
let re = if let Some(ref compiled) = compiled_scalar {
compiled
} else {
// TODO: For performance-critical applications with repeating patterns,
// consider adding a small LRU cache here
&compile_regex_checked(pat)?
};
// First match only
if let Some(m) = re.find(s) {
let m_str = m.as_str();
if m_str.len() > MAX_SINGLE_MATCH {
return Err(DataFusionError::Execution(
"REGEXP_EXTRACT match exceeds per-row limit (1MB)".to_string(),
));
}
builder.append_value(m_str);
} else {
builder.append_null();
}
}
Ok(ColumnarValue::Array(Arc::new(builder.finish())))
}
}
// Compile a regex with safety checks
fn compile_regex_checked(pattern: &str) -> datafusion_common::Result<Regex> {
if pattern.len() > MAX_PATTERN_LEN {
return Err(DataFusionError::Execution(format!(
"REGEXP_EXTRACT pattern too long (> {} chars)",
MAX_PATTERN_LEN
)));
}
RegexBuilder::new(pattern)
.size_limit(MAX_REGEX_SIZE)
.dfa_size_limit(MAX_DFA_SIZE)
.build()
.map_err(|e| {
DataFusionError::Execution(format!("REGEXP_EXTRACT invalid pattern '{}': {e}", pattern))
})
}
#[cfg(test)]
mod tests {
use datafusion_common::arrow::array::StringArray;
use datafusion_common::arrow::datatypes::Field;
use datafusion_expr::ScalarFunctionArgs;
use super::*;
#[test]
fn test_regexp_extract_function_basic() {
let text_array = Arc::new(StringArray::from(vec!["version 1.2.3", "no match here"]));
let pattern_array = Arc::new(StringArray::from(vec!["\\d+\\.\\d+\\.\\d+", "\\d+"]));
let args = ScalarFunctionArgs {
args: vec![
ColumnarValue::Array(text_array),
ColumnarValue::Array(pattern_array),
],
arg_fields: vec![
Arc::new(Field::new("arg_0", DataType::Utf8, false)),
Arc::new(Field::new("arg_1", DataType::Utf8, false)),
],
return_field: Arc::new(Field::new("result", DataType::LargeUtf8, true)),
number_rows: 2,
config_options: Arc::new(datafusion_common::config::ConfigOptions::default()),
};
let function = RegexpExtractFunction::default();
let result = function.invoke_with_args(args).unwrap();
if let ColumnarValue::Array(array) = result {
let string_array = array.as_string::<i64>();
assert_eq!(string_array.value(0), "1.2.3");
assert!(string_array.is_null(1)); // no match should return NULL
} else {
panic!("Expected array result");
}
}
#[test]
fn test_regexp_extract_phone_number() {
let text_array = Arc::new(StringArray::from(vec!["Phone: 123-456-7890", "No phone"]));
let pattern_array = Arc::new(StringArray::from(vec![
"\\d{3}-\\d{3}-\\d{4}",
"\\d{3}-\\d{3}-\\d{4}",
]));
let args = ScalarFunctionArgs {
args: vec![
ColumnarValue::Array(text_array),
ColumnarValue::Array(pattern_array),
],
arg_fields: vec![
Arc::new(Field::new("arg_0", DataType::Utf8, false)),
Arc::new(Field::new("arg_1", DataType::Utf8, false)),
],
return_field: Arc::new(Field::new("result", DataType::LargeUtf8, true)),
number_rows: 2,
config_options: Arc::new(datafusion_common::config::ConfigOptions::default()),
};
let function = RegexpExtractFunction::default();
let result = function.invoke_with_args(args).unwrap();
if let ColumnarValue::Array(array) = result {
let string_array = array.as_string::<i64>();
assert_eq!(string_array.value(0), "123-456-7890");
assert!(string_array.is_null(1)); // no match should return NULL
} else {
panic!("Expected array result");
}
}
#[test]
fn test_regexp_extract_email() {
let text_array = Arc::new(StringArray::from(vec![
"Email: user@domain.com",
"Invalid email",
]));
let pattern_array = Arc::new(StringArray::from(vec![
"[a-zA-Z0-9]+@[a-zA-Z0-9]+\\.[a-zA-Z]+",
"[a-zA-Z0-9]+@[a-zA-Z0-9]+\\.[a-zA-Z]+",
]));
let args = ScalarFunctionArgs {
args: vec![
ColumnarValue::Array(text_array),
ColumnarValue::Array(pattern_array),
],
arg_fields: vec![
Arc::new(Field::new("arg_0", DataType::Utf8, false)),
Arc::new(Field::new("arg_1", DataType::Utf8, false)),
],
return_field: Arc::new(Field::new("result", DataType::LargeUtf8, true)),
number_rows: 2,
config_options: Arc::new(datafusion_common::config::ConfigOptions::default()),
};
let function = RegexpExtractFunction::default();
let result = function.invoke_with_args(args).unwrap();
if let ColumnarValue::Array(array) = result {
let string_array = array.as_string::<i64>();
assert_eq!(string_array.value(0), "user@domain.com");
assert!(string_array.is_null(1)); // no match should return NULL
} else {
panic!("Expected array result");
}
}
#[test]
fn test_regexp_extract_with_nulls() {
let text_array = Arc::new(StringArray::from(vec![Some("test 123"), None]));
let pattern_array = Arc::new(StringArray::from(vec![Some("\\d+"), Some("\\d+")]));
let args = ScalarFunctionArgs {
args: vec![
ColumnarValue::Array(text_array),
ColumnarValue::Array(pattern_array),
],
arg_fields: vec![
Arc::new(Field::new("arg_0", DataType::Utf8, true)),
Arc::new(Field::new("arg_1", DataType::Utf8, false)),
],
return_field: Arc::new(Field::new("result", DataType::LargeUtf8, true)),
number_rows: 2,
config_options: Arc::new(datafusion_common::config::ConfigOptions::default()),
};
let function = RegexpExtractFunction::default();
let result = function.invoke_with_args(args).unwrap();
if let ColumnarValue::Array(array) = result {
let string_array = array.as_string::<i64>();
assert_eq!(string_array.value(0), "123");
assert!(string_array.is_null(1)); // NULL input should return NULL
} else {
panic!("Expected array result");
}
}
}

View File

@@ -14,6 +14,7 @@
use std::any::Any; use std::any::Any;
use std::fmt::{Debug, Formatter}; use std::fmt::{Debug, Formatter};
use std::hash::{Hash, Hasher};
use datafusion::arrow::datatypes::DataType; use datafusion::arrow::datatypes::DataType;
use datafusion::logical_expr::{ScalarFunctionArgs, ScalarUDFImpl}; use datafusion::logical_expr::{ScalarFunctionArgs, ScalarUDFImpl};
@@ -33,6 +34,20 @@ impl Debug for ScalarUdf {
} }
} }
impl PartialEq for ScalarUdf {
fn eq(&self, other: &Self) -> bool {
self.function.signature() == other.function.signature()
}
}
impl Eq for ScalarUdf {}
impl Hash for ScalarUdf {
fn hash<H: Hasher>(&self, state: &mut H) {
self.function.signature().hash(state)
}
}
impl ScalarUDFImpl for ScalarUdf { impl ScalarUDFImpl for ScalarUdf {
fn as_any(&self) -> &dyn Any { fn as_any(&self) -> &dyn Any {
self self

View File

@@ -36,7 +36,7 @@ pub fn as_veclit(arg: &ScalarValue) -> Result<Option<Cow<'_, [f32]>>> {
/// Convert a u8 slice to a vector literal. /// Convert a u8 slice to a vector literal.
pub fn binlit_as_veclit(bytes: &[u8]) -> Result<Cow<'_, [f32]>> { pub fn binlit_as_veclit(bytes: &[u8]) -> Result<Cow<'_, [f32]>> {
if bytes.len() % std::mem::size_of::<f32>() != 0 { if !bytes.len().is_multiple_of(size_of::<f32>()) {
return InvalidFuncArgsSnafu { return InvalidFuncArgsSnafu {
err_msg: format!("Invalid binary length of vector: {}", bytes.len()), err_msg: format!("Invalid binary length of vector: {}", bytes.len()),
} }

View File

@@ -16,6 +16,9 @@ mod version;
use std::sync::Arc; use std::sync::Arc;
use common_catalog::consts::{
DEFAULT_PRIVATE_SCHEMA_NAME, INFORMATION_SCHEMA_NAME, PG_CATALOG_NAME,
};
use datafusion::arrow::array::{ArrayRef, StringArray, as_boolean_array}; use datafusion::arrow::array::{ArrayRef, StringArray, as_boolean_array};
use datafusion::catalog::TableFunction; use datafusion::catalog::TableFunction;
use datafusion::common::ScalarValue; use datafusion::common::ScalarValue;
@@ -32,10 +35,36 @@ use crate::system::define_nullary_udf;
const CURRENT_SCHEMA_FUNCTION_NAME: &str = "current_schema"; const CURRENT_SCHEMA_FUNCTION_NAME: &str = "current_schema";
const CURRENT_SCHEMAS_FUNCTION_NAME: &str = "current_schemas"; const CURRENT_SCHEMAS_FUNCTION_NAME: &str = "current_schemas";
const SESSION_USER_FUNCTION_NAME: &str = "session_user"; const SESSION_USER_FUNCTION_NAME: &str = "session_user";
const CURRENT_DATABASE_FUNCTION_NAME: &str = "current_database";
define_nullary_udf!(CurrentSchemaFunction); define_nullary_udf!(CurrentSchemaFunction);
define_nullary_udf!(CurrentSchemasFunction); define_nullary_udf!(CurrentSchemasFunction);
define_nullary_udf!(SessionUserFunction); define_nullary_udf!(SessionUserFunction);
define_nullary_udf!(CurrentDatabaseFunction);
impl Function for CurrentDatabaseFunction {
fn name(&self) -> &str {
CURRENT_DATABASE_FUNCTION_NAME
}
fn return_type(&self, _: &[DataType]) -> datafusion_common::Result<DataType> {
Ok(DataType::Utf8View)
}
fn signature(&self) -> &Signature {
&self.signature
}
fn invoke_with_args(
&self,
args: ScalarFunctionArgs,
) -> datafusion_common::Result<ColumnarValue> {
let func_ctx = find_function_context(&args)?;
let db = func_ctx.query_ctx.current_catalog().to_string();
Ok(ColumnarValue::Scalar(ScalarValue::Utf8View(Some(db))))
}
}
// Though "current_schema" can be aliased to "database", to not cause any breaking changes, // Though "current_schema" can be aliased to "database", to not cause any breaking changes,
// we are not doing it: not until https://github.com/apache/datafusion/issues/17469 is resolved. // we are not doing it: not until https://github.com/apache/datafusion/issues/17469 is resolved.
@@ -117,9 +146,9 @@ impl Function for CurrentSchemasFunction {
let mut values = vec!["public"]; let mut values = vec!["public"];
// include implicit schemas // include implicit schemas
if input.value(0) { if input.value(0) {
values.push("information_schema"); values.push(INFORMATION_SCHEMA_NAME);
values.push("pg_catalog"); values.push(PG_CATALOG_NAME);
values.push("greptime_private"); values.push(DEFAULT_PRIVATE_SCHEMA_NAME);
} }
let list_array = SingleRowListArrayBuilder::new(Arc::new(StringArray::from(values))); let list_array = SingleRowListArrayBuilder::new(Arc::new(StringArray::from(values)));
@@ -141,6 +170,7 @@ impl PGCatalogFunction {
registry.register_scalar(CurrentSchemaFunction::default()); registry.register_scalar(CurrentSchemaFunction::default());
registry.register_scalar(CurrentSchemasFunction::default()); registry.register_scalar(CurrentSchemasFunction::default());
registry.register_scalar(SessionUserFunction::default()); registry.register_scalar(SessionUserFunction::default());
registry.register_scalar(CurrentDatabaseFunction::default());
registry.register(pg_catalog::format_type::create_format_type_udf()); registry.register(pg_catalog::format_type::create_format_type_udf());
registry.register(pg_catalog::create_pg_get_partkeydef_udf()); registry.register(pg_catalog::create_pg_get_partkeydef_udf());
registry.register(pg_catalog::has_privilege_udf::create_has_privilege_udf( registry.register(pg_catalog::has_privilege_udf::create_has_privilege_udf(
@@ -164,7 +194,10 @@ impl PGCatalogFunction {
registry.register(pg_catalog::create_pg_get_userbyid_udf()); registry.register(pg_catalog::create_pg_get_userbyid_udf());
registry.register(pg_catalog::create_pg_table_is_visible()); registry.register(pg_catalog::create_pg_table_is_visible());
registry.register(pg_catalog::pg_get_expr_udf::create_pg_get_expr_udf()); registry.register(pg_catalog::pg_get_expr_udf::create_pg_get_expr_udf());
// TODO(sunng87): upgrade datafusion to add registry.register(pg_catalog::create_pg_encoding_to_char_udf());
//registry.register(pg_catalog::create_pg_encoding_to_char_udf()); registry.register(pg_catalog::create_pg_relation_size_udf());
registry.register(pg_catalog::create_pg_total_relation_size_udf());
registry.register(pg_catalog::create_pg_stat_get_numscans());
registry.register(pg_catalog::create_pg_get_constraintdef());
} }
} }

View File

@@ -1,123 +0,0 @@
// Copyright 2023 Greptime Team
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
use std::collections::HashMap;
use api::helper::ColumnDataTypeWrapper;
use api::v1::{Column, DeleteRequest as GrpcDeleteRequest};
use datatypes::prelude::ConcreteDataType;
use snafu::{ResultExt, ensure};
use table::requests::DeleteRequest;
use crate::error::{ColumnDataTypeSnafu, IllegalDeleteRequestSnafu, Result};
use crate::insert::add_values_to_builder;
pub fn to_table_delete_request(
catalog_name: &str,
schema_name: &str,
request: GrpcDeleteRequest,
) -> Result<DeleteRequest> {
let row_count = request.row_count as usize;
let mut key_column_values = HashMap::with_capacity(request.key_columns.len());
for Column {
column_name,
values,
null_mask,
datatype,
datatype_extension,
..
} in request.key_columns
{
let Some(values) = values else { continue };
let datatype: ConcreteDataType =
ColumnDataTypeWrapper::try_new(datatype, datatype_extension)
.context(ColumnDataTypeSnafu)?
.into();
let vector = add_values_to_builder(datatype, values, row_count, null_mask)?;
ensure!(
key_column_values
.insert(column_name.clone(), vector)
.is_none(),
IllegalDeleteRequestSnafu {
reason: format!("Duplicated column '{column_name}' in delete request.")
}
);
}
Ok(DeleteRequest {
catalog_name: catalog_name.to_string(),
schema_name: schema_name.to_string(),
table_name: request.table_name,
key_column_values,
})
}
#[cfg(test)]
mod tests {
use std::sync::Arc;
use api::v1::ColumnDataType;
use api::v1::column::Values;
use datatypes::prelude::{ScalarVector, VectorRef};
use datatypes::vectors::{Int32Vector, StringVector};
use super::*;
#[test]
fn test_to_table_delete_request() {
let grpc_request = GrpcDeleteRequest {
table_name: "foo".to_string(),
key_columns: vec![
Column {
column_name: "id".to_string(),
values: Some(Values {
i32_values: vec![1, 2, 3],
..Default::default()
}),
datatype: ColumnDataType::Int32 as i32,
..Default::default()
},
Column {
column_name: "name".to_string(),
values: Some(Values {
string_values: vec!["a".to_string(), "b".to_string(), "c".to_string()],
..Default::default()
}),
datatype: ColumnDataType::String as i32,
..Default::default()
},
],
row_count: 3,
};
let mut request =
to_table_delete_request("foo_catalog", "foo_schema", grpc_request).unwrap();
assert_eq!(request.catalog_name, "foo_catalog");
assert_eq!(request.schema_name, "foo_schema");
assert_eq!(request.table_name, "foo");
assert_eq!(
Arc::new(Int32Vector::from_slice(vec![1, 2, 3])) as VectorRef,
request.key_column_values.remove("id").unwrap()
);
assert_eq!(
Arc::new(StringVector::from_slice(&["a", "b", "c"])) as VectorRef,
request.key_column_values.remove("name").unwrap()
);
assert!(request.key_column_values.is_empty());
}
}

View File

@@ -25,13 +25,6 @@ use store_api::metadata::MetadataError;
#[snafu(visibility(pub))] #[snafu(visibility(pub))]
#[stack_trace_debug] #[stack_trace_debug]
pub enum Error { pub enum Error {
#[snafu(display("Illegal delete request, reason: {reason}"))]
IllegalDeleteRequest {
reason: String,
#[snafu(implicit)]
location: Location,
},
#[snafu(display("Column datatype error"))] #[snafu(display("Column datatype error"))]
ColumnDataType { ColumnDataType {
#[snafu(implicit)] #[snafu(implicit)]
@@ -65,13 +58,6 @@ pub enum Error {
location: Location, location: Location,
}, },
#[snafu(display("Failed to create vector"))]
CreateVector {
#[snafu(implicit)]
location: Location,
source: datatypes::error::Error,
},
#[snafu(display("Missing required field in protobuf, field: {}", field))] #[snafu(display("Missing required field in protobuf, field: {}", field))]
MissingField { MissingField {
field: String, field: String,
@@ -87,13 +73,6 @@ pub enum Error {
source: api::error::Error, source: api::error::Error,
}, },
#[snafu(display("Unexpected values length, reason: {}", reason))]
UnexpectedValuesLength {
reason: String,
#[snafu(implicit)]
location: Location,
},
#[snafu(display("Unknown location type: {}", location_type))] #[snafu(display("Unknown location type: {}", location_type))]
UnknownLocationType { UnknownLocationType {
location_type: i32, location_type: i32,
@@ -189,18 +168,13 @@ pub type Result<T> = std::result::Result<T, Error>;
impl ErrorExt for Error { impl ErrorExt for Error {
fn status_code(&self) -> StatusCode { fn status_code(&self) -> StatusCode {
match self { match self {
Error::IllegalDeleteRequest { .. } => StatusCode::InvalidArguments,
Error::ColumnDataType { .. } => StatusCode::Internal, Error::ColumnDataType { .. } => StatusCode::Internal,
Error::DuplicatedTimestampColumn { .. } Error::DuplicatedTimestampColumn { .. }
| Error::DuplicatedColumnName { .. } | Error::DuplicatedColumnName { .. }
| Error::MissingTimestampColumn { .. } => StatusCode::InvalidArguments, | Error::MissingTimestampColumn { .. } => StatusCode::InvalidArguments,
Error::CreateVector { .. } => StatusCode::InvalidArguments,
Error::MissingField { .. } => StatusCode::InvalidArguments, Error::MissingField { .. } => StatusCode::InvalidArguments,
Error::InvalidColumnDef { source, .. } => source.status_code(), Error::InvalidColumnDef { source, .. } => source.status_code(),
Error::UnexpectedValuesLength { .. } | Error::UnknownLocationType { .. } => { Error::UnknownLocationType { .. } => StatusCode::InvalidArguments,
StatusCode::InvalidArguments
}
Error::UnknownColumnDataType { .. } | Error::InvalidStringIndexColumnType { .. } => { Error::UnknownColumnDataType { .. } | Error::InvalidStringIndexColumnType { .. } => {
StatusCode::InvalidArguments StatusCode::InvalidArguments

View File

@@ -1,80 +0,0 @@
// Copyright 2023 Greptime Team
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
use api::helper;
use api::v1::column::Values;
use common_base::BitVec;
use datatypes::data_type::{ConcreteDataType, DataType};
use datatypes::prelude::VectorRef;
use snafu::{ResultExt, ensure};
use crate::error::{CreateVectorSnafu, Result, UnexpectedValuesLengthSnafu};
pub(crate) fn add_values_to_builder(
data_type: ConcreteDataType,
values: Values,
row_count: usize,
null_mask: Vec<u8>,
) -> Result<VectorRef> {
if null_mask.is_empty() {
Ok(helper::pb_values_to_vector_ref(&data_type, values))
} else {
let builder = &mut data_type.create_mutable_vector(row_count);
let values = helper::pb_values_to_values(&data_type, values);
let null_mask = BitVec::from_vec(null_mask);
ensure!(
null_mask.count_ones() + values.len() == row_count,
UnexpectedValuesLengthSnafu {
reason: "If null_mask is not empty, the sum of the number of nulls and the length of values must be equal to row_count."
}
);
let mut idx_of_values = 0;
for idx in 0..row_count {
match is_null(&null_mask, idx) {
Some(true) => builder.push_null(),
_ => {
builder
.try_push_value_ref(values[idx_of_values].as_value_ref())
.context(CreateVectorSnafu)?;
idx_of_values += 1
}
}
}
Ok(builder.to_vector())
}
}
fn is_null(null_mask: &BitVec, idx: usize) -> Option<bool> {
null_mask.get(idx).as_deref().copied()
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn test_is_null() {
let null_mask = BitVec::from_slice(&[0b0000_0001, 0b0000_1000]);
assert_eq!(Some(true), is_null(&null_mask, 0));
assert_eq!(Some(false), is_null(&null_mask, 1));
assert_eq!(Some(false), is_null(&null_mask, 10));
assert_eq!(Some(true), is_null(&null_mask, 11));
assert_eq!(Some(false), is_null(&null_mask, 12));
assert_eq!(None, is_null(&null_mask, 16));
assert_eq!(None, is_null(&null_mask, 99));
}
}

View File

@@ -13,9 +13,7 @@
// limitations under the License. // limitations under the License.
mod alter; mod alter;
pub mod delete;
pub mod error; pub mod error;
pub mod insert;
pub mod util; pub mod util;
pub use alter::{alter_expr_to_request, create_table_schema}; pub use alter::{alter_expr_to_request, create_table_schema};

View File

@@ -167,7 +167,7 @@ pub fn build_create_table_expr(
default_constraint: vec![], default_constraint: vec![],
semantic_type, semantic_type,
comment: String::new(), comment: String::new(),
datatype_extension: *datatype_extension, datatype_extension: datatype_extension.clone(),
options: options.clone(), options: options.clone(),
}); });
} }
@@ -209,7 +209,7 @@ pub fn extract_new_columns(
default_constraint: vec![], default_constraint: vec![],
semantic_type: expr.semantic_type, semantic_type: expr.semantic_type,
comment: String::new(), comment: String::new(),
datatype_extension: *expr.datatype_extension, datatype_extension: expr.datatype_extension.clone(),
options: expr.options.clone(), options: expr.options.clone(),
}); });
AddColumn { AddColumn {
@@ -425,7 +425,7 @@ mod tests {
ConcreteDataType::from( ConcreteDataType::from(
ColumnDataTypeWrapper::try_new( ColumnDataTypeWrapper::try_new(
decimal_column.data_type, decimal_column.data_type,
decimal_column.datatype_extension, decimal_column.datatype_extension.clone(),
) )
.unwrap() .unwrap()
) )
@@ -520,6 +520,7 @@ mod tests {
.as_ref() .as_ref()
.unwrap() .unwrap()
.datatype_extension .datatype_extension
.clone()
) )
.unwrap() .unwrap()
) )

View File

@@ -479,7 +479,7 @@ impl Pool {
}) })
} }
fn entry(&self, addr: String) -> Entry<String, Channel> { fn entry(&self, addr: String) -> Entry<'_, String, Channel> {
self.channels.entry(addr) self.channels.entry(addr)
} }

View File

@@ -325,7 +325,7 @@ fn build_struct(
let result = #fn_name(handler, query_ctx, &[]).await let result = #fn_name(handler, query_ctx, &[]).await
.map_err(|e| datafusion_common::DataFusionError::Execution(format!("Function execution error: {}", e.output_msg())))?; .map_err(|e| datafusion_common::DataFusionError::Execution(format!("Function execution error: {}", e.output_msg())))?;
builder.push_value_ref(result.as_value_ref()); builder.push_value_ref(&result.as_value_ref());
} else { } else {
for i in 0..rows_num { for i in 0..rows_num {
let args: Vec<_> = columns.iter() let args: Vec<_> = columns.iter()
@@ -335,7 +335,7 @@ fn build_struct(
let result = #fn_name(handler, query_ctx, &args).await let result = #fn_name(handler, query_ctx, &args).await
.map_err(|e| datafusion_common::DataFusionError::Execution(format!("Function execution error: {}", e.output_msg())))?; .map_err(|e| datafusion_common::DataFusionError::Execution(format!("Function execution error: {}", e.output_msg())))?;
builder.push_value_ref(result.as_value_ref()); builder.push_value_ref(&result.as_value_ref());
} }
} }
@@ -345,6 +345,20 @@ fn build_struct(
Ok(datafusion_expr::ColumnarValue::Array(result_vector.to_arrow_array())) Ok(datafusion_expr::ColumnarValue::Array(result_vector.to_arrow_array()))
} }
} }
impl PartialEq for #name {
fn eq(&self, other: &Self) -> bool {
self.signature == other.signature
}
}
impl Eq for #name {}
impl std::hash::Hash for #name {
fn hash<H: std::hash::Hasher>(&self, state: &mut H) {
self.signature.hash(state)
}
}
} }
.into() .into()
} }

Some files were not shown because too many files have changed in this diff Show More