Compare commits

...

39 Commits

Author SHA1 Message Date
Yingwen
cd55202136 feat: unordered scanner scans data by time ranges (#4757)
* feat: define range meta

* feat: group ranges

* feat: split range

* feat: build ranges from the scan input

* feat: get partition range from range meta

* feat: build file range

* feat: unordered scan read by ranges

* feat: wip for mem ranges

* feat: build ranges

* feat: remove unused codes

* chore: update comments

* feat: update metrics

* chore: address review comments

* chore: debug assertion
2024-09-29 07:14:48 +00:00
Lei, HUANG
50cb59587d chore: replace anymap with anymap2 (#4781) 2024-09-29 06:27:35 +00:00
Lei, HUANG
0a82b12d08 fix(sqlness): sqlness isolation (#4780)
* fix: isolate logs

* fix: copy cases

* fix: clippy
2024-09-29 05:54:01 +00:00
LFC
d9f2f0ccf0 feat: add a new status code for "external" errors (#4775)
* feat: add a new status code for "external" errors

* Update src/auth/src/error.rs

Co-authored-by: shuiyisong <113876041+shuiyisong@users.noreply.github.com>

* support mysql cli cleartext auth

* resolve PR comments

---------

Co-authored-by: shuiyisong <113876041+shuiyisong@users.noreply.github.com>
2024-09-29 03:38:50 +00:00
Ning Sun
cedbbcf2b8 fix: update pgwire for potential issue with connection establish (#4783) 2024-09-28 19:27:31 +00:00
Ning Sun
d6be44bc7f fix: dead loop on detecting postgres ssl handshake (#4778) 2024-09-28 01:39:29 +00:00
JohnsonLee
3a46c1b235 fix: use information_schema returns Unknown database (#4774)
* fix: use information_schema returns Unknown database 'information_schema'

* test: make sure 'use information_schma' successful
2024-09-27 23:15:48 +00:00
Lei, HUANG
934bc13967 feat(mito): limit compaction output file size (#4754)
* Commit Message

 Clarify documentation for CompactionOutput struct

 Updated the documentation for the `CompactionOutput` struct to specify that the output time range is only relevant for windowed compaction.

* Add max_output_file_size to TwcsPicker and TwcsOptions

 - Introduced `max_output_file_size` to `TwcsPicker` struct and its logic to enforce output file size limits during compaction.
 - Updated `TwcsOptions` to include `max_output_file_size` and adjusted related tests.
 - Modified `new_picker` function to initialize `TwcsPicker` with the new `max_output_file_size` field.

* feat/limit-compaction-output-size:
 Refactor compaction picker and TWCS to support append mode and improve options handling

 - Update compaction picker to accept a reference to options and append mode flag
 - Modify TWCS picker logic to consider append mode when filtering deleted rows
 - Remove VersionControl usage in compactor and simplify return type
 - Adjust enforce_max_output_size logic in TWCS picker to handle max output file size
 - Add append mode flag to TwcsPicker struct
 - Fix incorrect condition in TWCS picker for enforcing max output size
 - Update region options tests to reflect new max output file size format (1GB and 7MB)
 - Simplify InvalidTableOptionSnafu error handling in create_parser
 - Add `compaction.twcs.max_output_file_size` to mito engine option keys

* resolve some comments
2024-09-27 11:17:36 +00:00
Weny Xu
4045298cb2 feat: add region_statistics table (#4771)
* refactor: introduce `region_statistic`

* refactor: move DatanodeStat related structs to common_meta

* chore: add comments

* feat: implement `list_region_stats` for `ClusterInfo` trait

* feat: add `region_statistics` table

* feat: add table_id and region_number fields

* chore: rename unused snafu

* chore: udpate sqlness results

* chore: avoid to print source in error msg

* chore: move `procedure_info` under `greptime` catalog

* chore: apply suggestions from CR

* Update src/common/meta/src/datanode.rs

Co-authored-by: jeremyhi <jiachun_feng@proton.me>

---------

Co-authored-by: jeremyhi <jiachun_feng@proton.me>
2024-09-27 09:54:52 +00:00
Lanqing Yang
cc4106cbd2 feat: protect datanode with concurrency limit. (#4699)
Adding parallelism in region server to protect datanode from query overload.
2024-09-27 06:12:57 +00:00
localhost
627a326273 refactor: Change the error type in the pipeline crate from String to Error (#4763)
* chore: in process

* chore: change pipeline crate error type

* chore: improve event error

* chore: fix by pr comment

* chore: use snafu context replace ok_or_else

* refactor: update snafu usage

---------

Co-authored-by: Ning Sun <sunng@protonmail.com>
2024-09-25 19:32:34 +00:00
discord9
0274e752ae chore: make sure aws-lc-sys wouldn't be built (#4767)
* chore: make sure aws-lc-sys wouldn't be built

* fix: bash

* refactor: multiple line bash
2024-09-25 17:11:37 +00:00
Lei, HUANG
cd4bf239d0 chore: relax table name constraint (#4766)
chore/relax-table-name-constraint: Updated NAME_PATTERN to allow '@' and '#' characters and adjusted tests for new table name validation rules.
2024-09-25 02:45:18 +00:00
Ning Sun
e3c0b5482f feat: returning warning instead of error on unsupported SET statement (#4761)
* feat: add capability to send warning to pgclient

* fix: refactor query context to carry query scope data

* feat: return a warning for unsupported postgres statement
2024-09-24 08:45:55 +00:00
Weny Xu
d1b252736d refactor: unify the styling in create_or_alter_tables_on_demand (#4756)
* refactor: refactor `create_or_alter_tables_on_demand`

* chore: apply suggestions from CR
2024-09-24 03:48:16 +00:00
discord9
54f6e13d13 fix: Release CI & make rustls use ring (#4750)
* feat: new dev-build

* fix: copy

* chore: rm dbg run

* fix: binstall install script

* chore: typos

* fix: update useable image

* fix: properly uninstall gcc-9

* chore: another try

* chore: print current cc version

* chore: another try to release

* fix: use gcc-10 by `update-alternatives`

* chore: update dev-build image

* remove gcc-9 again and install make/cmake

* use new image

* alias gcc-10/g++-10 to every variant

* again.....

* fix....

* again release ....

* chore: remove auto remove

* feat: rustls default to ring

* chore: update Cargo.lock

* chore: update Cargo.lock

* Apply suggestions from code review

---------

Co-authored-by: Ruihang Xia <waynestxia@gmail.com>
2024-09-24 03:11:12 +00:00
Ruihang Xia
5c64f0ce09 refactor!: simplify NativeType trait and remove percentile UDAF (#4758)
* refactor!: simplify NativeType trait and remove percentile UDAF

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* remove NativeType

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* recover a mis-deleted case

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

---------

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
2024-09-23 10:55:20 +00:00
Ruihang Xia
2feddca1cb feat: include order by to commutativity rule set (#4753)
* feat: include order by to commutativity rule set

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* tune sqlness replace interceptor

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

---------

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
2024-09-23 08:35:06 +00:00
Ning Sun
0f99218386 feat: list/array/timezone support for postgres output (#4727)
* feat: list/array support for postgres output

* fix: implement time zone support for postgrsql

* feat: add a geohash function that returns array

* fix: typo

* fix: lint warnings

* test: add sqlness test

* refactor: check resolution range before convert value

* fix: test result for sqlness

* feat: upgrade pgwire apis
2024-09-22 02:39:38 +00:00
Weny Xu
163cea81c2 feat: migrate local WAL regions (#4715)
* feat: allow to flush region before migrating

* fix: fix unit tests

* feat: allow to set `flush_timeout`

* feat: skip to replay memtable

* fix: fix unit tests

* test: add more tests

* refactor: simplify timeout logical

* test: add unit tests

* test: add unit tests

* chore: update comments

* fix: fix unit tests

* fix: fmt and clippy

* feat: change default timeout to 30s

* fix: throw `ExceededDeadline` error

* test: add tests for `downgrade_region_with_retry`

* chore: apply suggestions from CR

* chore: apply suggestions from CR

* chore: apply suggestions from CR

* chore: update proto to `3633474`

* refactor: refactor `upgrade_region_with_retry`

* chore: apply suggestions from CR
2024-09-20 08:27:20 +00:00
taobo
0c9b8eb0d2 feat: improve observability for procedure (#4675)
* feat: improve observability for procedure

* fix: test error

* test: add sqlness test for information_schema.procedure_info

* fix: sqlness test error

* fix: cr comment

* chore: update proto version

* fix: apply cr comment

* update version

* fix: cr comment

* optimize procedure type output format

* upgrade dep version

* fix: clippy error

* fix: `procedure` borrowed error

* fix: optimize code
2024-09-20 06:07:53 +00:00
Ning Sun
75c6fad1a3 feat: add more h3 scalar functions (#4707)
* feat: add more h3 scalar functions

* chore: comment up
2024-09-20 04:19:50 +00:00
Yingwen
e12ffbeb2f feat: flush other workers if still need flush (#4746) 2024-09-20 02:55:31 +00:00
discord9
c4e52ebf91 feat: use new image for gcc-10 (#4748)
feat: use new image
2024-09-20 02:34:45 +00:00
Yingwen
f02410c39b fix: disable field pruning in last non null mode (#4740)
* fix: don't prune fields in last non null mode

* test: add sqlness test for field pruning

* test: add flush

* refine implementation

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

---------

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
Co-authored-by: Ruihang Xia <waynestxia@gmail.com>
2024-09-20 00:35:37 +00:00
Ruihang Xia
f5cf25b0db refactor: remove DfPlan wrapper (#4733)
* refactor: remove DfPlan wrapper

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* clean up

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* remove unused errors

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* fix test assertion

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

---------

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
2024-09-19 12:29:33 +00:00
liyang
1acda74c26 fix: cannot input tag for the dev-builder image (#4743) 2024-09-19 11:14:34 +00:00
Yohan Wal
95787825f1 build(deps): use original jsonb repo (#4742) 2024-09-19 09:44:44 +00:00
Weny Xu
49004391d3 chore(fuzz): print table name for debugging (#4738)
* chore(fuzz): print table name for debugging

* chore: apply suggestions
2024-09-19 09:40:10 +00:00
discord9
d0f5b2ad7d fix: use gcc-10 in release dev build (#4741) 2024-09-19 09:34:06 +00:00
Yohan Wal
0295f8dbea docs: json datatype rfc (#4515)
* docs: json datatype rfc

* docs: turn to a jsonb proposal

* chore: fix typo

* feat: add store and query process

* fix: typo

* fix: use query nodes instead of query plans

* feat: a detailed overview of query

* fix: grammar

* fix: use independent cast function

* fix: unify cast function

* fix: refine, make statements clear

* docs: update rfc according to impl

* docs: refine

* docs: fix wrong arrows

* docs: refine

* docs: fix some errors qaq
2024-09-19 05:49:10 +00:00
Ning Sun
8786624515 feat: improve support for postgres extended protocol (#4721)
* feat: improve support for postgres extended protocol

* fix: lint fix

* fix: test code

* fix: adopt upstream

* refactor: remove dup code

* refactor: avoid copy on error message
2024-09-19 05:30:56 +00:00
shuiyisong
52d627e37d chore: add log ingest interceptor (#4734)
* chore: add log ingest interceptor

* chore: rename

* chore: update interceptor signature
2024-09-19 05:14:47 +00:00
Lei, HUANG
b5f7138d33 refactor(tables): improve tables performance (#4737)
* chore: cherrypick 52e8eebb2dbbbe81179583c05094004a5eedd7fd

* refactor/tables: Change variable from immutable to mutable in KvBackendCatalogManager's method

* refactor/tables: Replace unbounded channel with bounded and use semaphore for concurrency control in KvBackendCatalogManager

* refactor/tables: Add common-runtime dependency and update KvBackendCatalogManager to use common_runtime::spawn_global

* refactor/tables: Await on sending error through channel in KvBackendCatalogManager
2024-09-19 04:44:02 +00:00
Ning Sun
08bd40333c feat: add an option to turn on compression for arrow output (#4730)
* feat: add an option to turn on compression for arrow output

* fix: typo
2024-09-19 04:38:41 +00:00
discord9
d1e0602c76 fix: opensrv Use After Free update (#4732)
* chore: version skew

* fix: even more version skew

* feat: use `ring` instead of `aws-lc` for remove nasm assembler on windows

* feat: use `ring` for pgwire

* feat: change to use `aws-lc-sys` on windows instead

* feat: change back to use `ring`

* chore: provide CryptoProvider

* feat: use upstream repo

* feat: install ring crypto lib in main

* chore: use same fn to install in tests

* feat: make pgwire use `ring`
2024-09-19 04:12:13 +00:00
Weny Xu
befb6d85f0 fix: determine region role by using is_readonly (#4725)
fix: correct `is_writable` behavior
2024-09-18 22:17:39 +00:00
Yohan Wal
f73fb82133 feat: add respective json_is UDFs for JSON type (#4726)
* feat: add respective json_is UDFs

* refactor: rename to_json to parse_json

* chore: happy clippy

* chore: some rename

* fix: small fixes
2024-09-18 11:07:30 +00:00
shuiyisong
50b3bb4c0d fix: sort cargo toml (#4735) 2024-09-18 09:19:05 +00:00
262 changed files with 9888 additions and 4561 deletions

View File

@@ -50,7 +50,7 @@ runs:
BUILDX_MULTI_PLATFORM_BUILD=all \
IMAGE_REGISTRY=${{ inputs.dockerhub-image-registry }} \
IMAGE_NAMESPACE=${{ inputs.dockerhub-image-namespace }} \
IMAGE_TAG=${{ inputs.version }}
DEV_BUILDER_IMAGE_TAG=${{ inputs.version }}
- name: Build and push dev-builder-centos image
shell: bash
@@ -61,7 +61,7 @@ runs:
BUILDX_MULTI_PLATFORM_BUILD=amd64 \
IMAGE_REGISTRY=${{ inputs.dockerhub-image-registry }} \
IMAGE_NAMESPACE=${{ inputs.dockerhub-image-namespace }} \
IMAGE_TAG=${{ inputs.version }}
DEV_BUILDER_IMAGE_TAG=${{ inputs.version }}
- name: Build and push dev-builder-android image # Only build image for amd64 platform.
shell: bash
@@ -71,6 +71,6 @@ runs:
BASE_IMAGE=android \
IMAGE_REGISTRY=${{ inputs.dockerhub-image-registry }} \
IMAGE_NAMESPACE=${{ inputs.dockerhub-image-namespace }} \
IMAGE_TAG=${{ inputs.version }} && \
DEV_BUILDER_IMAGE_TAG=${{ inputs.version }} && \
docker push ${{ inputs.dockerhub-image-registry }}/${{ inputs.dockerhub-image-namespace }}/dev-builder-android:${{ inputs.version }}

View File

@@ -269,6 +269,13 @@ jobs:
- name: Install cargo-gc-bin
shell: bash
run: cargo install cargo-gc-bin
- name: Check aws-lc-sys will not build
shell: bash
run: |
if cargo tree -i aws-lc-sys -e features | grep -q aws-lc-sys; then
echo "Found aws-lc-sys, which has compilation problems on older gcc versions. Please replace it with ring until its building experience improves."
exit 1
fi
- name: Build greptime bianry
shell: bash
# `cargo gc` will invoke `cargo build` with specified args

1960
Cargo.lock generated

File diff suppressed because it is too large Load Diff

View File

@@ -90,7 +90,7 @@ aquamarine = "0.3"
arrow = { version = "51.0.0", features = ["prettyprint"] }
arrow-array = { version = "51.0.0", default-features = false, features = ["chrono-tz"] }
arrow-flight = "51.0"
arrow-ipc = { version = "51.0.0", default-features = false, features = ["lz4"] }
arrow-ipc = { version = "51.0.0", default-features = false, features = ["lz4", "zstd"] }
arrow-schema = { version = "51.0", features = ["serde"] }
async-stream = "0.3"
async-trait = "0.1"
@@ -99,7 +99,7 @@ base64 = "0.21"
bigdecimal = "0.4.2"
bitflags = "2.4.1"
bytemuck = "1.12"
bytes = { version = "1.5", features = ["serde"] }
bytes = { version = "1.7", features = ["serde"] }
chrono = { version = "0.4", features = ["serde"] }
clap = { version = "4.4", features = ["derive"] }
config = "0.13.0"
@@ -120,11 +120,11 @@ etcd-client = { version = "0.13" }
fst = "0.4.7"
futures = "0.3"
futures-util = "0.3"
greptime-proto = { git = "https://github.com/GreptimeTeam/greptime-proto.git", rev = "973f49cde88a582fb65755cc572ebcf6fb93ccf7" }
greptime-proto = { git = "https://github.com/GreptimeTeam/greptime-proto.git", rev = "36334744c7020734dcb4a6b8d24d52ae7ed53fe1" }
humantime = "2.1"
humantime-serde = "1.1"
itertools = "0.10"
jsonb = { git = "https://github.com/CookiePieWw/jsonb.git", rev = "d0166c130fce903bf6c58643417a3173a6172d31", default-features = false }
jsonb = { git = "https://github.com/datafuselabs/jsonb.git", rev = "46ad50fc71cf75afbf98eec455f7892a6387c1fc", default-features = false }
lazy_static = "1.4"
meter-core = { git = "https://github.com/GreptimeTeam/greptime-meter.git", rev = "80eb97c24c88af4dd9a86f8bbaf50e741d4eb8cd" }
mockall = "0.11.4"
@@ -166,10 +166,10 @@ serde = { version = "1.0", features = ["derive"] }
serde_json = { version = "1.0", features = ["float_roundtrip"] }
serde_with = "3"
shadow-rs = "0.31"
similar-asserts = "1.6.0"
smallvec = { version = "1", features = ["serde"] }
snafu = "0.8"
sysinfo = "0.30"
similar-asserts = "1.6.0"
# on branch v0.44.x
sqlparser = { git = "https://github.com/GreptimeTeam/sqlparser-rs.git", rev = "54a267ac89c09b11c0c88934690530807185d3e7", features = [
"visitor",
@@ -245,6 +245,15 @@ store-api = { path = "src/store-api" }
substrait = { path = "src/common/substrait" }
table = { path = "src/table" }
[patch.crates-io]
# change all rustls dependencies to use our fork to default to `ring` to make it "just work"
hyper-rustls = { git = "https://github.com/GreptimeTeam/hyper-rustls" }
rustls = { git = "https://github.com/GreptimeTeam/rustls" }
tokio-rustls = { git = "https://github.com/GreptimeTeam/tokio-rustls" }
# This is commented, since we are not using aws-lc-sys, if we need to use it, we need to uncomment this line or use a release after this commit, or it wouldn't compile with gcc < 8.1
# see https://github.com/aws/aws-lc-rs/pull/526
# aws-lc-sys = { git ="https://github.com/aws/aws-lc-rs", rev = "556558441e3494af4b156ae95ebc07ebc2fd38aa" }
[workspace.dependencies.meter-macros]
git = "https://github.com/GreptimeTeam/greptime-meter.git"
rev = "80eb97c24c88af4dd9a86f8bbaf50e741d4eb8cd"

View File

@@ -8,7 +8,7 @@ CARGO_BUILD_OPTS := --locked
IMAGE_REGISTRY ?= docker.io
IMAGE_NAMESPACE ?= greptime
IMAGE_TAG ?= latest
DEV_BUILDER_IMAGE_TAG ?= 2024-06-06-b4b105ad-20240827021230
DEV_BUILDER_IMAGE_TAG ?= 2024-06-06-5674c14f-20240920110415
BUILDX_MULTI_PLATFORM_BUILD ?= false
BUILDX_BUILDER_NAME ?= gtbuilder
BASE_IMAGE ?= ubuntu

View File

@@ -17,6 +17,7 @@
| `default_timezone` | String | Unset | The default timezone of the server. |
| `init_regions_in_background` | Bool | `false` | Initialize all regions in the background during the startup.<br/>By default, it provides services after all regions have been initialized. |
| `init_regions_parallelism` | Integer | `16` | Parallelism of initializing regions. |
| `max_concurrent_queries` | Integer | `0` | The maximum current queries allowed to be executed. Zero means unlimited. |
| `runtime` | -- | -- | The runtime options. |
| `runtime.global_rt_size` | Integer | `8` | The number of threads to execute the runtime for global read operations. |
| `runtime.compact_rt_size` | Integer | `4` | The number of threads to execute the runtime for global write operations. |
@@ -335,6 +336,7 @@
| `init_regions_in_background` | Bool | `false` | Initialize all regions in the background during the startup.<br/>By default, it provides services after all regions have been initialized. |
| `enable_telemetry` | Bool | `true` | Enable telemetry to collect anonymous usage data. |
| `init_regions_parallelism` | Integer | `16` | Parallelism of initializing regions. |
| `max_concurrent_queries` | Integer | `0` | The maximum current queries allowed to be executed. Zero means unlimited. |
| `rpc_addr` | String | Unset | Deprecated, use `grpc.addr` instead. |
| `rpc_hostname` | String | Unset | Deprecated, use `grpc.hostname` instead. |
| `rpc_runtime_size` | Integer | Unset | Deprecated, use `grpc.runtime_size` instead. |

View File

@@ -19,6 +19,9 @@ enable_telemetry = true
## Parallelism of initializing regions.
init_regions_parallelism = 16
## The maximum current queries allowed to be executed. Zero means unlimited.
max_concurrent_queries = 0
## Deprecated, use `grpc.addr` instead.
## @toml2docs:none-default
rpc_addr = "127.0.0.1:3001"

View File

@@ -15,6 +15,9 @@ init_regions_in_background = false
## Parallelism of initializing regions.
init_regions_parallelism = 16
## The maximum current queries allowed to be executed. Zero means unlimited.
max_concurrent_queries = 0
## The runtime options.
#+ [runtime]
## The number of threads to execute the runtime for global read operations.

View File

@@ -0,0 +1,50 @@
#!/bin/bash
set -euxo pipefail
cd "$(mktemp -d)"
# Fix version to v1.6.6, this is different than the latest version in original install script in
# https://raw.githubusercontent.com/cargo-bins/cargo-binstall/main/install-from-binstall-release.sh
base_url="https://github.com/cargo-bins/cargo-binstall/releases/download/v1.6.6/cargo-binstall-"
os="$(uname -s)"
if [ "$os" == "Darwin" ]; then
url="${base_url}universal-apple-darwin.zip"
curl -LO --proto '=https' --tlsv1.2 -sSf "$url"
unzip cargo-binstall-universal-apple-darwin.zip
elif [ "$os" == "Linux" ]; then
machine="$(uname -m)"
if [ "$machine" == "armv7l" ]; then
machine="armv7"
fi
target="${machine}-unknown-linux-musl"
if [ "$machine" == "armv7" ]; then
target="${target}eabihf"
fi
url="${base_url}${target}.tgz"
curl -L --proto '=https' --tlsv1.2 -sSf "$url" | tar -xvzf -
elif [ "${OS-}" = "Windows_NT" ]; then
machine="$(uname -m)"
target="${machine}-pc-windows-msvc"
url="${base_url}${target}.zip"
curl -LO --proto '=https' --tlsv1.2 -sSf "$url"
unzip "cargo-binstall-${target}.zip"
else
echo "Unsupported OS ${os}"
exit 1
fi
./cargo-binstall -y --force cargo-binstall
CARGO_HOME="${CARGO_HOME:-$HOME/.cargo}"
if ! [[ ":$PATH:" == *":$CARGO_HOME/bin:"* ]]; then
if [ -n "${CI:-}" ] && [ -n "${GITHUB_PATH:-}" ]; then
echo "$CARGO_HOME/bin" >> "$GITHUB_PATH"
else
echo
printf "\033[0;31mYour path is missing %s, you might want to add it.\033[0m\n" "$CARGO_HOME/bin"
echo
fi
fi

View File

@@ -32,7 +32,9 @@ RUN rustup toolchain install ${RUST_TOOLCHAIN}
# Install cargo-binstall with a specific version to adapt the current rust toolchain.
# Note: if we use the latest version, we may encounter the following `use of unstable library feature 'io_error_downcast'` error.
RUN cargo install cargo-binstall --version 1.6.6 --locked
# compile from source take too long, so we use the precompiled binary instead
COPY $DOCKER_BUILD_ROOT/docker/dev-builder/binstall/pull_binstall.sh /usr/local/bin/pull_binstall.sh
RUN chmod +x /usr/local/bin/pull_binstall.sh && /usr/local/bin/pull_binstall.sh
# Install nextest.
RUN cargo binstall cargo-nextest --no-confirm

View File

@@ -24,6 +24,15 @@ RUN apt-get update && DEBIAN_FRONTEND=noninteractive apt-get install -y \
python3.10 \
python3.10-dev
# https://github.com/GreptimeTeam/greptimedb/actions/runs/10935485852/job/30357457188#step:3:7106
# `aws-lc-sys` require gcc >= 10.3.0 to work, hence alias to use gcc-10
RUN apt-get remove -y gcc-9 g++-9 cpp-9 && \
apt-get install -y gcc-10 g++-10 cpp-10 make cmake && \
ln -sf /usr/bin/gcc-10 /usr/bin/gcc && ln -sf /usr/bin/g++-10 /usr/bin/g++ && \
ln -sf /usr/bin/gcc-10 /usr/bin/cc && \
ln -sf /usr/bin/g++-10 /usr/bin/cpp && ln -sf /usr/bin/g++-10 /usr/bin/c++ && \
cc --version && gcc --version && g++ --version && cpp --version && c++ --version
# Remove Python 3.8 and install pip.
RUN apt-get -y purge python3.8 && \
apt-get -y autoremove && \
@@ -57,7 +66,9 @@ RUN rustup toolchain install ${RUST_TOOLCHAIN}
# Install cargo-binstall with a specific version to adapt the current rust toolchain.
# Note: if we use the latest version, we may encounter the following `use of unstable library feature 'io_error_downcast'` error.
RUN cargo install cargo-binstall --version 1.6.6 --locked
# compile from source take too long, so we use the precompiled binary instead
COPY $DOCKER_BUILD_ROOT/docker/dev-builder/binstall/pull_binstall.sh /usr/local/bin/pull_binstall.sh
RUN chmod +x /usr/local/bin/pull_binstall.sh && /usr/local/bin/pull_binstall.sh
# Install nextest.
RUN cargo binstall cargo-nextest --no-confirm

View File

@@ -0,0 +1,197 @@
---
Feature Name: Json Datatype
Tracking Issue: https://github.com/GreptimeTeam/greptimedb/issues/4230
Date: 2024-8-6
Author: "Yuhan Wang <profsyb@gmail.com>"
---
# Summary
This RFC proposes a method for storing and querying JSON data in the database.
# Motivation
JSON is widely used across various scenarios. Direct support for writing and querying JSON can significantly enhance the database's flexibility.
# Details
## Storage and Query
GreptimeDB's type system is built on Arrow/DataFusion, where each data type in GreptimeDB corresponds to a data type in Arrow/DataFusion. The proposed JSON type will be implemented on top of the existing `Binary` type, leveraging the current `datatype::value::Value` and `datatype::vectors::BinaryVector` implementations, utilizing the JSONB format as the encoding of JSON data. JSON data is stored and processed similarly to binary data within the storage layer and query engine.
This approach brings problems when dealing with insertions and queries of JSON columns.
## Insertion
Users commonly write JSON data as strings. Thus we need to make conversions between string and JSONB. There are 2 ways to do this:
1. MySQL and PostgreSQL servers provide auto-conversions between strings and JSONB. When a string is inserted into a JSON column, the server will try to parse the string as JSON and convert it to JSONB. The non-JSON strings will be rejected.
2. A function `parse_json` is provided to convert string to JSONB. If the string is not a valid JSON string, the function will return an error.
For example, in MySQL client:
```SQL
CREATE TABLE IF NOT EXISTS test (
ts TIMESTAMP TIME INDEX,
a INT,
b JSON
);
INSERT INTO test VALUES(
0,
0,
'{
"name": "jHl2oDDnPc1i2OzlP5Y",
"timestamp": "2024-07-25T04:33:11.369386Z",
"attributes": { "event_attributes": 48.28667 }
}'
);
INSERT INTO test VALUES(
0,
0,
parse_json('{
"name": "jHl2oDDnPc1i2OzlP5Y",
"timestamp": "2024-07-25T04:33:11.369386Z",
"attributes": { "event_attributes": 48.28667 }
}')
);
```
Are both valid.
The dataflow of the insertion process is as follows:
```
Insert JSON strings directly through client:
Parse Insert
String(Serialized JSON)┌──────────┐Arrow Binary(JSONB)┌──────┐Arrow Binary(JSONB)
Client ---------------------->│ Server │------------------>│ Mito │------------------> Storage
└──────────┘ └──────┘
(Server identifies JSON type and performs auto-conversion)
Insert JSON strings through parse_json function:
Parse Insert
String(Serialized JSON)┌──────────┐String(Serialized JSON)┌─────┐Arrow Binary(JSONB)┌──────┐Arrow Binary(JSONB)
Client ---------------------->│ Server │---------------------->│ UDF │------------------>│ Mito │------------------> Storage
└──────────┘ └─────┘ └──────┘
(Conversion is performed by UDF inside Query Engine)
```
Servers identify JSON column through column schema and perform auto-conversions. But when using prepared statements and binding parameters, the corresponding cached plans in datafusion generated by prepared statements cannot identify JSON columns. Under this circumstance, the servers identify JSON columns through the given parameters and perform auto-conversions.
The following is an example of inserting JSON data through prepared statements:
```Rust
sqlx::query(
"create table test(ts timestamp time index, j json)",
)
.execute(&pool)
.await
.unwrap();
let json = serde_json::json!({
"code": 200,
"success": true,
"payload": {
"features": [
"serde",
"json"
],
"homepage": null
}
});
// Valid, can identify serde_json::Value as JSON type
sqlx::query("insert into test values($1, $2)")
.bind(i)
.bind(json)
.execute(&pool)
.await
.unwrap();
// Invalid, cannot identify String as JSON type
sqlx::query("insert into test values($1, $2)")
.bind(i)
.bind(json.to_string())
.execute(&pool)
.await
.unwrap();
```
## Query
Correspondingly, users prefer to display JSON data as strings. Thus we need to make conversions between JSON data and strings before presenting JSON data. There are also 2 ways to do this: auto-conversions on MySQL and PostgreSQL servers, and function `json_to_string`.
For example, in MySQL client:
```SQL
SELECT b FROM test;
SELECT json_to_string(b) FROM test;
```
Will both return the JSON as human-readable strings.
Specifically, to perform auto-conversions, we attach a message to JSON data in the `metadata` of `Field` in Arrow/Datafusion schema when scanning a JSON column. Frontend servers could identify JSON data and convert it to strings.
The dataflow of the query process is as follows:
```
Query directly through client:
Decode Scan
String(Serialized JSON)┌──────────┐Arrow Binary(JSONB)┌──────────────┐Arrow Binary(JSONB)
Client <----------------------│ Server │<------------------│ Query Engine │<----------------- Storage
└──────────┘ └──────────────┘
(Server identifies JSON type and performs auto-conversion based on column metadata)
Query through json_to_string function:
Scan & Decode
String(Serialized JSON)┌──────────┐String(Serialized JSON)┌──────────────┐Arrow Binary(JSONB)
Client <----------------------│ Server │<----------------------│ Query Engine │<----------------- Storage
└──────────┘ └──────────────┘
(Conversion is performed by UDF inside Query Engine)
```
However, if a function uses JSON type as its return type, the metadata method mentioned above is not applicable. Thus the functions of JSON type should specify the return type explicitly instead of returning a JSON type, such as `json_get_int` and `json_get_float` which return corresponding data of `INT` and `FLOAT` type respectively.
## Functions
Similar to the common JSON type, JSON data can be queried with functions.
For example:
```SQL
CREATE TABLE IF NOT EXISTS test (
ts TIMESTAMP TIME INDEX,
a INT,
b JSON
);
INSERT INTO test VALUES(
0,
0,
'{
"name": "jHl2oDDnPc1i2OzlP5Y",
"timestamp": "2024-07-25T04:33:11.369386Z",
"attributes": { "event_attributes": 48.28667 }
}'
);
SELECT json_get_string(b, 'name') FROM test;
+---------------------+
| b.name |
+---------------------+
| jHl2oDDnPc1i2OzlP5Y |
+---------------------+
SELECT json_get_float(b, 'attributes.event_attributes') FROM test;
+--------------------------------+
| b.attributes.event_attributes |
+--------------------------------+
| 48.28667 |
+--------------------------------+
```
And more functions can be added in the future.
# Drawbacks
As a general purpose JSON data type, JSONB may not be as efficient as specialized data types for specific scenarios.
The auto-conversion mechanism is not supported in all scenarios. We need to find workarounds for these scenarios.
# Alternatives
Extract and flatten JSON schema to store in a structured format through pipeline. For nested data, we can provide nested types like `STRUCT` or `ARRAY`.

View File

@@ -75,6 +75,16 @@ pub enum Password<'a> {
PgMD5(HashedPassword<'a>, Salt<'a>),
}
impl Password<'_> {
pub fn r#type(&self) -> &str {
match self {
Password::PlainText(_) => "plain_text",
Password::MysqlNativePassword(_, _) => "mysql_native_password",
Password::PgMD5(_, _) => "pg_md5",
}
}
}
pub fn auth_mysql(
auth_data: HashedPassword,
salt: Salt,

View File

@@ -89,7 +89,7 @@ impl ErrorExt for Error {
Error::FileWatch { .. } => StatusCode::InvalidArguments,
Error::InternalState { .. } => StatusCode::Unexpected,
Error::Io { .. } => StatusCode::StorageUnavailable,
Error::AuthBackend { .. } => StatusCode::Internal,
Error::AuthBackend { source, .. } => source.status_code(),
Error::UserNotFound { .. } => StatusCode::UserNotFound,
Error::UnsupportedPasswordType { .. } => StatusCode::UnsupportedPasswordType,

View File

@@ -57,6 +57,11 @@ pub trait UserProvider: Send + Sync {
self.authorize(catalog, schema, &user_info).await?;
Ok(user_info)
}
/// Returns whether this user provider implementation is backed by an external system.
fn external(&self) -> bool {
false
}
}
fn load_credential_from_file(filepath: &str) -> Result<Option<HashMap<String, Vec<u8>>>> {

View File

@@ -22,8 +22,10 @@ common-config.workspace = true
common-error.workspace = true
common-macro.workspace = true
common-meta.workspace = true
common-procedure.workspace = true
common-query.workspace = true
common-recordbatch.workspace = true
common-runtime.workspace = true
common-telemetry.workspace = true
common-time.workspace = true
common-version.workspace = true
@@ -48,6 +50,7 @@ sql.workspace = true
store-api.workspace = true
table.workspace = true
tokio.workspace = true
tokio-stream = "0.1"
[dev-dependencies]
cache.workspace = true

View File

@@ -50,13 +50,20 @@ pub enum Error {
source: BoxedError,
},
#[snafu(display("Failed to list nodes in cluster: {source}"))]
#[snafu(display("Failed to list nodes in cluster"))]
ListNodes {
#[snafu(implicit)]
location: Location,
source: BoxedError,
},
#[snafu(display("Failed to region stats in cluster"))]
ListRegionStats {
#[snafu(implicit)]
location: Location,
source: BoxedError,
},
#[snafu(display("Failed to list flows in catalog {catalog}"))]
ListFlows {
#[snafu(implicit)]
@@ -82,6 +89,33 @@ pub enum Error {
location: Location,
},
#[snafu(display("Failed to get procedure client in {mode} mode"))]
GetProcedureClient {
mode: String,
#[snafu(implicit)]
location: Location,
},
#[snafu(display("Failed to list procedures"))]
ListProcedures {
#[snafu(implicit)]
location: Location,
source: BoxedError,
},
#[snafu(display("Procedure id not found"))]
ProcedureIdNotFound {
#[snafu(implicit)]
location: Location,
},
#[snafu(display("convert proto data error"))]
ConvertProtoData {
#[snafu(implicit)]
location: Location,
source: BoxedError,
},
#[snafu(display("Failed to re-compile script due to internal error"))]
CompileScriptInternal {
#[snafu(implicit)]
@@ -266,7 +300,9 @@ impl ErrorExt for Error {
| Error::FindRegionRoutes { .. }
| Error::CacheNotFound { .. }
| Error::CastManager { .. }
| Error::Json { .. } => StatusCode::Unexpected,
| Error::Json { .. }
| Error::GetProcedureClient { .. }
| Error::ProcedureIdNotFound { .. } => StatusCode::Unexpected,
Error::ViewPlanColumnsChanged { .. } => StatusCode::InvalidArguments,
@@ -283,7 +319,10 @@ impl ErrorExt for Error {
| Error::ListNodes { source, .. }
| Error::ListSchemas { source, .. }
| Error::ListTables { source, .. }
| Error::ListFlows { source, .. } => source.status_code(),
| Error::ListFlows { source, .. }
| Error::ListProcedures { source, .. }
| Error::ListRegionStats { source, .. }
| Error::ConvertProtoData { source, .. } => source.status_code(),
Error::CreateTable { source, .. } => source.status_code(),

View File

@@ -31,6 +31,7 @@ use common_meta::key::table_info::TableInfoValue;
use common_meta::key::table_name::TableNameKey;
use common_meta::key::{TableMetadataManager, TableMetadataManagerRef};
use common_meta::kv_backend::KvBackendRef;
use common_procedure::ProcedureManagerRef;
use futures_util::stream::BoxStream;
use futures_util::{StreamExt, TryStreamExt};
use meta_client::client::MetaClient;
@@ -42,6 +43,8 @@ use table::dist_table::DistTable;
use table::table::numbers::{NumbersTable, NUMBERS_TABLE_NAME};
use table::table_name::TableName;
use table::TableRef;
use tokio::sync::Semaphore;
use tokio_stream::wrappers::ReceiverStream;
use crate::error::{
CacheNotFoundSnafu, GetTableCacheSnafu, InvalidTableInfoInCatalogSnafu, ListCatalogsSnafu,
@@ -61,12 +64,18 @@ use crate::CatalogManager;
#[derive(Clone)]
pub struct KvBackendCatalogManager {
mode: Mode,
/// Only available in `Distributed` mode.
meta_client: Option<Arc<MetaClient>>,
/// Manages partition rules.
partition_manager: PartitionRuleManagerRef,
/// Manages table metadata.
table_metadata_manager: TableMetadataManagerRef,
/// A sub-CatalogManager that handles system tables
system_catalog: SystemCatalog,
/// Cache registry for all caches.
cache_registry: LayeredCacheRegistryRef,
/// Only available in `Standalone` mode.
procedure_manager: Option<ProcedureManagerRef>,
}
const CATALOG_CACHE_MAX_CAPACITY: u64 = 128;
@@ -77,6 +86,7 @@ impl KvBackendCatalogManager {
meta_client: Option<Arc<MetaClient>>,
backend: KvBackendRef,
cache_registry: LayeredCacheRegistryRef,
procedure_manager: Option<ProcedureManagerRef>,
) -> Arc<Self> {
Arc::new_cyclic(|me| Self {
mode,
@@ -104,6 +114,7 @@ impl KvBackendCatalogManager {
backend,
},
cache_registry,
procedure_manager,
})
}
@@ -130,6 +141,10 @@ impl KvBackendCatalogManager {
pub fn table_metadata_manager_ref(&self) -> &TableMetadataManagerRef {
&self.table_metadata_manager
}
pub fn procedure_manager(&self) -> Option<ProcedureManagerRef> {
self.procedure_manager.clone()
}
}
#[async_trait::async_trait]
@@ -179,21 +194,18 @@ impl CatalogManager for KvBackendCatalogManager {
schema: &str,
query_ctx: Option<&QueryContext>,
) -> Result<Vec<String>> {
let stream = self
let mut tables = self
.table_metadata_manager
.table_name_manager()
.tables(catalog, schema);
let mut tables = stream
.tables(catalog, schema)
.map_ok(|(table_name, _)| table_name)
.try_collect::<Vec<_>>()
.await
.map_err(BoxedError::new)
.context(ListTablesSnafu { catalog, schema })?
.into_iter()
.map(|(k, _)| k)
.collect::<Vec<_>>();
tables.extend_from_slice(&self.system_catalog.table_names(schema, query_ctx));
.context(ListTablesSnafu { catalog, schema })?;
Ok(tables.into_iter().collect())
tables.extend(self.system_catalog.table_names(schema, query_ctx));
Ok(tables)
}
async fn catalog_exists(&self, catalog: &str) -> Result<bool> {
@@ -303,36 +315,68 @@ impl CatalogManager for KvBackendCatalogManager {
}
});
let table_id_stream = self
.table_metadata_manager
.table_name_manager()
.tables(catalog, schema)
.map_ok(|(_, v)| v.table_id());
const BATCH_SIZE: usize = 128;
let user_tables = try_stream!({
const CONCURRENCY: usize = 8;
let (tx, rx) = tokio::sync::mpsc::channel(64);
let metadata_manager = self.table_metadata_manager.clone();
let catalog = catalog.to_string();
let schema = schema.to_string();
let semaphore = Arc::new(Semaphore::new(CONCURRENCY));
common_runtime::spawn_global(async move {
let table_id_stream = metadata_manager
.table_name_manager()
.tables(&catalog, &schema)
.map_ok(|(_, v)| v.table_id());
// Split table ids into chunks
let mut table_id_chunks = table_id_stream.ready_chunks(BATCH_SIZE);
while let Some(table_ids) = table_id_chunks.next().await {
let table_ids = table_ids
let table_ids = match table_ids
.into_iter()
.collect::<std::result::Result<Vec<_>, _>>()
.map_err(BoxedError::new)
.context(ListTablesSnafu { catalog, schema })?;
.context(ListTablesSnafu {
catalog: &catalog,
schema: &schema,
}) {
Ok(table_ids) => table_ids,
Err(e) => {
let _ = tx.send(Err(e)).await;
return;
}
};
let table_info_values = self
.table_metadata_manager
.table_info_manager()
.batch_get(&table_ids)
.await
.context(TableMetadataManagerSnafu)?;
let metadata_manager = metadata_manager.clone();
let tx = tx.clone();
let semaphore = semaphore.clone();
common_runtime::spawn_global(async move {
// we don't explicitly close the semaphore so just ignore the potential error.
let _ = semaphore.acquire().await;
let table_info_values = match metadata_manager
.table_info_manager()
.batch_get(&table_ids)
.await
.context(TableMetadataManagerSnafu)
{
Ok(table_info_values) => table_info_values,
Err(e) => {
let _ = tx.send(Err(e)).await;
return;
}
};
for table_info_value in table_info_values.into_values() {
yield build_table(table_info_value)?;
}
for table in table_info_values.into_values().map(build_table) {
if tx.send(table).await.is_err() {
return;
}
}
});
}
});
let user_tables = ReceiverStream::new(rx);
Box::pin(sys_tables.chain(user_tables))
}
}

View File

@@ -18,7 +18,9 @@ pub mod flows;
mod information_memory_table;
pub mod key_column_usage;
mod partitions;
mod procedure_info;
mod region_peers;
mod region_statistics;
mod runtime_metrics;
pub mod schemata;
mod table_constraints;
@@ -188,6 +190,16 @@ impl SystemSchemaProviderInner for InformationSchemaProvider {
self.catalog_name.clone(),
self.flow_metadata_manager.clone(),
)) as _),
PROCEDURE_INFO => Some(
Arc::new(procedure_info::InformationSchemaProcedureInfo::new(
self.catalog_manager.clone(),
)) as _,
),
REGION_STATISTICS => Some(Arc::new(
region_statistics::InformationSchemaRegionStatistics::new(
self.catalog_manager.clone(),
),
) as _),
_ => None,
}
}
@@ -235,6 +247,14 @@ impl InformationSchemaProvider {
CLUSTER_INFO.to_string(),
self.build_table(CLUSTER_INFO).unwrap(),
);
tables.insert(
PROCEDURE_INFO.to_string(),
self.build_table(PROCEDURE_INFO).unwrap(),
);
tables.insert(
REGION_STATISTICS.to_string(),
self.build_table(REGION_STATISTICS).unwrap(),
);
}
tables.insert(TABLES.to_string(), self.build_table(TABLES).unwrap());
@@ -250,7 +270,6 @@ impl InformationSchemaProvider {
self.build_table(TABLE_CONSTRAINTS).unwrap(),
);
tables.insert(FLOWS.to_string(), self.build_table(FLOWS).unwrap());
// Add memory tables
for name in MEMORY_TABLES.iter() {
tables.insert((*name).to_string(), self.build_table(name).expect(name));

View File

@@ -0,0 +1,310 @@
// Copyright 2023 Greptime Team
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
use std::sync::{Arc, Weak};
use api::v1::meta::{ProcedureMeta, ProcedureStatus};
use arrow_schema::SchemaRef as ArrowSchemaRef;
use common_catalog::consts::INFORMATION_SCHEMA_PROCEDURE_INFO_TABLE_ID;
use common_config::Mode;
use common_error::ext::BoxedError;
use common_meta::ddl::{ExecutorContext, ProcedureExecutor};
use common_meta::rpc::procedure;
use common_procedure::{ProcedureInfo, ProcedureState};
use common_recordbatch::adapter::RecordBatchStreamAdapter;
use common_recordbatch::{RecordBatch, SendableRecordBatchStream};
use common_time::timestamp::Timestamp;
use datafusion::execution::TaskContext;
use datafusion::physical_plan::stream::RecordBatchStreamAdapter as DfRecordBatchStreamAdapter;
use datafusion::physical_plan::streaming::PartitionStream as DfPartitionStream;
use datafusion::physical_plan::SendableRecordBatchStream as DfSendableRecordBatchStream;
use datatypes::prelude::{ConcreteDataType, ScalarVectorBuilder, VectorRef};
use datatypes::schema::{ColumnSchema, Schema, SchemaRef};
use datatypes::timestamp::TimestampMillisecond;
use datatypes::value::Value;
use datatypes::vectors::{StringVectorBuilder, TimestampMillisecondVectorBuilder};
use snafu::ResultExt;
use store_api::storage::{ScanRequest, TableId};
use super::PROCEDURE_INFO;
use crate::error::{
ConvertProtoDataSnafu, CreateRecordBatchSnafu, GetProcedureClientSnafu, InternalSnafu,
ListProceduresSnafu, ProcedureIdNotFoundSnafu, Result,
};
use crate::system_schema::information_schema::{InformationTable, Predicates};
use crate::system_schema::utils;
use crate::CatalogManager;
const PROCEDURE_ID: &str = "procedure_id";
const PROCEDURE_TYPE: &str = "procedure_type";
const START_TIME: &str = "start_time";
const END_TIME: &str = "end_time";
const STATUS: &str = "status";
const LOCK_KEYS: &str = "lock_keys";
const INIT_CAPACITY: usize = 42;
/// The `PROCEDURE_INFO` table provides information about the current procedure information of the cluster.
///
/// - `procedure_id`: the unique identifier of the procedure.
/// - `procedure_name`: the name of the procedure.
/// - `start_time`: the starting execution time of the procedure.
/// - `end_time`: the ending execution time of the procedure.
/// - `status`: the status of the procedure.
/// - `lock_keys`: the lock keys of the procedure.
///
pub(super) struct InformationSchemaProcedureInfo {
schema: SchemaRef,
catalog_manager: Weak<dyn CatalogManager>,
}
impl InformationSchemaProcedureInfo {
pub(super) fn new(catalog_manager: Weak<dyn CatalogManager>) -> Self {
Self {
schema: Self::schema(),
catalog_manager,
}
}
pub(crate) fn schema() -> SchemaRef {
Arc::new(Schema::new(vec![
ColumnSchema::new(PROCEDURE_ID, ConcreteDataType::string_datatype(), false),
ColumnSchema::new(PROCEDURE_TYPE, ConcreteDataType::string_datatype(), false),
ColumnSchema::new(
START_TIME,
ConcreteDataType::timestamp_millisecond_datatype(),
true,
),
ColumnSchema::new(
END_TIME,
ConcreteDataType::timestamp_millisecond_datatype(),
true,
),
ColumnSchema::new(STATUS, ConcreteDataType::string_datatype(), false),
ColumnSchema::new(LOCK_KEYS, ConcreteDataType::string_datatype(), true),
]))
}
fn builder(&self) -> InformationSchemaProcedureInfoBuilder {
InformationSchemaProcedureInfoBuilder::new(
self.schema.clone(),
self.catalog_manager.clone(),
)
}
}
impl InformationTable for InformationSchemaProcedureInfo {
fn table_id(&self) -> TableId {
INFORMATION_SCHEMA_PROCEDURE_INFO_TABLE_ID
}
fn table_name(&self) -> &'static str {
PROCEDURE_INFO
}
fn schema(&self) -> SchemaRef {
self.schema.clone()
}
fn to_stream(&self, request: ScanRequest) -> Result<SendableRecordBatchStream> {
let schema = self.schema.arrow_schema().clone();
let mut builder = self.builder();
let stream = Box::pin(DfRecordBatchStreamAdapter::new(
schema,
futures::stream::once(async move {
builder
.make_procedure_info(Some(request))
.await
.map(|x| x.into_df_record_batch())
.map_err(Into::into)
}),
));
Ok(Box::pin(
RecordBatchStreamAdapter::try_new(stream)
.map_err(BoxedError::new)
.context(InternalSnafu)?,
))
}
}
struct InformationSchemaProcedureInfoBuilder {
schema: SchemaRef,
catalog_manager: Weak<dyn CatalogManager>,
procedure_ids: StringVectorBuilder,
procedure_types: StringVectorBuilder,
start_times: TimestampMillisecondVectorBuilder,
end_times: TimestampMillisecondVectorBuilder,
statuses: StringVectorBuilder,
lock_keys: StringVectorBuilder,
}
impl InformationSchemaProcedureInfoBuilder {
fn new(schema: SchemaRef, catalog_manager: Weak<dyn CatalogManager>) -> Self {
Self {
schema,
catalog_manager,
procedure_ids: StringVectorBuilder::with_capacity(INIT_CAPACITY),
procedure_types: StringVectorBuilder::with_capacity(INIT_CAPACITY),
start_times: TimestampMillisecondVectorBuilder::with_capacity(INIT_CAPACITY),
end_times: TimestampMillisecondVectorBuilder::with_capacity(INIT_CAPACITY),
statuses: StringVectorBuilder::with_capacity(INIT_CAPACITY),
lock_keys: StringVectorBuilder::with_capacity(INIT_CAPACITY),
}
}
/// Construct the `information_schema.procedure_info` virtual table
async fn make_procedure_info(&mut self, request: Option<ScanRequest>) -> Result<RecordBatch> {
let predicates = Predicates::from_scan_request(&request);
let mode = utils::running_mode(&self.catalog_manager)?.unwrap_or(Mode::Standalone);
match mode {
Mode::Standalone => {
if let Some(procedure_manager) = utils::procedure_manager(&self.catalog_manager)? {
let procedures = procedure_manager
.list_procedures()
.await
.map_err(BoxedError::new)
.context(ListProceduresSnafu)?;
for procedure in procedures {
self.add_procedure(
&predicates,
procedure.state.as_str_name().to_string(),
procedure,
);
}
} else {
return GetProcedureClientSnafu { mode: "standalone" }.fail();
}
}
Mode::Distributed => {
if let Some(meta_client) = utils::meta_client(&self.catalog_manager)? {
let procedures = meta_client
.list_procedures(&ExecutorContext::default())
.await
.map_err(BoxedError::new)
.context(ListProceduresSnafu)?;
for procedure in procedures.procedures {
self.add_procedure_info(&predicates, procedure)?;
}
} else {
return GetProcedureClientSnafu {
mode: "distributed",
}
.fail();
}
}
};
self.finish()
}
fn add_procedure(
&mut self,
predicates: &Predicates,
status: String,
procedure_info: ProcedureInfo,
) {
let ProcedureInfo {
id,
type_name,
start_time_ms,
end_time_ms,
lock_keys,
..
} = procedure_info;
let pid = id.to_string();
let start_time = TimestampMillisecond(Timestamp::new_millisecond(start_time_ms));
let end_time = TimestampMillisecond(Timestamp::new_millisecond(end_time_ms));
let lock_keys = lock_keys.join(",");
let row = [
(PROCEDURE_ID, &Value::from(pid.clone())),
(PROCEDURE_TYPE, &Value::from(type_name.clone())),
(START_TIME, &Value::from(start_time)),
(END_TIME, &Value::from(end_time)),
(STATUS, &Value::from(status.clone())),
(LOCK_KEYS, &Value::from(lock_keys.clone())),
];
if !predicates.eval(&row) {
return;
}
self.procedure_ids.push(Some(&pid));
self.procedure_types.push(Some(&type_name));
self.start_times.push(Some(start_time));
self.end_times.push(Some(end_time));
self.statuses.push(Some(&status));
self.lock_keys.push(Some(&lock_keys));
}
fn add_procedure_info(
&mut self,
predicates: &Predicates,
procedure: ProcedureMeta,
) -> Result<()> {
let pid = match procedure.id {
Some(pid) => pid,
None => return ProcedureIdNotFoundSnafu {}.fail(),
};
let pid = procedure::pb_pid_to_pid(&pid)
.map_err(BoxedError::new)
.context(ConvertProtoDataSnafu)?;
let status = ProcedureStatus::try_from(procedure.status)
.map(|v| v.as_str_name())
.unwrap_or("Unknown")
.to_string();
let procedure_info = ProcedureInfo {
id: pid,
type_name: procedure.type_name,
start_time_ms: procedure.start_time_ms,
end_time_ms: procedure.end_time_ms,
state: ProcedureState::Running,
lock_keys: procedure.lock_keys,
};
self.add_procedure(predicates, status, procedure_info);
Ok(())
}
fn finish(&mut self) -> Result<RecordBatch> {
let columns: Vec<VectorRef> = vec![
Arc::new(self.procedure_ids.finish()),
Arc::new(self.procedure_types.finish()),
Arc::new(self.start_times.finish()),
Arc::new(self.end_times.finish()),
Arc::new(self.statuses.finish()),
Arc::new(self.lock_keys.finish()),
];
RecordBatch::new(self.schema.clone(), columns).context(CreateRecordBatchSnafu)
}
}
impl DfPartitionStream for InformationSchemaProcedureInfo {
fn schema(&self) -> &ArrowSchemaRef {
self.schema.arrow_schema()
}
fn execute(&self, _: Arc<TaskContext>) -> DfSendableRecordBatchStream {
let schema = self.schema.arrow_schema().clone();
let mut builder = self.builder();
Box::pin(DfRecordBatchStreamAdapter::new(
schema,
futures::stream::once(async move {
builder
.make_procedure_info(None)
.await
.map(|x| x.into_df_record_batch())
.map_err(Into::into)
}),
))
}
}

View File

@@ -0,0 +1,257 @@
// Copyright 2023 Greptime Team
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
use std::sync::{Arc, Weak};
use arrow_schema::SchemaRef as ArrowSchemaRef;
use common_catalog::consts::INFORMATION_SCHEMA_REGION_STATISTICS_TABLE_ID;
use common_config::Mode;
use common_error::ext::BoxedError;
use common_meta::cluster::ClusterInfo;
use common_meta::datanode::RegionStat;
use common_recordbatch::adapter::RecordBatchStreamAdapter;
use common_recordbatch::{DfSendableRecordBatchStream, RecordBatch, SendableRecordBatchStream};
use common_telemetry::tracing::warn;
use datafusion::execution::TaskContext;
use datafusion::physical_plan::stream::RecordBatchStreamAdapter as DfRecordBatchStreamAdapter;
use datafusion::physical_plan::streaming::PartitionStream as DfPartitionStream;
use datatypes::prelude::{ConcreteDataType, ScalarVectorBuilder, VectorRef};
use datatypes::schema::{ColumnSchema, Schema, SchemaRef};
use datatypes::value::Value;
use datatypes::vectors::{StringVectorBuilder, UInt32VectorBuilder, UInt64VectorBuilder};
use snafu::ResultExt;
use store_api::storage::{ScanRequest, TableId};
use super::{InformationTable, REGION_STATISTICS};
use crate::error::{CreateRecordBatchSnafu, InternalSnafu, ListRegionStatsSnafu, Result};
use crate::information_schema::Predicates;
use crate::system_schema::utils;
use crate::CatalogManager;
const REGION_ID: &str = "region_id";
const TABLE_ID: &str = "table_id";
const REGION_NUMBER: &str = "region_number";
const MEMTABLE_SIZE: &str = "memtable_size";
const MANIFEST_SIZE: &str = "manifest_size";
const SST_SIZE: &str = "sst_size";
const ENGINE: &str = "engine";
const REGION_ROLE: &str = "region_role";
const INIT_CAPACITY: usize = 42;
/// The `REGION_STATISTICS` table provides information about the region statistics. Including fields:
///
/// - `region_id`: The region id.
/// - `table_id`: The table id.
/// - `region_number`: The region number.
/// - `memtable_size`: The memtable size in bytes.
/// - `manifest_size`: The manifest size in bytes.
/// - `sst_size`: The sst size in bytes.
/// - `engine`: The engine type.
/// - `region_role`: The region role.
///
pub(super) struct InformationSchemaRegionStatistics {
schema: SchemaRef,
catalog_manager: Weak<dyn CatalogManager>,
}
impl InformationSchemaRegionStatistics {
pub(super) fn new(catalog_manager: Weak<dyn CatalogManager>) -> Self {
Self {
schema: Self::schema(),
catalog_manager,
}
}
pub(crate) fn schema() -> SchemaRef {
Arc::new(Schema::new(vec![
ColumnSchema::new(REGION_ID, ConcreteDataType::uint64_datatype(), false),
ColumnSchema::new(TABLE_ID, ConcreteDataType::uint32_datatype(), false),
ColumnSchema::new(REGION_NUMBER, ConcreteDataType::uint32_datatype(), false),
ColumnSchema::new(MEMTABLE_SIZE, ConcreteDataType::uint64_datatype(), true),
ColumnSchema::new(MANIFEST_SIZE, ConcreteDataType::uint64_datatype(), true),
ColumnSchema::new(SST_SIZE, ConcreteDataType::uint64_datatype(), true),
ColumnSchema::new(ENGINE, ConcreteDataType::string_datatype(), true),
ColumnSchema::new(REGION_ROLE, ConcreteDataType::string_datatype(), true),
]))
}
fn builder(&self) -> InformationSchemaRegionStatisticsBuilder {
InformationSchemaRegionStatisticsBuilder::new(
self.schema.clone(),
self.catalog_manager.clone(),
)
}
}
impl InformationTable for InformationSchemaRegionStatistics {
fn table_id(&self) -> TableId {
INFORMATION_SCHEMA_REGION_STATISTICS_TABLE_ID
}
fn table_name(&self) -> &'static str {
REGION_STATISTICS
}
fn schema(&self) -> SchemaRef {
self.schema.clone()
}
fn to_stream(&self, request: ScanRequest) -> Result<SendableRecordBatchStream> {
let schema = self.schema.arrow_schema().clone();
let mut builder = self.builder();
let stream = Box::pin(DfRecordBatchStreamAdapter::new(
schema,
futures::stream::once(async move {
builder
.make_region_statistics(Some(request))
.await
.map(|x| x.into_df_record_batch())
.map_err(Into::into)
}),
));
Ok(Box::pin(
RecordBatchStreamAdapter::try_new(stream)
.map_err(BoxedError::new)
.context(InternalSnafu)?,
))
}
}
struct InformationSchemaRegionStatisticsBuilder {
schema: SchemaRef,
catalog_manager: Weak<dyn CatalogManager>,
region_ids: UInt64VectorBuilder,
table_ids: UInt32VectorBuilder,
region_numbers: UInt32VectorBuilder,
memtable_sizes: UInt64VectorBuilder,
manifest_sizes: UInt64VectorBuilder,
sst_sizes: UInt64VectorBuilder,
engines: StringVectorBuilder,
region_roles: StringVectorBuilder,
}
impl InformationSchemaRegionStatisticsBuilder {
fn new(schema: SchemaRef, catalog_manager: Weak<dyn CatalogManager>) -> Self {
Self {
schema,
catalog_manager,
region_ids: UInt64VectorBuilder::with_capacity(INIT_CAPACITY),
table_ids: UInt32VectorBuilder::with_capacity(INIT_CAPACITY),
region_numbers: UInt32VectorBuilder::with_capacity(INIT_CAPACITY),
memtable_sizes: UInt64VectorBuilder::with_capacity(INIT_CAPACITY),
manifest_sizes: UInt64VectorBuilder::with_capacity(INIT_CAPACITY),
sst_sizes: UInt64VectorBuilder::with_capacity(INIT_CAPACITY),
engines: StringVectorBuilder::with_capacity(INIT_CAPACITY),
region_roles: StringVectorBuilder::with_capacity(INIT_CAPACITY),
}
}
/// Construct a new `InformationSchemaRegionStatistics` from the collected data.
async fn make_region_statistics(
&mut self,
request: Option<ScanRequest>,
) -> Result<RecordBatch> {
let predicates = Predicates::from_scan_request(&request);
let mode = utils::running_mode(&self.catalog_manager)?.unwrap_or(Mode::Standalone);
match mode {
Mode::Standalone => {
// TODO(weny): implement it
}
Mode::Distributed => {
if let Some(meta_client) = utils::meta_client(&self.catalog_manager)? {
let region_stats = meta_client
.list_region_stats()
.await
.map_err(BoxedError::new)
.context(ListRegionStatsSnafu)?;
for region_stat in region_stats {
self.add_region_statistic(&predicates, region_stat);
}
} else {
warn!("Meta client is not available");
}
}
}
self.finish()
}
fn add_region_statistic(&mut self, predicate: &Predicates, region_stat: RegionStat) {
let row = [
(REGION_ID, &Value::from(region_stat.id.as_u64())),
(TABLE_ID, &Value::from(region_stat.id.table_id())),
(REGION_NUMBER, &Value::from(region_stat.id.region_number())),
(MEMTABLE_SIZE, &Value::from(region_stat.memtable_size)),
(MANIFEST_SIZE, &Value::from(region_stat.manifest_size)),
(SST_SIZE, &Value::from(region_stat.sst_size)),
(ENGINE, &Value::from(region_stat.engine.as_str())),
(REGION_ROLE, &Value::from(region_stat.role.to_string())),
];
if !predicate.eval(&row) {
return;
}
self.region_ids.push(Some(region_stat.id.as_u64()));
self.table_ids.push(Some(region_stat.id.table_id()));
self.region_numbers
.push(Some(region_stat.id.region_number()));
self.memtable_sizes.push(Some(region_stat.memtable_size));
self.manifest_sizes.push(Some(region_stat.manifest_size));
self.sst_sizes.push(Some(region_stat.sst_size));
self.engines.push(Some(&region_stat.engine));
self.region_roles.push(Some(&region_stat.role.to_string()));
}
fn finish(&mut self) -> Result<RecordBatch> {
let columns: Vec<VectorRef> = vec![
Arc::new(self.region_ids.finish()),
Arc::new(self.table_ids.finish()),
Arc::new(self.region_numbers.finish()),
Arc::new(self.memtable_sizes.finish()),
Arc::new(self.manifest_sizes.finish()),
Arc::new(self.sst_sizes.finish()),
Arc::new(self.engines.finish()),
Arc::new(self.region_roles.finish()),
];
RecordBatch::new(self.schema.clone(), columns).context(CreateRecordBatchSnafu)
}
}
impl DfPartitionStream for InformationSchemaRegionStatistics {
fn schema(&self) -> &ArrowSchemaRef {
self.schema.arrow_schema()
}
fn execute(&self, _: Arc<TaskContext>) -> DfSendableRecordBatchStream {
let schema = self.schema.arrow_schema().clone();
let mut builder = self.builder();
Box::pin(DfRecordBatchStreamAdapter::new(
schema,
futures::stream::once(async move {
builder
.make_region_statistics(None)
.await
.map(|x| x.into_df_record_batch())
.map_err(Into::into)
}),
))
}
}

View File

@@ -45,3 +45,5 @@ pub const TABLE_CONSTRAINTS: &str = "table_constraints";
pub const CLUSTER_INFO: &str = "cluster_info";
pub const VIEWS: &str = "views";
pub const FLOWS: &str = "flows";
pub const PROCEDURE_INFO: &str = "procedure_info";
pub const REGION_STATISTICS: &str = "region_statistics";

View File

@@ -18,6 +18,7 @@ use std::sync::{Arc, Weak};
use common_config::Mode;
use common_meta::key::TableMetadataManagerRef;
use common_procedure::ProcedureManagerRef;
use meta_client::client::MetaClient;
use snafu::OptionExt;
@@ -68,3 +69,17 @@ pub fn table_meta_manager(
.downcast_ref::<KvBackendCatalogManager>()
.map(|manager| manager.table_metadata_manager_ref().clone()))
}
/// Try to get the `[ProcedureManagerRef]` from `[CatalogManager]` weak reference.
pub fn procedure_manager(
catalog_manager: &Weak<dyn CatalogManager>,
) -> Result<Option<ProcedureManagerRef>> {
let catalog_manager = catalog_manager
.upgrade()
.context(UpgradeWeakCatalogManagerRefSnafu)?;
Ok(catalog_manager
.as_any()
.downcast_ref::<KvBackendCatalogManager>()
.and_then(|manager| manager.procedure_manager()))
}

View File

@@ -327,6 +327,7 @@ mod tests {
None,
backend.clone(),
layered_cache_registry,
None,
);
let table_metadata_manager = TableMetadataManager::new(backend);
let mut view_info = common_meta::key::test_utils::new_test_table_info(1024, vec![]);

View File

@@ -15,10 +15,11 @@
#![doc = include_str!("../../../../README.md")]
use clap::{Parser, Subcommand};
use cmd::error::Result;
use cmd::error::{InitTlsProviderSnafu, Result};
use cmd::options::GlobalOptions;
use cmd::{cli, datanode, flownode, frontend, metasrv, standalone, App};
use common_version::version;
use servers::install_ring_crypto_provider;
#[derive(Parser)]
#[command(name = "greptime", author, version, long_version = version(), about)]
@@ -94,6 +95,7 @@ async fn main() -> Result<()> {
async fn main_body() -> Result<()> {
setup_human_panic();
install_ring_crypto_provider().map_err(|msg| InitTlsProviderSnafu { msg }.build())?;
start(Command::parse()).await
}

View File

@@ -35,7 +35,6 @@ use either::Either;
use meta_client::client::MetaClientBuilder;
use query::datafusion::DatafusionQueryEngine;
use query::parser::QueryLanguageParser;
use query::plan::LogicalPlan;
use query::query_engine::{DefaultSerializer, QueryEngineState};
use query::QueryEngine;
use rustyline::error::ReadlineError;
@@ -179,7 +178,7 @@ impl Repl {
.await
.context(PlanStatementSnafu)?;
let LogicalPlan::DfPlan(plan) = query_engine
let plan = query_engine
.optimize(&query_engine.engine_context(query_ctx), &plan)
.context(PlanStatementSnafu)?;
@@ -281,6 +280,7 @@ async fn create_query_engine(meta_addr: &str) -> Result<DatafusionQueryEngine> {
Some(meta_client.clone()),
cached_meta_backend.clone(),
layered_cache_registry,
None,
);
let plugins: Plugins = Default::default();
let state = Arc::new(QueryEngineState::new(

View File

@@ -24,6 +24,12 @@ use snafu::{Location, Snafu};
#[snafu(visibility(pub))]
#[stack_trace_debug]
pub enum Error {
#[snafu(display("Failed to install ring crypto provider: {}", msg))]
InitTlsProvider {
#[snafu(implicit)]
location: Location,
msg: String,
},
#[snafu(display("Failed to create default catalog and schema"))]
InitMetadata {
#[snafu(implicit)]
@@ -369,9 +375,10 @@ impl ErrorExt for Error {
}
Error::SubstraitEncodeLogicalPlan { source, .. } => source.status_code(),
Error::SerdeJson { .. } | Error::FileIo { .. } | Error::SpawnThread { .. } => {
StatusCode::Unexpected
}
Error::SerdeJson { .. }
| Error::FileIo { .. }
| Error::SpawnThread { .. }
| Error::InitTlsProvider { .. } => StatusCode::Unexpected,
Error::Other { source, .. } => source.status_code(),

View File

@@ -274,6 +274,7 @@ impl StartCommand {
Some(meta_client.clone()),
cached_meta_backend.clone(),
layered_cache_registry.clone(),
None,
);
let table_metadata_manager =

View File

@@ -320,6 +320,7 @@ impl StartCommand {
Some(meta_client.clone()),
cached_meta_backend.clone(),
layered_cache_registry.clone(),
None,
);
let executor = HandlerGroupExecutor::new(vec![

View File

@@ -482,6 +482,7 @@ impl StartCommand {
None,
kv_backend.clone(),
layered_cache_registry.clone(),
Some(procedure_manager.clone()),
);
let table_metadata_manager =

View File

@@ -8,7 +8,7 @@ license.workspace = true
workspace = true
[dependencies]
anymap = "1.0.0-beta.2"
anymap2 = "0.13"
async-trait.workspace = true
bitvec = "1.0"
bytes.workspace = true

View File

@@ -12,20 +12,21 @@
// See the License for the specific language governing permissions and
// limitations under the License.
use std::any::Any;
use std::sync::{Arc, RwLock, RwLockReadGuard, RwLockWriteGuard};
/// [`Plugins`] is a wrapper of [AnyMap](https://github.com/chris-morgan/anymap) and provides a thread-safe way to store and retrieve plugins.
use anymap2::SendSyncAnyMap;
/// [`Plugins`] is a wrapper of [anymap2](https://github.com/azriel91/anymap2) and provides a thread-safe way to store and retrieve plugins.
/// Make it Cloneable and we can treat it like an Arc struct.
#[derive(Default, Clone)]
pub struct Plugins {
inner: Arc<RwLock<anymap::Map<dyn Any + Send + Sync>>>,
inner: Arc<RwLock<SendSyncAnyMap>>,
}
impl Plugins {
pub fn new() -> Self {
Self {
inner: Arc::new(RwLock::new(anymap::Map::new())),
inner: Arc::new(RwLock::new(SendSyncAnyMap::new())),
}
}
@@ -61,11 +62,11 @@ impl Plugins {
self.read().is_empty()
}
fn read(&self) -> RwLockReadGuard<anymap::Map<dyn Any + Send + Sync>> {
fn read(&self) -> RwLockReadGuard<SendSyncAnyMap> {
self.inner.read().unwrap()
}
fn write(&self) -> RwLockWriteGuard<anymap::Map<dyn Any + Send + Sync>> {
fn write(&self) -> RwLockWriteGuard<SendSyncAnyMap> {
self.inner.write().unwrap()
}
}

View File

@@ -98,6 +98,11 @@ pub const INFORMATION_SCHEMA_CLUSTER_INFO_TABLE_ID: u32 = 31;
pub const INFORMATION_SCHEMA_VIEW_TABLE_ID: u32 = 32;
/// id for information_schema.FLOWS
pub const INFORMATION_SCHEMA_FLOW_TABLE_ID: u32 = 33;
/// id for information_schema.procedure_info
pub const INFORMATION_SCHEMA_PROCEDURE_INFO_TABLE_ID: u32 = 34;
/// id for information_schema.region_statistics
pub const INFORMATION_SCHEMA_REGION_STATISTICS_TABLE_ID: u32 = 35;
/// ----- End of information_schema tables -----
/// ----- Begin of pg_catalog tables -----

View File

@@ -38,6 +38,8 @@ pub enum StatusCode {
Cancelled = 1005,
/// Illegal state, can be exposed to users.
IllegalState = 1006,
/// Caused by some error originated from external system.
External = 1007,
// ====== End of common status code ================
// ====== Begin of SQL related status code =========
@@ -162,7 +164,8 @@ impl StatusCode {
| StatusCode::InvalidAuthHeader
| StatusCode::AccessDenied
| StatusCode::PermissionDenied
| StatusCode::RequestOutdated => false,
| StatusCode::RequestOutdated
| StatusCode::External => false,
}
}
@@ -177,7 +180,9 @@ impl StatusCode {
| StatusCode::IllegalState
| StatusCode::EngineExecuteQuery
| StatusCode::StorageUnavailable
| StatusCode::RuntimeResourcesExhausted => true,
| StatusCode::RuntimeResourcesExhausted
| StatusCode::External => true,
StatusCode::Success
| StatusCode::Unsupported
| StatusCode::InvalidArguments
@@ -256,7 +261,7 @@ macro_rules! define_into_tonic_status {
pub fn status_to_tonic_code(status_code: StatusCode) -> Code {
match status_code {
StatusCode::Success => Code::Ok,
StatusCode::Unknown => Code::Unknown,
StatusCode::Unknown | StatusCode::External => Code::Unknown,
StatusCode::Unsupported => Code::Unimplemented,
StatusCode::Unexpected
| StatusCode::IllegalState

View File

@@ -27,6 +27,7 @@ common-time.workspace = true
common-version.workspace = true
datafusion.workspace = true
datatypes.workspace = true
derive_more = { version = "1", default-features = false, features = ["display"] }
geohash = { version = "0.13", optional = true }
h3o = { version = "0.6", optional = true }
jsonb.workspace = true

View File

@@ -16,7 +16,6 @@ mod argmax;
mod argmin;
mod diff;
mod mean;
mod percentile;
mod polyval;
mod scipy_stats_norm_cdf;
mod scipy_stats_norm_pdf;
@@ -28,7 +27,6 @@ pub use argmin::ArgminAccumulatorCreator;
use common_query::logical_plan::AggregateFunctionCreatorRef;
pub use diff::DiffAccumulatorCreator;
pub use mean::MeanAccumulatorCreator;
pub use percentile::PercentileAccumulatorCreator;
pub use polyval::PolyvalAccumulatorCreator;
pub use scipy_stats_norm_cdf::ScipyStatsNormCdfAccumulatorCreator;
pub use scipy_stats_norm_pdf::ScipyStatsNormPdfAccumulatorCreator;
@@ -91,7 +89,6 @@ impl AggregateFunctions {
register_aggr_func!("polyval", 2, PolyvalAccumulatorCreator);
register_aggr_func!("argmax", 1, ArgmaxAccumulatorCreator);
register_aggr_func!("argmin", 1, ArgminAccumulatorCreator);
register_aggr_func!("percentile", 2, PercentileAccumulatorCreator);
register_aggr_func!("scipystatsnormcdf", 2, ScipyStatsNormCdfAccumulatorCreator);
register_aggr_func!("scipystatsnormpdf", 2, ScipyStatsNormPdfAccumulatorCreator);
}

View File

@@ -1,436 +0,0 @@
// Copyright 2023 Greptime Team
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
use std::cmp::Reverse;
use std::collections::BinaryHeap;
use std::sync::Arc;
use common_macro::{as_aggr_func_creator, AggrFuncTypeStore};
use common_query::error::{
self, BadAccumulatorImplSnafu, CreateAccumulatorSnafu, DowncastVectorSnafu,
FromScalarValueSnafu, InvalidInputColSnafu, Result,
};
use common_query::logical_plan::{Accumulator, AggregateFunctionCreator};
use common_query::prelude::*;
use datatypes::prelude::*;
use datatypes::types::OrdPrimitive;
use datatypes::value::{ListValue, OrderedFloat};
use datatypes::vectors::{ConstantVector, Float64Vector, Helper, ListVector};
use datatypes::with_match_primitive_type_id;
use num::NumCast;
use snafu::{ensure, OptionExt, ResultExt};
// https://numpy.org/doc/stable/reference/generated/numpy.percentile.html?highlight=percentile#numpy.percentile
// if the p is 50,then the Percentile become median
// we use two heap great and not_greater
// the not_greater push the value that smaller than P-value
// the greater push the value that bigger than P-value
// just like the percentile in numpy:
// Given a vector V of length N, the q-th percentile of V is the value q/100 of the way from the minimum to the maximum in a sorted copy of V.
// The values and distances of the two nearest neighbors as well as the method parameter will determine the percentile
// if the normalized ranking does not match the location of q exactly.
// This function is the same as the median if q=50, the same as the minimum if q=0 and the same as the maximum if q=100.
// This optional method parameter specifies the method to use when the desired quantile lies between two data points i < j.
// If g is the fractional part of the index surrounded by i and alpha and beta are correction constants modifying i and j.
// i+g = (q-alpha)/(n-alpha-beta+1)
// Below, 'q' is the quantile value, 'n' is the sample size and alpha and beta are constants. The following formula gives an interpolation "i + g" of where the quantile would be in the sorted sample.
// With 'i' being the floor and 'g' the fractional part of the result.
// the default method is linear where
// alpha = 1
// beta = 1
#[derive(Debug, Default)]
pub struct Percentile<T>
where
T: WrapperType,
{
greater: BinaryHeap<Reverse<OrdPrimitive<T>>>,
not_greater: BinaryHeap<OrdPrimitive<T>>,
n: u64,
p: Option<f64>,
}
impl<T> Percentile<T>
where
T: WrapperType,
{
fn push(&mut self, value: T) {
let value = OrdPrimitive::<T>(value);
self.n += 1;
if self.not_greater.is_empty() {
self.not_greater.push(value);
return;
}
// to keep the not_greater length == floor+1
// so to ensure the peek of the not_greater is array[floor]
// and the peek of the greater is array[floor+1]
let p = self.p.unwrap_or(0.0_f64);
let floor = (((self.n - 1) as f64) * p / (100_f64)).floor();
if value <= *self.not_greater.peek().unwrap() {
self.not_greater.push(value);
if self.not_greater.len() > (floor + 1.0) as usize {
self.greater.push(Reverse(self.not_greater.pop().unwrap()));
}
} else {
self.greater.push(Reverse(value));
if self.not_greater.len() < (floor + 1.0) as usize {
self.not_greater.push(self.greater.pop().unwrap().0);
}
}
}
}
impl<T> Accumulator for Percentile<T>
where
T: WrapperType,
{
fn state(&self) -> Result<Vec<Value>> {
let nums = self
.greater
.iter()
.map(|x| &x.0)
.chain(self.not_greater.iter())
.map(|&n| n.into())
.collect::<Vec<Value>>();
Ok(vec![
Value::List(ListValue::new(nums, T::LogicalType::build_data_type())),
self.p.into(),
])
}
fn update_batch(&mut self, values: &[VectorRef]) -> Result<()> {
if values.is_empty() {
return Ok(());
}
ensure!(values.len() == 2, InvalidInputStateSnafu);
ensure!(values[0].len() == values[1].len(), InvalidInputStateSnafu);
if values[0].len() == 0 {
return Ok(());
}
// This is a unary accumulator, so only one column is provided.
let column = &values[0];
let mut len = 1;
let column: &<T as Scalar>::VectorType = if column.is_const() {
len = column.len();
let column: &ConstantVector = unsafe { Helper::static_cast(column) };
unsafe { Helper::static_cast(column.inner()) }
} else {
unsafe { Helper::static_cast(column) }
};
let x = &values[1];
let x = Helper::check_get_scalar::<f64>(x).context(error::InvalidInputTypeSnafu {
err_msg: "expecting \"POLYVAL\" function's second argument to be float64",
})?;
// `get(0)` is safe because we have checked `values[1].len() == values[0].len() != 0`
let first = x.get(0);
ensure!(!first.is_null(), InvalidInputColSnafu);
for i in 1..x.len() {
ensure!(first == x.get(i), InvalidInputColSnafu);
}
let first = match first {
Value::Float64(OrderedFloat(v)) => v,
// unreachable because we have checked `first` is not null and is i64 above
_ => unreachable!(),
};
if let Some(p) = self.p {
ensure!(p == first, InvalidInputColSnafu);
} else {
self.p = Some(first);
};
(0..len).for_each(|_| {
for v in column.iter_data().flatten() {
self.push(v);
}
});
Ok(())
}
fn merge_batch(&mut self, states: &[VectorRef]) -> Result<()> {
if states.is_empty() {
return Ok(());
}
ensure!(
states.len() == 2,
BadAccumulatorImplSnafu {
err_msg: "expect 2 states in `merge_batch`"
}
);
let p = &states[1];
let p = p
.as_any()
.downcast_ref::<Float64Vector>()
.with_context(|| DowncastVectorSnafu {
err_msg: format!(
"expect float64vector, got vector type {}",
p.vector_type_name()
),
})?;
let p = p.get(0);
if p.is_null() {
return Ok(());
}
let p = match p {
Value::Float64(OrderedFloat(p)) => p,
_ => unreachable!(),
};
self.p = Some(p);
let values = &states[0];
let values = values
.as_any()
.downcast_ref::<ListVector>()
.with_context(|| DowncastVectorSnafu {
err_msg: format!(
"expect ListVector, got vector type {}",
values.vector_type_name()
),
})?;
for value in values.values_iter() {
if let Some(value) = value.context(FromScalarValueSnafu)? {
let column: &<T as Scalar>::VectorType = unsafe { Helper::static_cast(&value) };
for v in column.iter_data().flatten() {
self.push(v);
}
}
}
Ok(())
}
fn evaluate(&self) -> Result<Value> {
if self.not_greater.is_empty() {
assert!(
self.greater.is_empty(),
"not expected in two-heap percentile algorithm, there must be a bug when implementing it"
);
}
let not_greater = self.not_greater.peek();
if not_greater.is_none() {
return Ok(Value::Null);
}
let not_greater = (*self.not_greater.peek().unwrap()).as_primitive();
let percentile = if self.greater.is_empty() {
NumCast::from(not_greater).unwrap()
} else {
let greater = self.greater.peek().unwrap();
let p = if let Some(p) = self.p {
p
} else {
return Ok(Value::Null);
};
let fract = (((self.n - 1) as f64) * p / 100_f64).fract();
let not_greater_v: f64 = NumCast::from(not_greater).unwrap();
let greater_v: f64 = NumCast::from(greater.0.as_primitive()).unwrap();
not_greater_v * (1.0 - fract) + greater_v * fract
};
Ok(Value::from(percentile))
}
}
#[as_aggr_func_creator]
#[derive(Debug, Default, AggrFuncTypeStore)]
pub struct PercentileAccumulatorCreator {}
impl AggregateFunctionCreator for PercentileAccumulatorCreator {
fn creator(&self) -> AccumulatorCreatorFunction {
let creator: AccumulatorCreatorFunction = Arc::new(move |types: &[ConcreteDataType]| {
let input_type = &types[0];
with_match_primitive_type_id!(
input_type.logical_type_id(),
|$S| {
Ok(Box::new(Percentile::<<$S as LogicalPrimitiveType>::Wrapper>::default()))
},
{
let err_msg = format!(
"\"PERCENTILE\" aggregate function not support data type {:?}",
input_type.logical_type_id(),
);
CreateAccumulatorSnafu { err_msg }.fail()?
}
)
});
creator
}
fn output_type(&self) -> Result<ConcreteDataType> {
let input_types = self.input_types()?;
ensure!(input_types.len() == 2, InvalidInputStateSnafu);
// unwrap is safe because we have checked input_types len must equals 1
Ok(ConcreteDataType::float64_datatype())
}
fn state_types(&self) -> Result<Vec<ConcreteDataType>> {
let input_types = self.input_types()?;
ensure!(input_types.len() == 2, InvalidInputStateSnafu);
Ok(vec![
ConcreteDataType::list_datatype(input_types.into_iter().next().unwrap()),
ConcreteDataType::float64_datatype(),
])
}
}
#[cfg(test)]
mod test {
use datatypes::vectors::{Float64Vector, Int32Vector};
use super::*;
#[test]
fn test_update_batch() {
// test update empty batch, expect not updating anything
let mut percentile = Percentile::<i32>::default();
percentile.update_batch(&[]).unwrap();
assert!(percentile.not_greater.is_empty());
assert!(percentile.greater.is_empty());
assert_eq!(Value::Null, percentile.evaluate().unwrap());
// test update one not-null value
let mut percentile = Percentile::<i32>::default();
let v: Vec<VectorRef> = vec![
Arc::new(Int32Vector::from(vec![Some(42)])),
Arc::new(Float64Vector::from(vec![Some(100.0_f64)])),
];
percentile.update_batch(&v).unwrap();
assert_eq!(Value::from(42.0_f64), percentile.evaluate().unwrap());
// test update one null value
let mut percentile = Percentile::<i32>::default();
let v: Vec<VectorRef> = vec![
Arc::new(Int32Vector::from(vec![Option::<i32>::None])),
Arc::new(Float64Vector::from(vec![Some(100.0_f64)])),
];
percentile.update_batch(&v).unwrap();
assert_eq!(Value::Null, percentile.evaluate().unwrap());
// test update no null-value batch
let mut percentile = Percentile::<i32>::default();
let v: Vec<VectorRef> = vec![
Arc::new(Int32Vector::from(vec![Some(-1i32), Some(1), Some(2)])),
Arc::new(Float64Vector::from(vec![
Some(100.0_f64),
Some(100.0_f64),
Some(100.0_f64),
])),
];
percentile.update_batch(&v).unwrap();
assert_eq!(Value::from(2_f64), percentile.evaluate().unwrap());
// test update null-value batch
let mut percentile = Percentile::<i32>::default();
let v: Vec<VectorRef> = vec![
Arc::new(Int32Vector::from(vec![Some(-2i32), None, Some(3), Some(4)])),
Arc::new(Float64Vector::from(vec![
Some(100.0_f64),
Some(100.0_f64),
Some(100.0_f64),
Some(100.0_f64),
])),
];
percentile.update_batch(&v).unwrap();
assert_eq!(Value::from(4_f64), percentile.evaluate().unwrap());
// test update with constant vector
let mut percentile = Percentile::<i32>::default();
let v: Vec<VectorRef> = vec![
Arc::new(ConstantVector::new(
Arc::new(Int32Vector::from_vec(vec![4])),
2,
)),
Arc::new(Float64Vector::from(vec![Some(100.0_f64), Some(100.0_f64)])),
];
percentile.update_batch(&v).unwrap();
assert_eq!(Value::from(4_f64), percentile.evaluate().unwrap());
// test left border
let mut percentile = Percentile::<i32>::default();
let v: Vec<VectorRef> = vec![
Arc::new(Int32Vector::from(vec![Some(-1i32), Some(1), Some(2)])),
Arc::new(Float64Vector::from(vec![
Some(0.0_f64),
Some(0.0_f64),
Some(0.0_f64),
])),
];
percentile.update_batch(&v).unwrap();
assert_eq!(Value::from(-1.0_f64), percentile.evaluate().unwrap());
// test medium
let mut percentile = Percentile::<i32>::default();
let v: Vec<VectorRef> = vec![
Arc::new(Int32Vector::from(vec![Some(-1i32), Some(1), Some(2)])),
Arc::new(Float64Vector::from(vec![
Some(50.0_f64),
Some(50.0_f64),
Some(50.0_f64),
])),
];
percentile.update_batch(&v).unwrap();
assert_eq!(Value::from(1.0_f64), percentile.evaluate().unwrap());
// test right border
let mut percentile = Percentile::<i32>::default();
let v: Vec<VectorRef> = vec![
Arc::new(Int32Vector::from(vec![Some(-1i32), Some(1), Some(2)])),
Arc::new(Float64Vector::from(vec![
Some(100.0_f64),
Some(100.0_f64),
Some(100.0_f64),
])),
];
percentile.update_batch(&v).unwrap();
assert_eq!(Value::from(2.0_f64), percentile.evaluate().unwrap());
// the following is the result of numpy.percentile
// numpy.percentile
// a = np.array([[10,7,4]])
// np.percentile(a,40)
// >> 6.400000000000
let mut percentile = Percentile::<i32>::default();
let v: Vec<VectorRef> = vec![
Arc::new(Int32Vector::from(vec![Some(10i32), Some(7), Some(4)])),
Arc::new(Float64Vector::from(vec![
Some(40.0_f64),
Some(40.0_f64),
Some(40.0_f64),
])),
];
percentile.update_batch(&v).unwrap();
assert_eq!(Value::from(6.400000000_f64), percentile.evaluate().unwrap());
// the following is the result of numpy.percentile
// a = np.array([[10,7,4]])
// np.percentile(a,95)
// >> 9.7000000000000011
let mut percentile = Percentile::<i32>::default();
let v: Vec<VectorRef> = vec![
Arc::new(Int32Vector::from(vec![Some(10i32), Some(7), Some(4)])),
Arc::new(Float64Vector::from(vec![
Some(95.0_f64),
Some(95.0_f64),
Some(95.0_f64),
])),
];
percentile.update_batch(&v).unwrap();
assert_eq!(
Value::from(9.700_000_000_000_001_f64),
percentile.evaluate().unwrap()
);
}
}

View File

@@ -16,8 +16,7 @@ use std::sync::Arc;
mod geohash;
mod h3;
use geohash::GeohashFunction;
use h3::H3Function;
use geohash::{GeohashFunction, GeohashNeighboursFunction};
use crate::function_registry::FunctionRegistry;
@@ -25,7 +24,21 @@ pub(crate) struct GeoFunctions;
impl GeoFunctions {
pub fn register(registry: &FunctionRegistry) {
// geohash
registry.register(Arc::new(GeohashFunction));
registry.register(Arc::new(H3Function));
registry.register(Arc::new(GeohashNeighboursFunction));
// h3 family
registry.register(Arc::new(h3::H3LatLngToCell));
registry.register(Arc::new(h3::H3LatLngToCellString));
registry.register(Arc::new(h3::H3CellBase));
registry.register(Arc::new(h3::H3CellCenterChild));
registry.register(Arc::new(h3::H3CellCenterLat));
registry.register(Arc::new(h3::H3CellCenterLng));
registry.register(Arc::new(h3::H3CellIsPentagon));
registry.register(Arc::new(h3::H3CellParent));
registry.register(Arc::new(h3::H3CellResolution));
registry.register(Arc::new(h3::H3CellToString));
registry.register(Arc::new(h3::H3IsNeighbour));
registry.register(Arc::new(h3::H3StringToCell));
}
}

View File

@@ -20,23 +20,69 @@ use common_query::error::{self, InvalidFuncArgsSnafu, Result};
use common_query::prelude::{Signature, TypeSignature};
use datafusion::logical_expr::Volatility;
use datatypes::prelude::ConcreteDataType;
use datatypes::scalars::ScalarVectorBuilder;
use datatypes::value::Value;
use datatypes::vectors::{MutableVector, StringVectorBuilder, VectorRef};
use datatypes::scalars::{Scalar, ScalarVectorBuilder};
use datatypes::value::{ListValue, Value};
use datatypes::vectors::{ListVectorBuilder, MutableVector, StringVectorBuilder, VectorRef};
use geohash::Coord;
use snafu::{ensure, ResultExt};
use crate::function::{Function, FunctionContext};
macro_rules! ensure_resolution_usize {
($v: ident) => {
if !($v > 0 && $v <= 12) {
Err(BoxedError::new(PlainError::new(
format!("Invalid geohash resolution {}, expect value: [1, 12]", $v),
StatusCode::EngineExecuteQuery,
)))
.context(error::ExecuteSnafu)
} else {
Ok($v as usize)
}
};
}
fn try_into_resolution(v: Value) -> Result<usize> {
match v {
Value::Int8(v) => {
ensure_resolution_usize!(v)
}
Value::Int16(v) => {
ensure_resolution_usize!(v)
}
Value::Int32(v) => {
ensure_resolution_usize!(v)
}
Value::Int64(v) => {
ensure_resolution_usize!(v)
}
Value::UInt8(v) => {
ensure_resolution_usize!(v)
}
Value::UInt16(v) => {
ensure_resolution_usize!(v)
}
Value::UInt32(v) => {
ensure_resolution_usize!(v)
}
Value::UInt64(v) => {
ensure_resolution_usize!(v)
}
_ => unreachable!(),
}
}
/// Function that return geohash string for a given geospatial coordinate.
#[derive(Clone, Debug, Default)]
pub struct GeohashFunction;
const NAME: &str = "geohash";
impl GeohashFunction {
const NAME: &'static str = "geohash";
}
impl Function for GeohashFunction {
fn name(&self) -> &str {
NAME
Self::NAME
}
fn return_type(&self, _input_types: &[ConcreteDataType]) -> Result<ConcreteDataType> {
@@ -93,17 +139,7 @@ impl Function for GeohashFunction {
for i in 0..size {
let lat = lat_vec.get(i).as_f64_lossy();
let lon = lon_vec.get(i).as_f64_lossy();
let r = match resolution_vec.get(i) {
Value::Int8(v) => v as usize,
Value::Int16(v) => v as usize,
Value::Int32(v) => v as usize,
Value::Int64(v) => v as usize,
Value::UInt8(v) => v as usize,
Value::UInt16(v) => v as usize,
Value::UInt32(v) => v as usize,
Value::UInt64(v) => v as usize,
_ => unreachable!(),
};
let r = try_into_resolution(resolution_vec.get(i))?;
let result = match (lat, lon) {
(Some(lat), Some(lon)) => {
@@ -130,6 +166,134 @@ impl Function for GeohashFunction {
impl fmt::Display for GeohashFunction {
fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
write!(f, "{}", NAME)
write!(f, "{}", Self::NAME)
}
}
/// Function that return geohash string for a given geospatial coordinate.
#[derive(Clone, Debug, Default)]
pub struct GeohashNeighboursFunction;
impl GeohashNeighboursFunction {
const NAME: &'static str = "geohash_neighbours";
}
impl Function for GeohashNeighboursFunction {
fn name(&self) -> &str {
GeohashNeighboursFunction::NAME
}
fn return_type(&self, _input_types: &[ConcreteDataType]) -> Result<ConcreteDataType> {
Ok(ConcreteDataType::list_datatype(
ConcreteDataType::string_datatype(),
))
}
fn signature(&self) -> Signature {
let mut signatures = Vec::new();
for coord_type in &[
ConcreteDataType::float32_datatype(),
ConcreteDataType::float64_datatype(),
] {
for resolution_type in &[
ConcreteDataType::int8_datatype(),
ConcreteDataType::int16_datatype(),
ConcreteDataType::int32_datatype(),
ConcreteDataType::int64_datatype(),
ConcreteDataType::uint8_datatype(),
ConcreteDataType::uint16_datatype(),
ConcreteDataType::uint32_datatype(),
ConcreteDataType::uint64_datatype(),
] {
signatures.push(TypeSignature::Exact(vec![
// latitude
coord_type.clone(),
// longitude
coord_type.clone(),
// resolution
resolution_type.clone(),
]));
}
}
Signature::one_of(signatures, Volatility::Stable)
}
fn eval(&self, _func_ctx: FunctionContext, columns: &[VectorRef]) -> Result<VectorRef> {
ensure!(
columns.len() == 3,
InvalidFuncArgsSnafu {
err_msg: format!(
"The length of the args is not correct, expect 3, provided : {}",
columns.len()
),
}
);
let lat_vec = &columns[0];
let lon_vec = &columns[1];
let resolution_vec = &columns[2];
let size = lat_vec.len();
let mut results =
ListVectorBuilder::with_type_capacity(ConcreteDataType::string_datatype(), size);
for i in 0..size {
let lat = lat_vec.get(i).as_f64_lossy();
let lon = lon_vec.get(i).as_f64_lossy();
let r = try_into_resolution(resolution_vec.get(i))?;
let result = match (lat, lon) {
(Some(lat), Some(lon)) => {
let coord = Coord { x: lon, y: lat };
let encoded = geohash::encode(coord, r)
.map_err(|e| {
BoxedError::new(PlainError::new(
format!("Geohash error: {}", e),
StatusCode::EngineExecuteQuery,
))
})
.context(error::ExecuteSnafu)?;
let neighbours = geohash::neighbors(&encoded)
.map_err(|e| {
BoxedError::new(PlainError::new(
format!("Geohash error: {}", e),
StatusCode::EngineExecuteQuery,
))
})
.context(error::ExecuteSnafu)?;
Some(ListValue::new(
vec![
neighbours.n,
neighbours.nw,
neighbours.w,
neighbours.sw,
neighbours.s,
neighbours.se,
neighbours.e,
neighbours.ne,
]
.into_iter()
.map(Value::from)
.collect(),
ConcreteDataType::string_datatype(),
))
}
_ => None,
};
if let Some(list_value) = result {
results.push(Some(list_value.as_scalar_ref()));
} else {
results.push(None);
}
}
Ok(results.to_vector())
}
}
impl fmt::Display for GeohashNeighboursFunction {
fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
write!(f, "{}", GeohashNeighboursFunction::NAME)
}
}

View File

@@ -12,7 +12,7 @@
// See the License for the specific language governing permissions and
// limitations under the License.
use std::fmt;
use std::str::FromStr;
use common_error::ext::{BoxedError, PlainError};
use common_error::status_code::StatusCode;
@@ -22,23 +22,118 @@ use datafusion::logical_expr::Volatility;
use datatypes::prelude::ConcreteDataType;
use datatypes::scalars::ScalarVectorBuilder;
use datatypes::value::Value;
use datatypes::vectors::{MutableVector, StringVectorBuilder, VectorRef};
use h3o::{LatLng, Resolution};
use datatypes::vectors::{
BooleanVectorBuilder, Float64VectorBuilder, MutableVector, StringVectorBuilder,
UInt64VectorBuilder, UInt8VectorBuilder, VectorRef,
};
use derive_more::Display;
use h3o::{CellIndex, LatLng, Resolution};
use snafu::{ensure, ResultExt};
use crate::function::{Function, FunctionContext};
/// Function that returns [h3] encoding string for a given geospatial coordinate.
/// Function that returns [h3] encoding cellid for a given geospatial coordinate.
///
/// [h3]: https://h3geo.org/
#[derive(Clone, Debug, Default)]
pub struct H3Function;
#[derive(Clone, Debug, Default, Display)]
#[display("{}", self.name())]
pub struct H3LatLngToCell;
const NAME: &str = "h3";
impl Function for H3Function {
impl Function for H3LatLngToCell {
fn name(&self) -> &str {
NAME
"h3_latlng_to_cell"
}
fn return_type(&self, _input_types: &[ConcreteDataType]) -> Result<ConcreteDataType> {
Ok(ConcreteDataType::uint64_datatype())
}
fn signature(&self) -> Signature {
let mut signatures = Vec::new();
for coord_type in &[
ConcreteDataType::float32_datatype(),
ConcreteDataType::float64_datatype(),
] {
for resolution_type in &[
ConcreteDataType::int8_datatype(),
ConcreteDataType::int16_datatype(),
ConcreteDataType::int32_datatype(),
ConcreteDataType::int64_datatype(),
ConcreteDataType::uint8_datatype(),
ConcreteDataType::uint16_datatype(),
ConcreteDataType::uint32_datatype(),
ConcreteDataType::uint64_datatype(),
] {
signatures.push(TypeSignature::Exact(vec![
// latitude
coord_type.clone(),
// longitude
coord_type.clone(),
// resolution
resolution_type.clone(),
]));
}
}
Signature::one_of(signatures, Volatility::Stable)
}
fn eval(&self, _func_ctx: FunctionContext, columns: &[VectorRef]) -> Result<VectorRef> {
ensure!(
columns.len() == 3,
InvalidFuncArgsSnafu {
err_msg: format!(
"The length of the args is not correct, expect 3, provided : {}",
columns.len()
),
}
);
let lat_vec = &columns[0];
let lon_vec = &columns[1];
let resolution_vec = &columns[2];
let size = lat_vec.len();
let mut results = UInt64VectorBuilder::with_capacity(size);
for i in 0..size {
let lat = lat_vec.get(i).as_f64_lossy();
let lon = lon_vec.get(i).as_f64_lossy();
let r = value_to_resolution(resolution_vec.get(i))?;
let result = match (lat, lon) {
(Some(lat), Some(lon)) => {
let coord = LatLng::new(lat, lon)
.map_err(|e| {
BoxedError::new(PlainError::new(
format!("H3 error: {}", e),
StatusCode::EngineExecuteQuery,
))
})
.context(error::ExecuteSnafu)?;
let encoded: u64 = coord.to_cell(r).into();
Some(encoded)
}
_ => None,
};
results.push(result);
}
Ok(results.to_vector())
}
}
/// Function that returns [h3] encoding cellid in string form for a given
/// geospatial coordinate.
///
/// [h3]: https://h3geo.org/
#[derive(Clone, Debug, Default, Display)]
#[display("{}", self.name())]
pub struct H3LatLngToCellString;
impl Function for H3LatLngToCellString {
fn name(&self) -> &str {
"h3_latlng_to_cell_string"
}
fn return_type(&self, _input_types: &[ConcreteDataType]) -> Result<ConcreteDataType> {
@@ -95,17 +190,7 @@ impl Function for H3Function {
for i in 0..size {
let lat = lat_vec.get(i).as_f64_lossy();
let lon = lon_vec.get(i).as_f64_lossy();
let r = match resolution_vec.get(i) {
Value::Int8(v) => v as u8,
Value::Int16(v) => v as u8,
Value::Int32(v) => v as u8,
Value::Int64(v) => v as u8,
Value::UInt8(v) => v,
Value::UInt16(v) => v as u8,
Value::UInt32(v) => v as u8,
Value::UInt64(v) => v as u8,
_ => unreachable!(),
};
let r = value_to_resolution(resolution_vec.get(i))?;
let result = match (lat, lon) {
(Some(lat), Some(lon)) => {
@@ -117,14 +202,6 @@ impl Function for H3Function {
))
})
.context(error::ExecuteSnafu)?;
let r = Resolution::try_from(r)
.map_err(|e| {
BoxedError::new(PlainError::new(
format!("H3 error: {}", e),
StatusCode::EngineExecuteQuery,
))
})
.context(error::ExecuteSnafu)?;
let encoded = coord.to_cell(r).to_string();
Some(encoded)
}
@@ -138,8 +215,585 @@ impl Function for H3Function {
}
}
impl fmt::Display for H3Function {
fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
write!(f, "{}", NAME)
/// Function that converts cell id to its string form
#[derive(Clone, Debug, Default, Display)]
#[display("{}", self.name())]
pub struct H3CellToString;
impl Function for H3CellToString {
fn name(&self) -> &str {
"h3_cell_to_string"
}
fn return_type(&self, _input_types: &[ConcreteDataType]) -> Result<ConcreteDataType> {
Ok(ConcreteDataType::string_datatype())
}
fn signature(&self) -> Signature {
signature_of_cell()
}
fn eval(&self, _func_ctx: FunctionContext, columns: &[VectorRef]) -> Result<VectorRef> {
ensure!(
columns.len() == 1,
InvalidFuncArgsSnafu {
err_msg: format!(
"The length of the args is not correct, expect 1, provided : {}",
columns.len()
),
}
);
let cell_vec = &columns[0];
let size = cell_vec.len();
let mut results = StringVectorBuilder::with_capacity(size);
for i in 0..size {
let cell_id_string = cell_from_value(cell_vec.get(i))?.map(|c| c.to_string());
results.push(cell_id_string.as_deref());
}
Ok(results.to_vector())
}
}
/// Function that converts cell string id to uint64 number
#[derive(Clone, Debug, Default, Display)]
#[display("{}", self.name())]
pub struct H3StringToCell;
impl Function for H3StringToCell {
fn name(&self) -> &str {
"h3_string_to_cell"
}
fn return_type(&self, _input_types: &[ConcreteDataType]) -> Result<ConcreteDataType> {
Ok(ConcreteDataType::uint64_datatype())
}
fn signature(&self) -> Signature {
Signature::new(
TypeSignature::Exact(vec![ConcreteDataType::string_datatype()]),
Volatility::Stable,
)
}
fn eval(&self, _func_ctx: FunctionContext, columns: &[VectorRef]) -> Result<VectorRef> {
ensure!(
columns.len() == 1,
InvalidFuncArgsSnafu {
err_msg: format!(
"The length of the args is not correct, expect 1, provided : {}",
columns.len()
),
}
);
let string_vec = &columns[0];
let size = string_vec.len();
let mut results = UInt64VectorBuilder::with_capacity(size);
for i in 0..size {
let cell = string_vec.get(i);
let cell_id = match cell {
Value::String(v) => Some(
CellIndex::from_str(v.as_utf8())
.map_err(|e| {
BoxedError::new(PlainError::new(
format!("H3 error: {}", e),
StatusCode::EngineExecuteQuery,
))
})
.context(error::ExecuteSnafu)?
.into(),
),
_ => None,
};
results.push(cell_id);
}
Ok(results.to_vector())
}
}
/// Function that returns centroid latitude of given cell id
#[derive(Clone, Debug, Default, Display)]
#[display("{}", self.name())]
pub struct H3CellCenterLat;
impl Function for H3CellCenterLat {
fn name(&self) -> &str {
"h3_cell_center_lat"
}
fn return_type(&self, _input_types: &[ConcreteDataType]) -> Result<ConcreteDataType> {
Ok(ConcreteDataType::float64_datatype())
}
fn signature(&self) -> Signature {
signature_of_cell()
}
fn eval(&self, _func_ctx: FunctionContext, columns: &[VectorRef]) -> Result<VectorRef> {
ensure!(
columns.len() == 1,
InvalidFuncArgsSnafu {
err_msg: format!(
"The length of the args is not correct, expect 1, provided : {}",
columns.len()
),
}
);
let cell_vec = &columns[0];
let size = cell_vec.len();
let mut results = Float64VectorBuilder::with_capacity(size);
for i in 0..size {
let cell = cell_from_value(cell_vec.get(i))?;
let lat = cell.map(|cell| LatLng::from(cell).lat());
results.push(lat);
}
Ok(results.to_vector())
}
}
/// Function that returns centroid longitude of given cell id
#[derive(Clone, Debug, Default, Display)]
#[display("{}", self.name())]
pub struct H3CellCenterLng;
impl Function for H3CellCenterLng {
fn name(&self) -> &str {
"h3_cell_center_lng"
}
fn return_type(&self, _input_types: &[ConcreteDataType]) -> Result<ConcreteDataType> {
Ok(ConcreteDataType::float64_datatype())
}
fn signature(&self) -> Signature {
signature_of_cell()
}
fn eval(&self, _func_ctx: FunctionContext, columns: &[VectorRef]) -> Result<VectorRef> {
ensure!(
columns.len() == 1,
InvalidFuncArgsSnafu {
err_msg: format!(
"The length of the args is not correct, expect 1, provided : {}",
columns.len()
),
}
);
let cell_vec = &columns[0];
let size = cell_vec.len();
let mut results = Float64VectorBuilder::with_capacity(size);
for i in 0..size {
let cell = cell_from_value(cell_vec.get(i))?;
let lat = cell.map(|cell| LatLng::from(cell).lng());
results.push(lat);
}
Ok(results.to_vector())
}
}
/// Function that returns resolution of given cell id
#[derive(Clone, Debug, Default, Display)]
#[display("{}", self.name())]
pub struct H3CellResolution;
impl Function for H3CellResolution {
fn name(&self) -> &str {
"h3_cell_resolution"
}
fn return_type(&self, _input_types: &[ConcreteDataType]) -> Result<ConcreteDataType> {
Ok(ConcreteDataType::uint8_datatype())
}
fn signature(&self) -> Signature {
signature_of_cell()
}
fn eval(&self, _func_ctx: FunctionContext, columns: &[VectorRef]) -> Result<VectorRef> {
ensure!(
columns.len() == 1,
InvalidFuncArgsSnafu {
err_msg: format!(
"The length of the args is not correct, expect 1, provided : {}",
columns.len()
),
}
);
let cell_vec = &columns[0];
let size = cell_vec.len();
let mut results = UInt8VectorBuilder::with_capacity(size);
for i in 0..size {
let cell = cell_from_value(cell_vec.get(i))?;
let res = cell.map(|cell| cell.resolution().into());
results.push(res);
}
Ok(results.to_vector())
}
}
/// Function that returns base cell of given cell id
#[derive(Clone, Debug, Default, Display)]
#[display("{}", self.name())]
pub struct H3CellBase;
impl Function for H3CellBase {
fn name(&self) -> &str {
"h3_cell_base"
}
fn return_type(&self, _input_types: &[ConcreteDataType]) -> Result<ConcreteDataType> {
Ok(ConcreteDataType::uint8_datatype())
}
fn signature(&self) -> Signature {
signature_of_cell()
}
fn eval(&self, _func_ctx: FunctionContext, columns: &[VectorRef]) -> Result<VectorRef> {
ensure!(
columns.len() == 1,
InvalidFuncArgsSnafu {
err_msg: format!(
"The length of the args is not correct, expect 1, provided : {}",
columns.len()
),
}
);
let cell_vec = &columns[0];
let size = cell_vec.len();
let mut results = UInt8VectorBuilder::with_capacity(size);
for i in 0..size {
let cell = cell_from_value(cell_vec.get(i))?;
let res = cell.map(|cell| cell.base_cell().into());
results.push(res);
}
Ok(results.to_vector())
}
}
/// Function that check if given cell id is a pentagon
#[derive(Clone, Debug, Default, Display)]
#[display("{}", self.name())]
pub struct H3CellIsPentagon;
impl Function for H3CellIsPentagon {
fn name(&self) -> &str {
"h3_cell_is_pentagon"
}
fn return_type(&self, _input_types: &[ConcreteDataType]) -> Result<ConcreteDataType> {
Ok(ConcreteDataType::boolean_datatype())
}
fn signature(&self) -> Signature {
signature_of_cell()
}
fn eval(&self, _func_ctx: FunctionContext, columns: &[VectorRef]) -> Result<VectorRef> {
ensure!(
columns.len() == 1,
InvalidFuncArgsSnafu {
err_msg: format!(
"The length of the args is not correct, expect 1, provided : {}",
columns.len()
),
}
);
let cell_vec = &columns[0];
let size = cell_vec.len();
let mut results = BooleanVectorBuilder::with_capacity(size);
for i in 0..size {
let cell = cell_from_value(cell_vec.get(i))?;
let res = cell.map(|cell| cell.is_pentagon());
results.push(res);
}
Ok(results.to_vector())
}
}
/// Function that returns center child cell of given cell id
#[derive(Clone, Debug, Default, Display)]
#[display("{}", self.name())]
pub struct H3CellCenterChild;
impl Function for H3CellCenterChild {
fn name(&self) -> &str {
"h3_cell_center_child"
}
fn return_type(&self, _input_types: &[ConcreteDataType]) -> Result<ConcreteDataType> {
Ok(ConcreteDataType::uint64_datatype())
}
fn signature(&self) -> Signature {
signature_of_cell_and_resolution()
}
fn eval(&self, _func_ctx: FunctionContext, columns: &[VectorRef]) -> Result<VectorRef> {
ensure!(
columns.len() == 2,
InvalidFuncArgsSnafu {
err_msg: format!(
"The length of the args is not correct, expect 2, provided : {}",
columns.len()
),
}
);
let cell_vec = &columns[0];
let res_vec = &columns[1];
let size = cell_vec.len();
let mut results = UInt64VectorBuilder::with_capacity(size);
for i in 0..size {
let cell = cell_from_value(cell_vec.get(i))?;
let res = value_to_resolution(res_vec.get(i))?;
let result = cell
.and_then(|cell| cell.center_child(res))
.map(|c| c.into());
results.push(result);
}
Ok(results.to_vector())
}
}
/// Function that returns parent cell of given cell id and resolution
#[derive(Clone, Debug, Default, Display)]
#[display("{}", self.name())]
pub struct H3CellParent;
impl Function for H3CellParent {
fn name(&self) -> &str {
"h3_cell_parent"
}
fn return_type(&self, _input_types: &[ConcreteDataType]) -> Result<ConcreteDataType> {
Ok(ConcreteDataType::uint64_datatype())
}
fn signature(&self) -> Signature {
signature_of_cell_and_resolution()
}
fn eval(&self, _func_ctx: FunctionContext, columns: &[VectorRef]) -> Result<VectorRef> {
ensure!(
columns.len() == 2,
InvalidFuncArgsSnafu {
err_msg: format!(
"The length of the args is not correct, expect 2, provided : {}",
columns.len()
),
}
);
let cell_vec = &columns[0];
let res_vec = &columns[1];
let size = cell_vec.len();
let mut results = UInt64VectorBuilder::with_capacity(size);
for i in 0..size {
let cell = cell_from_value(cell_vec.get(i))?;
let res = value_to_resolution(res_vec.get(i))?;
let result = cell.and_then(|cell| cell.parent(res)).map(|c| c.into());
results.push(result);
}
Ok(results.to_vector())
}
}
/// Function that checks if two cells are neighbour
#[derive(Clone, Debug, Default, Display)]
#[display("{}", self.name())]
pub struct H3IsNeighbour;
impl Function for H3IsNeighbour {
fn name(&self) -> &str {
"h3_is_neighbour"
}
fn return_type(&self, _input_types: &[ConcreteDataType]) -> Result<ConcreteDataType> {
Ok(ConcreteDataType::boolean_datatype())
}
fn signature(&self) -> Signature {
signature_of_double_cell()
}
fn eval(&self, _func_ctx: FunctionContext, columns: &[VectorRef]) -> Result<VectorRef> {
ensure!(
columns.len() == 2,
InvalidFuncArgsSnafu {
err_msg: format!(
"The length of the args is not correct, expect 2, provided : {}",
columns.len()
),
}
);
let cell_vec = &columns[0];
let cell2_vec = &columns[1];
let size = cell_vec.len();
let mut results = BooleanVectorBuilder::with_capacity(size);
for i in 0..size {
let result = match (
cell_from_value(cell_vec.get(i))?,
cell_from_value(cell2_vec.get(i))?,
) {
(Some(cell_this), Some(cell_that)) => {
let is_neighbour = cell_this
.is_neighbor_with(cell_that)
.map_err(|e| {
BoxedError::new(PlainError::new(
format!("H3 error: {}", e),
StatusCode::EngineExecuteQuery,
))
})
.context(error::ExecuteSnafu)?;
Some(is_neighbour)
}
_ => None,
};
results.push(result);
}
Ok(results.to_vector())
}
}
fn value_to_resolution(v: Value) -> Result<Resolution> {
let r = match v {
Value::Int8(v) => v as u8,
Value::Int16(v) => v as u8,
Value::Int32(v) => v as u8,
Value::Int64(v) => v as u8,
Value::UInt8(v) => v,
Value::UInt16(v) => v as u8,
Value::UInt32(v) => v as u8,
Value::UInt64(v) => v as u8,
_ => unreachable!(),
};
Resolution::try_from(r)
.map_err(|e| {
BoxedError::new(PlainError::new(
format!("H3 error: {}", e),
StatusCode::EngineExecuteQuery,
))
})
.context(error::ExecuteSnafu)
}
fn signature_of_cell() -> Signature {
let mut signatures = Vec::new();
for cell_type in &[
ConcreteDataType::uint64_datatype(),
ConcreteDataType::int64_datatype(),
] {
signatures.push(TypeSignature::Exact(vec![cell_type.clone()]));
}
Signature::one_of(signatures, Volatility::Stable)
}
fn signature_of_double_cell() -> Signature {
let mut signatures = Vec::new();
let cell_types = &[
ConcreteDataType::uint64_datatype(),
ConcreteDataType::int64_datatype(),
];
for cell_type in cell_types {
for cell_type2 in cell_types {
signatures.push(TypeSignature::Exact(vec![
cell_type.clone(),
cell_type2.clone(),
]));
}
}
Signature::one_of(signatures, Volatility::Stable)
}
fn signature_of_cell_and_resolution() -> Signature {
let mut signatures = Vec::new();
for cell_type in &[
ConcreteDataType::uint64_datatype(),
ConcreteDataType::int64_datatype(),
] {
for resolution_type in &[
ConcreteDataType::int8_datatype(),
ConcreteDataType::int16_datatype(),
ConcreteDataType::int32_datatype(),
ConcreteDataType::int64_datatype(),
ConcreteDataType::uint8_datatype(),
ConcreteDataType::uint16_datatype(),
ConcreteDataType::uint32_datatype(),
ConcreteDataType::uint64_datatype(),
] {
signatures.push(TypeSignature::Exact(vec![
cell_type.clone(),
resolution_type.clone(),
]));
}
}
Signature::one_of(signatures, Volatility::Stable)
}
fn cell_from_value(v: Value) -> Result<Option<CellIndex>> {
let cell = match v {
Value::Int64(v) => Some(
CellIndex::try_from(v as u64)
.map_err(|e| {
BoxedError::new(PlainError::new(
format!("H3 error: {}", e),
StatusCode::EngineExecuteQuery,
))
})
.context(error::ExecuteSnafu)?,
),
Value::UInt64(v) => Some(
CellIndex::try_from(v)
.map_err(|e| {
BoxedError::new(PlainError::new(
format!("H3 error: {}", e),
StatusCode::EngineExecuteQuery,
))
})
.context(error::ExecuteSnafu)?,
),
_ => None,
};
Ok(cell)
}

View File

@@ -14,12 +14,16 @@
use std::sync::Arc;
mod json_get;
mod json_is;
mod json_to_string;
mod to_json;
mod parse_json;
use json_get::{JsonGetBool, JsonGetFloat, JsonGetInt, JsonGetString};
use json_is::{
JsonIsArray, JsonIsBool, JsonIsFloat, JsonIsInt, JsonIsNull, JsonIsObject, JsonIsString,
};
use json_to_string::JsonToStringFunction;
use to_json::ToJsonFunction;
use parse_json::ParseJsonFunction;
use crate::function_registry::FunctionRegistry;
@@ -28,11 +32,19 @@ pub(crate) struct JsonFunction;
impl JsonFunction {
pub fn register(registry: &FunctionRegistry) {
registry.register(Arc::new(JsonToStringFunction));
registry.register(Arc::new(ToJsonFunction));
registry.register(Arc::new(ParseJsonFunction));
registry.register(Arc::new(JsonGetInt));
registry.register(Arc::new(JsonGetFloat));
registry.register(Arc::new(JsonGetString));
registry.register(Arc::new(JsonGetBool));
registry.register(Arc::new(JsonIsNull));
registry.register(Arc::new(JsonIsInt));
registry.register(Arc::new(JsonIsFloat));
registry.register(Arc::new(JsonIsString));
registry.register(Arc::new(JsonIsBool));
registry.register(Arc::new(JsonIsArray));
registry.register(Arc::new(JsonIsObject));
}
}

View File

@@ -47,7 +47,7 @@ fn get_json_by_path(json: &[u8], path: &str) -> Option<Vec<u8>> {
/// If the path does not exist or the value is not the type specified, return `NULL`.
macro_rules! json_get {
// e.g. name = JsonGetInt, type = Int64, rust_type = i64, doc = "Get the value from the JSONB by the given path and return it as an integer."
($name: ident, $type: ident, $rust_type: ident, $doc:expr) => {
($name:ident, $type:ident, $rust_type:ident, $doc:expr) => {
paste::paste! {
#[doc = $doc]
#[derive(Clone, Debug, Default)]

View File

@@ -0,0 +1,215 @@
// Copyright 2023 Greptime Team
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
use std::fmt::{self, Display};
use common_query::error::{InvalidFuncArgsSnafu, Result, UnsupportedInputDataTypeSnafu};
use common_query::prelude::Signature;
use datafusion::logical_expr::Volatility;
use datatypes::data_type::ConcreteDataType;
use datatypes::prelude::VectorRef;
use datatypes::scalars::ScalarVectorBuilder;
use datatypes::vectors::{BooleanVectorBuilder, MutableVector};
use snafu::ensure;
use crate::function::{Function, FunctionContext};
/// Checks if the input is a JSON object of the given type.
macro_rules! json_is {
($name:ident, $json_type:ident, $doc:expr) => {
paste::paste! {
#[derive(Clone, Debug, Default)]
pub struct $name;
impl Function for $name {
fn name(&self) -> &str {
stringify!([<$name:snake>])
}
fn return_type(&self, _input_types: &[ConcreteDataType]) -> Result<ConcreteDataType> {
Ok(ConcreteDataType::boolean_datatype())
}
fn signature(&self) -> Signature {
Signature::exact(vec![ConcreteDataType::json_datatype()], Volatility::Immutable)
}
fn eval(&self, _func_ctx: FunctionContext, columns: &[VectorRef]) -> Result<VectorRef> {
ensure!(
columns.len() == 1,
InvalidFuncArgsSnafu {
err_msg: format!(
"The length of the args is not correct, expect exactly one, have: {}",
columns.len()
),
}
);
let jsons = &columns[0];
let size = jsons.len();
let datatype = jsons.data_type();
let mut results = BooleanVectorBuilder::with_capacity(size);
match datatype {
// JSON data type uses binary vector
ConcreteDataType::Binary(_) => {
for i in 0..size {
let json = jsons.get_ref(i);
let json = json.as_binary();
let result = match json {
Ok(Some(json)) => {
Some(jsonb::[<is_ $json_type>](json))
}
_ => None,
};
results.push(result);
}
}
_ => {
return UnsupportedInputDataTypeSnafu {
function: stringify!([<$name:snake>]),
datatypes: columns.iter().map(|c| c.data_type()).collect::<Vec<_>>(),
}
.fail();
}
}
Ok(results.to_vector())
}
}
impl Display for $name {
fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
write!(f, "{}", stringify!([<$name:snake>]).to_ascii_uppercase())
}
}
}
}
}
json_is!(JsonIsNull, null, "Checks if the input JSONB is null");
json_is!(
JsonIsBool,
boolean,
"Checks if the input JSONB is a boolean type JSON value"
);
json_is!(
JsonIsInt,
i64,
"Checks if the input JSONB is a integer type JSON value"
);
json_is!(
JsonIsFloat,
number,
"Checks if the input JSONB is a JSON float"
);
json_is!(
JsonIsString,
string,
"Checks if the input JSONB is a JSON string"
);
json_is!(
JsonIsArray,
array,
"Checks if the input JSONB is a JSON array"
);
json_is!(
JsonIsObject,
object,
"Checks if the input JSONB is a JSON object"
);
#[cfg(test)]
mod tests {
use std::sync::Arc;
use datatypes::scalars::ScalarVector;
use datatypes::vectors::BinaryVector;
use super::*;
#[test]
fn test_json_is_functions() {
let json_is_functions: [&dyn Function; 6] = [
&JsonIsBool,
&JsonIsInt,
&JsonIsFloat,
&JsonIsString,
&JsonIsArray,
&JsonIsObject,
];
let expected_names = [
"json_is_bool",
"json_is_int",
"json_is_float",
"json_is_string",
"json_is_array",
"json_is_object",
];
for (func, expected_name) in json_is_functions.iter().zip(expected_names.iter()) {
assert_eq!(func.name(), *expected_name);
assert_eq!(
func.return_type(&[ConcreteDataType::json_datatype()])
.unwrap(),
ConcreteDataType::boolean_datatype()
);
assert_eq!(
func.signature(),
Signature::exact(
vec![ConcreteDataType::json_datatype()],
Volatility::Immutable
)
);
}
let json_strings = [
r#"true"#,
r#"1"#,
r#"1.0"#,
r#""The pig fly through a castle, and has been attracted by the princess.""#,
r#"[1, 2]"#,
r#"{"a": 1}"#,
];
let expected_results = [
[true, false, false, false, false, false],
[false, true, false, false, false, false],
// Integers are also floats
[false, true, true, false, false, false],
[false, false, false, true, false, false],
[false, false, false, false, true, false],
[false, false, false, false, false, true],
];
let jsonbs = json_strings
.iter()
.map(|s| {
let value = jsonb::parse_value(s.as_bytes()).unwrap();
value.to_vec()
})
.collect::<Vec<_>>();
let json_vector = BinaryVector::from_vec(jsonbs);
let args: Vec<VectorRef> = vec![Arc::new(json_vector)];
for (func, expected_result) in json_is_functions.iter().zip(expected_results.iter()) {
let vector = func.eval(FunctionContext::default(), &args).unwrap();
assert_eq!(vector.len(), json_strings.len());
for (i, expected) in expected_result.iter().enumerate() {
let result = vector.get_ref(i);
let result = result.as_boolean().unwrap().unwrap();
assert_eq!(result, *expected);
}
}
}
}

View File

@@ -119,7 +119,7 @@ mod tests {
use super::*;
#[test]
fn test_get_by_path_function() {
fn test_json_to_string_function() {
let json_to_string = JsonToStringFunction;
assert_eq!("json_to_string", json_to_string.name());

View File

@@ -27,11 +27,11 @@ use crate::function::{Function, FunctionContext};
/// Parses the `String` into `JSONB`.
#[derive(Clone, Debug, Default)]
pub struct ToJsonFunction;
pub struct ParseJsonFunction;
const NAME: &str = "to_json";
const NAME: &str = "parse_json";
impl Function for ToJsonFunction {
impl Function for ParseJsonFunction {
fn name(&self) -> &str {
NAME
}
@@ -101,9 +101,9 @@ impl Function for ToJsonFunction {
}
}
impl Display for ToJsonFunction {
impl Display for ParseJsonFunction {
fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
write!(f, "TO_JSON")
write!(f, "PARSE_JSON")
}
}
@@ -119,17 +119,17 @@ mod tests {
#[test]
fn test_get_by_path_function() {
let to_json = ToJsonFunction;
let parse_json = ParseJsonFunction;
assert_eq!("to_json", to_json.name());
assert_eq!("parse_json", parse_json.name());
assert_eq!(
ConcreteDataType::json_datatype(),
to_json
parse_json
.return_type(&[ConcreteDataType::json_datatype()])
.unwrap()
);
assert!(matches!(to_json.signature(),
assert!(matches!(parse_json.signature(),
Signature {
type_signature: TypeSignature::Exact(valid_types),
volatility: Volatility::Immutable
@@ -152,13 +152,12 @@ mod tests {
let json_string_vector = StringVector::from_vec(json_strings.to_vec());
let args: Vec<VectorRef> = vec![Arc::new(json_string_vector)];
let vector = to_json.eval(FunctionContext::default(), &args).unwrap();
let vector = parse_json.eval(FunctionContext::default(), &args).unwrap();
assert_eq!(3, vector.len());
for (i, gt) in jsonbs.iter().enumerate() {
let result = vector.get_ref(i);
let result = result.as_binary().unwrap().unwrap();
// remove whitespaces
assert_eq!(gt, result);
}
}

View File

@@ -25,13 +25,13 @@ use session::context::QueryContextRef;
use crate::handlers::ProcedureServiceHandlerRef;
use crate::helper::cast_u64;
const DEFAULT_REPLAY_TIMEOUT_SECS: u64 = 10;
const DEFAULT_TIMEOUT_SECS: u64 = 30;
/// A function to migrate a region from source peer to target peer.
/// Returns the submitted procedure id if success. Only available in cluster mode.
///
/// - `migrate_region(region_id, from_peer, to_peer)`, with default replay WAL timeout(10 seconds).
/// - `migrate_region(region_id, from_peer, to_peer, timeout(secs))`
/// - `migrate_region(region_id, from_peer, to_peer)`, with timeout(30 seconds).
/// - `migrate_region(region_id, from_peer, to_peer, timeout(secs))`.
///
/// The parameters:
/// - `region_id`: the region id
@@ -48,18 +48,13 @@ pub(crate) async fn migrate_region(
_ctx: &QueryContextRef,
params: &[ValueRef<'_>],
) -> Result<Value> {
let (region_id, from_peer, to_peer, replay_timeout) = match params.len() {
let (region_id, from_peer, to_peer, timeout) = match params.len() {
3 => {
let region_id = cast_u64(&params[0])?;
let from_peer = cast_u64(&params[1])?;
let to_peer = cast_u64(&params[2])?;
(
region_id,
from_peer,
to_peer,
Some(DEFAULT_REPLAY_TIMEOUT_SECS),
)
(region_id, from_peer, to_peer, Some(DEFAULT_TIMEOUT_SECS))
}
4 => {
@@ -82,14 +77,14 @@ pub(crate) async fn migrate_region(
}
};
match (region_id, from_peer, to_peer, replay_timeout) {
(Some(region_id), Some(from_peer), Some(to_peer), Some(replay_timeout)) => {
match (region_id, from_peer, to_peer, timeout) {
(Some(region_id), Some(from_peer), Some(to_peer), Some(timeout)) => {
let pid = procedure_service_handler
.migrate_region(MigrateRegionRequest {
region_id,
from_peer,
to_peer,
replay_timeout: Duration::from_secs(replay_timeout),
timeout: Duration::from_secs(timeout),
})
.await?;

View File

@@ -15,6 +15,7 @@ workspace = true
anymap2 = "0.13.0"
api.workspace = true
async-recursion = "1.0"
async-stream = "0.3"
async-trait.workspace = true
base64.workspace = true
bytes.workspace = true

View File

@@ -20,6 +20,7 @@ use regex::Regex;
use serde::{Deserialize, Serialize};
use snafu::{ensure, OptionExt, ResultExt};
use crate::datanode::RegionStat;
use crate::error::{
DecodeJsonSnafu, EncodeJsonSnafu, Error, FromUtf8Snafu, InvalidNodeInfoKeySnafu,
InvalidRoleSnafu, ParseNumSnafu, Result,
@@ -47,6 +48,9 @@ pub trait ClusterInfo {
role: Option<Role>,
) -> std::result::Result<Vec<NodeInfo>, Self::Error>;
/// List all region stats in the cluster.
async fn list_region_stats(&self) -> std::result::Result<Vec<RegionStat>, Self::Error>;
// TODO(jeremy): Other info, like region status, etc.
}

View File

@@ -0,0 +1,413 @@
// Copyright 2023 Greptime Team
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
use std::collections::HashSet;
use std::str::FromStr;
use api::v1::meta::{HeartbeatRequest, RequestHeader};
use common_time::util as time_util;
use lazy_static::lazy_static;
use regex::Regex;
use serde::{Deserialize, Serialize};
use snafu::{ensure, OptionExt, ResultExt};
use store_api::region_engine::{RegionRole, RegionStatistic};
use store_api::storage::RegionId;
use table::metadata::TableId;
use crate::error::Result;
use crate::{error, ClusterId};
pub(crate) const DATANODE_LEASE_PREFIX: &str = "__meta_datanode_lease";
const INACTIVE_REGION_PREFIX: &str = "__meta_inactive_region";
const DATANODE_STAT_PREFIX: &str = "__meta_datanode_stat";
pub const REGION_STATISTIC_KEY: &str = "__region_statistic";
lazy_static! {
pub(crate) static ref DATANODE_LEASE_KEY_PATTERN: Regex =
Regex::new(&format!("^{DATANODE_LEASE_PREFIX}-([0-9]+)-([0-9]+)$")).unwrap();
static ref DATANODE_STAT_KEY_PATTERN: Regex =
Regex::new(&format!("^{DATANODE_STAT_PREFIX}-([0-9]+)-([0-9]+)$")).unwrap();
static ref INACTIVE_REGION_KEY_PATTERN: Regex = Regex::new(&format!(
"^{INACTIVE_REGION_PREFIX}-([0-9]+)-([0-9]+)-([0-9]+)$"
))
.unwrap();
}
/// The key of the datanode stat in the storage.
///
/// The format is `__meta_datanode_stat-{cluster_id}-{node_id}`.
#[derive(Debug, Clone, Default, Serialize, Deserialize)]
pub struct Stat {
pub timestamp_millis: i64,
pub cluster_id: ClusterId,
// The datanode Id.
pub id: u64,
// The datanode address.
pub addr: String,
/// The read capacity units during this period
pub rcus: i64,
/// The write capacity units during this period
pub wcus: i64,
/// How many regions on this node
pub region_num: u64,
pub region_stats: Vec<RegionStat>,
// The node epoch is used to check whether the node has restarted or redeployed.
pub node_epoch: u64,
}
/// The statistics of a region.
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct RegionStat {
/// The region_id.
pub id: RegionId,
/// The read capacity units during this period
pub rcus: i64,
/// The write capacity units during this period
pub wcus: i64,
/// Approximate bytes of this region
pub approximate_bytes: i64,
/// The engine name.
pub engine: String,
/// The region role.
pub role: RegionRole,
/// The size of the memtable in bytes.
pub memtable_size: u64,
/// The size of the manifest in bytes.
pub manifest_size: u64,
/// The size of the SST files in bytes.
pub sst_size: u64,
}
impl Stat {
#[inline]
pub fn is_empty(&self) -> bool {
self.region_stats.is_empty()
}
pub fn stat_key(&self) -> DatanodeStatKey {
DatanodeStatKey {
cluster_id: self.cluster_id,
node_id: self.id,
}
}
/// Returns a tuple array containing [RegionId] and [RegionRole].
pub fn regions(&self) -> Vec<(RegionId, RegionRole)> {
self.region_stats.iter().map(|s| (s.id, s.role)).collect()
}
/// Returns all table ids in the region stats.
pub fn table_ids(&self) -> HashSet<TableId> {
self.region_stats.iter().map(|s| s.id.table_id()).collect()
}
/// Retains the active region stats and updates the rcus, wcus, and region_num.
pub fn retain_active_region_stats(&mut self, inactive_region_ids: &HashSet<RegionId>) {
if inactive_region_ids.is_empty() {
return;
}
self.region_stats
.retain(|r| !inactive_region_ids.contains(&r.id));
self.rcus = self.region_stats.iter().map(|s| s.rcus).sum();
self.wcus = self.region_stats.iter().map(|s| s.wcus).sum();
self.region_num = self.region_stats.len() as u64;
}
}
impl TryFrom<&HeartbeatRequest> for Stat {
type Error = Option<RequestHeader>;
fn try_from(value: &HeartbeatRequest) -> std::result::Result<Self, Self::Error> {
let HeartbeatRequest {
header,
peer,
region_stats,
node_epoch,
..
} = value;
match (header, peer) {
(Some(header), Some(peer)) => {
let region_stats = region_stats
.iter()
.map(RegionStat::from)
.collect::<Vec<_>>();
Ok(Self {
timestamp_millis: time_util::current_time_millis(),
cluster_id: header.cluster_id,
// datanode id
id: peer.id,
// datanode address
addr: peer.addr.clone(),
rcus: region_stats.iter().map(|s| s.rcus).sum(),
wcus: region_stats.iter().map(|s| s.wcus).sum(),
region_num: region_stats.len() as u64,
region_stats,
node_epoch: *node_epoch,
})
}
(header, _) => Err(header.clone()),
}
}
}
impl From<&api::v1::meta::RegionStat> for RegionStat {
fn from(value: &api::v1::meta::RegionStat) -> Self {
let region_stat = value
.extensions
.get(REGION_STATISTIC_KEY)
.and_then(|value| RegionStatistic::deserialize_from_slice(value))
.unwrap_or_default();
Self {
id: RegionId::from_u64(value.region_id),
rcus: value.rcus,
wcus: value.wcus,
approximate_bytes: value.approximate_bytes,
engine: value.engine.to_string(),
role: RegionRole::from(value.role()),
memtable_size: region_stat.memtable_size,
manifest_size: region_stat.manifest_size,
sst_size: region_stat.sst_size,
}
}
}
/// The key of the datanode stat in the memory store.
///
/// The format is `__meta_datanode_stat-{cluster_id}-{node_id}`.
#[derive(Debug, Clone, Copy, Eq, PartialEq, Hash)]
pub struct DatanodeStatKey {
pub cluster_id: ClusterId,
pub node_id: u64,
}
impl DatanodeStatKey {
/// The key prefix.
pub fn prefix_key() -> Vec<u8> {
format!("{DATANODE_STAT_PREFIX}-").into_bytes()
}
/// The key prefix with the cluster id.
pub fn key_prefix_with_cluster_id(cluster_id: ClusterId) -> String {
format!("{DATANODE_STAT_PREFIX}-{cluster_id}-")
}
}
impl From<DatanodeStatKey> for Vec<u8> {
fn from(value: DatanodeStatKey) -> Self {
format!(
"{}-{}-{}",
DATANODE_STAT_PREFIX, value.cluster_id, value.node_id
)
.into_bytes()
}
}
impl FromStr for DatanodeStatKey {
type Err = error::Error;
fn from_str(key: &str) -> Result<Self> {
let caps = DATANODE_STAT_KEY_PATTERN
.captures(key)
.context(error::InvalidStatKeySnafu { key })?;
ensure!(caps.len() == 3, error::InvalidStatKeySnafu { key });
let cluster_id = caps[1].to_string();
let node_id = caps[2].to_string();
let cluster_id: u64 = cluster_id.parse().context(error::ParseNumSnafu {
err_msg: format!("invalid cluster_id: {cluster_id}"),
})?;
let node_id: u64 = node_id.parse().context(error::ParseNumSnafu {
err_msg: format!("invalid node_id: {node_id}"),
})?;
Ok(Self {
cluster_id,
node_id,
})
}
}
impl TryFrom<Vec<u8>> for DatanodeStatKey {
type Error = error::Error;
fn try_from(bytes: Vec<u8>) -> Result<Self> {
String::from_utf8(bytes)
.context(error::FromUtf8Snafu {
name: "DatanodeStatKey",
})
.map(|x| x.parse())?
}
}
/// The value of the datanode stat in the memory store.
#[derive(Debug, Clone, Serialize, Deserialize)]
#[serde(transparent)]
pub struct DatanodeStatValue {
pub stats: Vec<Stat>,
}
impl DatanodeStatValue {
/// Get the latest number of regions.
pub fn region_num(&self) -> Option<u64> {
self.stats.last().map(|x| x.region_num)
}
/// Get the latest node addr.
pub fn node_addr(&self) -> Option<String> {
self.stats.last().map(|x| x.addr.clone())
}
}
impl TryFrom<DatanodeStatValue> for Vec<u8> {
type Error = error::Error;
fn try_from(stats: DatanodeStatValue) -> Result<Self> {
Ok(serde_json::to_string(&stats)
.context(error::SerializeToJsonSnafu {
input: format!("{stats:?}"),
})?
.into_bytes())
}
}
impl FromStr for DatanodeStatValue {
type Err = error::Error;
fn from_str(value: &str) -> Result<Self> {
serde_json::from_str(value).context(error::DeserializeFromJsonSnafu { input: value })
}
}
impl TryFrom<Vec<u8>> for DatanodeStatValue {
type Error = error::Error;
fn try_from(value: Vec<u8>) -> Result<Self> {
String::from_utf8(value)
.context(error::FromUtf8Snafu {
name: "DatanodeStatValue",
})
.map(|x| x.parse())?
}
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn test_stat_key() {
let stat = Stat {
cluster_id: 3,
id: 101,
region_num: 10,
..Default::default()
};
let stat_key = stat.stat_key();
assert_eq!(3, stat_key.cluster_id);
assert_eq!(101, stat_key.node_id);
}
#[test]
fn test_stat_val_round_trip() {
let stat = Stat {
cluster_id: 0,
id: 101,
region_num: 100,
..Default::default()
};
let stat_val = DatanodeStatValue { stats: vec![stat] };
let bytes: Vec<u8> = stat_val.try_into().unwrap();
let stat_val: DatanodeStatValue = bytes.try_into().unwrap();
let stats = stat_val.stats;
assert_eq!(1, stats.len());
let stat = stats.first().unwrap();
assert_eq!(0, stat.cluster_id);
assert_eq!(101, stat.id);
assert_eq!(100, stat.region_num);
}
#[test]
fn test_get_addr_from_stat_val() {
let empty = DatanodeStatValue { stats: vec![] };
let addr = empty.node_addr();
assert!(addr.is_none());
let stat_val = DatanodeStatValue {
stats: vec![
Stat {
addr: "1".to_string(),
..Default::default()
},
Stat {
addr: "2".to_string(),
..Default::default()
},
Stat {
addr: "3".to_string(),
..Default::default()
},
],
};
let addr = stat_val.node_addr().unwrap();
assert_eq!("3", addr);
}
#[test]
fn test_get_region_num_from_stat_val() {
let empty = DatanodeStatValue { stats: vec![] };
let region_num = empty.region_num();
assert!(region_num.is_none());
let wrong = DatanodeStatValue {
stats: vec![Stat {
region_num: 0,
..Default::default()
}],
};
let right = wrong.region_num();
assert_eq!(Some(0), right);
let stat_val = DatanodeStatValue {
stats: vec![
Stat {
region_num: 1,
..Default::default()
},
Stat {
region_num: 0,
..Default::default()
},
Stat {
region_num: 2,
..Default::default()
},
],
};
let region_num = stat_val.region_num().unwrap();
assert_eq!(2, region_num);
}
}

View File

@@ -15,6 +15,7 @@
use std::collections::HashMap;
use std::sync::Arc;
use api::v1::meta::ProcedureDetailResponse;
use common_telemetry::tracing_context::W3cTrace;
use store_api::storage::{RegionId, RegionNumber, TableId};
@@ -82,6 +83,8 @@ pub trait ProcedureExecutor: Send + Sync {
ctx: &ExecutorContext,
pid: &str,
) -> Result<ProcedureStateResponse>;
async fn list_procedures(&self, ctx: &ExecutorContext) -> Result<ProcedureDetailResponse>;
}
pub type ProcedureExecutorRef = Arc<dyn ProcedureExecutor>;

View File

@@ -14,6 +14,7 @@
use std::sync::Arc;
use api::v1::meta::ProcedureDetailResponse;
use common_procedure::{
watcher, BoxedProcedureLoader, Output, ProcedureId, ProcedureManagerRef, ProcedureWithId,
};
@@ -825,6 +826,15 @@ impl ProcedureExecutor for DdlManager {
Ok(procedure::procedure_state_to_pb_response(&state))
}
async fn list_procedures(&self, _ctx: &ExecutorContext) -> Result<ProcedureDetailResponse> {
let metas = self
.procedure_manager
.list_procedures()
.await
.context(QueryProcedureSnafu)?;
Ok(procedure::procedure_details_to_pb_response(metas))
}
}
#[cfg(test)]

View File

@@ -218,6 +218,24 @@ pub enum Error {
error: JsonError,
},
#[snafu(display("Failed to serialize to json: {}", input))]
SerializeToJson {
input: String,
#[snafu(source)]
error: serde_json::error::Error,
#[snafu(implicit)]
location: Location,
},
#[snafu(display("Failed to deserialize from json: {}", input))]
DeserializeFromJson {
input: String,
#[snafu(source)]
error: serde_json::error::Error,
#[snafu(implicit)]
location: Location,
},
#[snafu(display("Payload not exist"))]
PayloadNotExist {
#[snafu(implicit)]
@@ -531,13 +549,20 @@ pub enum Error {
location: Location,
},
#[snafu(display("Invalid node info key: {}", key))]
#[snafu(display("Invalid node info key: {}", key))]
InvalidNodeInfoKey {
key: String,
#[snafu(implicit)]
location: Location,
},
#[snafu(display("Invalid node stat key: {}", key))]
InvalidStatKey {
key: String,
#[snafu(implicit)]
location: Location,
},
#[snafu(display("Failed to parse number: {}", err_msg))]
ParseNum {
err_msg: String,
@@ -627,7 +652,9 @@ impl ErrorExt for Error {
| EtcdTxnFailed { .. }
| ConnectEtcd { .. }
| MoveValues { .. }
| GetCache { .. } => StatusCode::Internal,
| GetCache { .. }
| SerializeToJson { .. }
| DeserializeFromJson { .. } => StatusCode::Internal,
ValueNotExist { .. } => StatusCode::Unexpected,
@@ -700,6 +727,7 @@ impl ErrorExt for Error {
| InvalidNumTopics { .. }
| SchemaNotFound { .. }
| InvalidNodeInfoKey { .. }
| InvalidStatKey { .. }
| ParseNum { .. }
| InvalidRole { .. }
| EmptyDdlTasks { .. } => StatusCode::InvalidArguments,

View File

@@ -132,11 +132,20 @@ impl OpenRegion {
pub struct DowngradeRegion {
/// The [RegionId].
pub region_id: RegionId,
/// The timeout of waiting for flush the region.
///
/// `None` stands for don't flush before downgrading the region.
#[serde(default)]
pub flush_timeout: Option<Duration>,
}
impl Display for DowngradeRegion {
fn fmt(&self, f: &mut Formatter<'_>) -> std::fmt::Result {
write!(f, "DowngradeRegion(region_id={})", self.region_id)
write!(
f,
"DowngradeRegion(region_id={}, flush_timeout={:?})",
self.region_id, self.flush_timeout,
)
}
}
@@ -152,7 +161,7 @@ pub struct UpgradeRegion {
/// `None` stands for no wait,
/// it's helpful to verify whether the leader region is ready.
#[serde(with = "humantime_serde")]
pub wait_for_replay_timeout: Option<Duration>,
pub replay_timeout: Option<Duration>,
/// The hint for replaying memtable.
#[serde(default)]
pub location_id: Option<u64>,

View File

@@ -144,7 +144,7 @@ use crate::rpc::router::{region_distribution, RegionRoute, RegionStatus};
use crate::rpc::store::BatchDeleteRequest;
use crate::DatanodeId;
pub const NAME_PATTERN: &str = r"[a-zA-Z_:-][a-zA-Z0-9_:\-\.]*";
pub const NAME_PATTERN: &str = r"[a-zA-Z_:-][a-zA-Z0-9_:\-\.@#]*";
pub const MAINTENANCE_KEY: &str = "__maintenance";
const DATANODE_TABLE_KEY_PREFIX: &str = "__dn_table";

View File

@@ -147,7 +147,8 @@ impl CatalogManager {
req,
DEFAULT_PAGE_SIZE,
Arc::new(catalog_decoder),
);
)
.into_stream();
Box::pin(stream)
}

View File

@@ -167,7 +167,8 @@ impl DatanodeTableManager {
req,
DEFAULT_PAGE_SIZE,
Arc::new(datanode_table_value_decoder),
);
)
.into_stream();
Box::pin(stream)
}

View File

@@ -200,7 +200,8 @@ impl FlowNameManager {
req,
DEFAULT_PAGE_SIZE,
Arc::new(flow_name_decoder),
);
)
.into_stream();
Box::pin(stream)
}

View File

@@ -180,7 +180,8 @@ impl FlowRouteManager {
req,
DEFAULT_PAGE_SIZE,
Arc::new(flow_route_decoder),
);
)
.into_stream();
Box::pin(stream)
}

View File

@@ -180,7 +180,8 @@ impl FlownodeFlowManager {
req,
DEFAULT_PAGE_SIZE,
Arc::new(flownode_flow_key_decoder),
);
)
.into_stream();
Box::pin(stream.map_ok(|key| (key.flow_id(), key.partition_id())))
}

View File

@@ -207,7 +207,8 @@ impl TableFlowManager {
req,
DEFAULT_PAGE_SIZE,
Arc::new(table_flow_decoder),
);
)
.into_stream();
Box::pin(stream)
}

View File

@@ -230,7 +230,8 @@ impl SchemaManager {
req,
DEFAULT_PAGE_SIZE,
Arc::new(schema_decoder),
);
)
.into_stream();
Box::pin(stream)
}

View File

@@ -259,7 +259,8 @@ impl TableNameManager {
req,
DEFAULT_PAGE_SIZE,
Arc::new(table_decoder),
);
)
.into_stream();
Box::pin(stream)
}

View File

@@ -22,6 +22,7 @@
pub mod cache;
pub mod cache_invalidator;
pub mod cluster;
pub mod datanode;
pub mod ddl;
pub mod ddl_manager;
pub mod distributed_time_constants;

View File

@@ -12,14 +12,11 @@
// See the License for the specific language governing permissions and
// limitations under the License.
use std::collections::VecDeque;
use std::pin::Pin;
use std::sync::Arc;
use std::task::{Context, Poll};
use async_stream::try_stream;
use common_telemetry::debug;
use futures::future::BoxFuture;
use futures::{ready, FutureExt, Stream};
use futures::Stream;
use snafu::ensure;
use crate::error::{self, Result};
@@ -30,17 +27,6 @@ use crate::util::get_next_prefix_key;
pub type KeyValueDecoderFn<T> = dyn Fn(KeyValue) -> Result<T> + Send + Sync;
enum PaginationStreamState<T> {
/// At the start of reading.
Init,
/// Decoding key value pairs.
Decoding(SimpleKeyValueDecoder<T>),
/// Retrieving data from backend.
Reading(BoxFuture<'static, Result<(PaginationStreamFactory, Option<RangeResponse>)>>),
/// Error
Error,
}
/// The Range Request's default page size.
///
/// It dependents on upstream KvStore server side grpc message size limitation.
@@ -65,8 +51,6 @@ struct PaginationStreamFactory {
/// keys.
pub range_end: Vec<u8>,
/// page_size is the pagination page size.
page_size: usize,
/// keys_only when set returns only the keys and not the values.
pub keys_only: bool,
@@ -89,7 +73,6 @@ impl PaginationStreamFactory {
kv: kv.clone(),
key,
range_end,
page_size,
keys_only,
more,
adaptive_page_size: if page_size == 0 {
@@ -137,7 +120,7 @@ impl PaginationStreamFactory {
}
}
async fn read_next(mut self) -> Result<(Self, Option<RangeResponse>)> {
async fn read_next(&mut self) -> Result<Option<RangeResponse>> {
if self.more {
let resp = self
.adaptive_range(RangeRequest {
@@ -151,33 +134,22 @@ impl PaginationStreamFactory {
let key = resp
.kvs
.last()
.map(|kv| kv.key.clone())
.unwrap_or_else(Vec::new);
.map(|kv| kv.key.as_slice())
.unwrap_or_default();
let next_key = get_next_prefix_key(&key);
Ok((
Self {
kv: self.kv,
key: next_key,
range_end: self.range_end,
page_size: self.page_size,
keys_only: self.keys_only,
more: resp.more,
adaptive_page_size: self.adaptive_page_size,
},
Some(resp),
))
let next_key = get_next_prefix_key(key);
self.key = next_key;
self.more = resp.more;
Ok(Some(resp))
} else {
Ok((self, None))
Ok(None)
}
}
}
pub struct PaginationStream<T> {
state: PaginationStreamState<T>,
decoder_fn: Arc<KeyValueDecoderFn<T>>,
factory: Option<PaginationStreamFactory>,
factory: PaginationStreamFactory,
}
impl<T> PaginationStream<T> {
@@ -189,82 +161,28 @@ impl<T> PaginationStream<T> {
decoder_fn: Arc<KeyValueDecoderFn<T>>,
) -> Self {
Self {
state: PaginationStreamState::Init,
decoder_fn,
factory: Some(PaginationStreamFactory::new(
factory: PaginationStreamFactory::new(
&kv,
req.key,
req.range_end,
page_size,
req.keys_only,
true,
)),
),
}
}
}
struct SimpleKeyValueDecoder<T> {
kv: VecDeque<KeyValue>,
decoder: Arc<KeyValueDecoderFn<T>>,
}
impl<T> Iterator for SimpleKeyValueDecoder<T> {
type Item = Result<T>;
fn next(&mut self) -> Option<Self::Item> {
if let Some(kv) = self.kv.pop_front() {
Some((self.decoder)(kv))
} else {
None
}
}
}
impl<T> Stream for PaginationStream<T> {
type Item = Result<T>;
fn poll_next(mut self: Pin<&mut Self>, cx: &mut Context<'_>) -> Poll<Option<Self::Item>> {
loop {
match &mut self.state {
PaginationStreamState::Decoding(decoder) => match decoder.next() {
Some(Ok(result)) => return Poll::Ready(Some(Ok(result))),
Some(Err(e)) => {
self.state = PaginationStreamState::Error;
return Poll::Ready(Some(Err(e)));
}
None => self.state = PaginationStreamState::Init,
},
PaginationStreamState::Init => {
let factory = self.factory.take().expect("lost factory");
if !factory.more {
// Ensures the factory always exists.
self.factory = Some(factory);
return Poll::Ready(None);
}
let fut = factory.read_next().boxed();
self.state = PaginationStreamState::Reading(fut);
impl<T> PaginationStream<T> {
pub fn into_stream(mut self) -> impl Stream<Item = Result<T>> {
try_stream!({
while let Some(resp) = self.factory.read_next().await? {
for kv in resp.kvs {
yield (self.decoder_fn)(kv)?
}
PaginationStreamState::Reading(f) => match ready!(f.poll_unpin(cx)) {
Ok((factory, Some(resp))) => {
self.factory = Some(factory);
let decoder = SimpleKeyValueDecoder {
kv: resp.kvs.into(),
decoder: self.decoder_fn.clone(),
};
self.state = PaginationStreamState::Decoding(decoder);
}
Ok((factory, None)) => {
self.factory = Some(factory);
self.state = PaginationStreamState::Init;
}
Err(e) => {
self.state = PaginationStreamState::Error;
return Poll::Ready(Some(Err(e)));
}
},
PaginationStreamState::Error => return Poll::Ready(None), // Ends the stream as error happens.
}
}
})
}
}
@@ -333,7 +251,8 @@ mod tests {
},
DEFAULT_PAGE_SIZE,
Arc::new(decoder),
);
)
.into_stream();
let kv = stream.try_collect::<Vec<_>>().await.unwrap();
assert!(kv.is_empty());
@@ -374,6 +293,7 @@ mod tests {
Arc::new(decoder),
);
let kv = stream
.into_stream()
.try_collect::<Vec<_>>()
.await
.unwrap()

View File

@@ -16,10 +16,11 @@ use std::time::Duration;
pub use api::v1::meta::{MigrateRegionResponse, ProcedureStateResponse};
use api::v1::meta::{
ProcedureId as PbProcedureId, ProcedureStateResponse as PbProcedureStateResponse,
ProcedureDetailResponse as PbProcedureDetailResponse, ProcedureId as PbProcedureId,
ProcedureMeta as PbProcedureMeta, ProcedureStateResponse as PbProcedureStateResponse,
ProcedureStatus as PbProcedureStatus,
};
use common_procedure::{ProcedureId, ProcedureState};
use common_procedure::{ProcedureId, ProcedureInfo, ProcedureState};
use snafu::ResultExt;
use crate::error::{ParseProcedureIdSnafu, Result};
@@ -30,7 +31,7 @@ pub struct MigrateRegionRequest {
pub region_id: u64,
pub from_peer: u64,
pub to_peer: u64,
pub replay_timeout: Duration,
pub timeout: Duration,
}
/// Cast the protobuf [`ProcedureId`] to common [`ProcedureId`].
@@ -49,9 +50,9 @@ pub fn pid_to_pb_pid(pid: ProcedureId) -> PbProcedureId {
}
}
/// Cast the common [`ProcedureState`] to pb [`ProcedureStateResponse`].
pub fn procedure_state_to_pb_response(state: &ProcedureState) -> PbProcedureStateResponse {
let (status, error) = match state {
/// Cast the [`ProcedureState`] to protobuf [`PbProcedureStatus`].
pub fn procedure_state_to_pb_state(state: &ProcedureState) -> (PbProcedureStatus, String) {
match state {
ProcedureState::Running => (PbProcedureStatus::Running, String::default()),
ProcedureState::Done { .. } => (PbProcedureStatus::Done, String::default()),
ProcedureState::Retrying { error } => (PbProcedureStatus::Retrying, error.to_string()),
@@ -62,8 +63,12 @@ pub fn procedure_state_to_pb_response(state: &ProcedureState) -> PbProcedureStat
ProcedureState::RollingBack { error } => {
(PbProcedureStatus::RollingBack, error.to_string())
}
};
}
}
/// Cast the common [`ProcedureState`] to pb [`ProcedureStateResponse`].
pub fn procedure_state_to_pb_response(state: &ProcedureState) -> PbProcedureStateResponse {
let (status, error) = procedure_state_to_pb_state(state);
PbProcedureStateResponse {
status: status.into(),
error,
@@ -71,6 +76,28 @@ pub fn procedure_state_to_pb_response(state: &ProcedureState) -> PbProcedureStat
}
}
pub fn procedure_details_to_pb_response(metas: Vec<ProcedureInfo>) -> PbProcedureDetailResponse {
let procedures = metas
.into_iter()
.map(|meta| {
let (status, error) = procedure_state_to_pb_state(&meta.state);
PbProcedureMeta {
id: Some(pid_to_pb_pid(meta.id)),
type_name: meta.type_name.to_string(),
status: status.into(),
start_time_ms: meta.start_time_ms,
end_time_ms: meta.end_time_ms,
lock_keys: meta.lock_keys,
error,
}
})
.collect();
PbProcedureDetailResponse {
procedures,
..Default::default()
}
}
#[cfg(test)]
mod tests {
use std::sync::Arc;

View File

@@ -172,7 +172,8 @@ impl StateStore for KvStateStore {
req,
self.max_num_per_range_request.unwrap_or_default(),
Arc::new(decode_kv),
);
)
.into_stream();
let stream = stream.map(move |r| {
let path = path.clone();

View File

@@ -19,6 +19,7 @@ common-error.workspace = true
common-macro.workspace = true
common-runtime.workspace = true
common-telemetry.workspace = true
common-time.workspace = true
futures.workspace = true
humantime-serde.workspace = true
object-store.workspace = true

View File

@@ -26,7 +26,7 @@ pub mod watcher;
pub use crate::error::{Error, Result};
pub use crate::procedure::{
BoxedProcedure, BoxedProcedureLoader, Context, ContextProvider, LockKey, Output, ParseIdError,
Procedure, ProcedureId, ProcedureManager, ProcedureManagerRef, ProcedureState, ProcedureWithId,
Status, StringKey,
Procedure, ProcedureId, ProcedureInfo, ProcedureManager, ProcedureManagerRef, ProcedureState,
ProcedureWithId, Status, StringKey,
};
pub use crate::watcher::Watcher;

View File

@@ -16,7 +16,7 @@ mod runner;
mod rwlock;
use std::collections::{HashMap, VecDeque};
use std::sync::atomic::{AtomicBool, Ordering};
use std::sync::atomic::{AtomicBool, AtomicI64, Ordering};
use std::sync::{Arc, Mutex, RwLock};
use std::time::{Duration, Instant};
@@ -35,7 +35,7 @@ use crate::error::{
StartRemoveOutdatedMetaTaskSnafu, StopRemoveOutdatedMetaTaskSnafu,
};
use crate::local::runner::Runner;
use crate::procedure::{BoxedProcedureLoader, InitProcedureState};
use crate::procedure::{BoxedProcedureLoader, InitProcedureState, ProcedureInfo};
use crate::store::{ProcedureMessage, ProcedureMessages, ProcedureStore, StateStoreRef};
use crate::{
BoxedProcedure, ContextProvider, LockKey, ProcedureId, ProcedureManager, ProcedureState,
@@ -57,6 +57,8 @@ const META_TTL: Duration = Duration::from_secs(60 * 10);
pub(crate) struct ProcedureMeta {
/// Id of this procedure.
id: ProcedureId,
/// Type name of this procedure.
type_name: String,
/// Parent procedure id.
parent_id: Option<ProcedureId>,
/// Notify to wait for subprocedures.
@@ -69,6 +71,10 @@ pub(crate) struct ProcedureMeta {
state_receiver: Receiver<ProcedureState>,
/// Id of child procedures.
children: Mutex<Vec<ProcedureId>>,
/// Start execution time of this procedure.
start_time_ms: AtomicI64,
/// End execution time of this procedure.
end_time_ms: AtomicI64,
}
impl ProcedureMeta {
@@ -77,6 +83,7 @@ impl ProcedureMeta {
procedure_state: ProcedureState,
parent_id: Option<ProcedureId>,
lock_key: LockKey,
type_name: &str,
) -> ProcedureMeta {
let (state_sender, state_receiver) = watch::channel(procedure_state);
ProcedureMeta {
@@ -87,6 +94,9 @@ impl ProcedureMeta {
state_sender,
state_receiver,
children: Mutex::new(Vec::new()),
start_time_ms: AtomicI64::new(0),
end_time_ms: AtomicI64::new(0),
type_name: type_name.to_string(),
}
}
@@ -117,6 +127,18 @@ impl ProcedureMeta {
fn num_children(&self) -> usize {
self.children.lock().unwrap().len()
}
/// update the start time of the procedure.
fn set_start_time_ms(&self) {
self.start_time_ms
.store(common_time::util::current_time_millis(), Ordering::Relaxed);
}
/// update the end time of the procedure.
fn set_end_time_ms(&self) {
self.end_time_ms
.store(common_time::util::current_time_millis(), Ordering::Relaxed);
}
}
/// Reference counted pointer to [ProcedureMeta].
@@ -210,6 +232,22 @@ impl ManagerContext {
procedures.get(&procedure_id).map(|meta| meta.state())
}
/// Returns the [ProcedureMeta] of all procedures.
fn list_procedure(&self) -> Vec<ProcedureInfo> {
let procedures = self.procedures.read().unwrap();
procedures
.values()
.map(|meta| ProcedureInfo {
id: meta.id,
type_name: meta.type_name.clone(),
start_time_ms: meta.start_time_ms.load(Ordering::Relaxed),
end_time_ms: meta.end_time_ms.load(Ordering::Relaxed),
state: meta.state(),
lock_keys: meta.lock_key.get_keys(),
})
.collect()
}
/// Returns the [Watcher] of specific `procedure_id`.
fn watcher(&self, procedure_id: ProcedureId) -> Option<Watcher> {
let procedures = self.procedures.read().unwrap();
@@ -438,6 +476,7 @@ impl LocalManager {
procedure_state,
None,
procedure.lock_key(),
procedure.type_name(),
));
let runner = Runner {
meta: meta.clone(),
@@ -641,6 +680,10 @@ impl ProcedureManager for LocalManager {
fn procedure_watcher(&self, procedure_id: ProcedureId) -> Option<Watcher> {
self.manager_ctx.watcher(procedure_id)
}
async fn list_procedures(&self) -> Result<Vec<ProcedureInfo>> {
Ok(self.manager_ctx.list_procedure())
}
}
struct RemoveOutdatedMetaFunction {
@@ -675,6 +718,7 @@ pub(crate) mod test_util {
ProcedureState::Running,
None,
LockKey::default(),
"ProcedureAdapter",
)
}

View File

@@ -27,7 +27,9 @@ use crate::error::{self, ProcedurePanicSnafu, Result, RollbackTimesExceededSnafu
use crate::local::{ManagerContext, ProcedureMeta, ProcedureMetaRef};
use crate::procedure::{Output, StringKey};
use crate::store::{ProcedureMessage, ProcedureStore};
use crate::{BoxedProcedure, Context, Error, ProcedureId, ProcedureState, ProcedureWithId, Status};
use crate::{
BoxedProcedure, Context, Error, Procedure, ProcedureId, ProcedureState, ProcedureWithId, Status,
};
/// A guard to cleanup procedure state.
struct ProcedureGuard {
@@ -129,7 +131,9 @@ impl Runner {
// Execute the procedure. We need to release the lock whenever the execution
// is successful or fail.
self.meta.set_start_time_ms();
self.execute_procedure_in_loop().await;
self.meta.set_end_time_ms();
// We can't remove the metadata of the procedure now as users and its parent might
// need to query its state.
@@ -368,6 +372,7 @@ impl Runner {
procedure_state,
Some(self.meta.id),
procedure.lock_key(),
procedure.type_name(),
));
let runner = Runner {
meta: meta.clone(),

View File

@@ -159,6 +159,14 @@ impl<T: Procedure + ?Sized> Procedure for Box<T> {
(**self).execute(ctx).await
}
async fn rollback(&mut self, ctx: &Context) -> Result<()> {
(**self).rollback(ctx).await
}
fn rollback_supported(&self) -> bool {
(**self).rollback_supported()
}
fn dump(&self) -> Result<String> {
(**self).dump()
}
@@ -227,6 +235,11 @@ impl LockKey {
pub fn keys_to_lock(&self) -> impl Iterator<Item = &StringKey> {
self.0.iter()
}
/// Returns the keys to lock.
pub fn get_keys(&self) -> Vec<String> {
self.0.iter().map(|key| format!("{:?}", key)).collect()
}
}
/// Boxed [Procedure].
@@ -374,6 +387,18 @@ impl ProcedureState {
_ => None,
}
}
/// Return the string values of the enum field names.
pub fn as_str_name(&self) -> &str {
match self {
ProcedureState::Running => "Running",
ProcedureState::Done { .. } => "Done",
ProcedureState::Retrying { .. } => "Retrying",
ProcedureState::Failed { .. } => "Failed",
ProcedureState::PrepareRollback { .. } => "PrepareRollback",
ProcedureState::RollingBack { .. } => "RollingBack",
}
}
}
/// The initial procedure state.
@@ -412,11 +437,30 @@ pub trait ProcedureManager: Send + Sync + 'static {
/// Returns a [Watcher] to watch [ProcedureState] of specific procedure.
fn procedure_watcher(&self, procedure_id: ProcedureId) -> Option<Watcher>;
/// Returns the details of the procedure.
async fn list_procedures(&self) -> Result<Vec<ProcedureInfo>>;
}
/// Ref-counted pointer to the [ProcedureManager].
pub type ProcedureManagerRef = Arc<dyn ProcedureManager>;
#[derive(Debug, Clone)]
pub struct ProcedureInfo {
/// Id of this procedure.
pub id: ProcedureId,
/// Type name of this procedure.
pub type_name: String,
/// Start execution time of this procedure.
pub start_time_ms: i64,
/// End execution time of this procedure.
pub end_time_ms: i64,
/// status of this procedure.
pub state: ProcedureState,
/// Lock keys of this procedure.
pub lock_keys: Vec<String>,
}
#[cfg(test)]
mod tests {
use common_error::mock::MockError;

View File

@@ -53,6 +53,7 @@ prost.workspace = true
query.workspace = true
reqwest.workspace = true
serde.workspace = true
serde_json.workspace = true
servers.workspace = true
session.workspace = true
snafu.workspace = true

View File

@@ -305,6 +305,7 @@ pub struct DatanodeOptions {
pub meta_client: Option<MetaClientOptions>,
pub wal: DatanodeWalConfig,
pub storage: StorageConfig,
pub max_concurrent_queries: usize,
/// Options for different store engines.
pub region_engine: Vec<RegionEngineConfig>,
pub logging: LoggingOptions,
@@ -339,6 +340,7 @@ impl Default for DatanodeOptions {
meta_client: None,
wal: DatanodeWalConfig::default(),
storage: StorageConfig::default(),
max_concurrent_queries: 0,
region_engine: vec![
RegionEngineConfig::Mito(MitoConfig::default()),
RegionEngineConfig::File(FileEngineConfig::default()),

View File

@@ -314,7 +314,7 @@ impl DatanodeBuilder {
&self,
event_listener: RegionServerEventListenerRef,
) -> Result<RegionServer> {
let opts = &self.opts;
let opts: &DatanodeOptions = &self.opts;
let query_engine_factory = QueryEngineFactory::new_with_plugins(
// query engine in datanode only executes plan with resolved table source.
@@ -334,6 +334,9 @@ impl DatanodeBuilder {
common_runtime::global_runtime(),
event_listener,
table_provider_factory,
opts.max_concurrent_queries,
//TODO: revaluate the hardcoded timeout on the next version of datanode concurrency limiter.
Duration::from_millis(100),
);
let object_store_manager = Self::build_object_store_manager(&opts.storage).await?;

View File

@@ -22,6 +22,7 @@ use common_macro::stack_trace_debug;
use snafu::{Location, Snafu};
use store_api::storage::RegionId;
use table::error::Error as TableError;
use tokio::time::error::Elapsed;
/// Business error of datanode.
#[derive(Snafu)]
@@ -347,6 +348,22 @@ pub enum Error {
#[snafu(implicit)]
location: Location,
},
#[snafu(display("Failed to acquire permit, source closed"))]
ConcurrentQueryLimiterClosed {
#[snafu(source)]
error: tokio::sync::AcquireError,
#[snafu(implicit)]
location: Location,
},
#[snafu(display("Failed to acquire permit under timeouts"))]
ConcurrentQueryLimiterTimeout {
#[snafu(source)]
error: Elapsed,
#[snafu(implicit)]
location: Location,
},
}
pub type Result<T> = std::result::Result<T, Error>;
@@ -411,6 +428,9 @@ impl ErrorExt for Error {
FindLogicalRegions { source, .. } => source.status_code(),
BuildMitoEngine { source, .. } => source.status_code(),
ConcurrentQueryLimiterClosed { .. } | ConcurrentQueryLimiterTimeout { .. } => {
StatusCode::RegionBusy
}
}
}

View File

@@ -12,11 +12,13 @@
// See the License for the specific language governing permissions and
// limitations under the License.
use std::collections::HashMap;
use std::sync::atomic::{AtomicBool, Ordering};
use std::sync::Arc;
use std::time::Duration;
use api::v1::meta::{HeartbeatRequest, NodeInfo, Peer, RegionRole, RegionStat};
use common_meta::datanode::REGION_STATISTIC_KEY;
use common_meta::distributed_time_constants::META_KEEP_ALIVE_INTERVAL_SECS;
use common_meta::heartbeat::handler::parse_mailbox_message::ParseMailboxMessageHandler;
use common_meta::heartbeat::handler::{
@@ -320,16 +322,25 @@ impl HeartbeatTask {
region_server
.reportable_regions()
.into_iter()
.map(|stat| RegionStat {
region_id: stat.region_id.as_u64(),
engine: stat.engine,
role: RegionRole::from(stat.role).into(),
// TODO(weny): w/rcus
rcus: 0,
wcus: 0,
approximate_bytes: region_server.region_disk_usage(stat.region_id).unwrap_or(0),
// TODO(weny): add extensions
extensions: Default::default(),
.map(|stat| {
let region_stat = region_server
.region_statistic(stat.region_id)
.unwrap_or_default();
let mut extensions = HashMap::new();
if let Some(serialized) = region_stat.serialize_to_vec() {
extensions.insert(REGION_STATISTIC_KEY.to_string(), serialized);
}
RegionStat {
region_id: stat.region_id.as_u64(),
engine: stat.engine,
role: RegionRole::from(stat.role).into(),
// TODO(weny): w/rcus
rcus: 0,
wcus: 0,
approximate_bytes: region_stat.estimated_disk_size() as i64,
extensions,
}
})
.collect()
}

View File

@@ -37,6 +37,7 @@ use crate::region_server::RegionServer;
pub struct RegionHeartbeatResponseHandler {
region_server: RegionServer,
catchup_tasks: TaskTracker<()>,
downgrade_tasks: TaskTracker<()>,
}
/// Handler of the instruction.
@@ -47,12 +48,22 @@ pub type InstructionHandler =
pub struct HandlerContext {
region_server: RegionServer,
catchup_tasks: TaskTracker<()>,
downgrade_tasks: TaskTracker<()>,
}
impl HandlerContext {
fn region_ident_to_region_id(region_ident: &RegionIdent) -> RegionId {
RegionId::new(region_ident.table_id, region_ident.region_number)
}
#[cfg(test)]
pub fn new_for_test(region_server: RegionServer) -> Self {
Self {
region_server,
catchup_tasks: TaskTracker::new(),
downgrade_tasks: TaskTracker::new(),
}
}
}
impl RegionHeartbeatResponseHandler {
@@ -61,6 +72,7 @@ impl RegionHeartbeatResponseHandler {
Self {
region_server,
catchup_tasks: TaskTracker::new(),
downgrade_tasks: TaskTracker::new(),
}
}
@@ -107,11 +119,13 @@ impl HeartbeatResponseHandler for RegionHeartbeatResponseHandler {
let mailbox = ctx.mailbox.clone();
let region_server = self.region_server.clone();
let catchup_tasks = self.catchup_tasks.clone();
let downgrade_tasks = self.downgrade_tasks.clone();
let handler = Self::build_handler(instruction)?;
let _handle = common_runtime::spawn_global(async move {
let reply = handler(HandlerContext {
region_server,
catchup_tasks,
downgrade_tasks,
})
.await;
@@ -129,6 +143,7 @@ mod tests {
use std::assert_matches::assert_matches;
use std::collections::HashMap;
use std::sync::Arc;
use std::time::Duration;
use common_meta::heartbeat::mailbox::{
HeartbeatMailbox, IncomingMessage, MailboxRef, MessageMeta,
@@ -197,6 +212,7 @@ mod tests {
// Downgrade region
let instruction = Instruction::DowngradeRegion(DowngradeRegion {
region_id: RegionId::new(2048, 1),
flush_timeout: Some(Duration::from_secs(1)),
});
assert!(heartbeat_handler
.is_acceptable(&heartbeat_env.create_handler_ctx((meta.clone(), instruction))));
@@ -205,7 +221,7 @@ mod tests {
let instruction = Instruction::UpgradeRegion(UpgradeRegion {
region_id,
last_entry_id: None,
wait_for_replay_timeout: None,
replay_timeout: None,
location_id: None,
});
assert!(
@@ -392,7 +408,10 @@ mod tests {
// Should be ok, if we try to downgrade it twice.
for _ in 0..2 {
let meta = MessageMeta::new_test(1, "test", "dn-1", "me-0");
let instruction = Instruction::DowngradeRegion(DowngradeRegion { region_id });
let instruction = Instruction::DowngradeRegion(DowngradeRegion {
region_id,
flush_timeout: Some(Duration::from_secs(1)),
});
let mut ctx = heartbeat_env.create_handler_ctx((meta, instruction));
let control = heartbeat_handler.handle(&mut ctx).await.unwrap();
@@ -413,6 +432,7 @@ mod tests {
let meta = MessageMeta::new_test(1, "test", "dn-1", "me-0");
let instruction = Instruction::DowngradeRegion(DowngradeRegion {
region_id: RegionId::new(2048, 1),
flush_timeout: Some(Duration::from_secs(1)),
});
let mut ctx = heartbeat_env.create_handler_ctx((meta, instruction));
let control = heartbeat_handler.handle(&mut ctx).await.unwrap();

View File

@@ -13,38 +13,399 @@
// limitations under the License.
use common_meta::instruction::{DowngradeRegion, DowngradeRegionReply, InstructionReply};
use common_telemetry::tracing::info;
use common_telemetry::warn;
use futures_util::future::BoxFuture;
use store_api::region_engine::SetReadonlyResponse;
use store_api::region_request::{RegionFlushRequest, RegionRequest};
use store_api::storage::RegionId;
use crate::heartbeat::handler::HandlerContext;
use crate::heartbeat::task_tracker::WaitResult;
impl HandlerContext {
async fn set_readonly_gracefully(&self, region_id: RegionId) -> InstructionReply {
match self.region_server.set_readonly_gracefully(region_id).await {
Ok(SetReadonlyResponse::Success { last_entry_id }) => {
InstructionReply::DowngradeRegion(DowngradeRegionReply {
last_entry_id,
exists: true,
error: None,
})
}
Ok(SetReadonlyResponse::NotFound) => {
InstructionReply::DowngradeRegion(DowngradeRegionReply {
last_entry_id: None,
exists: false,
error: None,
})
}
Err(err) => InstructionReply::DowngradeRegion(DowngradeRegionReply {
last_entry_id: None,
exists: true,
error: Some(format!("{err:?}")),
}),
}
}
pub(crate) fn handle_downgrade_region_instruction(
self,
DowngradeRegion { region_id }: DowngradeRegion,
DowngradeRegion {
region_id,
flush_timeout,
}: DowngradeRegion,
) -> BoxFuture<'static, InstructionReply> {
Box::pin(async move {
match self.region_server.set_readonly_gracefully(region_id).await {
Ok(SetReadonlyResponse::Success { last_entry_id }) => {
InstructionReply::DowngradeRegion(DowngradeRegionReply {
last_entry_id,
exists: true,
error: None,
})
}
Ok(SetReadonlyResponse::NotFound) => {
InstructionReply::DowngradeRegion(DowngradeRegionReply {
last_entry_id: None,
exists: false,
error: None,
})
}
Err(err) => InstructionReply::DowngradeRegion(DowngradeRegionReply {
let Some(writable) = self.region_server.is_writable(region_id) else {
return InstructionReply::DowngradeRegion(DowngradeRegionReply {
last_entry_id: None,
exists: true,
error: Some(format!("{err:?}")),
}),
exists: false,
error: None,
});
};
// Ignores flush request
if !writable {
return self.set_readonly_gracefully(region_id).await;
}
let region_server_moved = self.region_server.clone();
if let Some(flush_timeout) = flush_timeout {
let register_result = self
.downgrade_tasks
.try_register(
region_id,
Box::pin(async move {
info!("Flush region: {region_id} before downgrading region");
region_server_moved
.handle_request(
region_id,
RegionRequest::Flush(RegionFlushRequest {
row_group_size: None,
}),
)
.await?;
Ok(())
}),
)
.await;
if register_result.is_busy() {
warn!("Another flush task is running for the region: {region_id}");
}
let mut watcher = register_result.into_watcher();
let result = self.catchup_tasks.wait(&mut watcher, flush_timeout).await;
match result {
WaitResult::Timeout => {
InstructionReply::DowngradeRegion(DowngradeRegionReply {
last_entry_id: None,
exists: true,
error: Some(format!(
"Flush region: {region_id} before downgrading region is timeout"
)),
})
}
WaitResult::Finish(Ok(_)) => self.set_readonly_gracefully(region_id).await,
WaitResult::Finish(Err(err)) => {
InstructionReply::DowngradeRegion(DowngradeRegionReply {
last_entry_id: None,
exists: true,
error: Some(format!("{err:?}")),
})
}
}
} else {
self.set_readonly_gracefully(region_id).await
}
})
}
}
#[cfg(test)]
mod tests {
use std::assert_matches::assert_matches;
use std::time::Duration;
use common_meta::instruction::{DowngradeRegion, InstructionReply};
use mito2::engine::MITO_ENGINE_NAME;
use store_api::region_engine::{RegionRole, SetReadonlyResponse};
use store_api::region_request::RegionRequest;
use store_api::storage::RegionId;
use tokio::time::Instant;
use crate::error;
use crate::heartbeat::handler::HandlerContext;
use crate::tests::{mock_region_server, MockRegionEngine};
#[tokio::test]
async fn test_region_not_exist() {
let mut mock_region_server = mock_region_server();
let (mock_engine, _) = MockRegionEngine::new(MITO_ENGINE_NAME);
mock_region_server.register_engine(mock_engine);
let handler_context = HandlerContext::new_for_test(mock_region_server);
let region_id = RegionId::new(1024, 1);
let waits = vec![None, Some(Duration::from_millis(100u64))];
for flush_timeout in waits {
let reply = handler_context
.clone()
.handle_downgrade_region_instruction(DowngradeRegion {
region_id,
flush_timeout,
})
.await;
assert_matches!(reply, InstructionReply::DowngradeRegion(_));
if let InstructionReply::DowngradeRegion(reply) = reply {
assert!(!reply.exists);
assert!(reply.error.is_none());
assert!(reply.last_entry_id.is_none());
}
}
}
#[tokio::test]
async fn test_region_readonly() {
let mock_region_server = mock_region_server();
let region_id = RegionId::new(1024, 1);
let (mock_engine, _) =
MockRegionEngine::with_custom_apply_fn(MITO_ENGINE_NAME, |region_engine| {
region_engine.mock_role = Some(Some(RegionRole::Follower));
region_engine.handle_request_mock_fn = Some(Box::new(|_, req| {
if let RegionRequest::Flush(_) = req {
// Should be unreachable.
unreachable!();
};
Ok(0)
}));
region_engine.handle_set_readonly_gracefully_mock_fn =
Some(Box::new(|_| Ok(SetReadonlyResponse::success(Some(1024)))))
});
mock_region_server.register_test_region(region_id, mock_engine);
let handler_context = HandlerContext::new_for_test(mock_region_server);
let waits = vec![None, Some(Duration::from_millis(100u64))];
for flush_timeout in waits {
let reply = handler_context
.clone()
.handle_downgrade_region_instruction(DowngradeRegion {
region_id,
flush_timeout,
})
.await;
assert_matches!(reply, InstructionReply::DowngradeRegion(_));
if let InstructionReply::DowngradeRegion(reply) = reply {
assert!(reply.exists);
assert!(reply.error.is_none());
assert_eq!(reply.last_entry_id.unwrap(), 1024);
}
}
}
#[tokio::test]
async fn test_region_flush_timeout() {
let mock_region_server = mock_region_server();
let region_id = RegionId::new(1024, 1);
let (mock_engine, _) =
MockRegionEngine::with_custom_apply_fn(MITO_ENGINE_NAME, |region_engine| {
region_engine.mock_role = Some(Some(RegionRole::Leader));
region_engine.handle_request_delay = Some(Duration::from_secs(100));
region_engine.handle_set_readonly_gracefully_mock_fn =
Some(Box::new(|_| Ok(SetReadonlyResponse::success(Some(1024)))))
});
mock_region_server.register_test_region(region_id, mock_engine);
let handler_context = HandlerContext::new_for_test(mock_region_server);
let flush_timeout = Duration::from_millis(100);
let reply = handler_context
.clone()
.handle_downgrade_region_instruction(DowngradeRegion {
region_id,
flush_timeout: Some(flush_timeout),
})
.await;
assert_matches!(reply, InstructionReply::DowngradeRegion(_));
if let InstructionReply::DowngradeRegion(reply) = reply {
assert!(reply.exists);
assert!(reply.error.unwrap().contains("timeout"));
assert!(reply.last_entry_id.is_none());
}
}
#[tokio::test]
async fn test_region_flush_timeout_and_retry() {
let mock_region_server = mock_region_server();
let region_id = RegionId::new(1024, 1);
let (mock_engine, _) =
MockRegionEngine::with_custom_apply_fn(MITO_ENGINE_NAME, |region_engine| {
region_engine.mock_role = Some(Some(RegionRole::Leader));
region_engine.handle_request_delay = Some(Duration::from_millis(300));
region_engine.handle_set_readonly_gracefully_mock_fn =
Some(Box::new(|_| Ok(SetReadonlyResponse::success(Some(1024)))))
});
mock_region_server.register_test_region(region_id, mock_engine);
let handler_context = HandlerContext::new_for_test(mock_region_server);
let waits = vec![
Some(Duration::from_millis(100u64)),
Some(Duration::from_millis(100u64)),
];
for flush_timeout in waits {
let reply = handler_context
.clone()
.handle_downgrade_region_instruction(DowngradeRegion {
region_id,
flush_timeout,
})
.await;
assert_matches!(reply, InstructionReply::DowngradeRegion(_));
if let InstructionReply::DowngradeRegion(reply) = reply {
assert!(reply.exists);
assert!(reply.error.unwrap().contains("timeout"));
assert!(reply.last_entry_id.is_none());
}
}
let timer = Instant::now();
let reply = handler_context
.handle_downgrade_region_instruction(DowngradeRegion {
region_id,
flush_timeout: Some(Duration::from_millis(500)),
})
.await;
assert_matches!(reply, InstructionReply::DowngradeRegion(_));
// Must less than 300 ms.
assert!(timer.elapsed().as_millis() < 300);
if let InstructionReply::DowngradeRegion(reply) = reply {
assert!(reply.exists);
assert!(reply.error.is_none());
assert_eq!(reply.last_entry_id.unwrap(), 1024);
}
}
#[tokio::test]
async fn test_region_flush_timeout_and_retry_error() {
let mock_region_server = mock_region_server();
let region_id = RegionId::new(1024, 1);
let (mock_engine, _) =
MockRegionEngine::with_custom_apply_fn(MITO_ENGINE_NAME, |region_engine| {
region_engine.mock_role = Some(Some(RegionRole::Leader));
region_engine.handle_request_delay = Some(Duration::from_millis(300));
region_engine.handle_request_mock_fn = Some(Box::new(|_, _| {
error::UnexpectedSnafu {
violated: "mock flush failed",
}
.fail()
}));
region_engine.handle_set_readonly_gracefully_mock_fn =
Some(Box::new(|_| Ok(SetReadonlyResponse::success(Some(1024)))))
});
mock_region_server.register_test_region(region_id, mock_engine);
let handler_context = HandlerContext::new_for_test(mock_region_server);
let waits = vec![
Some(Duration::from_millis(100u64)),
Some(Duration::from_millis(100u64)),
];
for flush_timeout in waits {
let reply = handler_context
.clone()
.handle_downgrade_region_instruction(DowngradeRegion {
region_id,
flush_timeout,
})
.await;
assert_matches!(reply, InstructionReply::DowngradeRegion(_));
if let InstructionReply::DowngradeRegion(reply) = reply {
assert!(reply.exists);
assert!(reply.error.unwrap().contains("timeout"));
assert!(reply.last_entry_id.is_none());
}
}
let timer = Instant::now();
let reply = handler_context
.handle_downgrade_region_instruction(DowngradeRegion {
region_id,
flush_timeout: Some(Duration::from_millis(500)),
})
.await;
assert_matches!(reply, InstructionReply::DowngradeRegion(_));
// Must less than 300 ms.
assert!(timer.elapsed().as_millis() < 300);
if let InstructionReply::DowngradeRegion(reply) = reply {
assert!(reply.exists);
assert!(reply.error.unwrap().contains("flush failed"));
assert!(reply.last_entry_id.is_none());
}
}
#[tokio::test]
async fn test_set_region_readonly_not_found() {
let mock_region_server = mock_region_server();
let region_id = RegionId::new(1024, 1);
let (mock_engine, _) =
MockRegionEngine::with_custom_apply_fn(MITO_ENGINE_NAME, |region_engine| {
region_engine.mock_role = Some(Some(RegionRole::Leader));
region_engine.handle_set_readonly_gracefully_mock_fn =
Some(Box::new(|_| Ok(SetReadonlyResponse::NotFound)));
});
mock_region_server.register_test_region(region_id, mock_engine);
let handler_context = HandlerContext::new_for_test(mock_region_server);
let reply = handler_context
.clone()
.handle_downgrade_region_instruction(DowngradeRegion {
region_id,
flush_timeout: None,
})
.await;
assert_matches!(reply, InstructionReply::DowngradeRegion(_));
if let InstructionReply::DowngradeRegion(reply) = reply {
assert!(!reply.exists);
assert!(reply.error.is_none());
assert!(reply.last_entry_id.is_none());
}
}
#[tokio::test]
async fn test_set_region_readonly_error() {
let mock_region_server = mock_region_server();
let region_id = RegionId::new(1024, 1);
let (mock_engine, _) =
MockRegionEngine::with_custom_apply_fn(MITO_ENGINE_NAME, |region_engine| {
region_engine.mock_role = Some(Some(RegionRole::Leader));
region_engine.handle_set_readonly_gracefully_mock_fn = Some(Box::new(|_| {
error::UnexpectedSnafu {
violated: "Failed to set region to readonly",
}
.fail()
}));
});
mock_region_server.register_test_region(region_id, mock_engine);
let handler_context = HandlerContext::new_for_test(mock_region_server);
let reply = handler_context
.clone()
.handle_downgrade_region_instruction(DowngradeRegion {
region_id,
flush_timeout: None,
})
.await;
assert_matches!(reply, InstructionReply::DowngradeRegion(_));
if let InstructionReply::DowngradeRegion(reply) = reply {
assert!(reply.exists);
assert!(reply
.error
.unwrap()
.contains("Failed to set region to readonly"));
assert!(reply.last_entry_id.is_none());
}
}
}

View File

@@ -26,7 +26,7 @@ impl HandlerContext {
UpgradeRegion {
region_id,
last_entry_id,
wait_for_replay_timeout,
replay_timeout,
location_id,
}: UpgradeRegion,
) -> BoxFuture<'static, InstructionReply> {
@@ -78,7 +78,7 @@ impl HandlerContext {
}
// Returns immediately
let Some(wait_for_replay_timeout) = wait_for_replay_timeout else {
let Some(replay_timeout) = replay_timeout else {
return InstructionReply::UpgradeRegion(UpgradeRegionReply {
ready: false,
exists: true,
@@ -88,10 +88,7 @@ impl HandlerContext {
// We don't care that it returns a newly registered or running task.
let mut watcher = register_result.into_watcher();
let result = self
.catchup_tasks
.wait(&mut watcher, wait_for_replay_timeout)
.await;
let result = self.catchup_tasks.wait(&mut watcher, replay_timeout).await;
match result {
WaitResult::Timeout => InstructionReply::UpgradeRegion(UpgradeRegionReply {
@@ -129,7 +126,6 @@ mod tests {
use crate::error;
use crate::heartbeat::handler::HandlerContext;
use crate::heartbeat::task_tracker::TaskTracker;
use crate::tests::{mock_region_server, MockRegionEngine};
#[tokio::test]
@@ -138,21 +134,18 @@ mod tests {
let (mock_engine, _) = MockRegionEngine::new(MITO_ENGINE_NAME);
mock_region_server.register_engine(mock_engine);
let handler_context = HandlerContext {
region_server: mock_region_server,
catchup_tasks: TaskTracker::new(),
};
let handler_context = HandlerContext::new_for_test(mock_region_server);
let region_id = RegionId::new(1024, 1);
let waits = vec![None, Some(Duration::from_millis(100u64))];
for wait_for_replay_timeout in waits {
for replay_timeout in waits {
let reply = handler_context
.clone()
.handle_upgrade_region_instruction(UpgradeRegion {
region_id,
last_entry_id: None,
wait_for_replay_timeout,
replay_timeout,
location_id: None,
})
.await;
@@ -180,20 +173,17 @@ mod tests {
});
mock_region_server.register_test_region(region_id, mock_engine);
let handler_context = HandlerContext {
region_server: mock_region_server,
catchup_tasks: TaskTracker::new(),
};
let handler_context = HandlerContext::new_for_test(mock_region_server);
let waits = vec![None, Some(Duration::from_millis(100u64))];
for wait_for_replay_timeout in waits {
for replay_timeout in waits {
let reply = handler_context
.clone()
.handle_upgrade_region_instruction(UpgradeRegion {
region_id,
last_entry_id: None,
wait_for_replay_timeout,
replay_timeout,
location_id: None,
})
.await;
@@ -222,20 +212,17 @@ mod tests {
});
mock_region_server.register_test_region(region_id, mock_engine);
let handler_context = HandlerContext {
region_server: mock_region_server,
catchup_tasks: TaskTracker::new(),
};
let handler_context = HandlerContext::new_for_test(mock_region_server);
let waits = vec![None, Some(Duration::from_millis(100u64))];
for wait_for_replay_timeout in waits {
for replay_timeout in waits {
let reply = handler_context
.clone()
.handle_upgrade_region_instruction(UpgradeRegion {
region_id,
last_entry_id: None,
wait_for_replay_timeout,
replay_timeout,
location_id: None,
})
.await;
@@ -269,17 +256,14 @@ mod tests {
Some(Duration::from_millis(100u64)),
];
let handler_context = HandlerContext {
region_server: mock_region_server,
catchup_tasks: TaskTracker::new(),
};
let handler_context = HandlerContext::new_for_test(mock_region_server);
for wait_for_replay_timeout in waits {
for replay_timeout in waits {
let reply = handler_context
.clone()
.handle_upgrade_region_instruction(UpgradeRegion {
region_id,
wait_for_replay_timeout,
replay_timeout,
last_entry_id: None,
location_id: None,
})
@@ -298,7 +282,7 @@ mod tests {
.handle_upgrade_region_instruction(UpgradeRegion {
region_id,
last_entry_id: None,
wait_for_replay_timeout: Some(Duration::from_millis(500)),
replay_timeout: Some(Duration::from_millis(500)),
location_id: None,
})
.await;
@@ -333,17 +317,14 @@ mod tests {
});
mock_region_server.register_test_region(region_id, mock_engine);
let handler_context = HandlerContext {
region_server: mock_region_server,
catchup_tasks: TaskTracker::new(),
};
let handler_context = HandlerContext::new_for_test(mock_region_server);
let reply = handler_context
.clone()
.handle_upgrade_region_instruction(UpgradeRegion {
region_id,
last_entry_id: None,
wait_for_replay_timeout: None,
replay_timeout: None,
location_id: None,
})
.await;
@@ -361,7 +342,7 @@ mod tests {
.handle_upgrade_region_instruction(UpgradeRegion {
region_id,
last_entry_id: None,
wait_for_replay_timeout: Some(Duration::from_millis(200)),
replay_timeout: Some(Duration::from_millis(200)),
location_id: None,
})
.await;

View File

@@ -16,6 +16,7 @@ use std::collections::HashMap;
use std::fmt::Debug;
use std::ops::Deref;
use std::sync::{Arc, RwLock};
use std::time::Duration;
use api::region::RegionResponse;
use api::v1::region::{region_request, RegionResponse as RegionResponseV1};
@@ -53,15 +54,18 @@ use snafu::{ensure, OptionExt, ResultExt};
use store_api::metric_engine_consts::{
FILE_ENGINE_NAME, LOGICAL_TABLE_METADATA_KEY, METRIC_ENGINE_NAME,
};
use store_api::region_engine::{RegionEngineRef, RegionRole, SetReadonlyResponse};
use store_api::region_engine::{RegionEngineRef, RegionRole, RegionStatistic, SetReadonlyResponse};
use store_api::region_request::{
AffectedRows, RegionCloseRequest, RegionOpenRequest, RegionRequest,
};
use store_api::storage::RegionId;
use tokio::sync::{Semaphore, SemaphorePermit};
use tokio::time::timeout;
use tonic::{Request, Response, Result as TonicResult};
use crate::error::{
self, BuildRegionRequestsSnafu, DataFusionSnafu, DecodeLogicalPlanSnafu,
self, BuildRegionRequestsSnafu, ConcurrentQueryLimiterClosedSnafu,
ConcurrentQueryLimiterTimeoutSnafu, DataFusionSnafu, DecodeLogicalPlanSnafu,
ExecuteLogicalPlanSnafu, FindLogicalRegionsSnafu, HandleBatchOpenRequestSnafu,
HandleRegionRequestSnafu, NewPlanDecoderSnafu, RegionEngineNotFoundSnafu, RegionNotFoundSnafu,
RegionNotReadySnafu, Result, StopRegionEngineSnafu, UnexpectedSnafu, UnsupportedOutputSnafu,
@@ -90,6 +94,8 @@ impl RegionServer {
runtime,
event_listener,
Arc::new(DummyTableProviderFactory),
0,
Duration::from_millis(0),
)
}
@@ -98,6 +104,8 @@ impl RegionServer {
runtime: Runtime,
event_listener: RegionServerEventListenerRef,
table_provider_factory: TableProviderFactoryRef,
max_concurrent_queries: usize,
concurrent_query_limiter_timeout: Duration,
) -> Self {
Self {
inner: Arc::new(RegionServerInner::new(
@@ -105,6 +113,10 @@ impl RegionServer {
runtime,
event_listener,
table_provider_factory,
RegionServerParallelism::from_opts(
max_concurrent_queries,
concurrent_query_limiter_timeout,
),
)),
}
}
@@ -167,6 +179,11 @@ impl RegionServer {
&self,
request: api::v1::region::QueryRequest,
) -> Result<SendableRecordBatchStream> {
let _permit = if let Some(p) = &self.inner.parallelism {
Some(p.acquire().await?)
} else {
None
};
let region_id = RegionId::from_u64(request.region_id);
let provider = self.table_provider(region_id).await?;
let catalog_list = Arc::new(DummyCatalogList::with_table_provider(provider));
@@ -200,6 +217,11 @@ impl RegionServer {
#[tracing::instrument(skip_all)]
pub async fn handle_read(&self, request: QueryRequest) -> Result<SendableRecordBatchStream> {
let _permit = if let Some(p) = &self.inner.parallelism {
Some(p.acquire().await?)
} else {
None
};
let provider = self.table_provider(request.region_id).await?;
struct RegionDataSourceInjector {
@@ -290,9 +312,9 @@ impl RegionServer {
self.inner.runtime.clone()
}
pub fn region_disk_usage(&self, region_id: RegionId) -> Option<i64> {
pub fn region_statistic(&self, region_id: RegionId) -> Option<RegionStatistic> {
match self.inner.region_map.get(&region_id) {
Some(e) => e.region_disk_usage(region_id),
Some(e) => e.region_statistic(region_id),
None => None,
}
}
@@ -450,6 +472,36 @@ struct RegionServerInner {
runtime: Runtime,
event_listener: RegionServerEventListenerRef,
table_provider_factory: TableProviderFactoryRef,
// The number of queries allowed to be executed at the same time.
// Act as last line of defense on datanode to prevent query overloading.
parallelism: Option<RegionServerParallelism>,
}
struct RegionServerParallelism {
semaphore: Semaphore,
timeout: Duration,
}
impl RegionServerParallelism {
pub fn from_opts(
max_concurrent_queries: usize,
concurrent_query_limiter_timeout: Duration,
) -> Option<Self> {
if max_concurrent_queries == 0 {
return None;
}
Some(RegionServerParallelism {
semaphore: Semaphore::new(max_concurrent_queries),
timeout: concurrent_query_limiter_timeout,
})
}
pub async fn acquire(&self) -> Result<SemaphorePermit> {
timeout(self.timeout, self.semaphore.acquire())
.await
.context(ConcurrentQueryLimiterTimeoutSnafu)?
.context(ConcurrentQueryLimiterClosedSnafu)
}
}
enum CurrentEngine {
@@ -478,6 +530,7 @@ impl RegionServerInner {
runtime: Runtime,
event_listener: RegionServerEventListenerRef,
table_provider_factory: TableProviderFactoryRef,
parallelism: Option<RegionServerParallelism>,
) -> Self {
Self {
engines: RwLock::new(HashMap::new()),
@@ -486,6 +539,7 @@ impl RegionServerInner {
runtime,
event_listener,
table_provider_factory,
parallelism,
}
}
@@ -843,7 +897,7 @@ impl RegionServerInner {
let result = self
.query_engine
.execute(request.plan.into(), query_ctx)
.execute(request.plan, query_ctx)
.await
.context(ExecuteLogicalPlanSnafu)?;
@@ -1284,4 +1338,23 @@ mod tests {
assert(result);
}
}
#[tokio::test]
async fn test_region_server_parallism() {
let p = RegionServerParallelism::from_opts(2, Duration::from_millis(1)).unwrap();
let first_query = p.acquire().await;
assert!(first_query.is_ok());
let second_query = p.acquire().await;
assert!(second_query.is_ok());
let third_query = p.acquire().await;
assert!(third_query.is_err());
let err = third_query.unwrap_err();
assert_eq!(
err.output_msg(),
"Failed to acquire permit under timeouts: deadline has elapsed".to_string()
);
drop(first_query);
let forth_query = p.acquire().await;
assert!(forth_query.is_ok());
}
}

View File

@@ -24,14 +24,16 @@ use common_function::scalars::aggregate::AggregateFunctionMetaRef;
use common_query::prelude::ScalarUdf;
use common_query::Output;
use common_runtime::Runtime;
use datafusion_expr::LogicalPlan;
use query::dataframe::DataFrame;
use query::plan::LogicalPlan;
use query::planner::LogicalPlanner;
use query::query_engine::{DescribeResult, QueryEngineState};
use query::{QueryEngine, QueryEngineContext};
use session::context::QueryContextRef;
use store_api::metadata::RegionMetadataRef;
use store_api::region_engine::{RegionEngine, RegionRole, RegionScannerRef, SetReadonlyResponse};
use store_api::region_engine::{
RegionEngine, RegionRole, RegionScannerRef, RegionStatistic, SetReadonlyResponse,
};
use store_api::region_request::{AffectedRows, RegionRequest};
use store_api::storage::{RegionId, ScanRequest};
use table::TableRef;
@@ -103,10 +105,14 @@ pub fn mock_region_server() -> RegionServer {
pub type MockRequestHandler =
Box<dyn Fn(RegionId, RegionRequest) -> Result<AffectedRows, Error> + Send + Sync>;
pub type MockSetReadonlyGracefullyHandler =
Box<dyn Fn(RegionId) -> Result<SetReadonlyResponse, Error> + Send + Sync>;
pub struct MockRegionEngine {
sender: Sender<(RegionId, RegionRequest)>,
pub(crate) handle_request_delay: Option<Duration>,
pub(crate) handle_request_mock_fn: Option<MockRequestHandler>,
pub(crate) handle_set_readonly_gracefully_mock_fn: Option<MockSetReadonlyGracefullyHandler>,
pub(crate) mock_role: Option<Option<RegionRole>>,
engine: String,
}
@@ -120,6 +126,7 @@ impl MockRegionEngine {
handle_request_delay: None,
sender: tx,
handle_request_mock_fn: None,
handle_set_readonly_gracefully_mock_fn: None,
mock_role: None,
engine: engine.to_string(),
}),
@@ -138,6 +145,7 @@ impl MockRegionEngine {
handle_request_delay: None,
sender: tx,
handle_request_mock_fn: Some(mock_fn),
handle_set_readonly_gracefully_mock_fn: None,
mock_role: None,
engine: engine.to_string(),
}),
@@ -157,6 +165,7 @@ impl MockRegionEngine {
handle_request_delay: None,
sender: tx,
handle_request_mock_fn: None,
handle_set_readonly_gracefully_mock_fn: None,
mock_role: None,
engine: engine.to_string(),
};
@@ -203,7 +212,7 @@ impl RegionEngine for MockRegionEngine {
unimplemented!()
}
fn region_disk_usage(&self, _region_id: RegionId) -> Option<i64> {
fn region_statistic(&self, _region_id: RegionId) -> Option<RegionStatistic> {
unimplemented!()
}
@@ -217,9 +226,13 @@ impl RegionEngine for MockRegionEngine {
async fn set_readonly_gracefully(
&self,
_region_id: RegionId,
region_id: RegionId,
) -> Result<SetReadonlyResponse, BoxedError> {
unimplemented!()
if let Some(mock_fn) = &self.handle_set_readonly_gracefully_mock_fn {
return mock_fn(region_id).map_err(BoxedError::new);
};
unreachable!()
}
fn role(&self, _region_id: RegionId) -> Option<RegionRole> {

View File

@@ -48,7 +48,7 @@ pub use list_type::ListType;
pub use null_type::NullType;
pub use primitive_type::{
Float32Type, Float64Type, Int16Type, Int32Type, Int64Type, Int8Type, LogicalPrimitiveType,
NativeType, OrdPrimitive, UInt16Type, UInt32Type, UInt64Type, UInt8Type, WrapperType,
OrdPrimitive, UInt16Type, UInt32Type, UInt64Type, UInt8Type, WrapperType,
};
pub use string_type::StringType;
pub use time_type::{

View File

@@ -18,7 +18,6 @@ use std::fmt;
use arrow::datatypes::{ArrowNativeType, ArrowPrimitiveType, DataType as ArrowDataType};
use common_time::interval::IntervalUnit;
use common_time::{Date, DateTime};
use num::NumCast;
use serde::{Deserialize, Serialize};
use snafu::OptionExt;
@@ -31,27 +30,6 @@ use crate::types::{DateTimeType, DateType};
use crate::value::{Value, ValueRef};
use crate::vectors::{MutableVector, PrimitiveVector, PrimitiveVectorBuilder, Vector};
/// Data types that can be used as arrow's native type.
pub trait NativeType: ArrowNativeType + NumCast {}
macro_rules! impl_native_type {
($Type: ident) => {
impl NativeType for $Type {}
};
}
impl_native_type!(u8);
impl_native_type!(u16);
impl_native_type!(u32);
impl_native_type!(u64);
impl_native_type!(i8);
impl_native_type!(i16);
impl_native_type!(i32);
impl_native_type!(i64);
impl_native_type!(i128);
impl_native_type!(f32);
impl_native_type!(f64);
/// Represents the wrapper type that wraps a native type using the `newtype pattern`,
/// such as [Date](`common_time::Date`) is a wrapper type for the underlying native
/// type `i32`.
@@ -70,7 +48,7 @@ pub trait WrapperType:
/// Logical primitive type that this wrapper type belongs to.
type LogicalType: LogicalPrimitiveType<Wrapper = Self, Native = Self::Native>;
/// The underlying native type.
type Native: NativeType;
type Native: ArrowNativeType;
/// Convert native type into this wrapper type.
fn from_native(value: Self::Native) -> Self;
@@ -84,7 +62,7 @@ pub trait LogicalPrimitiveType: 'static + Sized {
/// Arrow primitive type of this logical type.
type ArrowPrimitive: ArrowPrimitiveType<Native = Self::Native>;
/// Native (physical) type of this logical type.
type Native: NativeType;
type Native: ArrowNativeType;
/// Wrapper type that the vector returns.
type Wrapper: WrapperType<LogicalType = Self, Native = Self::Native>
+ for<'a> Scalar<VectorType = PrimitiveVector<Self>, RefType<'a> = Self::Wrapper>
@@ -107,7 +85,7 @@ pub trait LogicalPrimitiveType: 'static + Sized {
/// A new type for [WrapperType], complement the `Ord` feature for it. Wrapping non ordered
/// primitive types like `f32` and `f64` in `OrdPrimitive` can make them be used in places that
/// require `Ord`. For example, in `Median` or `Percentile` UDAFs.
/// require `Ord`. For example, in `Median` UDAFs.
#[derive(Debug, Clone, Copy, PartialEq)]
pub struct OrdPrimitive<T: WrapperType>(pub T);

View File

@@ -26,7 +26,8 @@ use object_store::ObjectStore;
use snafu::{ensure, OptionExt};
use store_api::metadata::RegionMetadataRef;
use store_api::region_engine::{
RegionEngine, RegionRole, RegionScannerRef, SetReadonlyResponse, SinglePartitionScanner,
RegionEngine, RegionRole, RegionScannerRef, RegionStatistic, SetReadonlyResponse,
SinglePartitionScanner,
};
use store_api::region_request::{
AffectedRows, RegionCloseRequest, RegionCreateRequest, RegionDropRequest, RegionOpenRequest,
@@ -108,7 +109,7 @@ impl RegionEngine for FileRegionEngine {
self.inner.stop().await.map_err(BoxedError::new)
}
fn region_disk_usage(&self, _: RegionId) -> Option<i64> {
fn region_statistic(&self, _: RegionId) -> Option<RegionStatistic> {
None
}

View File

@@ -41,7 +41,6 @@ use datafusion_expr::{
BinaryExpr, Expr, Operator, Projection, ScalarUDFImpl, Signature, TypeSignature, Volatility,
};
use query::parser::QueryLanguageParser;
use query::plan::LogicalPlan;
use query::query_engine::DefaultSerializer;
use query::QueryEngine;
use snafu::ResultExt;
@@ -111,7 +110,6 @@ pub async fn sql_to_flow_plan(
.await
.map_err(BoxedError::new)
.context(ExternalSnafu)?;
let LogicalPlan::DfPlan(plan) = plan;
let opted_plan = apply_df_optimizer(plan).await?;

View File

@@ -163,7 +163,6 @@ mod test {
use itertools::Itertools;
use prost::Message;
use query::parser::QueryLanguageParser;
use query::plan::LogicalPlan;
use query::query_engine::DefaultSerializer;
use query::QueryEngine;
use session::context::QueryContext;
@@ -274,7 +273,6 @@ mod test {
.plan(stmt, QueryContext::arc())
.await
.unwrap();
let LogicalPlan::DfPlan(plan) = plan;
let plan = apply_df_optimizer(plan).await.unwrap();
// encode then decode so to rely on the impl of conversion from logical plan to substrait plan
@@ -297,7 +295,6 @@ mod test {
.plan(stmt, QueryContext::arc())
.await
.unwrap();
let LogicalPlan::DfPlan(plan) = plan;
let plan = apply_df_optimizer(plan).await;
assert!(plan.is_err());

View File

@@ -37,6 +37,7 @@ common-runtime.workspace = true
common-telemetry.workspace = true
common-time.workspace = true
common-version.workspace = true
datafusion-expr.workspace = true
datanode.workspace = true
humantime-serde.workspace = true
lazy_static.workspace = true

View File

@@ -40,6 +40,7 @@ use common_procedure::options::ProcedureConfig;
use common_procedure::ProcedureManagerRef;
use common_query::Output;
use common_telemetry::{debug, error, tracing};
use datafusion_expr::LogicalPlan;
use log_store::raft_engine::RaftEngineBackend;
use operator::delete::DeleterRef;
use operator::insert::InserterRef;
@@ -48,7 +49,6 @@ use pipeline::pipeline_operator::PipelineOperator;
use prometheus::HistogramTimer;
use query::metrics::OnDone;
use query::parser::{PromQuery, QueryLanguageParser, QueryStatement};
use query::plan::LogicalPlan;
use query::query_engine::options::{validate_catalog_and_schema, QueryOptions};
use query::query_engine::DescribeResult;
use query::QueryEngineRef;
@@ -359,7 +359,6 @@ impl SqlQueryHandler for Instance {
.schema_exists(catalog, schema, None)
.await
.context(error::CatalogSnafu)
.map(|b| b && !self.catalog_manager.is_reserved_schema_name(schema))
}
}

View File

@@ -20,7 +20,10 @@ use auth::{PermissionChecker, PermissionCheckerRef, PermissionReq};
use client::Output;
use common_error::ext::BoxedError;
use pipeline::{GreptimeTransformer, Pipeline, PipelineInfo, PipelineVersion};
use servers::error::{AuthSnafu, ExecuteGrpcRequestSnafu, PipelineSnafu, Result as ServerResult};
use servers::error::{
AuthSnafu, Error as ServerError, ExecuteGrpcRequestSnafu, PipelineSnafu, Result as ServerResult,
};
use servers::interceptor::{LogIngestInterceptor, LogIngestInterceptorRef};
use servers::query_handler::LogHandler;
use session::context::QueryContextRef;
use snafu::ResultExt;
@@ -40,6 +43,12 @@ impl LogHandler for Instance {
.check_permission(ctx.current_user(), PermissionReq::LogWrite)
.context(AuthSnafu)?;
let log = self
.plugins
.get::<LogIngestInterceptorRef<ServerError>>()
.as_ref()
.pre_ingest(log, ctx.clone())?;
self.handle_log_inserts(log, ctx).await
}

View File

@@ -18,11 +18,13 @@ use std::sync::Arc;
use auth::UserProviderRef;
use common_base::Plugins;
use common_config::{Configurable, Mode};
use servers::error::Error as ServerError;
use servers::grpc::builder::GrpcServerBuilder;
use servers::grpc::greptime_handler::GreptimeRequestHandler;
use servers::grpc::{GrpcOptions, GrpcServer, GrpcServerConfig};
use servers::http::event::LogValidatorRef;
use servers::http::{HttpServer, HttpServerBuilder};
use servers::interceptor::LogIngestInterceptorRef;
use servers::metrics_handler::MetricsHandler;
use servers::mysql::server::{MysqlServer, MysqlSpawnConfig, MysqlSpawnRef};
use servers::postgres::PostgresServer;
@@ -81,8 +83,10 @@ where
Some(self.instance.clone()),
);
builder = builder
.with_log_ingest_handler(self.instance.clone(), self.plugins.get::<LogValidatorRef>());
let validator = self.plugins.get::<LogValidatorRef>();
let ingest_interceptor = self.plugins.get::<LogIngestInterceptorRef<ServerError>>();
builder =
builder.with_log_ingest_handler(self.instance.clone(), validator, ingest_interceptor);
if let Some(user_provider) = self.plugins.get::<UserProviderRef>() {
builder = builder.with_user_provider(user_provider);

View File

@@ -264,9 +264,8 @@ impl KvBackend for RaftEngineBackend {
let mut response = BatchGetResponse {
kvs: Vec::with_capacity(req.keys.len()),
};
let engine = self.engine.read().unwrap();
for key in req.keys {
let Some(value) = engine.get(SYSTEM_NAMESPACE, &key) else {
let Some(value) = self.engine.read().unwrap().get(SYSTEM_NAMESPACE, &key) else {
continue;
};
response.kvs.push(KeyValue { key, value });

View File

@@ -22,13 +22,14 @@ mod cluster;
mod store;
mod util;
use api::v1::meta::Role;
use api::v1::meta::{ProcedureDetailResponse, Role};
use cluster::Client as ClusterClient;
use common_error::ext::BoxedError;
use common_grpc::channel_manager::{ChannelConfig, ChannelManager};
use common_meta::cluster::{
ClusterInfo, MetasrvStatus, NodeInfo, NodeInfoKey, NodeStatus, Role as ClusterRole,
};
use common_meta::datanode::{DatanodeStatKey, DatanodeStatValue, RegionStat};
use common_meta::ddl::{ExecutorContext, ProcedureExecutor};
use common_meta::error::{self as meta_error, Result as MetaResult};
use common_meta::rpc::ddl::{SubmitDdlTaskRequest, SubmitDdlTaskResponse};
@@ -259,6 +260,16 @@ impl ProcedureExecutor for MetaClient {
.map_err(BoxedError::new)
.context(meta_error::ExternalSnafu)
}
async fn list_procedures(&self, _ctx: &ExecutorContext) -> MetaResult<ProcedureDetailResponse> {
self.procedure_client()
.map_err(BoxedError::new)
.context(meta_error::ExternalSnafu)?
.list_procedures()
.await
.map_err(BoxedError::new)
.context(meta_error::ExternalSnafu)
}
}
#[async_trait::async_trait]
@@ -317,6 +328,28 @@ impl ClusterInfo for MetaClient {
Ok(nodes)
}
async fn list_region_stats(&self) -> Result<Vec<RegionStat>> {
let cluster_client = self.cluster_client()?;
let range_prefix = DatanodeStatKey::key_prefix_with_cluster_id(self.id.0);
let req = RangeRequest::new().with_prefix(range_prefix);
let mut datanode_stats = cluster_client
.range(req)
.await?
.kvs
.into_iter()
.map(|kv| DatanodeStatValue::try_from(kv.value).context(ConvertMetaRequestSnafu))
.collect::<Result<Vec<_>>>()?;
let region_stats = datanode_stats
.iter_mut()
.flat_map(|datanode_stat| {
let last = datanode_stat.stats.pop();
last.map(|stat| stat.region_stats).unwrap_or_default()
})
.collect::<Vec<_>>();
Ok(region_stats)
}
}
impl MetaClient {
@@ -473,7 +506,7 @@ impl MetaClient {
request.region_id,
request.from_peer,
request.to_peer,
request.replay_timeout,
request.timeout,
)
.await
}

View File

@@ -18,8 +18,9 @@ use std::time::Duration;
use api::v1::meta::procedure_service_client::ProcedureServiceClient;
use api::v1::meta::{
DdlTaskRequest, DdlTaskResponse, MigrateRegionRequest, MigrateRegionResponse, ProcedureId,
ProcedureStateResponse, QueryProcedureRequest, ResponseHeader, Role,
DdlTaskRequest, DdlTaskResponse, MigrateRegionRequest, MigrateRegionResponse,
ProcedureDetailRequest, ProcedureDetailResponse, ProcedureId, ProcedureStateResponse,
QueryProcedureRequest, ResponseHeader, Role,
};
use common_grpc::channel_manager::ChannelManager;
use common_telemetry::tracing_context::TracingContext;
@@ -76,19 +77,24 @@ impl Client {
/// - `region_id`: the migrated region id
/// - `from_peer`: the source datanode id
/// - `to_peer`: the target datanode id
/// - `replay_timeout`: replay WAL timeout after migration.
/// - `timeout`: timeout for downgrading region and upgrading region operations
pub async fn migrate_region(
&self,
region_id: u64,
from_peer: u64,
to_peer: u64,
replay_timeout: Duration,
timeout: Duration,
) -> Result<MigrateRegionResponse> {
let inner = self.inner.read().await;
inner
.migrate_region(region_id, from_peer, to_peer, replay_timeout)
.migrate_region(region_id, from_peer, to_peer, timeout)
.await
}
pub async fn list_procedures(&self) -> Result<ProcedureDetailResponse> {
let inner = self.inner.read().await;
inner.list_procedures().await
}
}
#[derive(Debug)]
@@ -210,13 +216,13 @@ impl Inner {
region_id: u64,
from_peer: u64,
to_peer: u64,
replay_timeout: Duration,
timeout: Duration,
) -> Result<MigrateRegionResponse> {
let mut req = MigrateRegionRequest {
region_id,
from_peer,
to_peer,
replay_timeout_secs: replay_timeout.as_secs() as u32,
timeout_secs: timeout.as_secs() as u32,
..Default::default()
};
@@ -279,4 +285,23 @@ impl Inner {
)
.await
}
async fn list_procedures(&self) -> Result<ProcedureDetailResponse> {
let mut req = ProcedureDetailRequest::default();
req.set_header(
self.id,
self.role,
TracingContext::from_current_span().to_w3c(),
);
self.with_retry(
"list procedure",
move |mut client| {
let req = req.clone();
async move { client.details(req).await.map(|res| res.into_inner()) }
},
|resp: &ProcedureDetailResponse| &resp.header,
)
.await
}
}

View File

@@ -23,6 +23,7 @@ use api::v1::meta::{
RangeRequest as PbRangeRequest, RangeResponse as PbRangeResponse, ResponseHeader,
};
use common_grpc::channel_manager::ChannelManager;
use common_meta::datanode::{DatanodeStatKey, DatanodeStatValue};
use common_meta::kv_backend::{KvBackend, ResettableKvBackendRef, TxnService};
use common_meta::rpc::store::{
BatchDeleteRequest, BatchDeleteResponse, BatchGetRequest, BatchGetResponse, BatchPutRequest,
@@ -37,7 +38,6 @@ use snafu::{ensure, OptionExt, ResultExt};
use crate::error;
use crate::error::{match_for_io_error, Result};
use crate::key::{DatanodeStatKey, DatanodeStatValue};
use crate::metasrv::ElectionRef;
pub type MetaPeerClientRef = Arc<MetaPeerClient>;
@@ -217,7 +217,11 @@ impl MetaPeerClient {
pub async fn get_node_cnt(&self) -> Result<i32> {
let kvs = self.get_dn_key_value(true).await?;
kvs.into_iter()
.map(|kv| kv.key.try_into())
.map(|kv| {
kv.key
.try_into()
.context(error::InvalidDatanodeStatFormatSnafu {})
})
.collect::<Result<HashSet<DatanodeStatKey>>>()
.map(|hash_set| hash_set.len() as i32)
}
@@ -319,7 +323,14 @@ impl MetaPeerClient {
fn to_stat_kv_map(kvs: Vec<KeyValue>) -> Result<HashMap<DatanodeStatKey, DatanodeStatValue>> {
let mut map = HashMap::with_capacity(kvs.len());
for kv in kvs {
let _ = map.insert(kv.key.try_into()?, kv.value.try_into()?);
let _ = map.insert(
kv.key
.try_into()
.context(error::InvalidDatanodeStatFormatSnafu {})?,
kv.value
.try_into()
.context(error::InvalidDatanodeStatFormatSnafu {})?,
);
}
Ok(map)
}
@@ -356,12 +367,11 @@ fn need_retry(error: &error::Error) -> bool {
#[cfg(test)]
mod tests {
use api::v1::meta::{Error, ErrorCode, ResponseHeader};
use common_meta::datanode::{DatanodeStatKey, DatanodeStatValue, Stat};
use common_meta::rpc::KeyValue;
use super::{check_resp_header, to_stat_kv_map, Context};
use crate::error;
use crate::handler::node_stat::Stat;
use crate::key::{DatanodeStatKey, DatanodeStatValue};
#[test]
fn test_to_stat_kv_map() {

View File

@@ -32,6 +32,13 @@ use crate::pubsub::Message;
#[snafu(visibility(pub))]
#[stack_trace_debug]
pub enum Error {
#[snafu(display("Exceeded deadline, operation: {}", operation))]
ExceededDeadline {
#[snafu(implicit)]
location: Location,
operation: String,
},
#[snafu(display("The target peer is unavailable temporally: {}", peer_id))]
PeerUnavailable {
#[snafu(implicit)]
@@ -278,22 +285,6 @@ pub enum Error {
location: Location,
},
#[snafu(display("Failed to parse stat key from utf8"))]
StatKeyFromUtf8 {
#[snafu(source)]
error: std::string::FromUtf8Error,
#[snafu(implicit)]
location: Location,
},
#[snafu(display("Failed to parse stat value from utf8"))]
StatValueFromUtf8 {
#[snafu(source)]
error: std::string::FromUtf8Error,
#[snafu(implicit)]
location: Location,
},
#[snafu(display("Failed to parse invalid region key from utf8"))]
InvalidRegionKeyFromUtf8 {
#[snafu(source)]
@@ -712,6 +703,13 @@ pub enum Error {
source: common_meta::error::Error,
},
#[snafu(display("Invalid datanode stat format"))]
InvalidDatanodeStatFormat {
#[snafu(implicit)]
location: Location,
source: common_meta::error::Error,
},
#[snafu(display("Failed to serialize options to TOML"))]
TomlFormat {
#[snafu(implicit)]
@@ -783,7 +781,8 @@ impl ErrorExt for Error {
| Error::Join { .. }
| Error::WeightArray { .. }
| Error::NotSetWeightArray { .. }
| Error::PeerUnavailable { .. } => StatusCode::Internal,
| Error::PeerUnavailable { .. }
| Error::ExceededDeadline { .. } => StatusCode::Internal,
Error::Unsupported { .. } => StatusCode::Unsupported,
@@ -807,8 +806,6 @@ impl ErrorExt for Error {
| Error::TomlFormat { .. } => StatusCode::InvalidArguments,
Error::LeaseKeyFromUtf8 { .. }
| Error::LeaseValueFromUtf8 { .. }
| Error::StatKeyFromUtf8 { .. }
| Error::StatValueFromUtf8 { .. }
| Error::InvalidRegionKeyFromUtf8 { .. }
| Error::TableRouteNotFound { .. }
| Error::TableInfoNotFound { .. }
@@ -822,7 +819,8 @@ impl ErrorExt for Error {
| Error::MigrationRunning { .. } => StatusCode::Unexpected,
Error::TableNotFound { .. } => StatusCode::TableNotFound,
Error::SaveClusterInfo { source, .. }
| Error::InvalidClusterInfoFormat { source, .. } => source.status_code(),
| Error::InvalidClusterInfoFormat { source, .. }
| Error::InvalidDatanodeStatFormat { source, .. } => source.status_code(),
Error::InvalidateTableCache { source, .. } => source.status_code(),
Error::SubmitProcedure { source, .. }
| Error::WaitProcedure { source, .. }

View File

@@ -22,6 +22,7 @@ use api::v1::meta::{
HeartbeatRequest, HeartbeatResponse, MailboxMessage, RegionLease, RequestHeader,
ResponseHeader, Role, PROTOCOL_VERSION,
};
use common_meta::datanode::Stat;
use common_meta::instruction::{Instruction, InstructionReply};
use common_meta::sequence::Sequence;
use common_telemetry::{debug, info, warn};
@@ -32,7 +33,6 @@ use store_api::storage::RegionId;
use tokio::sync::mpsc::Sender;
use tokio::sync::{oneshot, Notify, RwLock};
use self::node_stat::Stat;
use crate::error::{self, DeserializeFromJsonSnafu, Result, UnexpectedInstructionReplySnafu};
use crate::metasrv::Context;
use crate::metrics::{METRIC_META_HANDLER_EXECUTE, METRIC_META_HEARTBEAT_CONNECTION_NUM};
@@ -48,7 +48,6 @@ pub mod failure_handler;
pub mod filter_inactive_region_stats;
pub mod keep_lease_handler;
pub mod mailbox_handler;
pub mod node_stat;
pub mod on_leader_start_handler;
pub mod publish_heartbeat_handler;
pub mod region_lease_handler;

View File

@@ -15,6 +15,7 @@
use std::cmp::Ordering;
use api::v1::meta::{HeartbeatRequest, Role};
use common_meta::datanode::{DatanodeStatKey, DatanodeStatValue, Stat};
use common_meta::instruction::CacheIdent;
use common_meta::key::node_address::{NodeAddressKey, NodeAddressValue};
use common_meta::key::{MetadataKey, MetadataValue};
@@ -25,9 +26,7 @@ use dashmap::DashMap;
use snafu::ResultExt;
use crate::error::{self, Result};
use crate::handler::node_stat::Stat;
use crate::handler::{HandleControl, HeartbeatAccumulator, HeartbeatHandler};
use crate::key::{DatanodeStatKey, DatanodeStatValue};
use crate::metasrv::Context;
const MAX_CACHED_STATS_PER_KEY: usize = 10;
@@ -138,7 +137,8 @@ impl HeartbeatHandler for CollectStatsHandler {
let value: Vec<u8> = DatanodeStatValue {
stats: epoch_stats.drain_all(),
}
.try_into()?;
.try_into()
.context(error::InvalidDatanodeStatFormatSnafu {})?;
let put = PutRequest {
key,
value,
@@ -198,6 +198,7 @@ mod tests {
use std::sync::Arc;
use common_meta::cache_invalidator::DummyCacheInvalidator;
use common_meta::datanode::DatanodeStatKey;
use common_meta::key::TableMetadataManager;
use common_meta::kv_backend::memory::MemoryKvBackend;
use common_meta::sequence::SequenceBuilder;
@@ -205,7 +206,6 @@ mod tests {
use super::*;
use crate::cluster::MetaPeerClientBuilder;
use crate::handler::{HeartbeatMailbox, Pushers};
use crate::key::DatanodeStatKey;
use crate::service::store::cached_kv::LeaderCachedKvBackend;
#[tokio::test]

View File

@@ -13,9 +13,9 @@
// limitations under the License.
use api::v1::meta::{HeartbeatRequest, Role};
use common_meta::datanode::Stat;
use common_telemetry::{info, warn};
use super::node_stat::Stat;
use crate::error::Result;
use crate::handler::{HandleControl, HeartbeatAccumulator, HeartbeatHandler};
use crate::metasrv::Context;

View File

@@ -64,12 +64,12 @@ impl HeartbeatHandler for RegionFailureHandler {
mod tests {
use api::v1::meta::HeartbeatRequest;
use common_catalog::consts::default_engine;
use common_meta::datanode::{RegionStat, Stat};
use store_api::region_engine::RegionRole;
use store_api::storage::RegionId;
use tokio::sync::oneshot;
use crate::handler::failure_handler::RegionFailureHandler;
use crate::handler::node_stat::{RegionStat, Stat};
use crate::handler::{HeartbeatAccumulator, HeartbeatHandler};
use crate::metasrv::builder::MetasrvBuilder;
use crate::region::supervisor::tests::new_test_supervisor;
@@ -93,7 +93,9 @@ mod tests {
approximate_bytes: 0,
engine: default_engine().to_string(),
role: RegionRole::Follower,
extensions: Default::default(),
memtable_size: 0,
manifest_size: 0,
sst_size: 0,
}
}
acc.stat = Some(Stat {

View File

@@ -1,170 +0,0 @@
// Copyright 2023 Greptime Team
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
use std::collections::{HashMap, HashSet};
use api::v1::meta::{HeartbeatRequest, RequestHeader};
use common_meta::ClusterId;
use common_time::util as time_util;
use serde::{Deserialize, Serialize};
use store_api::region_engine::RegionRole;
use store_api::storage::RegionId;
use table::metadata::TableId;
use crate::key::DatanodeStatKey;
#[derive(Debug, Clone, Default, Serialize, Deserialize)]
pub struct Stat {
pub timestamp_millis: i64,
pub cluster_id: ClusterId,
// The datanode Id.
pub id: u64,
// The datanode address.
pub addr: String,
/// The read capacity units during this period
pub rcus: i64,
/// The write capacity units during this period
pub wcus: i64,
/// How many regions on this node
pub region_num: u64,
pub region_stats: Vec<RegionStat>,
// The node epoch is used to check whether the node has restarted or redeployed.
pub node_epoch: u64,
}
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct RegionStat {
/// The region_id.
pub id: RegionId,
/// The read capacity units during this period
pub rcus: i64,
/// The write capacity units during this period
pub wcus: i64,
/// Approximate bytes of this region
pub approximate_bytes: i64,
/// The engine name.
pub engine: String,
/// The region role.
pub role: RegionRole,
/// The extension info of this region
pub extensions: HashMap<String, Vec<u8>>,
}
impl Stat {
#[inline]
pub fn is_empty(&self) -> bool {
self.region_stats.is_empty()
}
pub fn stat_key(&self) -> DatanodeStatKey {
DatanodeStatKey {
cluster_id: self.cluster_id,
node_id: self.id,
}
}
/// Returns a tuple array containing [RegionId] and [RegionRole].
pub fn regions(&self) -> Vec<(RegionId, RegionRole)> {
self.region_stats.iter().map(|s| (s.id, s.role)).collect()
}
/// Returns all table ids in the region stats.
pub fn table_ids(&self) -> HashSet<TableId> {
self.region_stats.iter().map(|s| s.id.table_id()).collect()
}
pub fn retain_active_region_stats(&mut self, inactive_region_ids: &HashSet<RegionId>) {
if inactive_region_ids.is_empty() {
return;
}
self.region_stats
.retain(|r| !inactive_region_ids.contains(&r.id));
self.rcus = self.region_stats.iter().map(|s| s.rcus).sum();
self.wcus = self.region_stats.iter().map(|s| s.wcus).sum();
self.region_num = self.region_stats.len() as u64;
}
}
impl TryFrom<&HeartbeatRequest> for Stat {
type Error = Option<RequestHeader>;
fn try_from(value: &HeartbeatRequest) -> Result<Self, Self::Error> {
let HeartbeatRequest {
header,
peer,
region_stats,
node_epoch,
..
} = value;
match (header, peer) {
(Some(header), Some(peer)) => {
let region_stats = region_stats
.iter()
.map(RegionStat::from)
.collect::<Vec<_>>();
Ok(Self {
timestamp_millis: time_util::current_time_millis(),
cluster_id: header.cluster_id,
// datanode id
id: peer.id,
// datanode address
addr: peer.addr.clone(),
rcus: region_stats.iter().map(|s| s.rcus).sum(),
wcus: region_stats.iter().map(|s| s.wcus).sum(),
region_num: region_stats.len() as u64,
region_stats,
node_epoch: *node_epoch,
})
}
(header, _) => Err(header.clone()),
}
}
}
impl From<&api::v1::meta::RegionStat> for RegionStat {
fn from(value: &api::v1::meta::RegionStat) -> Self {
Self {
id: RegionId::from_u64(value.region_id),
rcus: value.rcus,
wcus: value.wcus,
approximate_bytes: value.approximate_bytes,
engine: value.engine.to_string(),
role: RegionRole::from(value.role()),
extensions: value.extensions.clone(),
}
}
}
#[cfg(test)]
mod tests {
use crate::handler::node_stat::Stat;
#[test]
fn test_stat_key() {
let stat = Stat {
cluster_id: 3,
id: 101,
region_num: 10,
..Default::default()
};
let stat_key = stat.stat_key();
assert_eq!(3, stat_key.cluster_id);
assert_eq!(101, stat_key.node_id);
}
}

Some files were not shown because too many files have changed in this diff Show More