Compare commits

...

61 Commits

Author SHA1 Message Date
Ning Sun
a362c4fb22 chore: use explicit geo_json version 2022-11-01 16:12:01 +08:00
Ning Sun
20fcc7819a feat: implement postgres read/write for geo data 2022-11-01 16:03:33 +08:00
Ning Sun
8119bbd7b4 feat: update sqlparser api 2022-10-20 10:57:11 +08:00
Ning Sun
48693cb12d fix: rename wkt conversion function 2022-10-19 14:52:48 +08:00
Ning Sun
8d938c3ac8 feat: use our forked sqlparser to parse Geometry(POINT) syntax 2022-10-19 14:13:26 +08:00
liangxingjian
3235436f60 fix: format 2022-10-19 10:31:02 +08:00
liangxingjian
3bf2c9840d feat: add impl of arrow array access 2022-10-19 10:25:58 +08:00
liangxingjian
35afa9dc74 fix: fix some error 2022-10-18 15:09:45 +08:00
liangxingjian
f4d8c2cef6 feat: implement simple sql demo 2022-10-18 11:54:38 +08:00
Ning Sun
788001b4bc fix: resolve lint warnings 2022-10-18 11:32:42 +08:00
Ning Sun
92ab3002c9 fix: resolve check warnings 2022-10-18 11:27:59 +08:00
Ning Sun
36ce08cb03 refactor: set inner subtype to ConcreteDataType::Geometry 2022-10-18 11:16:14 +08:00
liangxingjian
2f159dbe22 feat:add impl of geo-vec,add some unit test 2022-10-14 18:04:12 +08:00
liangxingjian
3d7d029cb5 feat:add some impl and test with a little refactor 2022-10-13 18:12:19 +08:00
liangxingjian
7aed777bc4 feat:add iter and ref of geo types 2022-10-12 17:06:42 +08:00
liangxingjian
ebcd18d3c4 feat:add some impl of geo type 2022-10-11 19:07:15 +08:00
liangxingjian
d44887bada feat:add some geo vec impl 2022-10-10 17:52:17 +08:00
liangxingjian
c0893ac19b fix:fix some error 2022-10-10 15:23:38 +08:00
liangxingjian
8a91e26020 feat:init to add new geo types 2022-10-09 17:46:06 +08:00
Ning Sun
178f8b64b5 fix: Update pgwire and fix buffer overflow issue (#293) 2022-09-29 17:58:03 +08:00
fys
fe8327fc78 feat: support write data via influxdb line protocol in frontend (#280)
* feat: support influxdb line protocol write
2022-09-29 17:08:08 +08:00
evenyag
ed89cc3e21 feat: Change signature of the Region::alter method (#287)
* feat: Change signature of the Region::alter method

* refactor: Add builders for ColumnsMetadata and ColumnFamiliesMetadata

* feat: Support altering the region metadata

Altering the region metadata is done in a copy-write fashion:
1. Convert the `RegionMetadata` into `RegionDescriptor` which is more
   convenient to mutate
2. Apply the `AlterOperation` to the `RegionDescriptor`. This would
   mutate the descriptor in-place
3. Create a `RegionMetadataBuilder` from the descriptor, bump the
   version and then build the new metadata

* feat: Implement altering table using the new Region::alter api

* refactor: Replaced wal name by region id

Region id is cheaper to clone than name

* chore: Remove pub(crate) of build_xxxx in engine mod

* style: fix clippy

* test: Add tests for AlterOperation and RegionMetadata::alter

* chore: ColumnsMetadataBuilder methods return &mut Self
2022-09-28 13:56:25 +08:00
Lei, Huang
25078e821b feat: type rewrite optimizer (#272)
* feat: add type conversion optimizer

* feat: add expr rewrite logical plan optimizer

* chore: add some doc

* fix: unit test

* fix: time zone issue in unit tests

* chore: add more tests

* fix: some CR comments

* chore: rebase develop

* chore: fix unit tests

* fix: unit test use timestamp with time zone

* chore: add more tests
2022-09-28 13:56:13 +08:00
LFC
ca732d45f9 feat: opentsdb support (#274)
* feat: opentsdb support

* fix: tests

* fix: resolve CR comments

* fix: resolve CR comments

* fix: resolve CR comments

* fix: resolve CR comments

* refactor: remove feature flags for opentsdb and pg

* fix: resolve CR comments

* fix: resolve CR comments

Co-authored-by: luofucong <luofucong@greptime.com>
2022-09-26 15:47:43 +08:00
dennis zhuang
0fa68ab7a5 feat: show databases and show tables (#276)
* feat: ensure time index column can't be included in primary key

* feat: sql parser supports show tables statement

* feat: impl show databases and show tables, #183

* feat: impl like expression for show databases/tables and add tests

* fix: typo

* fix: address CR problems
2022-09-26 14:05:49 +08:00
dennis zhuang
5f322ba16e feat: impl default constraint for column (#273)
* feat: impl default value for column in schema

* test: add test for column's default value

* refactor: rename ColumnDefaultValue to ColumnDefaultConstraint

* fix: timestamp column may be a constant vector

* fix: test_shutdown_pg_server

* fix: typo

Co-authored-by: LFC <bayinamine@gmail.com>

* fix: typo

Co-authored-by: LFC <bayinamine@gmail.com>

* fix: typo

Co-authored-by: LFC <bayinamine@gmail.com>

* chore: use table_info directly

Co-authored-by: LFC <bayinamine@gmail.com>

* refactor: by CR comments

Co-authored-by: LFC <bayinamine@gmail.com>
2022-09-22 10:43:21 +08:00
evenyag
a954ba862a feat: Implement dedup reader (#270)
* feat: Handle empty NullVector in replicate_null

* chore: Rename ChunkReaderImpl::sst_reader to batch_reader

* feat: dedup reader wip

* feat: Add BatchOp

Add BatchOp to support dedup/filter Batch and implement BatchOp for
ProjectedSchema.

Moves compare_row_of_batch to BatchOp::compare_row.

* feat: Allow Batch has empty columns

* feat: Implement DedupReader

Also add From<MutableBitmap> for BooleanVector

* test: Test dedup reader

Fix issue that compare_row compare by full key not row key

* chore: Add comments to BatchOp

* feat: Dedup results from merge reader

* test: Test merge read after flush

* test: Test merge read after flush and reopen

* test: Test replicate empty NullVector

* test: Add tests for `ProjectedSchema::dedup/filter`

* feat: Filter empty batch in DedepReader

Also fix clippy warnings and refactor some codes
2022-09-21 17:49:53 +08:00
evenyag
9489862417 fix: Fix sequence decrease after flush then reopen (#271)
The log store use start sequence instead of file start id to filter
log stream. Add more tests about flush, including flush empty memtable
and reopen after flush
2022-09-21 14:23:59 +08:00
Lei, Huang
35ba0868b5 feat: impl filter push down to parquet reader (#262)
* wip add predicate definition

* fix value move

* implement predicate and prune

* impl filter push down in chunk reader

* add more expr tests

* chore: rebase develop

* fix: unit test

* fix: field name/index lookup when building pruning stats

* chore: add some meaningless test

* fix: remove unnecessary extern crate

* fix: use datatypes::schema::SchemaRef
2022-09-21 11:47:55 +08:00
Ning Sun
8a400669aa feat: postgre wire protocol for frontend (#269) 2022-09-19 15:39:53 +08:00
evenyag
e697ba975b feat: Implement dedup and filter for vectors (#245)
* feat: Dedup vector

* refactor: Re-export Date/DateTime/Timestamp

* refactor: Named field for ListValueRef::Ref

Use field val instead of tuple for variant ListValueRef::Ref to keep
consistence with ListValueRef::Indexed

* feat: Implement ScalarVector for ListVector

Also implements ScalarVectorBuilder for ListVectorBuilder, Scalar for
ListValue and ScalarRef for ListValueRef

* test: Add tests for ScalarVector implementation of ListVector

* feat: Implement dedup using match_scalar_vector

* refactor: Move dedup func to individual mod

* chore: Update ListValueRef comments

* refactor: Move replicate to VectorOp

Move compute operations to VectorOp trait and acts as an super trait of
Vector. So we could later put dedup/filter methods to VectorOp trait,
avoid to define too many methods in Vector trait.

* refactor: Move scalar bounds to PrimitiveElement

Move Scalar and ScalarRef trait bounds to PrimitiveElement, so for each
native type which implements PrimitiveElement, its PrimitiveVector
always implements ScalarVector, so we could use it as ScalarVector
without adding additional trait bounds

* refactor: Move dedup to VectorOp

Remove compute mod and move dedup logic to operations::dedup

* feat: Implement VectorOp::filter

* test: Move replicate test of primitive to replicate.rs

* test: Add more replicate tests

* test: Add tests for dedup and filter

Also fix NullVector::dedup and ConstantVector::dedup

* style: fix clippy

* chore: Remove unused scalar.rs

* test: Add more tests for VectorOp and fix failed tests

Also fix TimestampVector eq not implemented.

* chore: Address CR comments

* chore: mention vector should be sorted in comment

* refactor: slice the vector directly in replicate_primitive_with_type
2022-09-19 14:05:02 +08:00
LFC
a649f34832 fix: select empty table (#268)
* fix: select empty table

Co-authored-by: luofucong <luofucong@greptime.com>
2022-09-19 11:28:12 +08:00
Lei, Huang
1770079691 fix: slice implementation for DateVector/DateTimeVector… (#266)
* fix: replicate and slice implementation for DateVector/DateTimeVector/TimestampVector

* chore: rebase develop
2022-09-16 16:38:46 +08:00
Ning Sun
1639b6e7ce refactor: rename to_vec to take for RecordBatches (#264) 2022-09-16 14:04:04 +08:00
Ning Sun
e67b0eb259 feat: Initial support of postgresql wire protocol (#229)
* feat: initial commit of postgres protocol adapter

* initial commit of postgres server

* feat: use common_io runtime and correct testcase

* fix previous tests

* feat: adopt pgwire api changes and add support for text encoded data

* feat: initial integration with datanode

* test: add feature flag to test

* fix: resolve lint warnings

* feat: add postgres feature flags for datanode

* feat: add support for newly introduced timestamp type

* feat: adopt latest datanode changes

* fix: address clippy warning for flattern scenario

* fix: make clippy great again

* fix: address issues found in review

* chore: sort dependencies by name

* feat: adopt new Output api

* fix: return error on unsupported data types

* refactor: extract common code dealing with record batches

* fix: resolve clippy warnings

* test: adds some unit tests postgres handler

* test: correct test for cargo update

* fix: update query module name

* test: add assertion for error content
2022-09-15 21:39:05 +08:00
LFC
fb6153f7e0 feat: a new type for supplying Ord to Primitive (#255)
Co-authored-by: luofucong <luofucong@greptime.com>
2022-09-15 18:32:55 +08:00
dennis zhuang
dfa3012396 feat: improve python coprocesssor parser (#260)
* feat: supports DateTime, Date and Timestamp column type to be returned by py scripts

* feat: improve coprocessor compiler, make it work better

* fix: comments

* fix: typo

* Update src/script/src/python/vector.rs

Co-authored-by: LFC <bayinamine@gmail.com>

* Update src/script/src/python/coprocessor.rs

Co-authored-by: LFC <bayinamine@gmail.com>

Co-authored-by: LFC <bayinamine@gmail.com>
2022-09-15 16:18:33 +08:00
dennis zhuang
c8cb705d9e ci: pre-commit configuration and hooks (#261)
* feat: adds pre-commit config and hooks

* refactor: sort all Cargo.toml by cargo-sort

* ci: adds conventional-pre-commit hook to pre-commit

* fix: remove .pre-commit-hooks.yaml

* fix: readme

* Update .pre-commit-config.yaml

Co-authored-by: Lei, Huang <6406592+v0y4g3r@users.noreply.github.com>

* ci: move clippy hook to push stage

* docs: install pre-push github hook

Co-authored-by: Lei, Huang <6406592+v0y4g3r@users.noreply.github.com>
2022-09-15 11:30:08 +08:00
fys
8400f8dfd4 chore: move query::Output to common-query module (#259)
* chore: move query::Output to common-query module

* chore: remove “query” dependency in client module
2022-09-15 10:07:58 +08:00
fys
ef40b12749 chore: add optional for datatype, rename data_type to datatype (#258) 2022-09-14 18:07:22 +08:00
evenyag
0dce8946d4 ci: Add ci cache (#256) 2022-09-14 16:06:59 +08:00
LFC
7dee7199dc fix: set unittests dir to /tmp can explode grcov's disk (#253)
* fix: set unittests dir to /tmp can explode grcov's disk

Co-authored-by: luofucong <luofucong@greptime.com>
2022-09-14 15:10:10 +08:00
Lei, Huang
2dbaad9770 fix: forbid use int64 as timestamp column data type (#248)
* fix: forbid use int64 as timestamp column data type

* fix unit test

* fix unit tests

* change gmt_created and gmt_modified data type in system tables to timestamp

* also change data type in readme
2022-09-14 12:03:16 +08:00
discord9
20dcaa6897 feat: interval& None value for prev&`next (#252)
* test: for builtin functions

* test: expect fail for `datetime()`

* feat: add `interval()` fn(WIP)

* feat: `interval()` fn in builtin(UNTEST)

* refactor: move `py_vec_obj_to_array` to util.rs

* style: fmt

* test: simple `interval()` cases

* test: `interval()` with `last()`&`first()`

* doc: `ts` param of `interval()`

* log: common_telemetry for logging in script crate

* doc: corrsponding test fn for each .ron file

* feat: change to`mpsc` for schedule_job

* test: schedule_job

* dep: rm rustpython dep in common-function

* refactor: mv `schedule_job` into `Script` trait

* test: change to use `interval` to sample datapoint

* feat: add gen_none_array for generate None Array

* feat: impl Missing value for `prev`&`next`

* test: `sum(prev(values))`

* doc: add comment for why not support Float16 in `prev()`

* feat: add `interval` in py side mock module

* style: cargo fmt

* refactor: according to comments

* refactor: extract `apply_interval_function`

* style: cargo fmt

* refactor: remove `schedule()`

* style: cargo fmt
2022-09-14 10:48:27 +08:00
LFC
ec99eb0cd0 feat: frontend instance (#238)
* feat: frontend instance

* no need to carry column length in `Column` proto

* add more tests

* rebase develop

* create a new variant with already provisioned RecordBatches in Output

* resolve code review comments

* new frontend instance does not connect datanode grpc

* add more tests

* add more tests

* rebase develop

Co-authored-by: luofucong <luofucong@greptime.com>
2022-09-13 17:10:22 +08:00
LFC
bdd5bdd917 Set unittest's logging dir to "/tmp" to not pollute source codes' dir when running unittests from IDE. (#249)
Co-authored-by: luofucong <luofucong@greptime.com>
2022-09-13 16:14:32 +08:00
dennis zhuang
03169c4a04 feat: impl scripts table and /run-script restful api (#230)
* feat: impl scripts table and /execute restful api

* fix: test failures

* fix: test failures

* feat: impl /run_script API

* refactor: rename run_script api to run-script and test script manager

* fix: remove println

* refactor: error mod

* refactor: by CR comments

* feat: rebase develop and change timestamp/gmt_crated/gmt_modified type to timestamp

* refactor: use assert_eq instread of assert

* doc: fix comment in Script#execute function
2022-09-13 15:09:00 +08:00
LFC
cad35fe82e ci: make grcov happy (#246)
* make grcov happy

Co-authored-by: luofucong <luofucong@greptime.com>
2022-09-13 14:19:49 +08:00
LFC
64b6b2afe1 feat: procedure macro to help writing UDAF (#242)
* feat: procedure macro to help writing UDAF

* resolve code review comments

Co-authored-by: luofucong <luofucong@greptime.com>
2022-09-13 10:39:44 +08:00
Morranto
628cdb89e8 feat: Add grpc implementation for alter table opeartions (#239)
* feat: grpc-alter impl

* fix: format

* fix cr

* Update src/datanode/src/error.rs

Co-authored-by: fys <40801205+Fengys123@users.noreply.github.com>

* Update src/datanode/src/server/grpc/ddl.rs

Co-authored-by: fys <40801205+Fengys123@users.noreply.github.com>

* fix bug

* Update src/datanode/src/server/grpc/ddl.rs

Co-authored-by: Ning Sun <sunng@protonmail.com>

* fix:format

* fix bug

Co-authored-by: fys <40801205+Fengys123@users.noreply.github.com>
Co-authored-by: Ning Sun <sunng@protonmail.com>
2022-09-10 21:50:21 +08:00
evenyag
d52d1eb122 fix: Only convert LogicalTypeId to ConcreteDataType in tests (#241)
LogicalTypeId to ConcreteDataType is only allowed in tests, since some
additional info is not stored in LogicalTypeId now. It is just an id, or
kind, not contains full type info.
2022-09-09 17:48:59 +08:00
evenyag
0290cdb5d6 test: Fix merge tests (#243)
* test: Fix merge tests

The merge tests still use Int64Vector for timestamp, which should
use TimestampVector instead.

* test: Test Debug format for Source::Reader

Mainly for improve code coverage
2022-09-09 16:10:57 +08:00
Lei, Huang
9366e77407 feat: impl timestamp type, value and vectors (#226)
* wip: impl timestamp data type

* add timestamp vectors

* adapt to recent changes to vector module

* fix all unit test

* rebase develop

* fix slice

* change default time unit to millisecond

* add more tests

* fix some CR comments

* fix some CR comments

* fix clippy

* fix some cr comments

* fix some CR comments

* fix some CR comments

* remove time unit in LogicalTypeId::Timestamp
2022-09-09 11:43:30 +08:00
evenyag
82dfe78321 feat: Implement merge reader (#225)
* feat: Check columns when constructing Batch

* feat: Merge reader skeleton

* test: Add tests for MergeReader

* feat: Use get_ref to compare row

* feat: Implement MergeReader

* test: Add more tests

* feat: Use MergeReader to implement ChunkReader

Now the ChunkReaderImpl use MergeReader as default reader. Also add more tests to MergeReader.

* docs: Describe the merge algo in merge.rs

Ports the doc comments from kudu to merge.rs to describe the idea of the
merge algorithm we used.

* test: Fix unit tests

* chore: Address CR comments

Panics if number of columns in batch is not equal to `BatchBuilder`'s

* chore: Address CR comments

* chore: Implement Debug and add test for Node
2022-09-09 11:35:51 +08:00
fys
37a425658c chore: optimize status code (#235) 2022-09-08 15:34:44 +08:00
Morranto
cc0c883ee2 feat: add proto files for grpc-alter (#234)
* feat: add proto files for grpc-alter

* fix: format

Co-authored-by: liangxingjian <xingjianliang@proton.me>
2022-09-08 15:33:59 +08:00
evenyag
0ae99f7ac3 fix: Fix MitoTable::scan only returns one batch (#227)
* feat: Add tests for different batch

Add region scan test with different batch size

* fix: Fix table scan only returns one batch

* style: Fix clippy

* test: Add tests to scan table with rows more than batch size

* fix: Fix MockChunkReader never stop
2022-09-06 20:36:05 +08:00
evenyag
7f8195861e feat: Adds push_value_ref and extend_slice_of to MutableVector (#215)
* feat: Impl cmp_element() for Vector

* chore: Add doc comments to MutableVector

* feat: Add create_mutable() to DataType

Add `create_mutable()` to create a MutableVector for each DataType.
Implement ListVectorBuilder and NullVectorBuilder for ListType and
NullType.

* feat: Add ValueRef

ValueRef is a reference to value, could be used to avoid some allocation
when getting data from Vector. To support ValueRef, also implement a
ListValueRef for ListValue, but comparision of ListValueRef still
requires some allocation, due to the complexity of ListValue and
ListVector.

Impl some From trait for ValueRef

* feat: Implement get_ref for Vector

* feat: Remove cmp_element from Vector

`cmp_element` could be replaced by `get_ref` and then compare

* feat: Implement push/extend for PrimitiveVectorBuilder

Implement push_value_ref() and extend_slice_of() for
PrimitiveVectorBuilder.

Also refactor the DataTypeBuilder trait for
primitive types to PrimitiveElement trait, adds necessary cast helper
methods to it.
- Cast a reference to Vector to reference arrow's primitive array
- Cast a ValueRef to primitive type
- Also make PrimitiveElement super trait of Primitive

* feat: Implement push/extend for all vector builders

Implement push_value_ref() and extend_slice_of() for remaining vector
builders. Add some helpful cast method to ValueRef and a method to
cast Value to ValueRef.

Change the behavior of PrimitiveElement::cast_xxx to panic when unable
to cast, since push_value_ref() and extend_slice_of() always panic
when given invalid input data type.

* feat: MutableVector returns error if data type unmatch

* test: Add tests for ValueRef

* feat: Add tests for Vector::get_ref

* feat: NullVector returns error if data type unmatch

* test: Add tests for vector builders

* fix: Fix compile error in python coprocessor

* refactor: Add lifetime param to IntoValueRef

The Primitive trait just use the `IntoValueRef<'static>` bound. Also
rename create_mutable to create_mutable_vector.

* chore: Address CR comments

* feat: Customize PartialOrd/Ord for Value/ValueRef

Panics if values/refs have different data type

* style: Fix clippy

* refactor: Use macro to generate body of ValueRef::as_xxx
2022-09-06 13:44:48 +08:00
LFC
5e67301c00 feat: implement alter table (#218)
* feat: implement alter table

* Currently we have no plans to support altering the primary keys (maybe never), so removed the related codes.

* make `alter` a trait function in table

* address other CR comments

* cleanup

* rebase develop

* resolve code review comments

Co-authored-by: luofucong <luofucong@greptime.com>
2022-09-06 13:44:34 +08:00
LFC
119ff2fc2e feat: create table through GRPC interface (#224)
* feat: create table through GRPC interface

* move `CreateExpr` `oneof` expr of `AdminExpr` in `admin.proto`, and implement the admin GRPC interface

* add `table_options` and `partition_options` to `CreateExpr`

* resolve code review comments

Co-authored-by: luofucong <luofucong@greptime.com>
2022-09-06 12:51:07 +08:00
Lei, Huang
3f9144a2e3 fix: StringVector use Utf8Array (#222) 2022-09-02 11:25:33 +08:00
298 changed files with 20847 additions and 3481 deletions

View File

@@ -24,6 +24,12 @@ jobs:
toolchain: ${{ env.RUST_TOOLCHAIN }}
override: true
profile: minimal
- name: Rust Cache
uses: Swatinem/rust-cache@v2.0.0
- name: Cleanup disk
uses: curoky/cleanup-disk-action@v2.0
with:
retain: 'rust'
- name: Execute tests
uses: actions-rs/cargo@v1
with:
@@ -32,10 +38,11 @@ jobs:
env:
RUST_BACKTRACE: 1
CARGO_INCREMENTAL: 0
RUSTFLAGS: "-Zprofile -Ccodegen-units=1 -Cinline-threshold=0 -Clink-dead-code -Coverflow-checks=off -Cpanic=abort -Zpanic_abort_tests"
RUSTFLAGS: "-Zprofile -Ccodegen-units=1 -Cinline-threshold=0 -Clink-dead-code -Coverflow-checks=off -Cpanic=unwind -Zpanic_abort_tests"
GT_S3_BUCKET: ${{ secrets.S3_BUCKET }}
GT_S3_ACCESS_KEY_ID: ${{ secrets.S3_ACCESS_KEY_ID }}
GT_S3_ACCESS_KEY: ${{ secrets.S3_ACCESS_KEY }}
UNITTEST_LOG_DIR: "__unittest_logs"
- name: Gather coverage data
id: coverage
uses: actions-rs/grcov@v0.1

View File

@@ -20,6 +20,8 @@ jobs:
profile: minimal
toolchain: ${{ env.RUST_TOOLCHAIN }}
override: true
- name: Rust Cache
uses: Swatinem/rust-cache@v2.0.0
- uses: actions-rs/cargo@v1
with:
command: check
@@ -37,6 +39,8 @@ jobs:
profile: minimal
toolchain: ${{ env.RUST_TOOLCHAIN }}
override: true
- name: Rust Cache
uses: Swatinem/rust-cache@v2.0.0
- uses: actions-rs/cargo@v1
with:
command: test
@@ -46,6 +50,7 @@ jobs:
GT_S3_BUCKET: ${{ secrets.S3_BUCKET }}
GT_S3_ACCESS_KEY_ID: ${{ secrets.S3_ACCESS_KEY_ID }}
GT_S3_ACCESS_KEY: ${{ secrets.S3_ACCESS_KEY }}
UNITTEST_LOG_DIR: "__unittest_logs"
fmt:
name: Rustfmt
@@ -59,6 +64,8 @@ jobs:
profile: minimal
toolchain: ${{ env.RUST_TOOLCHAIN }}
override: true
- name: Rust Cache
uses: Swatinem/rust-cache@v2.0.0
- run: rustup component add rustfmt
- uses: actions-rs/cargo@v1
with:
@@ -77,6 +84,8 @@ jobs:
profile: minimal
toolchain: ${{ env.RUST_TOOLCHAIN }}
override: true
- name: Rust Cache
uses: Swatinem/rust-cache@v2.0.0
- run: rustup component add clippy
- uses: actions-rs/cargo@v1
with:

21
.pre-commit-config.yaml Normal file
View File

@@ -0,0 +1,21 @@
repos:
- repo: https://github.com/compilerla/conventional-pre-commit
rev: 47923ce11be4a936cd216d427d985dd342adb751
hooks:
- id: conventional-pre-commit
stages: [commit-msg]
- repo: https://github.com/DevinR528/cargo-sort
rev: e6a795bc6b2c0958f9ef52af4863bbd7cc17238f
hooks:
- id: cargo-sort
args: ["--workspace"]
- repo: https://github.com/doublify/pre-commit-rust
rev: v1.0
hooks:
- id: fmt
- id: clippy
args: ["--workspace", "--all-targets", "--", "-D", "warnings", "-D", "clippy::print_stdout", "-D", "clippy::print_stderr"]
stages: [push]
- id: cargo-check

1798
Cargo.lock generated

File diff suppressed because it is too large Load Diff

View File

@@ -3,18 +3,20 @@ members = [
"src/api",
"src/catalog",
"src/client",
"src/cmd",
"src/common/base",
"src/common/error",
"src/common/function",
"src/common/function-macro",
"src/common/grpc",
"src/common/query",
"src/common/recordbatch",
"src/common/runtime",
"src/common/telemetry",
"src/common/time",
"src/cmd",
"src/datanode",
"src/datatypes",
"src/frontend",
"src/log-store",
"src/logical-plans",
"src/object-store",
@@ -28,3 +30,6 @@ members = [
"src/table-engine",
"test-util",
]
[patch.crates-io]
sqlparser = { git = "https://github.com/sunng87/sqlparser-rs.git", branch = "feature/argument-for-custom-type-for-v015" }

View File

@@ -80,6 +80,33 @@ docker run -p 3000:3000 \
greptimedb
```
### Start Frontend
Frontend should connect to Datanode, so **Datanode must have been started** at first!
```
// Connects to local Datanode at its default GRPC port: 3001
// Start Frontend with default options.
cargo run -- frontend start
OR
// Start Frontend with `mysql-addr` option.
cargo run -- frontend start --mysql-addr=0.0.0.0:9999
OR
// Start datanode with `log-dir` and `log-level` options.
cargo run -- --log-dir=logs --log-level=debug frontend start
```
Start datanode with config file:
```
cargo run -- --log-dir=logs --log-level=debug frontend start -c ./config/frontend.example.toml
```
### SQL Operations
1. Connecting DB by [mysql client](https://dev.mysql.com/downloads/mysql/):
@@ -94,7 +121,7 @@ greptimedb
```SQL
CREATE TABLE monitor (
host STRING,
ts BIGINT,
ts TIMESTAMP,
cpu DOUBLE DEFAULT 0,
memory DOUBLE,
TIME INDEX (ts),
@@ -123,3 +150,34 @@ greptimedb
3 rows in set (0.01 sec)
```
You can delete your data by removing `/tmp/greptimedb`.
## Contribute
1. [Install rust](https://www.rust-lang.org/tools/install)
2. [Install `pre-commit`](https://pre-commit.com/#plugins) for run hooks on every commit automatically such as `cargo fmt` etc.
```
$ pip install pre-commit
or
$ brew install pre-commit
$
```
3. Install the git hook scripts:
```
$ pre-commit install
pre-commit installed at .git/hooks/pre-commit
$ pre-commit install --hook-type commit-msg
pre-commit installed at .git/hooks/commit-msg
$ pre-commit install --hook-type pre-push
pre-commit installed at .git/hooks/pre-pus
```
now `pre-commit` will run automatically on `git commit`.
4. Check out branch from `develop` and make your contribution. Follow the [style guide](https://github.com/GreptimeTeam/docs/blob/main/style-guide/zh.md). Create a PR when you are ready, feel free and have fun!

View File

@@ -44,13 +44,15 @@ def as_table(kline: list):
"rv_60d",
"rv_90d",
"rv_180d"
],
sql="select open_time, close from k_line")
])
def calc_rvs(open_time, close):
from greptime import vector, log, prev, sqrt, datetime, pow, sum
from greptime import vector, log, prev, sqrt, datetime, pow, sum, last
import greptime as g
def calc_rv(close, open_time, time, interval):
mask = (open_time < time) & (open_time > time - interval)
close = close[mask]
open_time = open_time[mask]
close = g.interval(open_time, close, datetime("10m"), lambda x:last(x))
avg_time_interval = (open_time[-1] - open_time[0])/(len(open_time)-1)
ref = log(close/prev(close))
@@ -60,10 +62,10 @@ def calc_rvs(open_time, close):
# how to get env var,
# maybe through accessing scope and serde then send to remote?
timepoint = open_time[-1]
rv_7d = calc_rv(close, open_time, timepoint, datetime("7d"))
rv_15d = calc_rv(close, open_time, timepoint, datetime("15d"))
rv_30d = calc_rv(close, open_time, timepoint, datetime("30d"))
rv_60d = calc_rv(close, open_time, timepoint, datetime("60d"))
rv_90d = calc_rv(close, open_time, timepoint, datetime("90d"))
rv_180d = calc_rv(close, open_time, timepoint, datetime("180d"))
rv_7d = vector([calc_rv(close, open_time, timepoint, datetime("7d"))])
rv_15d = vector([calc_rv(close, open_time, timepoint, datetime("15d"))])
rv_30d = vector([calc_rv(close, open_time, timepoint, datetime("30d"))])
rv_60d = vector([calc_rv(close, open_time, timepoint, datetime("60d"))])
rv_90d = vector([calc_rv(close, open_time, timepoint, datetime("90d"))])
rv_180d = vector([calc_rv(close, open_time, timepoint, datetime("180d"))])
return rv_7d, rv_15d, rv_30d, rv_60d, rv_90d, rv_180d

View File

@@ -7,7 +7,7 @@
{
"symbol": "BTCUSD",
"period": "1",
"open_time": 1581231300,
"open_time": 300,
"open": "10107",
"high": "10109.34",
"low": "10106.71",
@@ -16,7 +16,7 @@
{
"symbol": "BTCUSD",
"period": "1",
"open_time": 1581231360,
"open_time": 900,
"open": "10106.79",
"high": "10109.27",
"low": "10105.92",
@@ -25,7 +25,7 @@
{
"symbol": "BTCUSD",
"period": "1",
"open_time": 1581231420,
"open_time": 1200,
"open": "10106.09",
"high": "10108.75",
"low": "10104.66",
@@ -34,7 +34,7 @@
{
"symbol": "BTCUSD",
"period": "1",
"open_time": 1581231480,
"open_time": 1800,
"open": "10108.73",
"high": "10109.52",
"low": "10106.07",
@@ -43,7 +43,7 @@
{
"symbol": "BTCUSD",
"period": "1",
"open_time": 1581231540,
"open_time": 2400,
"open": "10106.38",
"high": "10109.48",
"low": "10104.81",
@@ -52,7 +52,7 @@
{
"symbol": "BTCUSD",
"period": "1",
"open_time": 1581231600,
"open_time": 3000,
"open": "10106.95",
"high": "10109.48",
"low": "10106.6",
@@ -61,7 +61,7 @@
{
"symbol": "BTCUSD",
"period": "1",
"open_time": 1581231660,
"open_time": 3600,
"open": "10107.55",
"high": "10109.28",
"low": "10104.68",
@@ -70,7 +70,7 @@
{
"symbol": "BTCUSD",
"period": "1",
"open_time": 1581231720,
"open_time": 4200,
"open": "10104.68",
"high": "10109.18",
"low": "10104.14",
@@ -79,7 +79,7 @@
{
"symbol": "BTCUSD",
"period": "1",
"open_time": 1581231780,
"open_time": 4800,
"open": "10108.8",
"high": "10117.36",
"low": "10108.8",
@@ -88,7 +88,7 @@
{
"symbol": "BTCUSD",
"period": "1",
"open_time": 1581231840,
"open_time": 5400,
"open": "10115.96",
"high": "10119.19",
"low": "10115.96",
@@ -97,7 +97,7 @@
{
"symbol": "BTCUSD",
"period": "1",
"open_time": 1581231900,
"open_time": 6000,
"open": "10117.08",
"high": "10120.73",
"low": "10116.96",

View File

@@ -1,4 +1,4 @@
from .greptime import coprocessor, copr
from .greptime import vector, log, prev, sqrt, pow, datetime, sum
from .greptime import vector, log, prev, next, first, last, sqrt, pow, datetime, sum, interval
from .mock import mock_tester
from .cfg import set_conn_addr, get_conn_addr

View File

@@ -89,6 +89,11 @@ class vector(np.ndarray):
def filter(self, lst_bool):
return self[lst_bool]
def last(lst):
return lst[-1]
def first(lst):
return lst[0]
def prev(lst):
ret = np.zeros(len(lst))
@@ -96,35 +101,22 @@ def prev(lst):
ret[0] = nan
return ret
def next(lst):
ret = np.zeros(len(lst))
ret[:-1] = lst[1:]
ret[-1] = nan
return ret
def query(sql: str):
pass
def interval(arr: list, duration: int, fill, step: None | int = None, explicitOffset=False):
def interval(ts: vector, arr: vector, duration: int, func):
"""
Note that this is a mock function with same functionailty to the actual Python Coprocessor
`arr` is a vector of integral or temporal type.
`duration` is the length of sliding window
`step` being the length when sliding window take a step
`fill` indicate how to fill missing value:
- "prev": use previous
- "post": next
- "linear": linear interpolation, if not possible to interpolate certain types, fallback to prev
- "null": use null
- "none": do not interpolate
"""
if step is None:
step = duration
tot_len = int(np.ceil(len(arr) // step))
slices = np.zeros((tot_len, int(duration)))
for idx, start in enumerate(range(0, len(arr), step)):
slices[idx] = arr[start:(start + duration)]
return slices
start = np.min(ts)
end = np.max(ts)
masks = [(ts >= i) & (ts <= (i+duration)) for i in range(start, end, duration)]
lst_res = [func(arr[mask]) for mask in masks]
return lst_res
def factor(unit: str) -> int:

View File

@@ -4,7 +4,7 @@ it can only run on mock data and support by numpy
"""
from typing import Any
import numpy as np
from .greptime import i32,i64,f32,f64, vector, interval, query, prev, datetime, log, sum, sqrt, pow, nan, copr, coprocessor
from .greptime import i32,i64,f32,f64, vector, interval, prev, datetime, log, sum, sqrt, pow, nan, copr, coprocessor
import inspect
import functools

View File

@@ -26,6 +26,16 @@ def get_db(req:str):
return requests.get("http://{}{}".format(get_conn_addr(), req))
if __name__ == "__main__":
with open("component/script/python/example/kline.json", "r") as kline_file:
kline = json.load(kline_file)
table = as_table(kline["result"])
close = table["close"]
open_time = table["open_time"]
env = {"close":close, "open_time": open_time}
res = mock_tester(calc_rvs, env=env)
print("Mock result:", [i[0] for i in res])
exit()
if len(sys.argv)!=2:
raise Exception("Expect only one address as cmd's args")
set_conn_addr(sys.argv[1])
@@ -42,11 +52,6 @@ if __name__ == "__main__":
open_time = table["open_time"]
init_table(close, open_time)
# print(repr(close), repr(open_time))
# print("calc_rv:", calc_rv(close, open_time, open_time[-1]+datetime("10m"), datetime("7d")))
env = {"close":close, "open_time": open_time}
# print("env:", env)
print("Mock result:", mock_tester(calc_rvs, env=env))
real = calc_rvs()
print(real)
try:

View File

@@ -5,6 +5,10 @@ wal_dir = '/tmp/greptimedb/wal'
mysql_addr = '0.0.0.0:3306'
mysql_runtime_size = 4
# applied when postgres feature enbaled
postgres_addr = '0.0.0.0:5432'
postgres_runtime_size = 4
[storage]
type = 'File'
data_dir = '/tmp/greptimedb/data/'

View File

@@ -0,0 +1,4 @@
http_addr = '0.0.0.0:4000'
grpc_addr = '0.0.0.0:4001'
mysql_addr = '0.0.0.0:4003'
mysql_runtime_size = 4

View File

@@ -8,27 +8,29 @@ So is there a way we can make an aggregate function that automatically match the
# 1. Impl `AggregateFunctionCreator` trait for your accumulator creator.
You must first define a struct that can store the input data's type. For example,
You must first define a struct that will be used to create your accumulator. For example,
```Rust
struct MySumAccumulatorCreator {
input_types: ArcSwapOption<Vec<ConcreteDataType>>,
}
#[as_aggr_func_creator]
#[derive(Debug, AggrFuncTypeStore)]
struct MySumAccumulatorCreator {}
```
Attribute macro `#[as_aggr_func_creator]` and derive macro `#[derive(Debug, AggrFuncTypeStore)]` must both annotated on the struct. They work together to provide a storage of aggregate function's input data types, which are needed for creating generic accumulator later.
> Note that the `as_aggr_func_creator` macro will add fields to the struct, so the struct cannot be defined as an empty struct without field like `struct Foo;`, neither as a new type like `struct Foo(bar)`.
Then impl `AggregateFunctionCreator` trait on it. The definition of the trait is:
```Rust
pub trait AggregateFunctionCreator: Send + Sync + Debug {
fn creator(&self) -> AccumulatorCreatorFunction;
fn input_types(&self) -> Vec<ConcreteDataType>;
fn set_input_types(&self, input_types: Vec<ConcreteDataType>);
fn output_type(&self) -> ConcreteDataType;
fn state_types(&self) -> Vec<ConcreteDataType>;
}
```
our query engine will call `set_input_types` the very first, so you can use input data's type in methods that return output type and state types.
You can use input data's type in methods that return output type and state types (just invoke `input_types()`).
The output type is aggregate function's output data's type. For example, `SUM` aggregate function's output type is `u64` for a `u32` datatype column. The state types are accumulator's internal states' types. Take `AVG` aggregate function on a `i32` column as example, it's state types are `i64` (for sum) and `u64` (for count).

View File

@@ -2,11 +2,12 @@
name = "api"
version = "0.1.0"
edition = "2021"
# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html
[dependencies]
datatypes = { path = "../datatypes" }
prost = "0.11"
snafu = { version = "0.7", features = ["backtraces"] }
tonic = "0.8"
[build-dependencies]

View File

@@ -2,8 +2,54 @@ syntax = "proto3";
package greptime.v1;
// TODO(jiachun)
message AdminRequest {}
import "greptime/v1/column.proto";
import "greptime/v1/common.proto";
// TODO(jiachun)
message AdminResponse {}
message AdminRequest {
string name = 1;
repeated AdminExpr exprs = 2;
}
message AdminResponse {
repeated AdminResult results = 1;
}
message AdminExpr {
ExprHeader header = 1;
oneof expr {
CreateExpr create = 2;
AlterExpr alter = 3;
}
}
message AdminResult {
ResultHeader header = 1;
oneof result {
MutateResult mutate = 2;
}
}
message CreateExpr {
optional string catalog_name = 1;
optional string schema_name = 2;
string table_name = 3;
optional string desc = 4;
repeated ColumnDef column_defs = 5;
string time_index = 6;
repeated string primary_keys = 7;
bool create_if_not_exists = 8;
map<string, string> table_options = 9;
}
message AlterExpr {
optional string catalog_name = 1;
optional string schema_name = 2;
string table_name = 3;
oneof kind {
AddColumn add_column = 4;
}
}
message AddColumn {
ColumnDef column_def = 1;
}

View File

@@ -29,6 +29,10 @@ message Column {
repeated bool bool_values = 11;
repeated bytes binary_values = 12;
repeated string string_values = 13;
repeated int32 date_values = 14;
repeated int64 datetime_values = 15;
repeated int64 ts_millis_values = 16;
}
// The array of non-null values in this column.
//
@@ -43,4 +47,33 @@ message Column {
// Mask maps the positions of null values.
// If a bit in null_mask is 1, it indicates that the column value at that position is null.
bytes null_mask = 4;
// Helpful in creating vector from column.
optional ColumnDataType datatype = 5;
}
message ColumnDef {
string name = 1;
ColumnDataType datatype = 2;
bool is_nullable = 3;
optional bytes default_constraint = 4;
}
enum ColumnDataType {
BOOLEAN = 0;
INT8 = 1;
INT16 = 2;
INT32 = 3;
INT64 = 4;
UINT8 = 5;
UINT16 = 6;
UINT32 = 7;
UINT64 = 8;
FLOAT32 = 9;
FLOAT64 = 10;
BINARY = 11;
STRING = 12;
DATE = 13;
DATETIME = 14;
TIMESTAMP = 15;
}

View File

@@ -0,0 +1,18 @@
syntax = "proto3";
package greptime.v1;
message ExprHeader {
uint32 version = 1;
}
message ResultHeader {
uint32 version = 1;
uint32 code = 2;
string err_msg = 3;
}
message MutateResult {
uint32 success = 1;
uint32 failure = 2;
}

View File

@@ -2,6 +2,8 @@ syntax = "proto3";
package greptime.v1;
import "greptime/v1/common.proto";
message DatabaseRequest {
string name = 1;
repeated ObjectExpr exprs = 2;
@@ -21,10 +23,6 @@ message ObjectExpr {
}
}
message ExprHeader {
uint32 version = 1;
}
// TODO(fys): Only support sql now, and will support promql etc in the future
message SelectExpr {
oneof expr {
@@ -40,7 +38,23 @@ message PhysicalPlan {
message InsertExpr {
string table_name = 1;
repeated bytes values = 2;
message Values {
repeated bytes values = 1;
}
oneof expr {
Values values = 2;
// TODO(LFC): Remove field "sql" in InsertExpr.
// When Frontend instance received an insertion SQL (`insert into ...`), it's anticipated to parse the SQL and
// assemble the values to insert to feed Datanode. In other words, inserting data through Datanode instance's GRPC
// interface shouldn't use SQL directly.
// Then why the "sql" field exists here? It's because the Frontend needs table schema to create the values to insert,
// which is currently not able to find anywhere. (Maybe the table schema is suppose to be fetched from Meta?)
// The "sql" field is meant to be removed in the future.
string sql = 3;
}
}
// TODO(jiachun)
@@ -59,14 +73,3 @@ message ObjectResult {
message SelectResult {
bytes raw_data = 1;
}
message ResultHeader {
uint32 version = 1;
uint32 code = 2;
string err_msg = 3;
}
message MutateResult {
uint32 success = 1;
uint32 failure = 2;
}

18
src/api/src/error.rs Normal file
View File

@@ -0,0 +1,18 @@
use datatypes::prelude::ConcreteDataType;
use snafu::prelude::*;
use snafu::Backtrace;
pub type Result<T> = std::result::Result<T, Error>;
#[derive(Debug, Snafu)]
#[snafu(visibility(pub))]
pub enum Error {
#[snafu(display("Unknown proto column datatype: {}", datatype))]
UnknownColumnDataType { datatype: i32, backtrace: Backtrace },
#[snafu(display("Failed to create column datatype from {:?}", from))]
IntoColumnDataType {
from: ConcreteDataType,
backtrace: Backtrace,
},
}

362
src/api/src/helper.rs Normal file
View File

@@ -0,0 +1,362 @@
use datatypes::prelude::ConcreteDataType;
use snafu::prelude::*;
use crate::error::{self, Result};
use crate::v1::column::Values;
use crate::v1::ColumnDataType;
#[derive(Debug, PartialEq, Eq)]
pub struct ColumnDataTypeWrapper(ColumnDataType);
impl ColumnDataTypeWrapper {
pub fn try_new(datatype: i32) -> Result<Self> {
let datatype = ColumnDataType::from_i32(datatype)
.context(error::UnknownColumnDataTypeSnafu { datatype })?;
Ok(Self(datatype))
}
pub fn datatype(&self) -> ColumnDataType {
self.0
}
}
impl From<ColumnDataTypeWrapper> for ConcreteDataType {
fn from(datatype: ColumnDataTypeWrapper) -> Self {
match datatype.0 {
ColumnDataType::Boolean => ConcreteDataType::boolean_datatype(),
ColumnDataType::Int8 => ConcreteDataType::int8_datatype(),
ColumnDataType::Int16 => ConcreteDataType::int16_datatype(),
ColumnDataType::Int32 => ConcreteDataType::int32_datatype(),
ColumnDataType::Int64 => ConcreteDataType::int64_datatype(),
ColumnDataType::Uint8 => ConcreteDataType::uint8_datatype(),
ColumnDataType::Uint16 => ConcreteDataType::uint16_datatype(),
ColumnDataType::Uint32 => ConcreteDataType::uint32_datatype(),
ColumnDataType::Uint64 => ConcreteDataType::uint64_datatype(),
ColumnDataType::Float32 => ConcreteDataType::float32_datatype(),
ColumnDataType::Float64 => ConcreteDataType::float64_datatype(),
ColumnDataType::Binary => ConcreteDataType::binary_datatype(),
ColumnDataType::String => ConcreteDataType::string_datatype(),
ColumnDataType::Date => ConcreteDataType::date_datatype(),
ColumnDataType::Datetime => ConcreteDataType::datetime_datatype(),
ColumnDataType::Timestamp => ConcreteDataType::timestamp_millis_datatype(),
}
}
}
impl TryFrom<ConcreteDataType> for ColumnDataTypeWrapper {
type Error = error::Error;
fn try_from(datatype: ConcreteDataType) -> Result<Self> {
let datatype = ColumnDataTypeWrapper(match datatype {
ConcreteDataType::Boolean(_) => ColumnDataType::Boolean,
ConcreteDataType::Int8(_) => ColumnDataType::Int8,
ConcreteDataType::Int16(_) => ColumnDataType::Int16,
ConcreteDataType::Int32(_) => ColumnDataType::Int32,
ConcreteDataType::Int64(_) => ColumnDataType::Int64,
ConcreteDataType::UInt8(_) => ColumnDataType::Uint8,
ConcreteDataType::UInt16(_) => ColumnDataType::Uint16,
ConcreteDataType::UInt32(_) => ColumnDataType::Uint32,
ConcreteDataType::UInt64(_) => ColumnDataType::Uint64,
ConcreteDataType::Float32(_) => ColumnDataType::Float32,
ConcreteDataType::Float64(_) => ColumnDataType::Float64,
ConcreteDataType::Binary(_) => ColumnDataType::Binary,
ConcreteDataType::String(_) => ColumnDataType::String,
ConcreteDataType::Date(_) => ColumnDataType::Date,
ConcreteDataType::DateTime(_) => ColumnDataType::Datetime,
ConcreteDataType::Timestamp(_) => ColumnDataType::Timestamp,
ConcreteDataType::Null(_) | ConcreteDataType::List(_) => {
return error::IntoColumnDataTypeSnafu { from: datatype }.fail()
}
ConcreteDataType::Geometry(_) => todo!(),
});
Ok(datatype)
}
}
impl Values {
pub fn with_capacity(datatype: ColumnDataType, capacity: usize) -> Self {
match datatype {
ColumnDataType::Boolean => Values {
bool_values: Vec::with_capacity(capacity),
..Default::default()
},
ColumnDataType::Int8 => Values {
i8_values: Vec::with_capacity(capacity),
..Default::default()
},
ColumnDataType::Int16 => Values {
i16_values: Vec::with_capacity(capacity),
..Default::default()
},
ColumnDataType::Int32 => Values {
i32_values: Vec::with_capacity(capacity),
..Default::default()
},
ColumnDataType::Int64 => Values {
i64_values: Vec::with_capacity(capacity),
..Default::default()
},
ColumnDataType::Uint8 => Values {
u8_values: Vec::with_capacity(capacity),
..Default::default()
},
ColumnDataType::Uint16 => Values {
u16_values: Vec::with_capacity(capacity),
..Default::default()
},
ColumnDataType::Uint32 => Values {
u32_values: Vec::with_capacity(capacity),
..Default::default()
},
ColumnDataType::Uint64 => Values {
u64_values: Vec::with_capacity(capacity),
..Default::default()
},
ColumnDataType::Float32 => Values {
f32_values: Vec::with_capacity(capacity),
..Default::default()
},
ColumnDataType::Float64 => Values {
f64_values: Vec::with_capacity(capacity),
..Default::default()
},
ColumnDataType::Binary => Values {
binary_values: Vec::with_capacity(capacity),
..Default::default()
},
ColumnDataType::String => Values {
string_values: Vec::with_capacity(capacity),
..Default::default()
},
ColumnDataType::Date => Values {
date_values: Vec::with_capacity(capacity),
..Default::default()
},
ColumnDataType::Datetime => Values {
datetime_values: Vec::with_capacity(capacity),
..Default::default()
},
ColumnDataType::Timestamp => Values {
ts_millis_values: Vec::with_capacity(capacity),
..Default::default()
},
}
}
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn test_values_with_capacity() {
let values = Values::with_capacity(ColumnDataType::Int8, 2);
let values = values.i8_values;
assert_eq!(2, values.capacity());
let values = Values::with_capacity(ColumnDataType::Int32, 2);
let values = values.i32_values;
assert_eq!(2, values.capacity());
let values = Values::with_capacity(ColumnDataType::Int64, 2);
let values = values.i64_values;
assert_eq!(2, values.capacity());
let values = Values::with_capacity(ColumnDataType::Uint8, 2);
let values = values.u8_values;
assert_eq!(2, values.capacity());
let values = Values::with_capacity(ColumnDataType::Uint32, 2);
let values = values.u32_values;
assert_eq!(2, values.capacity());
let values = Values::with_capacity(ColumnDataType::Uint64, 2);
let values = values.u64_values;
assert_eq!(2, values.capacity());
let values = Values::with_capacity(ColumnDataType::Float32, 2);
let values = values.f32_values;
assert_eq!(2, values.capacity());
let values = Values::with_capacity(ColumnDataType::Float64, 2);
let values = values.f64_values;
assert_eq!(2, values.capacity());
let values = Values::with_capacity(ColumnDataType::Binary, 2);
let values = values.binary_values;
assert_eq!(2, values.capacity());
let values = Values::with_capacity(ColumnDataType::Boolean, 2);
let values = values.bool_values;
assert_eq!(2, values.capacity());
let values = Values::with_capacity(ColumnDataType::String, 2);
let values = values.string_values;
assert_eq!(2, values.capacity());
let values = Values::with_capacity(ColumnDataType::Date, 2);
let values = values.date_values;
assert_eq!(2, values.capacity());
let values = Values::with_capacity(ColumnDataType::Datetime, 2);
let values = values.datetime_values;
assert_eq!(2, values.capacity());
let values = Values::with_capacity(ColumnDataType::Timestamp, 2);
let values = values.ts_millis_values;
assert_eq!(2, values.capacity());
}
#[test]
fn test_concrete_datatype_from_column_datatype() {
assert_eq!(
ConcreteDataType::boolean_datatype(),
ColumnDataTypeWrapper(ColumnDataType::Boolean).into()
);
assert_eq!(
ConcreteDataType::int8_datatype(),
ColumnDataTypeWrapper(ColumnDataType::Int8).into()
);
assert_eq!(
ConcreteDataType::int16_datatype(),
ColumnDataTypeWrapper(ColumnDataType::Int16).into()
);
assert_eq!(
ConcreteDataType::int32_datatype(),
ColumnDataTypeWrapper(ColumnDataType::Int32).into()
);
assert_eq!(
ConcreteDataType::int64_datatype(),
ColumnDataTypeWrapper(ColumnDataType::Int64).into()
);
assert_eq!(
ConcreteDataType::uint8_datatype(),
ColumnDataTypeWrapper(ColumnDataType::Uint8).into()
);
assert_eq!(
ConcreteDataType::uint16_datatype(),
ColumnDataTypeWrapper(ColumnDataType::Uint16).into()
);
assert_eq!(
ConcreteDataType::uint32_datatype(),
ColumnDataTypeWrapper(ColumnDataType::Uint32).into()
);
assert_eq!(
ConcreteDataType::uint64_datatype(),
ColumnDataTypeWrapper(ColumnDataType::Uint64).into()
);
assert_eq!(
ConcreteDataType::float32_datatype(),
ColumnDataTypeWrapper(ColumnDataType::Float32).into()
);
assert_eq!(
ConcreteDataType::float64_datatype(),
ColumnDataTypeWrapper(ColumnDataType::Float64).into()
);
assert_eq!(
ConcreteDataType::binary_datatype(),
ColumnDataTypeWrapper(ColumnDataType::Binary).into()
);
assert_eq!(
ConcreteDataType::string_datatype(),
ColumnDataTypeWrapper(ColumnDataType::String).into()
);
assert_eq!(
ConcreteDataType::date_datatype(),
ColumnDataTypeWrapper(ColumnDataType::Date).into()
);
assert_eq!(
ConcreteDataType::datetime_datatype(),
ColumnDataTypeWrapper(ColumnDataType::Datetime).into()
);
assert_eq!(
ConcreteDataType::timestamp_millis_datatype(),
ColumnDataTypeWrapper(ColumnDataType::Timestamp).into()
);
}
#[test]
fn test_column_datatype_from_concrete_datatype() {
assert_eq!(
ColumnDataTypeWrapper(ColumnDataType::Boolean),
ConcreteDataType::boolean_datatype().try_into().unwrap()
);
assert_eq!(
ColumnDataTypeWrapper(ColumnDataType::Int8),
ConcreteDataType::int8_datatype().try_into().unwrap()
);
assert_eq!(
ColumnDataTypeWrapper(ColumnDataType::Int16),
ConcreteDataType::int16_datatype().try_into().unwrap()
);
assert_eq!(
ColumnDataTypeWrapper(ColumnDataType::Int32),
ConcreteDataType::int32_datatype().try_into().unwrap()
);
assert_eq!(
ColumnDataTypeWrapper(ColumnDataType::Int64),
ConcreteDataType::int64_datatype().try_into().unwrap()
);
assert_eq!(
ColumnDataTypeWrapper(ColumnDataType::Uint8),
ConcreteDataType::uint8_datatype().try_into().unwrap()
);
assert_eq!(
ColumnDataTypeWrapper(ColumnDataType::Uint16),
ConcreteDataType::uint16_datatype().try_into().unwrap()
);
assert_eq!(
ColumnDataTypeWrapper(ColumnDataType::Uint32),
ConcreteDataType::uint32_datatype().try_into().unwrap()
);
assert_eq!(
ColumnDataTypeWrapper(ColumnDataType::Uint64),
ConcreteDataType::uint64_datatype().try_into().unwrap()
);
assert_eq!(
ColumnDataTypeWrapper(ColumnDataType::Float32),
ConcreteDataType::float32_datatype().try_into().unwrap()
);
assert_eq!(
ColumnDataTypeWrapper(ColumnDataType::Float64),
ConcreteDataType::float64_datatype().try_into().unwrap()
);
assert_eq!(
ColumnDataTypeWrapper(ColumnDataType::Binary),
ConcreteDataType::binary_datatype().try_into().unwrap()
);
assert_eq!(
ColumnDataTypeWrapper(ColumnDataType::String),
ConcreteDataType::string_datatype().try_into().unwrap()
);
assert_eq!(
ColumnDataTypeWrapper(ColumnDataType::Date),
ConcreteDataType::date_datatype().try_into().unwrap()
);
assert_eq!(
ColumnDataTypeWrapper(ColumnDataType::Datetime),
ConcreteDataType::datetime_datatype().try_into().unwrap()
);
assert_eq!(
ColumnDataTypeWrapper(ColumnDataType::Timestamp),
ConcreteDataType::timestamp_millis_datatype()
.try_into()
.unwrap()
);
let result: Result<ColumnDataTypeWrapper> = ConcreteDataType::null_datatype().try_into();
assert!(result.is_err());
assert_eq!(
result.unwrap_err().to_string(),
"Failed to create column datatype from Null(NullType)"
);
let result: Result<ColumnDataTypeWrapper> =
ConcreteDataType::list_datatype(ConcreteDataType::boolean_datatype()).try_into();
assert!(result.is_err());
assert_eq!(
result.unwrap_err().to_string(),
"Failed to create column datatype from List(ListType { inner: Boolean(BooleanType) })"
);
}
}

View File

@@ -1,3 +1,5 @@
pub mod error;
pub mod helper;
pub mod serde;
pub mod v1;

View File

@@ -138,6 +138,7 @@ mod tests {
semantic_type: SEMANTIC_TAG,
values: Some(values),
null_mask,
..Default::default()
};
InsertBatch {
columns: vec![column],
@@ -156,6 +157,7 @@ mod tests {
semantic_type: SEMANTIC_TAG,
values: Some(values),
null_mask,
..Default::default()
};
SelectResult {
columns: vec![column],

View File

@@ -2,12 +2,11 @@
name = "catalog"
version = "0.1.0"
edition = "2021"
# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html
[dependencies]
async-trait = "0.1"
async-stream = "0.3"
async-trait = "0.1"
common-error = { path = "../common/error" }
common-query = { path = "../common/query" }
common-recordbatch = { path = "../common/recordbatch" }

View File

@@ -1,6 +1,13 @@
pub const SYSTEM_CATALOG_NAME: &str = "system";
pub const INFORMATION_SCHEMA_NAME: &str = "information_schema";
pub const SYSTEM_CATALOG_TABLE_ID: u32 = 0;
pub const SYSTEM_CATALOG_TABLE_NAME: &str = "system_catalog";
pub const DEFAULT_CATALOG_NAME: &str = "greptime";
pub const DEFAULT_SCHEMA_NAME: &str = "public";
/// Reserves [0,MIN_USER_TABLE_ID) for internal usage.
/// User defined table id starts from this value.
pub const MIN_USER_TABLE_ID: u32 = 1024;
/// system_catalog table id
pub const SYSTEM_CATALOG_TABLE_ID: u32 = 0;
/// scripts table id
pub const SCRIPTS_TABLE_ID: u32 = 1;

View File

@@ -21,6 +21,17 @@ pub enum Error {
source: table::error::Error,
},
#[snafu(display(
"Failed to create table, table info: {}, source: {}",
table_info,
source
))]
CreateTable {
table_info: String,
#[snafu(backtrace)]
source: table::error::Error,
},
#[snafu(display("System catalog is not valid: {}", msg))]
SystemCatalog { msg: String, backtrace: Backtrace },
@@ -89,6 +100,9 @@ pub enum Error {
#[snafu(backtrace)]
source: table::error::Error,
},
#[snafu(display("Illegal catalog manager state: {}", msg))]
IllegalManagerState { backtrace: Backtrace, msg: String },
}
pub type Result<T> = std::result::Result<T, Error>;
@@ -97,21 +111,27 @@ impl ErrorExt for Error {
fn status_code(&self) -> StatusCode {
match self {
Error::InvalidKey { .. }
| Error::OpenSystemCatalog { .. }
| Error::CreateSystemCatalog { .. }
| Error::SchemaNotFound { .. }
| Error::TableNotFound { .. }
| Error::InvalidEntryType { .. } => StatusCode::Unexpected,
Error::SystemCatalog { .. }
| Error::SystemCatalogTypeMismatch { .. }
| Error::EmptyValue
| Error::ValueDeserialize { .. }
| Error::IllegalManagerState { .. }
| Error::CatalogNotFound { .. }
| Error::OpenTable { .. }
| Error::ReadSystemCatalog { .. }
| Error::InsertTableRecord { .. } => StatusCode::StorageUnavailable,
| Error::InvalidEntryType { .. } => StatusCode::Unexpected,
Error::SystemCatalog { .. } | Error::EmptyValue | Error::ValueDeserialize { .. } => {
StatusCode::StorageUnavailable
}
Error::ReadSystemCatalog { source, .. } => source.status_code(),
Error::SystemCatalogTypeMismatch { source, .. } => source.status_code(),
Error::RegisterTable { .. } => StatusCode::Internal,
Error::TableExists { .. } => StatusCode::TableAlreadyExists,
Error::OpenSystemCatalog { source, .. }
| Error::CreateSystemCatalog { source, .. }
| Error::InsertTableRecord { source, .. }
| Error::OpenTable { source, .. }
| Error::CreateTable { source, .. } => source.status_code(),
}
}
@@ -155,7 +175,7 @@ mod tests {
);
assert_eq!(
StatusCode::Unexpected,
StatusCode::StorageUnavailable,
Error::OpenSystemCatalog {
source: table::error::Error::new(MockError::new(StatusCode::StorageUnavailable))
}
@@ -163,7 +183,7 @@ mod tests {
);
assert_eq!(
StatusCode::Unexpected,
StatusCode::StorageUnavailable,
Error::CreateSystemCatalog {
source: table::error::Error::new(MockError::new(StatusCode::StorageUnavailable))
}
@@ -180,7 +200,7 @@ mod tests {
);
assert_eq!(
StatusCode::StorageUnavailable,
StatusCode::Internal,
Error::SystemCatalogTypeMismatch {
data_type: DataType::Boolean,
source: datatypes::error::Error::UnsupportedArrowType {

View File

@@ -4,19 +4,20 @@ use std::any::Any;
use std::sync::Arc;
use table::metadata::TableId;
use table::requests::CreateTableRequest;
use table::TableRef;
pub use crate::consts::{DEFAULT_CATALOG_NAME, DEFAULT_SCHEMA_NAME};
pub use crate::consts::{DEFAULT_CATALOG_NAME, DEFAULT_SCHEMA_NAME, MIN_USER_TABLE_ID};
pub use crate::manager::LocalCatalogManager;
pub use crate::schema::{SchemaProvider, SchemaProviderRef};
mod consts;
pub mod consts;
pub mod error;
mod manager;
pub mod memory;
pub mod schema;
mod system;
mod tables;
pub mod tables;
/// Represent a list of named catalogs
pub trait CatalogList: Sync + Send {
@@ -70,10 +71,34 @@ pub trait CatalogManager: CatalogList {
/// Registers a table given given catalog/schema to catalog manager,
/// returns table registered.
async fn register_table(&self, request: RegisterTableRequest) -> error::Result<usize>;
/// Register a system table, should be called before starting the manager.
async fn register_system_table(&self, request: RegisterSystemTableRequest)
-> error::Result<()>;
/// Returns the table by catalog, schema and table name.
fn table(
&self,
catalog: Option<&str>,
schema: Option<&str>,
table_name: &str,
) -> error::Result<Option<TableRef>>;
}
pub type CatalogManagerRef = Arc<dyn CatalogManager>;
/// Hook called after system table opening.
pub type OpenSystemTableHook = Arc<dyn Fn(TableRef) -> error::Result<()> + Send + Sync>;
/// Register system table request:
/// - When system table is already created and registered, the hook will be called
/// with table ref after opening the system table
/// - When system table is not exists, create and register the table by create_table_request and calls open_hook with the created table.
pub struct RegisterSystemTableRequest {
pub create_table_request: CreateTableRequest,
pub open_hook: Option<OpenSystemTableHook>,
}
pub struct RegisterTableRequest {
pub catalog: Option<String>,
pub schema: Option<String>,

View File

@@ -13,12 +13,16 @@ use table::engine::{EngineContext, TableEngineRef};
use table::metadata::TableId;
use table::requests::OpenTableRequest;
use table::table::numbers::NumbersTable;
use table::TableRef;
use super::error::Result;
use crate::consts::{INFORMATION_SCHEMA_NAME, SYSTEM_CATALOG_NAME, SYSTEM_CATALOG_TABLE_NAME};
use crate::consts::{
INFORMATION_SCHEMA_NAME, MIN_USER_TABLE_ID, SYSTEM_CATALOG_NAME, SYSTEM_CATALOG_TABLE_NAME,
};
use crate::error::{
CatalogNotFoundSnafu, OpenTableSnafu, ReadSystemCatalogSnafu, SchemaNotFoundSnafu,
SystemCatalogSnafu, SystemCatalogTypeMismatchSnafu, TableExistsSnafu, TableNotFoundSnafu,
CatalogNotFoundSnafu, CreateTableSnafu, IllegalManagerStateSnafu, OpenTableSnafu,
ReadSystemCatalogSnafu, SchemaNotFoundSnafu, SystemCatalogSnafu,
SystemCatalogTypeMismatchSnafu, TableExistsSnafu, TableNotFoundSnafu,
};
use crate::memory::{MemoryCatalogList, MemoryCatalogProvider, MemorySchemaProvider};
use crate::system::{
@@ -28,7 +32,8 @@ use crate::system::{
use crate::tables::SystemCatalog;
use crate::{
format_full_table_name, CatalogList, CatalogManager, CatalogProvider, CatalogProviderRef,
RegisterTableRequest, SchemaProvider, DEFAULT_CATALOG_NAME, DEFAULT_SCHEMA_NAME,
RegisterSystemTableRequest, RegisterTableRequest, SchemaProvider, DEFAULT_CATALOG_NAME,
DEFAULT_SCHEMA_NAME,
};
/// A `CatalogManager` consists of a system catalog and a bunch of user catalogs.
@@ -37,7 +42,8 @@ pub struct LocalCatalogManager {
catalogs: Arc<MemoryCatalogList>,
engine: TableEngineRef,
next_table_id: AtomicU32,
lock: Mutex<()>,
init_lock: Mutex<bool>,
system_table_requests: Mutex<Vec<RegisterSystemTableRequest>>,
}
impl LocalCatalogManager {
@@ -54,8 +60,9 @@ impl LocalCatalogManager {
system: system_catalog,
catalogs: memory_catalog_list,
engine,
next_table_id: AtomicU32::new(0),
lock: Mutex::new(()),
next_table_id: AtomicU32::new(MIN_USER_TABLE_ID),
init_lock: Mutex::new(false),
system_table_requests: Mutex::new(Vec::default()),
})
}
@@ -78,7 +85,54 @@ impl LocalCatalogManager {
max_table_id
);
self.next_table_id
.store(max_table_id + 1, Ordering::Relaxed);
.store((max_table_id + 1).max(MIN_USER_TABLE_ID), Ordering::Relaxed);
*self.init_lock.lock().await = true;
// Processing system table hooks
let mut sys_table_requests = self.system_table_requests.lock().await;
for req in sys_table_requests.drain(..) {
let catalog_name = &req.create_table_request.catalog_name;
let schema_name = &req.create_table_request.schema_name;
let table_name = &req.create_table_request.table_name;
let table_id = req.create_table_request.id;
let table = if let Some(table) =
self.table(catalog_name.as_deref(), schema_name.as_deref(), table_name)?
{
table
} else {
let table = self
.engine
.create_table(&EngineContext::default(), req.create_table_request.clone())
.await
.with_context(|_| CreateTableSnafu {
table_info: format!(
"{}.{}.{}, id: {}",
catalog_name.as_deref().unwrap_or(DEFAULT_CATALOG_NAME),
schema_name.as_deref().unwrap_or(DEFAULT_SCHEMA_NAME),
table_name,
table_id,
),
})?;
self.register_table(RegisterTableRequest {
catalog: catalog_name.clone(),
schema: schema_name.clone(),
table_name: table_name.clone(),
table_id,
table: table.clone(),
})
.await?;
info!("Created and registered system table: {}", table_name);
table
};
if let Some(hook) = req.open_hook {
(hook)(table)?;
}
}
Ok(())
}
@@ -265,7 +319,15 @@ impl CatalogManager for LocalCatalogManager {
}
async fn register_table(&self, request: RegisterTableRequest) -> Result<usize> {
let _lock = self.lock.lock().await;
let started = self.init_lock.lock().await;
ensure!(
*started,
IllegalManagerStateSnafu {
msg: "Catalog manager not started",
}
);
let catalog_name = request
.catalog
.unwrap_or_else(|| DEFAULT_CATALOG_NAME.to_string());
@@ -304,4 +366,39 @@ impl CatalogManager for LocalCatalogManager {
schema.register_table(request.table_name, request.table)?;
Ok(1)
}
async fn register_system_table(&self, request: RegisterSystemTableRequest) -> Result<()> {
ensure!(
!*self.init_lock.lock().await,
IllegalManagerStateSnafu {
msg: "Catalog manager already started",
}
);
let mut sys_table_requests = self.system_table_requests.lock().await;
sys_table_requests.push(request);
Ok(())
}
fn table(
&self,
catalog: Option<&str>,
schema: Option<&str>,
table_name: &str,
) -> Result<Option<TableRef>> {
let catalog_name = catalog.unwrap_or(DEFAULT_CATALOG_NAME);
let schema_name = schema.unwrap_or(DEFAULT_SCHEMA_NAME);
let catalog = self
.catalogs
.catalog(catalog_name)
.context(CatalogNotFoundSnafu { catalog_name })?;
let schema = catalog
.schema(schema_name)
.with_context(|| SchemaNotFoundSnafu {
schema_info: format!("{}.{}", catalog_name, schema_name),
})?;
Ok(schema.table(table_name))
}
}

View File

@@ -5,10 +5,11 @@ use std::sync::Arc;
use common_query::logical_plan::Expr;
use common_recordbatch::SendableRecordBatchStream;
use common_telemetry::debug;
use common_time::timestamp::Timestamp;
use common_time::util;
use datatypes::prelude::{ConcreteDataType, ScalarVector};
use datatypes::schema::{ColumnSchema, Schema, SchemaBuilder, SchemaRef};
use datatypes::vectors::{BinaryVector, Int64Vector, UInt8Vector};
use datatypes::vectors::{BinaryVector, TimestampVector, UInt8Vector};
use serde::{Deserialize, Serialize};
use snafu::{ensure, OptionExt, ResultExt};
use table::engine::{EngineContext, TableEngineRef};
@@ -88,6 +89,7 @@ impl SystemCatalogTable {
schema: schema.clone(),
primary_key_indices: vec![ENTRY_TYPE_INDEX, KEY_INDEX, TIMESTAMP_INDEX],
create_if_not_exists: true,
table_options: HashMap::new(),
};
let table = engine
@@ -128,7 +130,7 @@ fn build_system_catalog_schema() -> Schema {
),
ColumnSchema::new(
"timestamp".to_string(),
ConcreteDataType::int64_datatype(),
ConcreteDataType::timestamp_millis_datatype(),
false,
),
ColumnSchema::new(
@@ -138,18 +140,19 @@ fn build_system_catalog_schema() -> Schema {
),
ColumnSchema::new(
"gmt_created".to_string(),
ConcreteDataType::int64_datatype(),
ConcreteDataType::timestamp_millis_datatype(),
false,
),
ColumnSchema::new(
"gmt_modified".to_string(),
ConcreteDataType::int64_datatype(),
ConcreteDataType::timestamp_millis_datatype(),
false,
),
];
// The schema of this table must be valid.
SchemaBuilder::from(cols)
SchemaBuilder::try_from(cols)
.unwrap()
.timestamp_index(2)
.build()
.unwrap()
@@ -170,7 +173,7 @@ pub fn build_table_insert_request(full_table_name: String, table_id: TableId) ->
// Timestamp in key part is intentionally left to 0
columns_values.insert(
"timestamp".to_string(),
Arc::new(Int64Vector::from_slice(&[0])) as _,
Arc::new(TimestampVector::from_slice(&[Timestamp::from_millis(0)])) as _,
);
columns_values.insert(
@@ -184,12 +187,16 @@ pub fn build_table_insert_request(full_table_name: String, table_id: TableId) ->
columns_values.insert(
"gmt_created".to_string(),
Arc::new(Int64Vector::from_slice(&[util::current_time_millis()])) as _,
Arc::new(TimestampVector::from_slice(&[Timestamp::from_millis(
util::current_time_millis(),
)])) as _,
);
columns_values.insert(
"gmt_modified".to_string(),
Arc::new(Int64Vector::from_slice(&[util::current_time_millis()])) as _,
Arc::new(TimestampVector::from_slice(&[Timestamp::from_millis(
util::current_time_millis(),
)])) as _,
);
InsertRequest {

View File

@@ -299,16 +299,16 @@ mod tests {
let batch = t.unwrap().df_recordbatch;
assert_eq!(1, batch.num_rows());
assert_eq!(4, batch.num_columns());
assert_eq!(&DataType::LargeUtf8, batch.column(0).data_type());
assert_eq!(&DataType::LargeUtf8, batch.column(1).data_type());
assert_eq!(&DataType::LargeUtf8, batch.column(2).data_type());
assert_eq!(&DataType::LargeUtf8, batch.column(3).data_type());
assert_eq!(&DataType::Utf8, batch.column(0).data_type());
assert_eq!(&DataType::Utf8, batch.column(1).data_type());
assert_eq!(&DataType::Utf8, batch.column(2).data_type());
assert_eq!(&DataType::Utf8, batch.column(3).data_type());
assert_eq!(
"test_catalog",
batch
.column(0)
.as_any()
.downcast_ref::<Utf8Array<i64>>()
.downcast_ref::<Utf8Array<i32>>()
.unwrap()
.value(0)
);
@@ -318,7 +318,7 @@ mod tests {
batch
.column(1)
.as_any()
.downcast_ref::<Utf8Array<i64>>()
.downcast_ref::<Utf8Array<i32>>()
.unwrap()
.value(0)
);
@@ -328,7 +328,7 @@ mod tests {
batch
.column(2)
.as_any()
.downcast_ref::<Utf8Array<i64>>()
.downcast_ref::<Utf8Array<i32>>()
.unwrap()
.value(0)
);
@@ -338,7 +338,7 @@ mod tests {
batch
.column(3)
.as_any()
.downcast_ref::<Utf8Array<i64>>()
.downcast_ref::<Utf8Array<i32>>()
.unwrap()
.value(0)
);

View File

@@ -2,18 +2,25 @@
name = "client"
version = "0.1.0"
edition = "2021"
# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html
[dependencies]
api = { path = "../api" }
async-stream = "0.3"
catalog = { path = "../catalog" }
common-base = { path = "../common/base" }
common-error = { path = "../common/error" }
common-grpc = { path = "../common/grpc" }
common-query = { path = "../common/query" }
common-recordbatch = { path = "../common/recordbatch" }
common-time = { path = "../common/time" }
datafusion = { git = "https://github.com/apache/arrow-datafusion.git", branch = "arrow2", features = ["simd"] }
datatypes = { path = "../datatypes" }
snafu = { version = "0.7", features = ["backtraces"] }
tonic = "0.8"
[dev-dependencies]
datanode = { path = "../datanode" }
tokio = { version = "1.0", features = ["full"] }
tracing = "0.1"
tracing-subscriber = { version = "0.3", features = ["env-filter"] }

View File

@@ -13,7 +13,13 @@ async fn run() {
let client = Client::connect("http://127.0.0.1:3001").await.unwrap();
let db = Database::new("greptime", client);
db.insert("demo", insert_batches()).await.unwrap();
let expr = InsertExpr {
table_name: "demo".to_string(),
expr: Some(insert_expr::Expr::Values(insert_expr::Values {
values: insert_batches(),
})),
};
db.insert(expr).await.unwrap();
}
fn insert_batches() -> Vec<Vec<u8>> {
@@ -37,6 +43,7 @@ fn insert_batches() -> Vec<Vec<u8>> {
semantic_type: SEMANTIC_TAG,
values: Some(host_vals),
null_mask: vec![0],
..Default::default()
};
let cpu_vals = column::Values {
@@ -48,6 +55,7 @@ fn insert_batches() -> Vec<Vec<u8>> {
semantic_type: SEMANTIC_FEILD,
values: Some(cpu_vals),
null_mask: vec![2],
..Default::default()
};
let mem_vals = column::Values {
@@ -59,6 +67,7 @@ fn insert_batches() -> Vec<Vec<u8>> {
semantic_type: SEMANTIC_FEILD,
values: Some(mem_vals),
null_mask: vec![4],
..Default::default()
};
let ts_vals = column::Values {
@@ -70,6 +79,7 @@ fn insert_batches() -> Vec<Vec<u8>> {
semantic_type: SEMANTIC_TS,
values: Some(ts_vals),
null_mask: vec![0],
..Default::default()
};
let insert_batch = InsertBatch {

View File

@@ -1,14 +1,106 @@
use api::v1::*;
use common_error::prelude::StatusCode;
use common_query::Output;
use snafu::prelude::*;
use crate::database::PROTOCOL_VERSION;
use crate::error;
use crate::Client;
use crate::Result;
#[derive(Clone, Debug)]
pub struct Admin {
name: String,
client: Client,
}
impl Admin {
pub fn new(client: Client) -> Self {
Self { client }
pub fn new(name: impl Into<String>, client: Client) -> Self {
Self {
name: name.into(),
client,
}
}
// TODO(jiachun): admin api
pub async fn start(&mut self, url: impl Into<String>) -> Result<()> {
self.client.start(url).await
}
pub async fn create(&self, expr: CreateExpr) -> Result<AdminResult> {
let header = ExprHeader {
version: PROTOCOL_VERSION,
};
let expr = AdminExpr {
header: Some(header),
expr: Some(admin_expr::Expr::Create(expr)),
};
self.do_request(expr).await
}
pub async fn do_request(&self, expr: AdminExpr) -> Result<AdminResult> {
// `remove(0)` is safe because of `do_requests`'s invariants.
Ok(self.do_requests(vec![expr]).await?.remove(0))
}
pub async fn alter(&self, expr: AlterExpr) -> Result<AdminResult> {
let header = ExprHeader {
version: PROTOCOL_VERSION,
};
let expr = AdminExpr {
header: Some(header),
expr: Some(admin_expr::Expr::Alter(expr)),
};
Ok(self.do_requests(vec![expr]).await?.remove(0))
}
/// Invariants: the lengths of input vec (`Vec<AdminExpr>`) and output vec (`Vec<AdminResult>`) are equal.
async fn do_requests(&self, exprs: Vec<AdminExpr>) -> Result<Vec<AdminResult>> {
let expr_count = exprs.len();
let req = AdminRequest {
name: self.name.clone(),
exprs,
};
let resp = self.client.admin(req).await?;
let results = resp.results;
ensure!(
results.len() == expr_count,
error::MissingResultSnafu {
name: "admin_results",
expected: expr_count,
actual: results.len(),
}
);
Ok(results)
}
}
pub fn admin_result_to_output(admin_result: AdminResult) -> Result<Output> {
let header = admin_result.header.context(error::MissingHeaderSnafu)?;
if !StatusCode::is_success(header.code) {
return error::DatanodeSnafu {
code: header.code,
msg: header.err_msg,
}
.fail();
}
let result = admin_result.result.context(error::MissingResultSnafu {
name: "result".to_string(),
expected: 1_usize,
actual: 0_usize,
})?;
let output = match result {
admin_result::Result::Mutate(mutate) => {
if mutate.failure != 0 {
return error::MutateFailureSnafu {
failure: mutate.failure,
}
.fail();
}
Output::AffectedRows(mutate.success as usize)
}
};
Ok(output)
}

View File

@@ -5,18 +5,43 @@ use tonic::transport::Channel;
use crate::error;
use crate::Result;
#[derive(Clone, Debug)]
#[derive(Clone, Debug, Default)]
pub struct Client {
client: GreptimeClient<Channel>,
client: Option<GreptimeClient<Channel>>,
}
impl Client {
pub async fn start(&mut self, url: impl Into<String>) -> Result<()> {
match self.client.as_ref() {
None => {
let url = url.into();
let client = GreptimeClient::connect(url.clone())
.await
.context(error::ConnectFailedSnafu { url })?;
self.client = Some(client);
Ok(())
}
Some(_) => error::IllegalGrpcClientStateSnafu {
err_msg: "already started",
}
.fail(),
}
}
pub fn with_client(client: GreptimeClient<Channel>) -> Self {
Self {
client: Some(client),
}
}
pub async fn connect(url: impl Into<String>) -> Result<Self> {
let url = url.into();
let client = GreptimeClient::connect(url.clone())
.await
.context(error::ConnectFailedSnafu { url })?;
Ok(Self { client })
Ok(Self {
client: Some(client),
})
}
pub async fn admin(&self, req: AdminRequest) -> Result<AdminResponse> {
@@ -48,12 +73,18 @@ impl Client {
}
pub async fn batch(&self, req: BatchRequest) -> Result<BatchResponse> {
let res = self
.client
.clone()
.batch(req)
.await
.context(error::TonicStatusSnafu)?;
Ok(res.into_inner())
if let Some(client) = self.client.as_ref() {
let res = client
.clone()
.batch(req)
.await
.context(error::TonicStatusSnafu)?;
Ok(res.into_inner())
} else {
error::IllegalGrpcClientStateSnafu {
err_msg: "not started",
}
.fail()
}
}
}

View File

@@ -1,28 +1,37 @@
use std::ops::Deref;
use std::sync::Arc;
use api::helper::ColumnDataTypeWrapper;
use api::v1::codec::SelectResult as GrpcSelectResult;
use api::v1::{
object_expr, object_result, select_expr, DatabaseRequest, ExprHeader, InsertExpr,
MutateResult as GrpcMutateResult, ObjectExpr, ObjectResult as GrpcObjectResult, PhysicalPlan,
SelectExpr,
column::Values, object_expr, object_result, select_expr, Column, ColumnDataType,
DatabaseRequest, ExprHeader, InsertExpr, MutateResult as GrpcMutateResult, ObjectExpr,
ObjectResult as GrpcObjectResult, PhysicalPlan, SelectExpr,
};
use common_base::BitVec;
use common_error::status_code::StatusCode;
use common_grpc::AsExcutionPlan;
use common_grpc::DefaultAsPlanImpl;
use common_query::Output;
use common_recordbatch::{RecordBatch, RecordBatches};
use common_time::date::Date;
use common_time::datetime::DateTime;
use common_time::timestamp::Timestamp;
use datafusion::physical_plan::ExecutionPlan;
use datatypes::prelude::*;
use datatypes::schema::{ColumnSchema, Schema};
use snafu::{ensure, OptionExt, ResultExt};
use crate::error::{self, MissingResultSnafu};
use crate::error;
use crate::{
error::DatanodeSnafu, error::DecodeSelectSnafu, error::EncodePhysicalSnafu,
error::MissingHeaderSnafu, Client, Result,
error::{
ConvertSchemaSnafu, DatanodeSnafu, DecodeSelectSnafu, EncodePhysicalSnafu,
MissingFieldSnafu,
},
Client, Result,
};
pub const PROTOCOL_VERSION: u32 = 1;
pub type Bytes = Vec<u8>;
#[derive(Clone, Debug)]
pub struct Database {
name: String,
@@ -37,26 +46,23 @@ impl Database {
}
}
pub async fn start(&mut self, url: impl Into<String>) -> Result<()> {
self.client.start(url).await
}
pub fn name(&self) -> &str {
&self.name
}
pub async fn insert(&self, table: impl Into<String>, values: Vec<Bytes>) -> Result<()> {
pub async fn insert(&self, insert: InsertExpr) -> Result<ObjectResult> {
let header = ExprHeader {
version: PROTOCOL_VERSION,
};
let insert = InsertExpr {
table_name: table.into(),
values,
};
let expr = ObjectExpr {
header: Some(header),
expr: Some(object_expr::Expr::Insert(insert)),
};
self.object(expr).await?;
Ok(())
self.object(expr).await?.try_into()
}
pub async fn select(&self, expr: Select) -> Result<ObjectResult> {
@@ -97,41 +103,12 @@ impl Database {
};
let obj_result = self.object(expr).await?;
let header = obj_result.header.context(MissingHeaderSnafu)?;
if !StatusCode::is_success(header.code) {
return DatanodeSnafu {
code: header.code,
msg: header.err_msg,
}
.fail();
}
let obj_result = obj_result.result.context(MissingResultSnafu {
name: "select_result".to_string(),
expected: 1_usize,
actual: 0_usize,
})?;
let result = match obj_result {
object_result::Result::Select(select) => {
let result = select
.raw_data
.deref()
.try_into()
.context(DecodeSelectSnafu)?;
ObjectResult::Select(result)
}
object_result::Result::Mutate(mutate) => ObjectResult::Mutate(mutate),
};
Ok(result)
obj_result.try_into()
}
// TODO(jiachun) update/delete
async fn object(&self, expr: ObjectExpr) -> Result<GrpcObjectResult> {
pub async fn object(&self, expr: ObjectExpr) -> Result<GrpcObjectResult> {
let res = self.objects(vec![expr]).await?.pop().unwrap();
Ok(res)
}
@@ -165,6 +142,273 @@ pub enum ObjectResult {
Mutate(GrpcMutateResult),
}
impl TryFrom<api::v1::ObjectResult> for ObjectResult {
type Error = error::Error;
fn try_from(object_result: api::v1::ObjectResult) -> std::result::Result<Self, Self::Error> {
let header = object_result.header.context(error::MissingHeaderSnafu)?;
if !StatusCode::is_success(header.code) {
return DatanodeSnafu {
code: header.code,
msg: header.err_msg,
}
.fail();
}
let obj_result = object_result.result.context(error::MissingResultSnafu {
name: "result".to_string(),
expected: 1_usize,
actual: 0_usize,
})?;
Ok(match obj_result {
object_result::Result::Select(select) => {
let result = (*select.raw_data).try_into().context(DecodeSelectSnafu)?;
ObjectResult::Select(result)
}
object_result::Result::Mutate(mutate) => ObjectResult::Mutate(mutate),
})
}
}
pub enum Select {
Sql(String),
}
impl TryFrom<ObjectResult> for Output {
type Error = error::Error;
fn try_from(value: ObjectResult) -> Result<Self> {
let output = match value {
ObjectResult::Select(select) => {
let vectors = select
.columns
.iter()
.map(|column| column_to_vector(column, select.row_count))
.collect::<Result<Vec<VectorRef>>>()?;
let column_schemas = select
.columns
.iter()
.zip(vectors.iter())
.map(|(column, vector)| {
let datatype = vector.data_type();
// nullable or not, does not affect the output
ColumnSchema::new(&column.column_name, datatype, true)
})
.collect::<Vec<ColumnSchema>>();
let schema = Arc::new(Schema::try_new(column_schemas).context(ConvertSchemaSnafu)?);
let recordbatches = if vectors.is_empty() {
RecordBatches::try_new(schema, vec![])
} else {
RecordBatch::new(schema, vectors)
.and_then(|batch| RecordBatches::try_new(batch.schema.clone(), vec![batch]))
}
.context(error::CreateRecordBatchesSnafu)?;
Output::RecordBatches(recordbatches)
}
ObjectResult::Mutate(mutate) => {
if mutate.failure != 0 {
return error::MutateFailureSnafu {
failure: mutate.failure,
}
.fail();
}
Output::AffectedRows(mutate.success as usize)
}
};
Ok(output)
}
}
fn column_to_vector(column: &Column, rows: u32) -> Result<VectorRef> {
let wrapper = ColumnDataTypeWrapper::try_new(
column
.datatype
.context(MissingFieldSnafu { field: "datatype" })?,
)
.context(error::ColumnDataTypeSnafu)?;
let column_datatype = wrapper.datatype();
let rows = rows as usize;
let mut vector = VectorBuilder::with_capacity(wrapper.into(), rows);
if let Some(values) = &column.values {
let values = collect_column_values(column_datatype, values);
let mut values_iter = values.into_iter();
let null_mask = BitVec::from_slice(&column.null_mask);
let mut nulls_iter = null_mask.iter().by_vals().fuse();
for i in 0..rows {
if let Some(true) = nulls_iter.next() {
vector.push_null();
} else {
let value_ref = values_iter.next().context(error::InvalidColumnProtoSnafu {
err_msg: format!(
"value not found at position {} of column {}",
i, &column.column_name
),
})?;
vector
.try_push_ref(value_ref)
.context(error::CreateVectorSnafu)?;
}
}
} else {
(0..rows).for_each(|_| vector.push_null());
}
Ok(vector.finish())
}
fn collect_column_values(column_datatype: ColumnDataType, values: &Values) -> Vec<ValueRef> {
macro_rules! collect_values {
($value: expr, $mapper: expr) => {
$value.iter().map($mapper).collect::<Vec<ValueRef>>()
};
}
match column_datatype {
ColumnDataType::Boolean => collect_values!(values.bool_values, |v| ValueRef::from(*v)),
ColumnDataType::Int8 => collect_values!(values.i8_values, |v| ValueRef::from(*v as i8)),
ColumnDataType::Int16 => {
collect_values!(values.i16_values, |v| ValueRef::from(*v as i16))
}
ColumnDataType::Int32 => {
collect_values!(values.i32_values, |v| ValueRef::from(*v))
}
ColumnDataType::Int64 => {
collect_values!(values.i64_values, |v| ValueRef::from(*v as i64))
}
ColumnDataType::Uint8 => {
collect_values!(values.u8_values, |v| ValueRef::from(*v as u8))
}
ColumnDataType::Uint16 => {
collect_values!(values.u16_values, |v| ValueRef::from(*v as u16))
}
ColumnDataType::Uint32 => {
collect_values!(values.u32_values, |v| ValueRef::from(*v))
}
ColumnDataType::Uint64 => {
collect_values!(values.u64_values, |v| ValueRef::from(*v as u64))
}
ColumnDataType::Float32 => collect_values!(values.f32_values, |v| ValueRef::from(*v)),
ColumnDataType::Float64 => collect_values!(values.f64_values, |v| ValueRef::from(*v)),
ColumnDataType::Binary => {
collect_values!(values.binary_values, |v| ValueRef::from(v.as_slice()))
}
ColumnDataType::String => {
collect_values!(values.string_values, |v| ValueRef::from(v.as_str()))
}
ColumnDataType::Date => {
collect_values!(values.date_values, |v| ValueRef::Date(Date::new(*v)))
}
ColumnDataType::Datetime => {
collect_values!(values.datetime_values, |v| ValueRef::DateTime(
DateTime::new(*v)
))
}
ColumnDataType::Timestamp => {
collect_values!(values.ts_millis_values, |v| ValueRef::Timestamp(
Timestamp::from_millis(*v)
))
}
}
}
#[cfg(test)]
mod tests {
use datanode::server::grpc::select::{null_mask, values};
use datatypes::vectors::{
BinaryVector, BooleanVector, DateTimeVector, DateVector, Float32Vector, Float64Vector,
Int16Vector, Int32Vector, Int64Vector, Int8Vector, StringVector, UInt16Vector,
UInt32Vector, UInt64Vector, UInt8Vector,
};
use super::*;
#[test]
fn test_column_to_vector() {
let mut column = create_test_column(Arc::new(BooleanVector::from(vec![true])));
column.datatype = Some(-100);
let result = column_to_vector(&column, 1);
assert!(result.is_err());
assert_eq!(
result.unwrap_err().to_string(),
"Column datatype error, source: Unknown proto column datatype: -100"
);
macro_rules! test_with_vector {
($vector: expr) => {
let vector = Arc::new($vector);
let column = create_test_column(vector.clone());
let result = column_to_vector(&column, vector.len() as u32).unwrap();
assert_eq!(result, vector as VectorRef);
};
}
test_with_vector!(BooleanVector::from(vec![Some(true), None, Some(false)]));
test_with_vector!(Int8Vector::from(vec![Some(i8::MIN), None, Some(i8::MAX)]));
test_with_vector!(Int16Vector::from(vec![
Some(i16::MIN),
None,
Some(i16::MAX)
]));
test_with_vector!(Int32Vector::from(vec![
Some(i32::MIN),
None,
Some(i32::MAX)
]));
test_with_vector!(Int64Vector::from(vec![
Some(i64::MIN),
None,
Some(i64::MAX)
]));
test_with_vector!(UInt8Vector::from(vec![Some(u8::MIN), None, Some(u8::MAX)]));
test_with_vector!(UInt16Vector::from(vec![
Some(u16::MIN),
None,
Some(u16::MAX)
]));
test_with_vector!(UInt32Vector::from(vec![
Some(u32::MIN),
None,
Some(u32::MAX)
]));
test_with_vector!(UInt64Vector::from(vec![
Some(u64::MIN),
None,
Some(u64::MAX)
]));
test_with_vector!(Float32Vector::from(vec![
Some(f32::MIN),
None,
Some(f32::MAX)
]));
test_with_vector!(Float64Vector::from(vec![
Some(f64::MIN),
None,
Some(f64::MAX)
]));
test_with_vector!(BinaryVector::from(vec![
Some(b"".to_vec()),
None,
Some(b"hello".to_vec())
]));
test_with_vector!(StringVector::from(vec![Some(""), None, Some("foo"),]));
test_with_vector!(DateVector::from(vec![Some(1), None, Some(3)]));
test_with_vector!(DateTimeVector::from(vec![Some(4), None, Some(6)]));
}
fn create_test_column(vector: VectorRef) -> Column {
let wrapper: ColumnDataTypeWrapper = vector.data_type().try_into().unwrap();
let array = vector.to_arrow_array();
Column {
column_name: "test".to_string(),
semantic_type: 1,
values: Some(values(&[array.clone()]).unwrap()),
null_mask: null_mask(&vec![array], vector.len()),
datatype: Some(wrapper.datatype() as i32),
}
}
}

View File

@@ -1,3 +1,4 @@
use std::any::Any;
use std::sync::Arc;
use api::serde::DecodeError;
@@ -42,6 +43,79 @@ pub enum Error {
#[snafu(backtrace)]
source: common_grpc::Error,
},
#[snafu(display("Mutate result has failure {}", failure))]
MutateFailure { failure: u32, backtrace: Backtrace },
#[snafu(display("Invalid column proto: {}", err_msg))]
InvalidColumnProto {
err_msg: String,
backtrace: Backtrace,
},
#[snafu(display("Column datatype error, source: {}", source))]
ColumnDataType {
#[snafu(backtrace)]
source: api::error::Error,
},
#[snafu(display("Failed to create vector, source: {}", source))]
CreateVector {
#[snafu(backtrace)]
source: datatypes::error::Error,
},
#[snafu(display("Failed to create RecordBatches, source: {}", source))]
CreateRecordBatches {
#[snafu(backtrace)]
source: common_recordbatch::error::Error,
},
#[snafu(display("Illegal GRPC client state: {}", err_msg))]
IllegalGrpcClientState {
err_msg: String,
backtrace: Backtrace,
},
#[snafu(display("Missing required field in protobuf, field: {}", field))]
MissingField { field: String, backtrace: Backtrace },
#[snafu(display("Failed to convert schema, source: {}", source))]
ConvertSchema {
#[snafu(backtrace)]
source: datatypes::error::Error,
},
}
pub type Result<T> = std::result::Result<T, Error>;
impl ErrorExt for Error {
fn status_code(&self) -> StatusCode {
match self {
Error::ConnectFailed { .. }
| Error::MissingResult { .. }
| Error::MissingHeader { .. }
| Error::TonicStatus { .. }
| Error::DecodeSelect { .. }
| Error::Datanode { .. }
| Error::EncodePhysical { .. }
| Error::MutateFailure { .. }
| Error::InvalidColumnProto { .. }
| Error::ColumnDataType { .. }
| Error::MissingField { .. } => StatusCode::Internal,
Error::ConvertSchema { source } | Error::CreateVector { source } => {
source.status_code()
}
Error::CreateRecordBatches { source } => source.status_code(),
Error::IllegalGrpcClientState { .. } => StatusCode::Unexpected,
}
}
fn backtrace_opt(&self) -> Option<&Backtrace> {
ErrorCompat::backtrace(self)
}
fn as_any(&self) -> &dyn Any {
self
}
}

View File

@@ -1,3 +1,4 @@
pub mod admin;
mod client;
mod database;
mod error;

View File

@@ -10,8 +10,10 @@ path = "src/bin/greptime.rs"
[dependencies]
clap = { version = "3.1", features = ["derive"] }
common-error = { path = "../common/error" }
common-telemetry = { path = "../common/telemetry", features = ["deadlock_detection"]}
common-telemetry = { path = "../common/telemetry", features = ["deadlock_detection"] }
datanode = { path = "../datanode" }
frontend = { path = "../frontend" }
futures = "0.3"
snafu = { version = "0.7", features = ["backtraces"] }
tokio = { version = "1.18", features = ["full"] }
toml = "0.5"

View File

@@ -3,6 +3,7 @@ use std::fmt;
use clap::Parser;
use cmd::datanode;
use cmd::error::Result;
use cmd::frontend;
use common_telemetry::{self, logging::error, logging::info};
#[derive(Parser)]
@@ -26,12 +27,15 @@ impl Command {
enum SubCommand {
#[clap(name = "datanode")]
Datanode(datanode::Command),
#[clap(name = "frontend")]
Frontend(frontend::Command),
}
impl SubCommand {
async fn run(self) -> Result<()> {
match self {
SubCommand::Datanode(cmd) => cmd.run().await,
SubCommand::Frontend(cmd) => cmd.run().await,
}
}
}
@@ -40,6 +44,7 @@ impl fmt::Display for SubCommand {
fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
match self {
SubCommand::Datanode(..) => write!(f, "greptime-datanode"),
SubCommand::Frontend(..) => write!(f, "greptime-frontend"),
}
}
}

View File

@@ -39,6 +39,8 @@ struct StartCommand {
rpc_addr: Option<String>,
#[clap(long)]
mysql_addr: Option<String>,
#[clap(long)]
postgres_addr: Option<String>,
#[clap(short, long)]
config_file: Option<String>,
}
@@ -78,6 +80,9 @@ impl TryFrom<StartCommand> for DatanodeOptions {
if let Some(addr) = cmd.mysql_addr {
opts.mysql_addr = addr;
}
if let Some(addr) = cmd.postgres_addr {
opts.postgres_addr = addr;
}
Ok(opts)
}
@@ -95,6 +100,7 @@ mod tests {
http_addr: None,
rpc_addr: None,
mysql_addr: None,
postgres_addr: None,
config_file: Some(format!(
"{}/../../config/datanode.example.toml",
std::env::current_dir().unwrap().as_path().to_str().unwrap()
@@ -106,6 +112,10 @@ mod tests {
assert_eq!("/tmp/greptimedb/wal".to_string(), options.wal_dir);
assert_eq!("0.0.0.0:3306".to_string(), options.mysql_addr);
assert_eq!(4, options.mysql_runtime_size);
assert_eq!("0.0.0.0:5432".to_string(), options.postgres_addr);
assert_eq!(4, options.postgres_runtime_size);
match options.storage {
ObjectStoreConfig::File { data_dir } => {
assert_eq!("/tmp/greptimedb/data/".to_string(), data_dir)

View File

@@ -11,6 +11,12 @@ pub enum Error {
source: datanode::error::Error,
},
#[snafu(display("Failed to start frontend, source: {}", source))]
StartFrontend {
#[snafu(backtrace)]
source: frontend::error::Error,
},
#[snafu(display("Failed to read config file: {}, source: {}", path, source))]
ReadConfig {
source: std::io::Error,
@@ -27,6 +33,7 @@ impl ErrorExt for Error {
fn status_code(&self) -> StatusCode {
match self {
Error::StartDatanode { source } => source.status_code(),
Error::StartFrontend { source } => source.status_code(),
Error::ReadConfig { .. } | Error::ParseConfig { .. } => StatusCode::InvalidArguments,
}
}

149
src/cmd/src/frontend.rs Normal file
View File

@@ -0,0 +1,149 @@
use clap::Parser;
use frontend::frontend::{Frontend, FrontendOptions};
use frontend::influxdb::InfluxdbOptions;
use frontend::mysql::MysqlOptions;
use frontend::opentsdb::OpentsdbOptions;
use frontend::postgres::PostgresOptions;
use snafu::ResultExt;
use crate::error::{self, Result};
use crate::toml_loader;
#[derive(Parser)]
pub struct Command {
#[clap(subcommand)]
subcmd: SubCommand,
}
impl Command {
pub async fn run(self) -> Result<()> {
self.subcmd.run().await
}
}
#[derive(Parser)]
enum SubCommand {
Start(StartCommand),
}
impl SubCommand {
async fn run(self) -> Result<()> {
match self {
SubCommand::Start(cmd) => cmd.run().await,
}
}
}
#[derive(Debug, Parser)]
struct StartCommand {
#[clap(long)]
http_addr: Option<String>,
#[clap(long)]
grpc_addr: Option<String>,
#[clap(long)]
mysql_addr: Option<String>,
#[clap(long)]
postgres_addr: Option<String>,
#[clap(long)]
opentsdb_addr: Option<String>,
#[clap(short, long)]
config_file: Option<String>,
#[clap(short, long)]
influxdb_enable: Option<bool>,
}
impl StartCommand {
async fn run(self) -> Result<()> {
let opts = self.try_into()?;
let mut frontend = Frontend::new(opts);
frontend.start().await.context(error::StartFrontendSnafu)
}
}
impl TryFrom<StartCommand> for FrontendOptions {
type Error = error::Error;
fn try_from(cmd: StartCommand) -> Result<Self> {
let mut opts: FrontendOptions = if let Some(path) = cmd.config_file {
toml_loader::from_file!(&path)?
} else {
FrontendOptions::default()
};
if let Some(addr) = cmd.http_addr {
opts.http_addr = Some(addr);
}
if let Some(addr) = cmd.grpc_addr {
opts.grpc_addr = Some(addr);
}
if let Some(addr) = cmd.mysql_addr {
opts.mysql_options = Some(MysqlOptions {
addr,
..Default::default()
});
}
if let Some(addr) = cmd.postgres_addr {
opts.postgres_options = Some(PostgresOptions {
addr,
..Default::default()
});
}
if let Some(addr) = cmd.opentsdb_addr {
opts.opentsdb_options = Some(OpentsdbOptions {
addr,
..Default::default()
});
}
if let Some(enable) = cmd.influxdb_enable {
opts.influxdb_options = Some(InfluxdbOptions { enable });
}
Ok(opts)
}
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn test_try_from_start_command() {
let command = StartCommand {
http_addr: Some("127.0.0.1:1234".to_string()),
grpc_addr: None,
mysql_addr: Some("127.0.0.1:5678".to_string()),
postgres_addr: Some("127.0.0.1:5432".to_string()),
opentsdb_addr: Some("127.0.0.1:4321".to_string()),
influxdb_enable: Some(false),
config_file: None,
};
let opts: FrontendOptions = command.try_into().unwrap();
assert_eq!(opts.http_addr, Some("127.0.0.1:1234".to_string()));
assert_eq!(opts.mysql_options.as_ref().unwrap().addr, "127.0.0.1:5678");
assert_eq!(
opts.postgres_options.as_ref().unwrap().addr,
"127.0.0.1:5432"
);
assert_eq!(
opts.opentsdb_options.as_ref().unwrap().addr,
"127.0.0.1:4321"
);
let default_opts = FrontendOptions::default();
assert_eq!(opts.grpc_addr, default_opts.grpc_addr);
assert_eq!(
opts.mysql_options.as_ref().unwrap().runtime_size,
default_opts.mysql_options.as_ref().unwrap().runtime_size
);
assert_eq!(
opts.postgres_options.as_ref().unwrap().runtime_size,
default_opts.postgres_options.as_ref().unwrap().runtime_size
);
assert_eq!(
opts.opentsdb_options.as_ref().unwrap().runtime_size,
default_opts.opentsdb_options.as_ref().unwrap().runtime_size
);
assert!(!opts.influxdb_options.unwrap().enable);
}
}

View File

@@ -1,3 +1,4 @@
pub mod datanode;
pub mod error;
pub mod frontend;
mod toml_loader;

View File

@@ -2,7 +2,6 @@
name = "common-error"
version = "0.1.0"
edition = "2021"
# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html
[dependencies]

View File

@@ -6,45 +6,47 @@ pub enum StatusCode {
// ====== Begin of common status code ==============
/// Success.
Success = 0,
/// Unknown error.
Unknown = 1,
Unknown = 1000,
/// Unsupported operation.
Unsupported = 2,
Unsupported = 1001,
/// Unexpected error, maybe there is a BUG.
Unexpected = 3,
Unexpected = 1002,
/// Internal server error.
Internal = 4,
Internal = 1003,
/// Invalid arguments.
InvalidArguments = 5,
InvalidArguments = 1004,
// ====== End of common status code ================
// ====== Begin of SQL related status code =========
/// SQL Syntax error.
InvalidSyntax = 6,
InvalidSyntax = 2000,
// ====== End of SQL related status code ===========
// ====== Begin of query related status code =======
/// Fail to create a plan for the query.
PlanQuery = 7,
PlanQuery = 3000,
/// The query engine fail to execute query.
EngineExecuteQuery = 8,
EngineExecuteQuery = 3001,
// ====== End of query related status code =========
// ====== Begin of catalog related status code =====
/// Table already exists.
TableAlreadyExists = 9,
TableNotFound = 10,
TableColumnNotFound = 11,
TableAlreadyExists = 4000,
TableNotFound = 4001,
TableColumnNotFound = 4002,
TableColumnExists = 4003,
// ====== End of catalog related status code =======
// ====== Begin of storage related status code =====
/// Storage is temporarily unable to handle the request
StorageUnavailable = 12,
StorageUnavailable = 5000,
// ====== End of storage related status code =======
// ====== Begin of server related status code =====
/// Runtime resources exhausted, like creating threads failed.
RuntimeResourcesExhausted = 13,
RuntimeResourcesExhausted = 6000,
// ====== End of server related status code =======
}

View File

@@ -0,0 +1,18 @@
[package]
name = "common-function-macro"
version = "0.1.0"
edition = "2021"
[lib]
proc-macro = true
[dependencies]
common-query = { path = "../query" }
datatypes = { path = "../../datatypes" }
quote = "1.0"
snafu = { version = "0.7", features = ["backtraces"] }
syn = "1.0"
[dev-dependencies]
arc-swap = "1.0"
static_assertions = "1.1.0"

View File

@@ -0,0 +1,71 @@
use proc_macro::TokenStream;
use quote::{quote, quote_spanned};
use syn::parse::Parser;
use syn::spanned::Spanned;
use syn::{parse_macro_input, DeriveInput, ItemStruct};
/// Make struct implemented trait [AggrFuncTypeStore], which is necessary when writing UDAF.
/// This derive macro is expect to be used along with attribute macro [as_aggr_func_creator].
#[proc_macro_derive(AggrFuncTypeStore)]
pub fn aggr_func_type_store_derive(input: TokenStream) -> TokenStream {
let ast = parse_macro_input!(input as DeriveInput);
impl_aggr_func_type_store(&ast)
}
fn impl_aggr_func_type_store(ast: &DeriveInput) -> TokenStream {
let name = &ast.ident;
let gen = quote! {
use common_query::logical_plan::accumulator::AggrFuncTypeStore;
use common_query::error::{InvalidInputStateSnafu, Error as QueryError};
use datatypes::prelude::ConcreteDataType;
impl AggrFuncTypeStore for #name {
fn input_types(&self) -> std::result::Result<Vec<ConcreteDataType>, QueryError> {
let input_types = self.input_types.load();
snafu::ensure!(input_types.is_some(), InvalidInputStateSnafu);
Ok(input_types.as_ref().unwrap().as_ref().clone())
}
fn set_input_types(&self, input_types: Vec<ConcreteDataType>) -> std::result::Result<(), QueryError> {
let old = self.input_types.swap(Some(std::sync::Arc::new(input_types.clone())));
if let Some(old) = old {
snafu::ensure!(old.len() == input_types.len(), InvalidInputStateSnafu);
for (x, y) in old.iter().zip(input_types.iter()) {
snafu::ensure!(x == y, InvalidInputStateSnafu);
}
}
Ok(())
}
}
};
gen.into()
}
/// A struct can be used as a creator for aggregate function if it has been annotated with this
/// attribute first. This attribute add a necessary field which is intended to store the input
/// data's types to the struct.
/// This attribute is expected to be used along with derive macro [AggrFuncTypeStore].
#[proc_macro_attribute]
pub fn as_aggr_func_creator(_args: TokenStream, input: TokenStream) -> TokenStream {
let mut item_struct = parse_macro_input!(input as ItemStruct);
if let syn::Fields::Named(ref mut fields) = item_struct.fields {
let result = syn::Field::parse_named.parse2(quote! {
input_types: arc_swap::ArcSwapOption<Vec<ConcreteDataType>>
});
match result {
Ok(field) => fields.named.push(field),
Err(e) => return e.into_compile_error().into(),
}
} else {
return quote_spanned!(
item_struct.fields.span() => compile_error!(
"This attribute macro needs to add fields to the its annotated struct, \
so the struct must have \"{}\".")
)
.into();
}
quote! {
#item_struct
}
.into()
}

View File

@@ -0,0 +1,14 @@
use common_function_macro::as_aggr_func_creator;
use common_function_macro::AggrFuncTypeStore;
use static_assertions::{assert_fields, assert_impl_all};
#[as_aggr_func_creator]
#[derive(Debug, Default, AggrFuncTypeStore)]
struct Foo {}
#[test]
fn test_derive() {
Foo::default();
assert_fields!(Foo: input_types);
assert_impl_all!(Foo: std::fmt::Debug, Default, AggrFuncTypeStore);
}

View File

@@ -3,32 +3,27 @@ edition = "2021"
name = "common-function"
version = "0.1.0"
[dependencies.arrow]
features = ["io_csv", "io_json", "io_parquet", "io_parquet_compression", "io_ipc", "ahash", "compute", "serde_types"]
package = "arrow2"
version = "0.10"
[dependencies]
arc-swap = "1.0"
chrono-tz = "0.6"
common-error = { path = "../error" }
common-function-macro = { path = "../function-macro" }
common-query = { path = "../query" }
datafusion-common = { git = "https://github.com/apache/arrow-datafusion.git" , branch = "arrow2" }
datafusion-common = { git = "https://github.com/apache/arrow-datafusion.git", branch = "arrow2" }
datatypes = { path = "../../datatypes" }
libc = "0.2"
num = "0.4"
num-traits = "0.2"
once_cell = "1.10"
paste = "1.0"
rustpython-ast = {git = "https://github.com/RustPython/RustPython", optional = true, rev = "02a1d1d"}
rustpython-bytecode = {git = "https://github.com/RustPython/RustPython", optional = true, rev = "02a1d1d"}
rustpython-compiler = {git = "https://github.com/RustPython/RustPython", optional = true, rev = "02a1d1d"}
rustpython-compiler-core = {git = "https://github.com/RustPython/RustPython", optional = true, rev = "02a1d1d"}
rustpython-parser = {git = "https://github.com/RustPython/RustPython", optional = true, rev = "02a1d1d"}
rustpython-vm = {git = "https://github.com/RustPython/RustPython", optional = true, rev = "02a1d1d"}
snafu = { version = "0.7", features = ["backtraces"] }
statrs = "0.15"
[dependencies.arrow]
features = ["io_csv", "io_json", "io_parquet", "io_parquet_compression", "io_ipc", "ahash", "compute", "serde_types"]
package = "arrow2"
version = "0.10"
[dev-dependencies]
ron = "0.7"
serde = {version = "1.0", features = ["derive"]}
serde = { version = "1.0", features = ["derive"] }

View File

@@ -1,10 +1,8 @@
use std::cmp::Ordering;
use std::sync::Arc;
use arc_swap::ArcSwapOption;
use common_query::error::{
BadAccumulatorImplSnafu, CreateAccumulatorSnafu, InvalidInputStateSnafu, Result,
};
use common_function_macro::{as_aggr_func_creator, AggrFuncTypeStore};
use common_query::error::{BadAccumulatorImplSnafu, CreateAccumulatorSnafu, Result};
use common_query::logical_plan::{Accumulator, AggregateFunctionCreator};
use common_query::prelude::*;
use datatypes::vectors::ConstantVector;
@@ -98,10 +96,9 @@ where
}
}
#[derive(Debug, Default)]
pub struct ArgmaxAccumulatorCreator {
input_types: ArcSwapOption<Vec<ConcreteDataType>>,
}
#[as_aggr_func_creator]
#[derive(Debug, Default, AggrFuncTypeStore)]
pub struct ArgmaxAccumulatorCreator {}
impl AggregateFunctionCreator for ArgmaxAccumulatorCreator {
fn creator(&self) -> AccumulatorCreatorFunction {
@@ -124,23 +121,6 @@ impl AggregateFunctionCreator for ArgmaxAccumulatorCreator {
creator
}
fn input_types(&self) -> Result<Vec<ConcreteDataType>> {
let input_types = self.input_types.load();
ensure!(input_types.is_some(), InvalidInputStateSnafu);
Ok(input_types.as_ref().unwrap().as_ref().clone())
}
fn set_input_types(&self, input_types: Vec<ConcreteDataType>) -> Result<()> {
let old = self.input_types.swap(Some(Arc::new(input_types.clone())));
if let Some(old) = old {
ensure!(old.len() == input_types.len(), InvalidInputStateSnafu);
for (x, y) in old.iter().zip(input_types.iter()) {
ensure!(x == y, InvalidInputStateSnafu);
}
}
Ok(())
}
fn output_type(&self) -> Result<ConcreteDataType> {
Ok(ConcreteDataType::uint64_datatype())
}

View File

@@ -1,10 +1,8 @@
use std::cmp::Ordering;
use std::sync::Arc;
use arc_swap::ArcSwapOption;
use common_query::error::{
BadAccumulatorImplSnafu, CreateAccumulatorSnafu, InvalidInputStateSnafu, Result,
};
use common_function_macro::{as_aggr_func_creator, AggrFuncTypeStore};
use common_query::error::{BadAccumulatorImplSnafu, CreateAccumulatorSnafu, Result};
use common_query::logical_plan::{Accumulator, AggregateFunctionCreator};
use common_query::prelude::*;
use datatypes::vectors::ConstantVector;
@@ -107,10 +105,9 @@ where
}
}
#[derive(Debug, Default)]
pub struct ArgminAccumulatorCreator {
input_types: ArcSwapOption<Vec<ConcreteDataType>>,
}
#[as_aggr_func_creator]
#[derive(Debug, Default, AggrFuncTypeStore)]
pub struct ArgminAccumulatorCreator {}
impl AggregateFunctionCreator for ArgminAccumulatorCreator {
fn creator(&self) -> AccumulatorCreatorFunction {
@@ -133,23 +130,6 @@ impl AggregateFunctionCreator for ArgminAccumulatorCreator {
creator
}
fn input_types(&self) -> Result<Vec<ConcreteDataType>> {
let input_types = self.input_types.load();
ensure!(input_types.is_some(), InvalidInputStateSnafu);
Ok(input_types.as_ref().unwrap().as_ref().clone())
}
fn set_input_types(&self, input_types: Vec<ConcreteDataType>) -> Result<()> {
let old = self.input_types.swap(Some(Arc::new(input_types.clone())));
if let Some(old) = old {
ensure!(old.len() == input_types.len(), InvalidInputStateSnafu);
for (x, y) in old.iter().zip(input_types.iter()) {
ensure!(x == y, InvalidInputStateSnafu);
}
}
Ok(())
}
fn output_type(&self) -> Result<ConcreteDataType> {
Ok(ConcreteDataType::uint32_datatype())
}

View File

@@ -1,10 +1,9 @@
use std::marker::PhantomData;
use std::sync::Arc;
use arc_swap::ArcSwapOption;
use common_function_macro::{as_aggr_func_creator, AggrFuncTypeStore};
use common_query::error::{
CreateAccumulatorSnafu, DowncastVectorSnafu, FromScalarValueSnafu, InvalidInputStateSnafu,
Result,
CreateAccumulatorSnafu, DowncastVectorSnafu, FromScalarValueSnafu, Result,
};
use common_query::logical_plan::{Accumulator, AggregateFunctionCreator};
use common_query::prelude::*;
@@ -118,10 +117,9 @@ where
}
}
#[derive(Debug, Default)]
pub struct DiffAccumulatorCreator {
input_types: ArcSwapOption<Vec<ConcreteDataType>>,
}
#[as_aggr_func_creator]
#[derive(Debug, Default, AggrFuncTypeStore)]
pub struct DiffAccumulatorCreator {}
impl AggregateFunctionCreator for DiffAccumulatorCreator {
fn creator(&self) -> AccumulatorCreatorFunction {
@@ -144,31 +142,13 @@ impl AggregateFunctionCreator for DiffAccumulatorCreator {
creator
}
fn input_types(&self) -> Result<Vec<ConcreteDataType>> {
let input_types = self.input_types.load();
ensure!(input_types.is_some(), InvalidInputStateSnafu);
Ok(input_types.as_ref().unwrap().as_ref().clone())
}
fn set_input_types(&self, input_types: Vec<ConcreteDataType>) -> Result<()> {
let old = self.input_types.swap(Some(Arc::new(input_types.clone())));
if let Some(old) = old {
ensure!(old.len() != input_types.len(), InvalidInputStateSnafu);
for (x, y) in old.iter().zip(input_types.iter()) {
ensure!(x == y, InvalidInputStateSnafu);
}
}
Ok(())
}
fn output_type(&self) -> Result<ConcreteDataType> {
let input_types = self.input_types()?;
ensure!(input_types.len() == 1, InvalidInputStateSnafu);
with_match_primitive_type_id!(
input_types[0].logical_type_id(),
|$S| {
Ok(ConcreteDataType::list_datatype(PrimitiveType::<<$S as Primitive>::LargestType>::default().logical_type_id().data_type()))
Ok(ConcreteDataType::list_datatype(PrimitiveType::<<$S as Primitive>::LargestType>::default().into()))
},
{
unreachable!()
@@ -182,7 +162,7 @@ impl AggregateFunctionCreator for DiffAccumulatorCreator {
with_match_primitive_type_id!(
input_types[0].logical_type_id(),
|$S| {
Ok(vec![ConcreteDataType::list_datatype(PrimitiveType::<$S>::default().logical_type_id().data_type())])
Ok(vec![ConcreteDataType::list_datatype(PrimitiveType::<$S>::default().into())])
},
{
unreachable!()

View File

@@ -1,10 +1,9 @@
use std::marker::PhantomData;
use std::sync::Arc;
use arc_swap::ArcSwapOption;
use common_function_macro::{as_aggr_func_creator, AggrFuncTypeStore};
use common_query::error::{
BadAccumulatorImplSnafu, CreateAccumulatorSnafu, DowncastVectorSnafu, InvalidInputStateSnafu,
Result,
BadAccumulatorImplSnafu, CreateAccumulatorSnafu, DowncastVectorSnafu, Result,
};
use common_query::logical_plan::{Accumulator, AggregateFunctionCreator};
use common_query::prelude::*;
@@ -125,10 +124,9 @@ where
}
}
#[derive(Debug, Default)]
pub struct MeanAccumulatorCreator {
input_types: ArcSwapOption<Vec<ConcreteDataType>>,
}
#[as_aggr_func_creator]
#[derive(Debug, Default, AggrFuncTypeStore)]
pub struct MeanAccumulatorCreator {}
impl AggregateFunctionCreator for MeanAccumulatorCreator {
fn creator(&self) -> AccumulatorCreatorFunction {
@@ -151,23 +149,6 @@ impl AggregateFunctionCreator for MeanAccumulatorCreator {
creator
}
fn input_types(&self) -> Result<Vec<ConcreteDataType>> {
let input_types = self.input_types.load();
ensure!(input_types.is_some(), InvalidInputStateSnafu);
Ok(input_types.as_ref().unwrap().as_ref().clone())
}
fn set_input_types(&self, input_types: Vec<ConcreteDataType>) -> Result<()> {
let old = self.input_types.swap(Some(Arc::new(input_types.clone())));
if let Some(old) = old {
ensure!(old.len() == input_types.len(), InvalidInputStateSnafu);
for (x, y) in old.iter().zip(input_types.iter()) {
ensure!(x == y, InvalidInputStateSnafu);
}
}
Ok(())
}
fn output_type(&self) -> Result<ConcreteDataType> {
let input_types = self.input_types()?;
ensure!(input_types.len() == 1, InvalidInputStateSnafu);

View File

@@ -2,17 +2,17 @@ use std::cmp::Reverse;
use std::collections::BinaryHeap;
use std::sync::Arc;
use arc_swap::ArcSwapOption;
use common_function_macro::{as_aggr_func_creator, AggrFuncTypeStore};
use common_query::error::{
CreateAccumulatorSnafu, DowncastVectorSnafu, FromScalarValueSnafu, InvalidInputStateSnafu,
Result,
CreateAccumulatorSnafu, DowncastVectorSnafu, FromScalarValueSnafu, Result,
};
use common_query::logical_plan::{Accumulator, AggregateFunctionCreator};
use common_query::prelude::*;
use datatypes::prelude::*;
use datatypes::types::OrdPrimitive;
use datatypes::value::ListValue;
use datatypes::vectors::{ConstantVector, ListVector};
use datatypes::with_match_ordered_primitive_type_id;
use datatypes::with_match_primitive_type_id;
use num::NumCast;
use snafu::{ensure, OptionExt, ResultExt};
@@ -37,17 +37,19 @@ use snafu::{ensure, OptionExt, ResultExt};
#[derive(Debug, Default)]
pub struct Median<T>
where
T: Primitive + Ord,
T: Primitive,
{
greater: BinaryHeap<Reverse<T>>,
not_greater: BinaryHeap<T>,
greater: BinaryHeap<Reverse<OrdPrimitive<T>>>,
not_greater: BinaryHeap<OrdPrimitive<T>>,
}
impl<T> Median<T>
where
T: Primitive + Ord,
T: Primitive,
{
fn push(&mut self, value: T) {
let value = OrdPrimitive::<T>(value);
if self.not_greater.is_empty() {
self.not_greater.push(value);
return;
@@ -71,7 +73,7 @@ where
// to use them.
impl<T> Accumulator for Median<T>
where
T: Primitive + Ord,
T: Primitive,
for<'a> T: Scalar<RefType<'a> = T>,
{
// This function serializes our state to `ScalarValue`, which DataFusion uses to pass this
@@ -166,8 +168,8 @@ where
let greater = self.greater.peek().unwrap();
// the following three NumCast's `unwrap`s are safe because T is primitive
let not_greater_v: f64 = NumCast::from(not_greater).unwrap();
let greater_v: f64 = NumCast::from(greater.0).unwrap();
let not_greater_v: f64 = NumCast::from(not_greater.as_primitive()).unwrap();
let greater_v: f64 = NumCast::from(greater.0.as_primitive()).unwrap();
let median: T = NumCast::from((not_greater_v + greater_v) / 2.0).unwrap();
median.into()
};
@@ -175,16 +177,15 @@ where
}
}
#[derive(Debug, Default)]
pub struct MedianAccumulatorCreator {
input_types: ArcSwapOption<Vec<ConcreteDataType>>,
}
#[as_aggr_func_creator]
#[derive(Debug, Default, AggrFuncTypeStore)]
pub struct MedianAccumulatorCreator {}
impl AggregateFunctionCreator for MedianAccumulatorCreator {
fn creator(&self) -> AccumulatorCreatorFunction {
let creator: AccumulatorCreatorFunction = Arc::new(move |types: &[ConcreteDataType]| {
let input_type = &types[0];
with_match_ordered_primitive_type_id!(
with_match_primitive_type_id!(
input_type.logical_type_id(),
|$S| {
Ok(Box::new(Median::<$S>::default()))
@@ -201,23 +202,6 @@ impl AggregateFunctionCreator for MedianAccumulatorCreator {
creator
}
fn input_types(&self) -> Result<Vec<ConcreteDataType>> {
let input_types = self.input_types.load();
ensure!(input_types.is_some(), InvalidInputStateSnafu);
Ok(input_types.as_ref().unwrap().as_ref().clone())
}
fn set_input_types(&self, input_types: Vec<ConcreteDataType>) -> Result<()> {
let old = self.input_types.swap(Some(Arc::new(input_types.clone())));
if let Some(old) = old {
ensure!(old.len() == input_types.len(), InvalidInputStateSnafu);
for (x, y) in old.iter().zip(input_types.iter()) {
ensure!(x == y, InvalidInputStateSnafu);
}
}
Ok(())
}
fn output_type(&self) -> Result<ConcreteDataType> {
let input_types = self.input_types()?;
ensure!(input_types.len() == 1, InvalidInputStateSnafu);

View File

@@ -64,55 +64,24 @@ pub(crate) struct AggregateFunctions;
impl AggregateFunctions {
pub fn register(registry: &FunctionRegistry) {
registry.register_aggregate_function(Arc::new(AggregateFunctionMeta::new(
"median",
1,
Arc::new(|| Arc::new(MedianAccumulatorCreator::default())),
)));
registry.register_aggregate_function(Arc::new(AggregateFunctionMeta::new(
"diff",
1,
Arc::new(|| Arc::new(DiffAccumulatorCreator::default())),
)));
registry.register_aggregate_function(Arc::new(AggregateFunctionMeta::new(
"mean",
1,
Arc::new(|| Arc::new(MeanAccumulatorCreator::default())),
)));
registry.register_aggregate_function(Arc::new(AggregateFunctionMeta::new(
"polyval",
2,
Arc::new(|| Arc::new(PolyvalAccumulatorCreator::default())),
)));
registry.register_aggregate_function(Arc::new(AggregateFunctionMeta::new(
"argmax",
1,
Arc::new(|| Arc::new(ArgmaxAccumulatorCreator::default())),
)));
registry.register_aggregate_function(Arc::new(AggregateFunctionMeta::new(
"argmin",
1,
Arc::new(|| Arc::new(ArgminAccumulatorCreator::default())),
)));
registry.register_aggregate_function(Arc::new(AggregateFunctionMeta::new(
"diff",
1,
Arc::new(|| Arc::new(DiffAccumulatorCreator::default())),
)));
registry.register_aggregate_function(Arc::new(AggregateFunctionMeta::new(
"percentile",
2,
Arc::new(|| Arc::new(PercentileAccumulatorCreator::default())),
)));
registry.register_aggregate_function(Arc::new(AggregateFunctionMeta::new(
"scipystatsnormcdf",
2,
Arc::new(|| Arc::new(ScipyStatsNormCdfAccumulatorCreator::default())),
)));
registry.register_aggregate_function(Arc::new(AggregateFunctionMeta::new(
"scipystatsnormpdf",
2,
Arc::new(|| Arc::new(ScipyStatsNormPdfAccumulatorCreator::default())),
)));
macro_rules! register_aggr_func {
($name :expr, $arg_count :expr, $creator :ty) => {
registry.register_aggregate_function(Arc::new(AggregateFunctionMeta::new(
$name,
$arg_count,
Arc::new(|| Arc::new(<$creator>::default())),
)));
};
}
register_aggr_func!("median", 1, MedianAccumulatorCreator);
register_aggr_func!("diff", 1, DiffAccumulatorCreator);
register_aggr_func!("mean", 1, MeanAccumulatorCreator);
register_aggr_func!("polyval", 2, PolyvalAccumulatorCreator);
register_aggr_func!("argmax", 1, ArgmaxAccumulatorCreator);
register_aggr_func!("argmin", 1, ArgminAccumulatorCreator);
register_aggr_func!("percentile", 2, PercentileAccumulatorCreator);
register_aggr_func!("scipystatsnormcdf", 2, ScipyStatsNormCdfAccumulatorCreator);
register_aggr_func!("scipystatsnormpdf", 2, ScipyStatsNormPdfAccumulatorCreator);
}
}

View File

@@ -2,17 +2,18 @@ use std::cmp::Reverse;
use std::collections::BinaryHeap;
use std::sync::Arc;
use arc_swap::ArcSwapOption;
use common_function_macro::{as_aggr_func_creator, AggrFuncTypeStore};
use common_query::error::{
self, BadAccumulatorImplSnafu, CreateAccumulatorSnafu, DowncastVectorSnafu,
FromScalarValueSnafu, InvalidInputColSnafu, InvalidInputStateSnafu, Result,
FromScalarValueSnafu, InvalidInputColSnafu, Result,
};
use common_query::logical_plan::{Accumulator, AggregateFunctionCreator};
use common_query::prelude::*;
use datatypes::prelude::*;
use datatypes::types::OrdPrimitive;
use datatypes::value::{ListValue, OrderedFloat};
use datatypes::vectors::{ConstantVector, Float64Vector, ListVector};
use datatypes::with_match_ordered_primitive_type_id;
use datatypes::with_match_primitive_type_id;
use num::NumCast;
use snafu::{ensure, OptionExt, ResultExt};
@@ -37,19 +38,21 @@ use snafu::{ensure, OptionExt, ResultExt};
#[derive(Debug, Default)]
pub struct Percentile<T>
where
T: Primitive + Ord,
T: Primitive,
{
greater: BinaryHeap<Reverse<T>>,
not_greater: BinaryHeap<T>,
greater: BinaryHeap<Reverse<OrdPrimitive<T>>>,
not_greater: BinaryHeap<OrdPrimitive<T>>,
n: u64,
p: Option<f64>,
}
impl<T> Percentile<T>
where
T: Primitive + Ord,
T: Primitive,
{
fn push(&mut self, value: T) {
let value = OrdPrimitive::<T>(value);
self.n += 1;
if self.not_greater.is_empty() {
self.not_greater.push(value);
@@ -76,7 +79,7 @@ where
impl<T> Accumulator for Percentile<T>
where
T: Primitive + Ord,
T: Primitive,
for<'a> T: Scalar<RefType<'a> = T>,
{
fn state(&self) -> Result<Vec<Value>> {
@@ -212,7 +215,7 @@ where
if not_greater.is_none() {
return Ok(Value::Null);
}
let not_greater = *self.not_greater.peek().unwrap();
let not_greater = (*self.not_greater.peek().unwrap()).as_primitive();
let percentile = if self.greater.is_empty() {
NumCast::from(not_greater).unwrap()
} else {
@@ -224,23 +227,22 @@ where
};
let fract = (((self.n - 1) as f64) * p / 100_f64).fract();
let not_greater_v: f64 = NumCast::from(not_greater).unwrap();
let greater_v: f64 = NumCast::from(greater.0).unwrap();
let greater_v: f64 = NumCast::from(greater.0.as_primitive()).unwrap();
not_greater_v * (1.0 - fract) + greater_v * fract
};
Ok(Value::from(percentile))
}
}
#[derive(Debug, Default)]
pub struct PercentileAccumulatorCreator {
input_types: ArcSwapOption<Vec<ConcreteDataType>>,
}
#[as_aggr_func_creator]
#[derive(Debug, Default, AggrFuncTypeStore)]
pub struct PercentileAccumulatorCreator {}
impl AggregateFunctionCreator for PercentileAccumulatorCreator {
fn creator(&self) -> AccumulatorCreatorFunction {
let creator: AccumulatorCreatorFunction = Arc::new(move |types: &[ConcreteDataType]| {
let input_type = &types[0];
with_match_ordered_primitive_type_id!(
with_match_primitive_type_id!(
input_type.logical_type_id(),
|$S| {
Ok(Box::new(Percentile::<$S>::default()))
@@ -257,23 +259,6 @@ impl AggregateFunctionCreator for PercentileAccumulatorCreator {
creator
}
fn input_types(&self) -> Result<Vec<ConcreteDataType>> {
let input_types = self.input_types.load();
ensure!(input_types.is_some(), InvalidInputStateSnafu);
Ok(input_types.as_ref().unwrap().as_ref().clone())
}
fn set_input_types(&self, input_types: Vec<ConcreteDataType>) -> Result<()> {
let old = self.input_types.swap(Some(Arc::new(input_types.clone())));
if let Some(old) = old {
ensure!(old.len() == input_types.len(), InvalidInputStateSnafu);
for (x, y) in old.iter().zip(input_types.iter()) {
ensure!(x == y, InvalidInputStateSnafu);
}
}
Ok(())
}
fn output_type(&self) -> Result<ConcreteDataType> {
let input_types = self.input_types()?;
ensure!(input_types.len() == 2, InvalidInputStateSnafu);

View File

@@ -1,10 +1,10 @@
use std::marker::PhantomData;
use std::sync::Arc;
use arc_swap::ArcSwapOption;
use common_function_macro::{as_aggr_func_creator, AggrFuncTypeStore};
use common_query::error::{
self, BadAccumulatorImplSnafu, CreateAccumulatorSnafu, DowncastVectorSnafu,
FromScalarValueSnafu, InvalidInputColSnafu, InvalidInputStateSnafu, Result,
FromScalarValueSnafu, InvalidInputColSnafu, Result,
};
use common_query::logical_plan::{Accumulator, AggregateFunctionCreator};
use common_query::prelude::*;
@@ -187,10 +187,9 @@ where
}
}
#[derive(Debug, Default)]
pub struct PolyvalAccumulatorCreator {
input_types: ArcSwapOption<Vec<ConcreteDataType>>,
}
#[as_aggr_func_creator]
#[derive(Debug, Default, AggrFuncTypeStore)]
pub struct PolyvalAccumulatorCreator {}
impl AggregateFunctionCreator for PolyvalAccumulatorCreator {
fn creator(&self) -> AccumulatorCreatorFunction {
@@ -213,23 +212,6 @@ impl AggregateFunctionCreator for PolyvalAccumulatorCreator {
creator
}
fn input_types(&self) -> Result<Vec<ConcreteDataType>> {
let input_types = self.input_types.load();
ensure!(input_types.is_some(), InvalidInputStateSnafu);
Ok(input_types.as_ref().unwrap().as_ref().clone())
}
fn set_input_types(&self, input_types: Vec<ConcreteDataType>) -> Result<()> {
let old = self.input_types.swap(Some(Arc::new(input_types.clone())));
if let Some(old) = old {
ensure!(old.len() == input_types.len(), InvalidInputStateSnafu);
for (x, y) in old.iter().zip(input_types.iter()) {
ensure!(x == y, InvalidInputStateSnafu);
}
}
Ok(())
}
fn output_type(&self) -> Result<ConcreteDataType> {
let input_types = self.input_types()?;
ensure!(input_types.len() == 2, InvalidInputStateSnafu);
@@ -237,7 +219,7 @@ impl AggregateFunctionCreator for PolyvalAccumulatorCreator {
with_match_primitive_type_id!(
input_type,
|$S| {
Ok(PrimitiveType::<<$S as Primitive>::LargestType>::default().logical_type_id().data_type())
Ok(PrimitiveType::<<$S as Primitive>::LargestType>::default().into())
},
{
unreachable!()

View File

@@ -1,10 +1,9 @@
use std::sync::Arc;
use arc_swap::ArcSwapOption;
use common_function_macro::{as_aggr_func_creator, AggrFuncTypeStore};
use common_query::error::{
self, BadAccumulatorImplSnafu, CreateAccumulatorSnafu, DowncastVectorSnafu,
FromScalarValueSnafu, GenerateFunctionSnafu, InvalidInputColSnafu, InvalidInputStateSnafu,
Result,
FromScalarValueSnafu, GenerateFunctionSnafu, InvalidInputColSnafu, Result,
};
use common_query::logical_plan::{Accumulator, AggregateFunctionCreator};
use common_query::prelude::*;
@@ -174,10 +173,9 @@ where
}
}
#[derive(Debug, Default)]
pub struct ScipyStatsNormCdfAccumulatorCreator {
input_types: ArcSwapOption<Vec<ConcreteDataType>>,
}
#[as_aggr_func_creator]
#[derive(Debug, Default, AggrFuncTypeStore)]
pub struct ScipyStatsNormCdfAccumulatorCreator {}
impl AggregateFunctionCreator for ScipyStatsNormCdfAccumulatorCreator {
fn creator(&self) -> AccumulatorCreatorFunction {
@@ -200,23 +198,6 @@ impl AggregateFunctionCreator for ScipyStatsNormCdfAccumulatorCreator {
creator
}
fn input_types(&self) -> Result<Vec<ConcreteDataType>> {
let input_types = self.input_types.load();
ensure!(input_types.is_some(), InvalidInputStateSnafu);
Ok(input_types.as_ref().unwrap().as_ref().clone())
}
fn set_input_types(&self, input_types: Vec<ConcreteDataType>) -> Result<()> {
let old = self.input_types.swap(Some(Arc::new(input_types.clone())));
if let Some(old) = old {
ensure!(old.len() == input_types.len(), InvalidInputStateSnafu);
for (x, y) in old.iter().zip(input_types.iter()) {
ensure!(x == y, InvalidInputStateSnafu);
}
}
Ok(())
}
fn output_type(&self) -> Result<ConcreteDataType> {
let input_types = self.input_types()?;
ensure!(input_types.len() == 2, InvalidInputStateSnafu);

View File

@@ -1,10 +1,9 @@
use std::sync::Arc;
use arc_swap::ArcSwapOption;
use common_function_macro::{as_aggr_func_creator, AggrFuncTypeStore};
use common_query::error::{
self, BadAccumulatorImplSnafu, CreateAccumulatorSnafu, DowncastVectorSnafu,
FromScalarValueSnafu, GenerateFunctionSnafu, InvalidInputColSnafu, InvalidInputStateSnafu,
Result,
FromScalarValueSnafu, GenerateFunctionSnafu, InvalidInputColSnafu, Result,
};
use common_query::logical_plan::{Accumulator, AggregateFunctionCreator};
use common_query::prelude::*;
@@ -174,10 +173,9 @@ where
}
}
#[derive(Debug, Default)]
pub struct ScipyStatsNormPdfAccumulatorCreator {
input_types: ArcSwapOption<Vec<ConcreteDataType>>,
}
#[as_aggr_func_creator]
#[derive(Debug, Default, AggrFuncTypeStore)]
pub struct ScipyStatsNormPdfAccumulatorCreator {}
impl AggregateFunctionCreator for ScipyStatsNormPdfAccumulatorCreator {
fn creator(&self) -> AccumulatorCreatorFunction {
@@ -200,23 +198,6 @@ impl AggregateFunctionCreator for ScipyStatsNormPdfAccumulatorCreator {
creator
}
fn input_types(&self) -> Result<Vec<ConcreteDataType>> {
let input_types = self.input_types.load();
ensure!(input_types.is_some(), InvalidInputStateSnafu);
Ok(input_types.as_ref().unwrap().as_ref().clone())
}
fn set_input_types(&self, input_types: Vec<ConcreteDataType>) -> Result<()> {
let old = self.input_types.swap(Some(Arc::new(input_types.clone())));
if let Some(old) = old {
ensure!(old.len() == input_types.len(), InvalidInputStateSnafu);
for (x, y) in old.iter().zip(input_types.iter()) {
ensure!(x == y, InvalidInputStateSnafu);
}
}
Ok(())
}
fn output_type(&self) -> Result<ConcreteDataType> {
let input_types = self.input_types()?;
ensure!(input_types.len() == 2, InvalidInputStateSnafu);

View File

@@ -6,6 +6,8 @@ edition = "2021"
[dependencies]
api = { path = "../../api" }
async-trait = "0.1"
common-base = { path = "../base" }
common-error = { path = "../error" }
datafusion = { git = "https://github.com/apache/arrow-datafusion.git", branch = "arrow2", features = ["simd"] }
snafu = { version = "0.7", features = ["backtraces"] }

View File

@@ -1,6 +1,11 @@
use std::any::Any;
use api::DecodeError;
use common_error::prelude::{ErrorExt, StatusCode};
use datafusion::error::DataFusionError;
use snafu::{Backtrace, Snafu};
use snafu::{Backtrace, ErrorCompat, Snafu};
pub type Result<T> = std::result::Result<T, Error>;
#[derive(Debug, Snafu)]
#[snafu(visibility(pub))]
@@ -31,4 +36,42 @@ pub enum Error {
source: DecodeError,
backtrace: Backtrace,
},
#[snafu(display(
"Write type mismatch, column name: {}, expected: {}, actual: {}",
column_name,
expected,
actual
))]
TypeMismatch {
column_name: String,
expected: String,
actual: String,
backtrace: Backtrace,
},
}
impl ErrorExt for Error {
fn status_code(&self) -> StatusCode {
match self {
Error::EmptyPhysicalPlan { .. }
| Error::EmptyPhysicalExpr { .. }
| Error::MissingField { .. }
| Error::TypeMismatch { .. } => StatusCode::InvalidArguments,
Error::UnsupportedDfPlan { .. } | Error::UnsupportedDfExpr { .. } => {
StatusCode::Unsupported
}
Error::NewProjection { .. } | Error::DecodePhysicalPlanNode { .. } => {
StatusCode::Internal
}
}
}
fn backtrace_opt(&self) -> Option<&Backtrace> {
ErrorCompat::backtrace(self)
}
fn as_any(&self) -> &dyn Any {
self
}
}

View File

@@ -1,5 +1,6 @@
pub mod error;
pub mod physical;
pub mod writer;
pub use error::Error;
pub use physical::{

View File

@@ -152,7 +152,7 @@ impl ExecutionPlan for MockExecution {
fn schema(&self) -> arrow::datatypes::SchemaRef {
let field1 = Field::new("id", DataType::UInt32, false);
let field2 = Field::new("name", DataType::LargeUtf8, false);
let field2 = Field::new("name", DataType::Utf8, false);
let field3 = Field::new("age", DataType::UInt32, false);
Arc::new(arrow::datatypes::Schema::new(vec![field1, field2, field3]))
}
@@ -190,7 +190,7 @@ impl ExecutionPlan for MockExecution {
let age_array = Arc::new(PrimitiveArray::from_slice([25u32, 28, 27, 35, 25]));
let schema = Arc::new(Schema::new(vec![
Field::new("id", DataType::UInt32, false),
Field::new("name", DataType::LargeUtf8, false),
Field::new("name", DataType::Utf8, false),
Field::new("age", DataType::UInt32, false),
]));
let record_batch =

View File

@@ -0,0 +1,367 @@
use std::collections::HashMap;
use api::v1::{
codec::InsertBatch,
column::{SemanticType, Values},
Column, ColumnDataType,
};
use common_base::BitVec;
use snafu::ensure;
use crate::error::{Result, TypeMismatchSnafu};
type ColumnName = String;
#[derive(Default)]
pub struct LinesWriter {
column_name_index: HashMap<ColumnName, usize>,
null_masks: Vec<BitVec>,
batch: InsertBatch,
lines: usize,
}
impl LinesWriter {
pub fn with_lines(lines: usize) -> Self {
Self {
lines,
..Default::default()
}
}
pub fn write_ts(&mut self, column_name: &str, value: (i64, Precision)) -> Result<()> {
let (idx, column) = self.mut_column(
column_name,
ColumnDataType::Timestamp,
SemanticType::Timestamp,
);
ensure!(
column.datatype == Some(ColumnDataType::Timestamp.into()),
TypeMismatchSnafu {
column_name,
expected: "timestamp",
actual: format!("{:?}", column.datatype)
}
);
// It is safe to use unwrap here, because values has been initialized in mut_column()
let values = column.values.as_mut().unwrap();
values.ts_millis_values.push(to_ms_ts(value.1, value.0));
self.null_masks[idx].push(false);
Ok(())
}
pub fn write_tag(&mut self, column_name: &str, value: &str) -> Result<()> {
let (idx, column) = self.mut_column(column_name, ColumnDataType::String, SemanticType::Tag);
ensure!(
column.datatype == Some(ColumnDataType::String.into()),
TypeMismatchSnafu {
column_name,
expected: "string",
actual: format!("{:?}", column.datatype)
}
);
// It is safe to use unwrap here, because values has been initialized in mut_column()
let values = column.values.as_mut().unwrap();
values.string_values.push(value.to_string());
self.null_masks[idx].push(false);
Ok(())
}
pub fn write_u64(&mut self, column_name: &str, value: u64) -> Result<()> {
let (idx, column) =
self.mut_column(column_name, ColumnDataType::Uint64, SemanticType::Field);
ensure!(
column.datatype == Some(ColumnDataType::Uint64.into()),
TypeMismatchSnafu {
column_name,
expected: "u64",
actual: format!("{:?}", column.datatype)
}
);
// It is safe to use unwrap here, because values has been initialized in mut_column()
let values = column.values.as_mut().unwrap();
values.u64_values.push(value);
self.null_masks[idx].push(false);
Ok(())
}
pub fn write_i64(&mut self, column_name: &str, value: i64) -> Result<()> {
let (idx, column) =
self.mut_column(column_name, ColumnDataType::Int64, SemanticType::Field);
ensure!(
column.datatype == Some(ColumnDataType::Int64.into()),
TypeMismatchSnafu {
column_name,
expected: "i64",
actual: format!("{:?}", column.datatype)
}
);
// It is safe to use unwrap here, because values has been initialized in mut_column()
let values = column.values.as_mut().unwrap();
values.i64_values.push(value);
self.null_masks[idx].push(false);
Ok(())
}
pub fn write_f64(&mut self, column_name: &str, value: f64) -> Result<()> {
let (idx, column) =
self.mut_column(column_name, ColumnDataType::Float64, SemanticType::Field);
ensure!(
column.datatype == Some(ColumnDataType::Float64.into()),
TypeMismatchSnafu {
column_name,
expected: "f64",
actual: format!("{:?}", column.datatype)
}
);
// It is safe to use unwrap here, because values has been initialized in mut_column()
let values = column.values.as_mut().unwrap();
values.f64_values.push(value);
self.null_masks[idx].push(false);
Ok(())
}
pub fn write_string(&mut self, column_name: &str, value: &str) -> Result<()> {
let (idx, column) =
self.mut_column(column_name, ColumnDataType::String, SemanticType::Field);
ensure!(
column.datatype == Some(ColumnDataType::String.into()),
TypeMismatchSnafu {
column_name,
expected: "string",
actual: format!("{:?}", column.datatype)
}
);
// It is safe to use unwrap here, because values has been initialized in mut_column()
let values = column.values.as_mut().unwrap();
values.string_values.push(value.to_string());
self.null_masks[idx].push(false);
Ok(())
}
pub fn write_bool(&mut self, column_name: &str, value: bool) -> Result<()> {
let (idx, column) =
self.mut_column(column_name, ColumnDataType::Boolean, SemanticType::Field);
ensure!(
column.datatype == Some(ColumnDataType::Boolean.into()),
TypeMismatchSnafu {
column_name,
expected: "boolean",
actual: format!("{:?}", column.datatype)
}
);
// It is safe to use unwrap here, because values has been initialized in mut_column()
let values = column.values.as_mut().unwrap();
values.bool_values.push(value);
self.null_masks[idx].push(false);
Ok(())
}
pub fn commit(&mut self) {
let batch = &mut self.batch;
batch.row_count += 1;
for i in 0..batch.columns.len() {
let null_mask = &mut self.null_masks[i];
if batch.row_count as usize > null_mask.len() {
null_mask.push(true);
}
}
}
pub fn finish(mut self) -> InsertBatch {
let null_masks = self.null_masks;
for (i, null_mask) in null_masks.into_iter().enumerate() {
let columns = &mut self.batch.columns;
columns[i].null_mask = null_mask.into_vec();
}
self.batch
}
fn mut_column(
&mut self,
column_name: &str,
datatype: ColumnDataType,
semantic_type: SemanticType,
) -> (usize, &mut Column) {
let column_names = &mut self.column_name_index;
let column_idx = match column_names.get(column_name) {
Some(i) => *i,
None => {
let new_idx = column_names.len();
let batch = &mut self.batch;
let to_insert = self.lines;
let mut null_mask = BitVec::with_capacity(to_insert);
null_mask.extend(BitVec::repeat(true, batch.row_count as usize));
self.null_masks.push(null_mask);
batch.columns.push(Column {
column_name: column_name.to_string(),
semantic_type: semantic_type.into(),
values: Some(Values::with_capacity(datatype, to_insert)),
datatype: Some(datatype.into()),
null_mask: Vec::default(),
});
column_names.insert(column_name.to_string(), new_idx);
new_idx
}
};
(column_idx, &mut self.batch.columns[column_idx])
}
}
fn to_ms_ts(p: Precision, ts: i64) -> i64 {
match p {
Precision::NANOSECOND => ts / 1_000_000,
Precision::MICROSECOND => ts / 1000,
Precision::MILLISECOND => ts,
Precision::SECOND => ts * 1000,
Precision::MINUTE => ts * 1000 * 60,
Precision::HOUR => ts * 1000 * 60 * 60,
}
}
#[derive(Debug, Clone, Copy, PartialEq, Eq)]
pub enum Precision {
NANOSECOND,
MICROSECOND,
MILLISECOND,
SECOND,
MINUTE,
HOUR,
}
#[cfg(test)]
mod tests {
use api::v1::{column::SemanticType, ColumnDataType};
use common_base::BitVec;
use super::LinesWriter;
use crate::writer::{to_ms_ts, Precision};
#[test]
fn test_lines_writer() {
let mut writer = LinesWriter::with_lines(3);
writer.write_tag("host", "host1").unwrap();
writer.write_f64("cpu", 0.5).unwrap();
writer.write_f64("memory", 0.4).unwrap();
writer.write_string("name", "name1").unwrap();
writer
.write_ts("ts", (101011000, Precision::MILLISECOND))
.unwrap();
writer.commit();
writer.write_tag("host", "host2").unwrap();
writer
.write_ts("ts", (102011001, Precision::MILLISECOND))
.unwrap();
writer.write_bool("enable_reboot", true).unwrap();
writer.write_u64("year_of_service", 2).unwrap();
writer.write_i64("temperature", 4).unwrap();
writer.commit();
writer.write_tag("host", "host3").unwrap();
writer.write_f64("cpu", 0.4).unwrap();
writer.write_u64("cpu_core_num", 16).unwrap();
writer
.write_ts("ts", (103011002, Precision::MILLISECOND))
.unwrap();
writer.commit();
let insert_batch = writer.finish();
assert_eq!(3, insert_batch.row_count);
let columns = insert_batch.columns;
assert_eq!(9, columns.len());
let column = &columns[0];
assert_eq!("host", columns[0].column_name);
assert_eq!(Some(ColumnDataType::String as i32), column.datatype);
assert_eq!(SemanticType::Tag as i32, column.semantic_type);
assert_eq!(
vec!["host1", "host2", "host3"],
column.values.as_ref().unwrap().string_values
);
verify_null_mask(&column.null_mask, vec![false, false, false]);
let column = &columns[1];
assert_eq!("cpu", column.column_name);
assert_eq!(Some(ColumnDataType::Float64 as i32), column.datatype);
assert_eq!(SemanticType::Field as i32, column.semantic_type);
assert_eq!(vec![0.5, 0.4], column.values.as_ref().unwrap().f64_values);
verify_null_mask(&column.null_mask, vec![false, true, false]);
let column = &columns[2];
assert_eq!("memory", column.column_name);
assert_eq!(Some(ColumnDataType::Float64 as i32), column.datatype);
assert_eq!(SemanticType::Field as i32, column.semantic_type);
assert_eq!(vec![0.4], column.values.as_ref().unwrap().f64_values);
verify_null_mask(&column.null_mask, vec![false, true, true]);
let column = &columns[3];
assert_eq!("name", column.column_name);
assert_eq!(Some(ColumnDataType::String as i32), column.datatype);
assert_eq!(SemanticType::Field as i32, column.semantic_type);
assert_eq!(vec!["name1"], column.values.as_ref().unwrap().string_values);
verify_null_mask(&column.null_mask, vec![false, true, true]);
let column = &columns[4];
assert_eq!("ts", column.column_name);
assert_eq!(Some(ColumnDataType::Timestamp as i32), column.datatype);
assert_eq!(SemanticType::Timestamp as i32, column.semantic_type);
assert_eq!(
vec![101011000, 102011001, 103011002],
column.values.as_ref().unwrap().ts_millis_values
);
verify_null_mask(&column.null_mask, vec![false, false, false]);
let column = &columns[5];
assert_eq!("enable_reboot", column.column_name);
assert_eq!(Some(ColumnDataType::Boolean as i32), column.datatype);
assert_eq!(SemanticType::Field as i32, column.semantic_type);
assert_eq!(vec![true], column.values.as_ref().unwrap().bool_values);
verify_null_mask(&column.null_mask, vec![true, false, true]);
let column = &columns[6];
assert_eq!("year_of_service", column.column_name);
assert_eq!(Some(ColumnDataType::Uint64 as i32), column.datatype);
assert_eq!(SemanticType::Field as i32, column.semantic_type);
assert_eq!(vec![2], column.values.as_ref().unwrap().u64_values);
verify_null_mask(&column.null_mask, vec![true, false, true]);
let column = &columns[7];
assert_eq!("temperature", column.column_name);
assert_eq!(Some(ColumnDataType::Int64 as i32), column.datatype);
assert_eq!(SemanticType::Field as i32, column.semantic_type);
assert_eq!(vec![4], column.values.as_ref().unwrap().i64_values);
verify_null_mask(&column.null_mask, vec![true, false, true]);
let column = &columns[8];
assert_eq!("cpu_core_num", column.column_name);
assert_eq!(Some(ColumnDataType::Uint64 as i32), column.datatype);
assert_eq!(SemanticType::Field as i32, column.semantic_type);
assert_eq!(vec![16], column.values.as_ref().unwrap().u64_values);
verify_null_mask(&column.null_mask, vec![true, true, false]);
}
fn verify_null_mask(data: &[u8], expected: Vec<bool>) {
let bitvec = BitVec::from_slice(data);
for (idx, b) in expected.iter().enumerate() {
assert_eq!(b, bitvec.get(idx).unwrap())
}
}
#[test]
fn test_to_ms() {
assert_eq!(100, to_ms_ts(Precision::NANOSECOND, 100110000));
assert_eq!(100110, to_ms_ts(Precision::MICROSECOND, 100110000));
assert_eq!(100110000, to_ms_ts(Precision::MILLISECOND, 100110000));
assert_eq!(
100110000 * 1000 * 60,
to_ms_ts(Precision::MINUTE, 100110000)
);
assert_eq!(
100110000 * 1000 * 60 * 60,
to_ms_ts(Precision::HOUR, 100110000)
);
}
}

View File

@@ -3,19 +3,21 @@ name = "common-query"
version = "0.1.0"
edition = "2021"
[dependencies.arrow]
package = "arrow2"
version="0.10"
[dependencies]
common-error = { path = "../error" }
datafusion = { git = "https://github.com/apache/arrow-datafusion.git" , branch = "arrow2", features = ["simd"]}
datafusion-common = { git = "https://github.com/apache/arrow-datafusion.git" , branch = "arrow2"}
datafusion-expr = { git = "https://github.com/apache/arrow-datafusion.git" , branch = "arrow2"}
datatypes = { path = "../../datatypes"}
common-recordbatch = { path = "../recordbatch" }
common-time = { path = "../time" }
datafusion = { git = "https://github.com/apache/arrow-datafusion.git", branch = "arrow2", features = ["simd"] }
datafusion-common = { git = "https://github.com/apache/arrow-datafusion.git", branch = "arrow2" }
datafusion-expr = { git = "https://github.com/apache/arrow-datafusion.git", branch = "arrow2" }
datatypes = { path = "../../datatypes" }
snafu = { version = "0.7", features = ["backtraces"] }
statrs = "0.15"
[dependencies.arrow]
package = "arrow2"
version = "0.10"
[dev-dependencies]
common-base = { path = "../base" }
tokio = { version = "1.0", features = ["full"] }
common-base = {path = "../base"}

View File

@@ -1,6 +1,15 @@
use common_recordbatch::{RecordBatches, SendableRecordBatchStream};
pub mod columnar_value;
pub mod error;
mod function;
pub mod logical_plan;
pub mod prelude;
mod signature;
// sql output
pub enum Output {
AffectedRows(usize),
RecordBatches(RecordBatches),
Stream(SendableRecordBatchStream),
}

View File

@@ -4,6 +4,7 @@ use std::fmt::Debug;
use std::sync::Arc;
use arrow::array::ArrayRef;
use common_time::timestamp::TimeUnit;
use datafusion_common::Result as DfResult;
use datafusion_expr::Accumulator as DfAccumulator;
use datatypes::prelude::*;
@@ -44,20 +45,14 @@ pub trait Accumulator: Send + Sync + Debug {
}
/// An `AggregateFunctionCreator` dynamically creates `Accumulator`.
/// DataFusion does not provide the input data's types when creating Accumulator, we have to stores
/// it somewhere else ourself. So an `AggregateFunctionCreator` often has a companion struct, that
/// can store the input data types, and knows the output and states types of an Accumulator.
/// That's how we create the Accumulator generically.
pub trait AggregateFunctionCreator: Send + Sync + Debug {
///
/// An `AggregateFunctionCreator` often has a companion struct, that
/// can store the input data types (impl [AggrFuncTypeStore]), and knows the output and states
/// types of an Accumulator.
pub trait AggregateFunctionCreator: AggrFuncTypeStore {
/// Create a function that can create a new accumulator with some input data type.
fn creator(&self) -> AccumulatorCreatorFunction;
/// Get the input data type of the Accumulator.
fn input_types(&self) -> Result<Vec<ConcreteDataType>>;
/// Store the input data type that is provided by DataFusion at runtime.
fn set_input_types(&self, input_types: Vec<ConcreteDataType>) -> Result<()>;
/// Get the Accumulator's output data type.
fn output_type(&self) -> Result<ConcreteDataType>;
@@ -65,6 +60,20 @@ pub trait AggregateFunctionCreator: Send + Sync + Debug {
fn state_types(&self) -> Result<Vec<ConcreteDataType>>;
}
/// `AggrFuncTypeStore` stores the aggregate function's input data's types.
///
/// When creating Accumulator generically, we have to know the input data's types.
/// However, DataFusion does not provide the input data's types at the time of creating Accumulator.
/// To solve the problem, we store the datatypes upfront here.
pub trait AggrFuncTypeStore: Send + Sync + Debug {
/// Get the input data types of the Accumulator.
fn input_types(&self) -> Result<Vec<ConcreteDataType>>;
/// Store the input data types that are provided by DataFusion at runtime (when it is evaluating
/// return type function).
fn set_input_types(&self, input_types: Vec<ConcreteDataType>) -> Result<()>;
}
pub fn make_accumulator_function(
creator: Arc<dyn AggregateFunctionCreator>,
) -> AccumulatorFunctionImpl {
@@ -178,15 +187,26 @@ fn try_into_scalar_value(value: Value, datatype: &ConcreteDataType) -> Result<Sc
Value::Int64(v) => ScalarValue::Int64(Some(v)),
Value::Float32(v) => ScalarValue::Float32(Some(v.0)),
Value::Float64(v) => ScalarValue::Float64(Some(v.0)),
Value::String(v) => ScalarValue::LargeUtf8(Some(v.as_utf8().to_string())),
Value::String(v) => ScalarValue::Utf8(Some(v.as_utf8().to_string())),
Value::Binary(v) => ScalarValue::LargeBinary(Some(v.to_vec())),
Value::Date(v) => ScalarValue::Date32(Some(v.val())),
Value::DateTime(v) => ScalarValue::Date64(Some(v.val())),
Value::Null => try_convert_null_value(datatype)?,
Value::List(list) => try_convert_list_value(list)?,
Value::Timestamp(t) => timestamp_to_scalar_value(t.unit(), Some(t.value())),
Value::Geometry(_) => todo!(),
})
}
fn timestamp_to_scalar_value(unit: TimeUnit, val: Option<i64>) -> ScalarValue {
match unit {
TimeUnit::Second => ScalarValue::TimestampSecond(val, None),
TimeUnit::Millisecond => ScalarValue::TimestampMillisecond(val, None),
TimeUnit::Microsecond => ScalarValue::TimestampMicrosecond(val, None),
TimeUnit::Nanosecond => ScalarValue::TimestampNanosecond(val, None),
}
}
fn try_convert_null_value(datatype: &ConcreteDataType) -> Result<ScalarValue> {
Ok(match datatype {
ConcreteDataType::Boolean(_) => ScalarValue::Boolean(None),
@@ -201,7 +221,8 @@ fn try_convert_null_value(datatype: &ConcreteDataType) -> Result<ScalarValue> {
ConcreteDataType::Float32(_) => ScalarValue::Float32(None),
ConcreteDataType::Float64(_) => ScalarValue::Float64(None),
ConcreteDataType::Binary(_) => ScalarValue::LargeBinary(None),
ConcreteDataType::String(_) => ScalarValue::LargeUtf8(None),
ConcreteDataType::String(_) => ScalarValue::Utf8(None),
ConcreteDataType::Timestamp(t) => timestamp_to_scalar_value(t.unit, None),
_ => {
return error::BadAccumulatorImplSnafu {
err_msg: format!(
@@ -330,10 +351,10 @@ mod tests {
.unwrap()
);
assert_eq!(
ScalarValue::LargeUtf8(Some("hello".to_string())),
ScalarValue::Utf8(Some("hello".to_string())),
try_into_scalar_value(
Value::String(StringBytes::from("hello")),
&ConcreteDataType::string_datatype()
&ConcreteDataType::string_datatype(),
)
.unwrap()
);
@@ -394,7 +415,7 @@ mod tests {
try_into_scalar_value(Value::Null, &ConcreteDataType::float64_datatype()).unwrap()
);
assert_eq!(
ScalarValue::LargeUtf8(None),
ScalarValue::Utf8(None),
try_into_scalar_value(Value::Null, &ConcreteDataType::string_datatype()).unwrap()
);
assert_eq!(
@@ -427,4 +448,24 @@ mod tests {
_ => unreachable!(),
}
}
#[test]
pub fn test_timestamp_to_scalar_value() {
assert_eq!(
ScalarValue::TimestampSecond(Some(1), None),
timestamp_to_scalar_value(TimeUnit::Second, Some(1))
);
assert_eq!(
ScalarValue::TimestampMillisecond(Some(1), None),
timestamp_to_scalar_value(TimeUnit::Millisecond, Some(1))
);
assert_eq!(
ScalarValue::TimestampMicrosecond(Some(1), None),
timestamp_to_scalar_value(TimeUnit::Microsecond, Some(1))
);
assert_eq!(
ScalarValue::TimestampNanosecond(Some(1), None),
timestamp_to_scalar_value(TimeUnit::Nanosecond, Some(1))
);
}
}

View File

@@ -1,4 +1,4 @@
mod accumulator;
pub mod accumulator;
mod expr;
mod udaf;
mod udf;
@@ -177,11 +177,7 @@ mod tests {
#[derive(Debug)]
struct DummyAccumulatorCreator;
impl AggregateFunctionCreator for DummyAccumulatorCreator {
fn creator(&self) -> AccumulatorCreatorFunction {
Arc::new(|_| Ok(Box::new(DummyAccumulator)))
}
impl AggrFuncTypeStore for DummyAccumulatorCreator {
fn input_types(&self) -> Result<Vec<ConcreteDataType>> {
Ok(vec![ConcreteDataType::float64_datatype()])
}
@@ -189,6 +185,12 @@ mod tests {
fn set_input_types(&self, _: Vec<ConcreteDataType>) -> Result<()> {
Ok(())
}
}
impl AggregateFunctionCreator for DummyAccumulatorCreator {
fn creator(&self) -> AccumulatorCreatorFunction {
Arc::new(|_| Ok(Box::new(DummyAccumulator)))
}
fn output_type(&self) -> Result<ConcreteDataType> {
Ok(self.input_types()?.into_iter().next().unwrap())

View File

@@ -5,8 +5,8 @@ edition = "2021"
[dependencies]
common-error = { path = "../error" }
datafusion = { git = "https://github.com/apache/arrow-datafusion.git" , branch = "arrow2", features = ["simd"]}
datafusion-common = { git = "https://github.com/apache/arrow-datafusion.git" , branch = "arrow2"}
datafusion = { git = "https://github.com/apache/arrow-datafusion.git", branch = "arrow2", features = ["simd"] }
datafusion-common = { git = "https://github.com/apache/arrow-datafusion.git", branch = "arrow2" }
datatypes = { path = "../../datatypes" }
futures = "0.3"
paste = "1.0"

View File

@@ -27,13 +27,21 @@ pub enum InnerError {
#[snafu(backtrace)]
source: BoxedError,
},
#[snafu(display("Failed to create RecordBatches, reason: {}", reason))]
CreateRecordBatches {
reason: String,
backtrace: Backtrace,
},
}
impl ErrorExt for InnerError {
fn status_code(&self) -> StatusCode {
match self {
InnerError::NewDfRecordBatch { .. } => StatusCode::InvalidArguments,
InnerError::DataTypes { .. } => StatusCode::Internal,
InnerError::DataTypes { .. } | InnerError::CreateRecordBatches { .. } => {
StatusCode::Internal
}
InnerError::External { source } => source.status_code(),
}
}

View File

@@ -9,6 +9,7 @@ use error::Result;
use futures::task::{Context, Poll};
use futures::Stream;
pub use recordbatch::RecordBatch;
use snafu::ensure;
pub trait RecordBatchStream: Stream<Item = Result<RecordBatch>> {
fn schema(&self) -> SchemaRef;
@@ -43,3 +44,76 @@ impl Stream for EmptyRecordBatchStream {
Poll::Ready(None)
}
}
#[derive(Debug)]
pub struct RecordBatches {
schema: SchemaRef,
batches: Vec<RecordBatch>,
}
impl RecordBatches {
pub fn try_new(schema: SchemaRef, batches: Vec<RecordBatch>) -> Result<Self> {
for batch in batches.iter() {
ensure!(
batch.schema == schema,
error::CreateRecordBatchesSnafu {
reason: format!(
"expect RecordBatch schema equals {:?}, actual: {:?}",
schema, batch.schema
)
}
)
}
Ok(Self { schema, batches })
}
pub fn schema(&self) -> SchemaRef {
self.schema.clone()
}
pub fn take(self) -> Vec<RecordBatch> {
self.batches
}
}
#[cfg(test)]
mod tests {
use std::sync::Arc;
use datatypes::prelude::{ConcreteDataType, VectorRef};
use datatypes::schema::{ColumnSchema, Schema};
use datatypes::vectors::{BooleanVector, Int32Vector, StringVector};
use super::*;
#[test]
fn test_recordbatches() {
let column_a = ColumnSchema::new("a", ConcreteDataType::int32_datatype(), false);
let column_b = ColumnSchema::new("b", ConcreteDataType::string_datatype(), false);
let column_c = ColumnSchema::new("c", ConcreteDataType::boolean_datatype(), false);
let va: VectorRef = Arc::new(Int32Vector::from_slice(&[1, 2]));
let vb: VectorRef = Arc::new(StringVector::from(vec!["hello", "world"]));
let vc: VectorRef = Arc::new(BooleanVector::from(vec![true, false]));
let schema1 = Arc::new(Schema::new(vec![column_a.clone(), column_b]));
let batch1 = RecordBatch::new(schema1.clone(), vec![va.clone(), vb]).unwrap();
let schema2 = Arc::new(Schema::new(vec![column_a, column_c]));
let batch2 = RecordBatch::new(schema2.clone(), vec![va, vc]).unwrap();
let result = RecordBatches::try_new(schema1.clone(), vec![batch1.clone(), batch2]);
assert!(result.is_err());
assert_eq!(
result.unwrap_err().to_string(),
format!(
"Failed to create RecordBatches, reason: expect RecordBatch schema equals {:?}, actual: {:?}",
schema1, schema2
)
);
let batches = RecordBatches::try_new(schema1.clone(), vec![batch1.clone()]).unwrap();
assert_eq!(schema1, batches.schema());
assert_eq!(vec![batch1], batches.take());
}
}

View File

@@ -3,7 +3,6 @@ name = "common-runtime"
version = "0.1.0"
edition = "2021"
[dependencies]
common-error = { path = "../error" }
common-telemetry = { path = "../telemetry" }

View File

@@ -2,12 +2,11 @@
name = "common-telemetry"
version = "0.1.0"
edition = "2021"
# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html
[features]
console = ["console-subscriber"]
deadlock_detection=["parking_lot"]
deadlock_detection =["parking_lot"]
[dependencies]
backtrace = "0.3"

View File

@@ -29,12 +29,17 @@ pub fn init_default_ut_logging() {
START.call_once(|| {
let mut g = GLOBAL_UT_LOG_GUARD.as_ref().lock().unwrap();
*g = Some(init_global_logging(
"unittest",
"__unittest_logs",
"DEBUG",
false,
));
// When running in Github's actions, env "UNITTEST_LOG_DIR" is set to a directory other
// than "/tmp".
// This is to fix the problem that the "/tmp" disk space of action runner's is small,
// if we write testing logs in it, actions would fail due to disk out of space error.
let dir =
env::var("UNITTEST_LOG_DIR").unwrap_or_else(|_| "/tmp/__unittest_logs".to_string());
*g = Some(init_global_logging("unittest", &dir, "DEBUG", false));
info!("logs dir = {}", dir);
});
}
@@ -74,6 +79,7 @@ pub fn init_global_logging(
.with_target("datafusion", Level::WARN)
.with_target("reqwest", Level::WARN)
.with_target("sqlparser", Level::WARN)
.with_target("h2", Level::INFO)
.with_default(
directives
.parse::<filter::LevelFilter>()

View File

@@ -2,7 +2,6 @@
name = "common-time"
version = "0.1.0"
edition = "2021"
# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html
[dependencies]

View File

@@ -33,6 +33,12 @@ impl FromStr for Date {
}
}
impl From<i32> for Date {
fn from(v: i32) -> Self {
Self(v)
}
}
impl Display for Date {
/// [Date] is formatted according to ISO-8601 standard.
fn fmt(&self, f: &mut Formatter<'_>) -> std::fmt::Result {
@@ -86,4 +92,10 @@ mod tests {
date.0 += 1000;
assert_eq!(date, Date::from_str(&date.to_string()).unwrap());
}
#[test]
pub fn test_from() {
let d: Date = 42.into();
assert_eq!(42, d.val());
}
}

View File

@@ -38,6 +38,12 @@ impl FromStr for DateTime {
}
}
impl From<i64> for DateTime {
fn from(v: i64) -> Self {
Self(v)
}
}
impl DateTime {
pub fn new(val: i64) -> Self {
Self(val)
@@ -65,4 +71,10 @@ mod tests {
let dt = DateTime::from_str(time).unwrap();
assert_eq!(time, &dt.to_string());
}
#[test]
pub fn test_from() {
let d: DateTime = 42.into();
assert_eq!(42, d.val());
}
}

View File

@@ -1,11 +1,33 @@
use chrono::ParseError;
use snafu::Snafu;
use snafu::{Backtrace, Snafu};
#[derive(Debug, Snafu)]
#[snafu(visibility(pub))]
pub enum Error {
#[snafu(display("Failed to parse string to date, raw: {}, source: {}", raw, source))]
ParseDateStr { raw: String, source: ParseError },
#[snafu(display("Failed to parse a string into Timestamp, raw string: {}", raw))]
ParseTimestamp { raw: String, backtrace: Backtrace },
}
pub type Result<T> = std::result::Result<T, Error>;
#[cfg(test)]
mod tests {
use chrono::NaiveDateTime;
use snafu::ResultExt;
use super::*;
#[test]
fn test_errors() {
let raw = "2020-09-08T13:42:29.190855Z";
let result = NaiveDateTime::parse_from_str(raw, "%F").context(ParseDateStrSnafu { raw });
assert!(matches!(result.err().unwrap(), Error::ParseDateStr { .. }));
assert_eq!(
"Failed to parse a string into Timestamp, raw string: 2020-09-08T13:42:29.190855Z",
ParseTimestampSnafu { raw }.build().to_string()
);
}
}

View File

@@ -3,7 +3,11 @@ pub mod datetime;
pub mod error;
pub mod range;
pub mod timestamp;
pub mod timestamp_millis;
pub mod util;
pub use date::Date;
pub use datetime::DateTime;
pub use range::RangeMillis;
pub use timestamp::TimestampMillis;
pub use timestamp::Timestamp;
pub use timestamp_millis::TimestampMillis;

View File

@@ -1,4 +1,4 @@
use crate::timestamp::TimestampMillis;
use crate::timestamp_millis::TimestampMillis;
/// A half-open time range.
///

View File

@@ -1,123 +1,268 @@
use core::default::Default;
use std::cmp::Ordering;
use std::hash::{Hash, Hasher};
use std::str::FromStr;
/// Unix timestamp in millisecond resolution.
///
/// Negative timestamp is allowed, which represents timestamp before '1970-01-01T00:00:00'.
#[derive(Debug, Clone, Copy, PartialEq, Eq, PartialOrd, Ord, Hash)]
pub struct TimestampMillis(i64);
use chrono::{offset::Local, DateTime, LocalResult, NaiveDateTime, TimeZone, Utc};
use serde::{Deserialize, Serialize};
impl TimestampMillis {
/// Positive infinity.
pub const INF: TimestampMillis = TimestampMillis::new(i64::MAX);
/// Maximum value of a timestamp.
///
/// The maximum value of i64 is reserved for infinity.
pub const MAX: TimestampMillis = TimestampMillis::new(i64::MAX - 1);
/// Minimum value of a timestamp.
pub const MIN: TimestampMillis = TimestampMillis::new(i64::MIN);
use crate::error::{Error, ParseTimestampSnafu};
/// Create a new timestamp from unix timestamp in milliseconds.
pub const fn new(ms: i64) -> TimestampMillis {
TimestampMillis(ms)
#[derive(Debug, Clone, Default, Copy, Serialize, Deserialize)]
pub struct Timestamp {
value: i64,
unit: TimeUnit,
}
impl Timestamp {
pub fn new(value: i64, unit: TimeUnit) -> Self {
Self { unit, value }
}
/// Returns the timestamp aligned by `bucket_duration` in milliseconds or
/// `None` if overflow occurred.
///
/// # Panics
/// Panics if `bucket_duration <= 0`.
pub fn align_by_bucket(self, bucket_duration: i64) -> Option<TimestampMillis> {
assert!(bucket_duration > 0);
let ts = if self.0 >= 0 {
self.0
} else {
// `bucket_duration > 0` implies `bucket_duration - 1` won't overflow.
self.0.checked_sub(bucket_duration - 1)?
};
Some(TimestampMillis(ts / bucket_duration * bucket_duration))
pub fn from_millis(value: i64) -> Self {
Self {
value,
unit: TimeUnit::Millisecond,
}
}
/// Returns the timestamp value as i64.
pub fn as_i64(&self) -> i64 {
self.0
pub fn unit(&self) -> TimeUnit {
self.unit
}
pub fn value(&self) -> i64 {
self.value
}
pub fn convert_to(&self, unit: TimeUnit) -> i64 {
// TODO(hl): May result into overflow
self.value * self.unit.factor() / unit.factor()
}
}
impl From<i64> for TimestampMillis {
fn from(ms: i64) -> TimestampMillis {
TimestampMillis::new(ms)
impl FromStr for Timestamp {
type Err = Error;
/// Accepts a string in RFC3339 / ISO8601 standard format and some variants and converts it to a nanosecond precision timestamp.
/// This code is copied from [arrow-datafusion](https://github.com/apache/arrow-datafusion/blob/arrow2/datafusion-physical-expr/src/arrow_temporal_util.rs#L71)
/// with some bugfixes.
/// Supported format:
/// - `2022-09-20T14:16:43.012345Z` (Zulu timezone)
/// - `2022-09-20T14:16:43.012345+08:00` (Explicit offset)
/// - `2022-09-20T14:16:43.012345` (local timezone, with T)
/// - `2022-09-20T14:16:43` (local timezone, no fractional seconds, with T)
/// - `2022-09-20 14:16:43.012345Z` (Zulu timezone, without T)
/// - `2022-09-20 14:16:43` (local timezone, without T)
/// - `2022-09-20 14:16:43.012345` (local timezone, without T)
fn from_str(s: &str) -> Result<Self, Self::Err> {
// RFC3339 timestamp (with a T)
if let Ok(ts) = DateTime::parse_from_rfc3339(s) {
return Ok(Timestamp::new(ts.timestamp_nanos(), TimeUnit::Nanosecond));
}
if let Ok(ts) = DateTime::parse_from_str(s, "%Y-%m-%d %H:%M:%S%.f%:z") {
return Ok(Timestamp::new(ts.timestamp_nanos(), TimeUnit::Nanosecond));
}
if let Ok(ts) = Utc.datetime_from_str(s, "%Y-%m-%d %H:%M:%S%.fZ") {
return Ok(Timestamp::new(ts.timestamp_nanos(), TimeUnit::Nanosecond));
}
if let Ok(ts) = NaiveDateTime::parse_from_str(s, "%Y-%m-%dT%H:%M:%S") {
return naive_datetime_to_timestamp(s, ts);
}
if let Ok(ts) = NaiveDateTime::parse_from_str(s, "%Y-%m-%dT%H:%M:%S%.f") {
return naive_datetime_to_timestamp(s, ts);
}
if let Ok(ts) = NaiveDateTime::parse_from_str(s, "%Y-%m-%d %H:%M:%S") {
return naive_datetime_to_timestamp(s, ts);
}
if let Ok(ts) = NaiveDateTime::parse_from_str(s, "%Y-%m-%d %H:%M:%S%.f") {
return naive_datetime_to_timestamp(s, ts);
}
ParseTimestampSnafu { raw: s }.fail()
}
}
impl PartialEq<i64> for TimestampMillis {
fn eq(&self, other: &i64) -> bool {
self.0 == *other
/// Converts the naive datetime (which has no specific timezone) to a
/// nanosecond epoch timestamp relative to UTC.
/// This code is copied from [arrow-datafusion](https://github.com/apache/arrow-datafusion/blob/arrow2/datafusion-physical-expr/src/arrow_temporal_util.rs#L137).
fn naive_datetime_to_timestamp(
s: &str,
datetime: NaiveDateTime,
) -> crate::error::Result<Timestamp> {
let l = Local {};
match l.from_local_datetime(&datetime) {
LocalResult::None => ParseTimestampSnafu { raw: s }.fail(),
LocalResult::Single(local_datetime) => Ok(Timestamp::new(
local_datetime.with_timezone(&Utc).timestamp_nanos(),
TimeUnit::Nanosecond,
)),
LocalResult::Ambiguous(local_datetime, _) => Ok(Timestamp::new(
local_datetime.with_timezone(&Utc).timestamp_nanos(),
TimeUnit::Nanosecond,
)),
}
}
impl PartialEq<TimestampMillis> for i64 {
fn eq(&self, other: &TimestampMillis) -> bool {
*self == other.0
impl From<i64> for Timestamp {
fn from(v: i64) -> Self {
Self {
value: v,
unit: TimeUnit::Millisecond,
}
}
}
impl PartialOrd<i64> for TimestampMillis {
fn partial_cmp(&self, other: &i64) -> Option<Ordering> {
Some(self.0.cmp(other))
#[derive(Debug, Default, Clone, Copy, PartialEq, Eq, Serialize, Deserialize)]
pub enum TimeUnit {
Second,
#[default]
Millisecond,
Microsecond,
Nanosecond,
}
impl TimeUnit {
pub fn factor(&self) -> i64 {
match self {
TimeUnit::Second => 1_000_000_000,
TimeUnit::Millisecond => 1_000_000,
TimeUnit::Microsecond => 1_000,
TimeUnit::Nanosecond => 1,
}
}
}
impl PartialOrd<TimestampMillis> for i64 {
fn partial_cmp(&self, other: &TimestampMillis) -> Option<Ordering> {
Some(self.cmp(&other.0))
impl PartialOrd for Timestamp {
fn partial_cmp(&self, other: &Self) -> Option<Ordering> {
(self.value * self.unit.factor()).partial_cmp(&(other.value * other.unit.factor()))
}
}
impl Ord for Timestamp {
fn cmp(&self, other: &Self) -> Ordering {
(self.value * self.unit.factor()).cmp(&(other.value * other.unit.factor()))
}
}
impl PartialEq for Timestamp {
fn eq(&self, other: &Self) -> bool {
self.convert_to(TimeUnit::Nanosecond) == other.convert_to(TimeUnit::Nanosecond)
}
}
impl Eq for Timestamp {}
impl Hash for Timestamp {
fn hash<H: Hasher>(&self, state: &mut H) {
state.write_i64(self.convert_to(TimeUnit::Nanosecond));
state.finish();
}
}
#[cfg(test)]
mod tests {
use chrono::Offset;
use super::*;
#[test]
fn test_timestamp() {
let ts = 123456;
let timestamp = TimestampMillis::from(ts);
assert_eq!(timestamp, ts);
assert_eq!(ts, timestamp);
assert_eq!(ts, timestamp.as_i64());
assert_ne!(TimestampMillis::new(0), timestamp);
assert!(TimestampMillis::new(-123) < TimestampMillis::new(0));
assert!(TimestampMillis::new(10) < 20);
assert!(10 < TimestampMillis::new(20));
assert_eq!(i64::MAX, TimestampMillis::INF);
assert_eq!(i64::MAX - 1, TimestampMillis::MAX);
assert_eq!(i64::MIN, TimestampMillis::MIN);
pub fn test_time_unit() {
assert_eq!(
TimeUnit::Millisecond.factor() * 1000,
TimeUnit::Second.factor()
);
assert_eq!(
TimeUnit::Microsecond.factor() * 1000000,
TimeUnit::Second.factor()
);
assert_eq!(
TimeUnit::Nanosecond.factor() * 1000000000,
TimeUnit::Second.factor()
);
}
#[test]
fn test_align_by_bucket() {
let bucket = 100;
assert_eq!(0, TimestampMillis::new(0).align_by_bucket(bucket).unwrap());
assert_eq!(0, TimestampMillis::new(1).align_by_bucket(bucket).unwrap());
assert_eq!(0, TimestampMillis::new(99).align_by_bucket(bucket).unwrap());
assert_eq!(
100,
TimestampMillis::new(100).align_by_bucket(bucket).unwrap()
pub fn test_timestamp() {
let t = Timestamp::new(1, TimeUnit::Millisecond);
assert_eq!(TimeUnit::Millisecond, t.unit());
assert_eq!(1, t.value());
assert_eq!(Timestamp::new(1000, TimeUnit::Microsecond), t);
assert!(t > Timestamp::new(999, TimeUnit::Microsecond));
}
#[test]
pub fn test_from_i64() {
let t: Timestamp = 42.into();
assert_eq!(42, t.value());
assert_eq!(TimeUnit::Millisecond, t.unit());
}
// Input timestamp string is regarded as local timezone if no timezone is specified,
// but expected timestamp is in UTC timezone
fn check_from_str(s: &str, expect: &str) {
let ts = Timestamp::from_str(s).unwrap();
let time = NaiveDateTime::from_timestamp(
ts.value / 1_000_000_000,
(ts.value % 1_000_000_000) as u32,
);
assert_eq!(
100,
TimestampMillis::new(199).align_by_bucket(bucket).unwrap()
assert_eq!(expect, time.to_string());
}
#[test]
fn test_from_str() {
// Explicit Z means timestamp in UTC
check_from_str("2020-09-08 13:42:29Z", "2020-09-08 13:42:29");
check_from_str("2020-09-08T13:42:29+08:00", "2020-09-08 05:42:29");
check_from_str(
"2020-09-08 13:42:29",
&NaiveDateTime::from_timestamp_opt(
1599572549 - Local.timestamp(0, 0).offset().fix().local_minus_utc() as i64,
0,
)
.unwrap()
.to_string(),
);
assert_eq!(0, TimestampMillis::MAX.align_by_bucket(i64::MAX).unwrap());
assert_eq!(
i64::MAX,
TimestampMillis::INF.align_by_bucket(i64::MAX).unwrap()
check_from_str(
"2020-09-08T13:42:29",
&NaiveDateTime::from_timestamp_opt(
1599572549 - Local.timestamp(0, 0).offset().fix().local_minus_utc() as i64,
0,
)
.unwrap()
.to_string(),
);
assert_eq!(None, TimestampMillis::MIN.align_by_bucket(bucket));
check_from_str(
"2020-09-08 13:42:29.042",
&NaiveDateTime::from_timestamp_opt(
1599572549 - Local.timestamp(0, 0).offset().fix().local_minus_utc() as i64,
42000000,
)
.unwrap()
.to_string(),
);
check_from_str("2020-09-08 13:42:29.042Z", "2020-09-08 13:42:29.042");
check_from_str("2020-09-08 13:42:29.042+08:00", "2020-09-08 05:42:29.042");
check_from_str(
"2020-09-08T13:42:29.042",
&NaiveDateTime::from_timestamp_opt(
1599572549 - Local.timestamp(0, 0).offset().fix().local_minus_utc() as i64,
42000000,
)
.unwrap()
.to_string(),
);
check_from_str("2020-09-08T13:42:29+08:00", "2020-09-08 05:42:29");
check_from_str(
"2020-09-08T13:42:29.0042+08:00",
"2020-09-08 05:42:29.004200",
);
}
}

View File

@@ -0,0 +1,123 @@
use std::cmp::Ordering;
/// Unix timestamp in millisecond resolution.
///
/// Negative timestamp is allowed, which represents timestamp before '1970-01-01T00:00:00'.
#[derive(Debug, Clone, Copy, PartialEq, Eq, PartialOrd, Ord, Hash)]
pub struct TimestampMillis(i64);
impl TimestampMillis {
/// Positive infinity.
pub const INF: TimestampMillis = TimestampMillis::new(i64::MAX);
/// Maximum value of a timestamp.
///
/// The maximum value of i64 is reserved for infinity.
pub const MAX: TimestampMillis = TimestampMillis::new(i64::MAX - 1);
/// Minimum value of a timestamp.
pub const MIN: TimestampMillis = TimestampMillis::new(i64::MIN);
/// Create a new timestamp from unix timestamp in milliseconds.
pub const fn new(ms: i64) -> TimestampMillis {
TimestampMillis(ms)
}
/// Returns the timestamp aligned by `bucket_duration` in milliseconds or
/// `None` if overflow occurred.
///
/// # Panics
/// Panics if `bucket_duration <= 0`.
pub fn align_by_bucket(self, bucket_duration: i64) -> Option<TimestampMillis> {
assert!(bucket_duration > 0);
let ts = if self.0 >= 0 {
self.0
} else {
// `bucket_duration > 0` implies `bucket_duration - 1` won't overflow.
self.0.checked_sub(bucket_duration - 1)?
};
Some(TimestampMillis(ts / bucket_duration * bucket_duration))
}
/// Returns the timestamp value as i64.
pub fn as_i64(&self) -> i64 {
self.0
}
}
impl From<i64> for TimestampMillis {
fn from(ms: i64) -> TimestampMillis {
TimestampMillis::new(ms)
}
}
impl PartialEq<i64> for TimestampMillis {
fn eq(&self, other: &i64) -> bool {
self.0 == *other
}
}
impl PartialEq<TimestampMillis> for i64 {
fn eq(&self, other: &TimestampMillis) -> bool {
*self == other.0
}
}
impl PartialOrd<i64> for TimestampMillis {
fn partial_cmp(&self, other: &i64) -> Option<Ordering> {
Some(self.0.cmp(other))
}
}
impl PartialOrd<TimestampMillis> for i64 {
fn partial_cmp(&self, other: &TimestampMillis) -> Option<Ordering> {
Some(self.cmp(&other.0))
}
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn test_timestamp() {
let ts = 123456;
let timestamp = TimestampMillis::from(ts);
assert_eq!(timestamp, ts);
assert_eq!(ts, timestamp);
assert_eq!(ts, timestamp.as_i64());
assert_ne!(TimestampMillis::new(0), timestamp);
assert!(TimestampMillis::new(-123) < TimestampMillis::new(0));
assert!(TimestampMillis::new(10) < 20);
assert!(10 < TimestampMillis::new(20));
assert_eq!(i64::MAX, TimestampMillis::INF);
assert_eq!(i64::MAX - 1, TimestampMillis::MAX);
assert_eq!(i64::MIN, TimestampMillis::MIN);
}
#[test]
fn test_align_by_bucket() {
let bucket = 100;
assert_eq!(0, TimestampMillis::new(0).align_by_bucket(bucket).unwrap());
assert_eq!(0, TimestampMillis::new(1).align_by_bucket(bucket).unwrap());
assert_eq!(0, TimestampMillis::new(99).align_by_bucket(bucket).unwrap());
assert_eq!(
100,
TimestampMillis::new(100).align_by_bucket(bucket).unwrap()
);
assert_eq!(
100,
TimestampMillis::new(199).align_by_bucket(bucket).unwrap()
);
assert_eq!(0, TimestampMillis::MAX.align_by_bucket(i64::MAX).unwrap());
assert_eq!(
i64::MAX,
TimestampMillis::INF.align_by_bucket(i64::MAX).unwrap()
);
assert_eq!(None, TimestampMillis::MIN.align_by_bucket(bucket));
}
}

View File

@@ -3,13 +3,6 @@ name = "datanode"
version = "0.1.0"
edition = "2021"
# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html
[dependencies.arrow]
package = "arrow2"
version = "0.10"
features = ["io_csv", "io_json", "io_parquet", "io_parquet_compression", "io_ipc", "ahash", "compute", "serde_types"]
[features]
default = ["python"]
python = [
@@ -19,18 +12,20 @@ python = [
[dependencies]
api = { path = "../api" }
async-trait = "0.1"
axum = "0.5"
axum-macros = "0.2"
axum = "0.6.0-rc.2"
axum-macros = "0.3.0-rc.1"
catalog = { path = "../catalog" }
common-base = { path = "../common/base" }
common-error = { path = "../common/error" }
common-grpc = { path = "../common/grpc" }
common-query = { path = "../common/query" }
common-recordbatch = { path = "../common/recordbatch" }
common-runtime = { path = "../common/runtime" }
common-telemetry = { path = "../common/telemetry" }
common-time = { path = "../common/time" }
datafusion = { git = "https://github.com/apache/arrow-datafusion.git", branch = "arrow2", features = ["simd"] }
datatypes = { path = "../datatypes" }
futures = "0.3"
hyper = { version = "0.14", features = ["full"] }
log-store = { path = "../log-store" }
metrics = "0.20"
@@ -47,19 +42,26 @@ store-api = { path = "../store-api" }
table = { path = "../table" }
table-engine = { path = "../table-engine", features = ["test"] }
tokio = { version = "1.18", features = ["full"] }
tonic = "0.8"
tokio-stream = { version = "0.1.8", features = ["net"] }
tower = { version = "0.4", features = ["full"]}
tower-http = { version ="0.3", features = ["full"]}
tonic = "0.8"
tower = { version = "0.4", features = ["full"] }
tower-http = { version = "0.3", features = ["full"] }
# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html
[dependencies.arrow]
package = "arrow2"
version = "0.10"
features = ["io_csv", "io_json", "io_parquet", "io_parquet_compression", "io_ipc", "ahash", "compute", "serde_types"]
[dev-dependencies]
axum-test-helper = "0.1"
client = { path = "../client" }
common-query = { path = "../common/query" }
datafusion = { git = "https://github.com/apache/arrow-datafusion.git" , branch = "arrow2", features = ["simd"]}
datafusion = { git = "https://github.com/apache/arrow-datafusion.git", branch = "arrow2", features = ["simd"] }
datafusion-common = { git = "https://github.com/apache/arrow-datafusion.git", branch = "arrow2" }
tempdir = "0.3"
[dev-dependencies.arrow]
package = "arrow2"
version="0.10"
version = "0.10"
features = ["io_csv", "io_json", "io_parquet", "io_parquet_compression", "io_ipc", "ahash", "compute", "serde_types"]

View File

@@ -26,6 +26,8 @@ pub struct DatanodeOptions {
pub rpc_addr: String,
pub mysql_addr: String,
pub mysql_runtime_size: u32,
pub postgres_addr: String,
pub postgres_runtime_size: u32,
pub wal_dir: String,
pub storage: ObjectStoreConfig,
}
@@ -37,6 +39,8 @@ impl Default for DatanodeOptions {
rpc_addr: "0.0.0.0:3001".to_string(),
mysql_addr: "0.0.0.0:3306".to_string(),
mysql_runtime_size: 2,
postgres_addr: "0.0.0.0:5432".to_string(),
postgres_runtime_size: 2,
wal_dir: "/tmp/greptimedb/wal".to_string(),
storage: ObjectStoreConfig::default(),
}

View File

@@ -1,9 +1,8 @@
use std::any::Any;
use api::serde::DecodeError;
use common_error::ext::BoxedError;
use common_error::prelude::*;
use datatypes::prelude::ConcreteDataType;
use datatypes::arrow::error::ArrowError;
use storage::error::Error as StorageError;
use table::error::Error as TableError;
@@ -29,6 +28,12 @@ pub enum Error {
source: catalog::error::Error,
},
#[snafu(display("Catalog not found: {}", name))]
CatalogNotFound { name: String, backtrace: Backtrace },
#[snafu(display("Schema not found: {}", name))]
SchemaNotFound { name: String, backtrace: Backtrace },
#[snafu(display("Failed to create table: {}, source: {}", table_name, source))]
CreateTable {
table_name: String,
@@ -40,7 +45,14 @@ pub enum Error {
GetTable {
table_name: String,
#[snafu(backtrace)]
source: BoxedError,
source: TableError,
},
#[snafu(display("Failed to alter table {}, source: {}", table_name, source))]
AlterTable {
table_name: String,
#[snafu(backtrace)]
source: TableError,
},
#[snafu(display("Table not found: {}", table_name))]
@@ -52,6 +64,9 @@ pub enum Error {
table_name: String,
},
#[snafu(display("Missing required field in protobuf, field: {}", field))]
MissingField { field: String, backtrace: Backtrace },
#[snafu(display(
"Columns and values number mismatch, columns: {}, values: {}",
columns,
@@ -59,19 +74,10 @@ pub enum Error {
))]
ColumnValuesNumberMismatch { columns: usize, values: usize },
#[snafu(display("Failed to parse value: {}, {}", msg, backtrace))]
ParseSqlValue { msg: String, backtrace: Backtrace },
#[snafu(display(
"Column {} expect type: {:?}, actual: {:?}",
column_name,
expect,
actual
))]
ColumnTypeMismatch {
column_name: String,
expect: ConcreteDataType,
actual: ConcreteDataType,
#[snafu(display("Failed to parse sql value, source: {}", source))]
ParseSqlValue {
#[snafu(backtrace)]
source: sql::error::Error,
},
#[snafu(display("Failed to insert value to table: {}, source: {}", table_name, source))]
@@ -135,8 +141,8 @@ pub enum Error {
source: common_runtime::error::Error,
},
#[snafu(display("Invalid CREATE TABLE sql statement, cause: {}", msg))]
InvalidCreateTableSql { msg: String, backtrace: Backtrace },
#[snafu(display("Invalid SQL, error: {}", msg))]
InvalidSql { msg: String, backtrace: Backtrace },
#[snafu(display("Failed to create schema when creating table, source: {}", source))]
CreateSchema {
@@ -150,15 +156,12 @@ pub enum Error {
source: datatypes::error::Error,
},
#[snafu(display("SQL data type not supported yet: {:?}", t))]
SqlTypeNotSupported {
t: sql::ast::DataType,
backtrace: Backtrace,
},
#[snafu(display("Specified timestamp key or primary key column not found: {}", name))]
KeyColumnNotFound { name: String, backtrace: Backtrace },
#[snafu(display("Invalid primary key: {}", msg))]
InvalidPrimaryKey { msg: String, backtrace: Backtrace },
#[snafu(display(
"Constraint in CREATE TABLE statement is not supported yet: {}",
constraint
@@ -179,6 +182,71 @@ pub enum Error {
#[snafu(backtrace)]
source: common_grpc::Error,
},
#[snafu(display("Column datatype error, source: {}", source))]
ColumnDataType {
#[snafu(backtrace)]
source: api::error::Error,
},
#[snafu(display("Column default constraint error, source: {}", source))]
ColumnDefaultConstraint {
#[snafu(backtrace)]
source: datatypes::error::Error,
},
#[snafu(display("Failed to parse SQL, source: {}", source))]
ParseSql {
#[snafu(backtrace)]
source: sql::error::Error,
},
#[snafu(display("Failed to start script manager, source: {}", source))]
StartScriptManager {
#[snafu(backtrace)]
source: script::error::Error,
},
#[snafu(display("Failed to collect RecordBatches, source: {}", source))]
CollectRecordBatches {
#[snafu(backtrace)]
source: common_recordbatch::error::Error,
},
#[snafu(display(
"Failed to parse string to timestamp, string: {}, source: {}",
raw,
source
))]
ParseTimestamp {
raw: String,
#[snafu(backtrace)]
source: common_time::error::Error,
},
#[snafu(display("Failed to create a new RecordBatch, source: {}", source))]
NewRecordBatch {
#[snafu(backtrace)]
source: common_recordbatch::error::Error,
},
#[snafu(display("Failed to create a new RecordBatches, source: {}", source))]
NewRecordBatches {
#[snafu(backtrace)]
source: common_recordbatch::error::Error,
},
#[snafu(display("Arrow computation error, source: {}", source))]
ArrowComputation {
backtrace: Backtrace,
source: ArrowError,
},
#[snafu(display("Failed to cast an arrow array into vector, source: {}", source))]
CastVector {
#[snafu(backtrace)]
source: datatypes::error::Error,
},
}
pub type Result<T> = std::result::Result<T, Error>;
@@ -189,22 +257,36 @@ impl ErrorExt for Error {
Error::ExecuteSql { source } => source.status_code(),
Error::ExecutePhysicalPlan { source } => source.status_code(),
Error::NewCatalog { source } => source.status_code(),
Error::CreateTable { source, .. } => source.status_code(),
Error::GetTable { source, .. } => source.status_code(),
Error::CreateTable { source, .. }
| Error::GetTable { source, .. }
| Error::AlterTable { source, .. } => source.status_code(),
Error::Insert { source, .. } => source.status_code(),
Error::ConvertSchema { source, .. } => source.status_code(),
Error::TableNotFound { .. } => StatusCode::TableNotFound,
Error::ColumnNotFound { .. } => StatusCode::TableColumnNotFound,
Error::ParseSqlValue { source, .. } | Error::ParseSql { source, .. } => {
source.status_code()
}
Error::CastVector { source, .. }
| Error::ColumnDefaultConstraint { source, .. }
| Error::CreateSchema { source, .. }
| Error::ConvertSchema { source, .. } => source.status_code(),
Error::ColumnValuesNumberMismatch { .. }
| Error::ParseSqlValue { .. }
| Error::ColumnTypeMismatch { .. }
| Error::IllegalInsertData { .. }
| Error::DecodeInsert { .. }
| Error::InvalidCreateTableSql { .. }
| Error::SqlTypeNotSupported { .. }
| Error::CreateSchema { .. }
| Error::InvalidSql { .. }
| Error::KeyColumnNotFound { .. }
| Error::ConstraintNotSupported { .. } => StatusCode::InvalidArguments,
| Error::InvalidPrimaryKey { .. }
| Error::MissingField { .. }
| Error::CatalogNotFound { .. }
| Error::SchemaNotFound { .. }
| Error::ConstraintNotSupported { .. }
| Error::ParseTimestamp { .. } => StatusCode::InvalidArguments,
// TODO(yingwen): Further categorize http error.
Error::StartServer { .. }
| Error::ParseAddr { .. }
@@ -214,11 +296,19 @@ impl ErrorExt for Error {
| Error::InsertSystemCatalog { .. }
| Error::Conversion { .. }
| Error::IntoPhysicalPlan { .. }
| Error::UnsupportedExpr { .. } => StatusCode::Internal,
| Error::UnsupportedExpr { .. }
| Error::ColumnDataType { .. } => StatusCode::Internal,
Error::InitBackend { .. } => StatusCode::StorageUnavailable,
Error::OpenLogStore { source } => source.status_code(),
Error::StartScriptManager { source } => source.status_code(),
Error::OpenStorageEngine { source } => source.status_code(),
Error::RuntimeResource { .. } => StatusCode::RuntimeResourcesExhausted,
Error::NewRecordBatch { source }
| Error::NewRecordBatches { source }
| Error::CollectRecordBatches { source } => source.status_code(),
Error::ArrowComputation { .. } => StatusCode::Unexpected,
}
}
@@ -239,6 +329,9 @@ impl From<Error> for tonic::Status {
#[cfg(test)]
mod tests {
use std::str::FromStr;
use common_error::ext::BoxedError;
use common_error::mock::MockError;
use super::*;
@@ -255,6 +348,10 @@ mod tests {
})
}
fn throw_arrow_error() -> std::result::Result<(), ArrowError> {
Err(ArrowError::NotYetImplemented("test".to_string()))
}
fn assert_internal_error(err: &Error) {
assert!(err.backtrace_opt().is_some());
assert_eq!(StatusCode::Internal, err.status_code());
@@ -265,6 +362,17 @@ mod tests {
assert_eq!(s.code(), tonic::Code::Internal);
}
#[test]
fn test_arrow_computation_error() {
let err = throw_arrow_error()
.context(ArrowComputationSnafu)
.unwrap_err();
assert!(matches!(err, Error::ArrowComputation { .. }));
assert!(err.backtrace_opt().is_some());
assert_eq!(StatusCode::Unexpected, err.status_code());
}
#[test]
fn test_error() {
let err = throw_query_error().context(ExecuteSqlSnafu).err().unwrap();
@@ -277,4 +385,12 @@ mod tests {
assert_internal_error(&err);
assert_tonic_internal_error(err);
}
#[test]
fn test_parse_timestamp() {
let err = common_time::timestamp::Timestamp::from_str("test")
.context(ParseTimestampSnafu { raw: "test" })
.unwrap_err();
assert_eq!(StatusCode::InvalidArguments, err.status_code());
}
}

View File

@@ -1,16 +1,20 @@
use std::{fs, path, sync::Arc};
use api::v1::{object_expr, select_expr, InsertExpr, ObjectExpr, ObjectResult, SelectExpr};
use api::v1::{
admin_expr, insert_expr, object_expr, select_expr, AdminExpr, AdminResult, ObjectExpr,
ObjectResult, SelectExpr,
};
use async_trait::async_trait;
use catalog::{CatalogManagerRef, DEFAULT_CATALOG_NAME, DEFAULT_SCHEMA_NAME};
use common_error::prelude::BoxedError;
use common_error::status_code::StatusCode;
use common_query::Output;
use common_telemetry::logging::{error, info};
use common_telemetry::timer;
use log_store::fs::{config::LogConfig, log::LocalFileLogStore};
use object_store::{backend::fs::Backend, util, ObjectStore};
use query::query_engine::{Output, QueryEngineFactory, QueryEngineRef};
use servers::query_handler::{GrpcQueryHandler, SqlQueryHandler};
use query::query_engine::{QueryEngineFactory, QueryEngineRef};
use servers::query_handler::{GrpcAdminHandler, GrpcQueryHandler, SqlQueryHandler};
use snafu::prelude::*;
use sql::statements::statement::Statement;
use storage::{config::EngineConfig as StorageEngineConfig, EngineImpl};
@@ -36,7 +40,7 @@ type DefaultEngine = MitoEngine<EngineImpl<LocalFileLogStore>>;
pub struct Instance {
// Query service
query_engine: QueryEngineRef,
sql_handler: SqlHandler<DefaultEngine>,
sql_handler: SqlHandler,
// Catalog list
catalog_manager: CatalogManagerRef,
physical_planner: PhysicalPlanner,
@@ -50,7 +54,7 @@ impl Instance {
let object_store = new_object_store(&opts.storage).await?;
let log_store = create_local_file_log_store(opts).await?;
let table_engine = DefaultEngine::new(
let table_engine = Arc::new(DefaultEngine::new(
TableEngineConfig::default(),
EngineImpl::new(
StorageEngineConfig::default(),
@@ -58,15 +62,16 @@ impl Instance {
object_store.clone(),
),
object_store,
);
));
let catalog_manager = Arc::new(
catalog::LocalCatalogManager::try_new(Arc::new(table_engine.clone()))
catalog::LocalCatalogManager::try_new(table_engine.clone())
.await
.context(NewCatalogSnafu)?,
);
let factory = QueryEngineFactory::new(catalog_manager.clone());
let query_engine = factory.query_engine().clone();
let script_executor = ScriptExecutor::new(query_engine.clone());
let script_executor =
ScriptExecutor::new(catalog_manager.clone(), query_engine.clone()).await?;
Ok(Self {
query_engine: query_engine.clone(),
@@ -77,7 +82,11 @@ impl Instance {
})
}
pub async fn execute_grpc_insert(&self, insert_expr: InsertExpr) -> Result<Output> {
pub async fn execute_grpc_insert(
&self,
table_name: &str,
values: insert_expr::Values,
) -> Result<Output> {
let schema_provider = self
.catalog_manager
.catalog(DEFAULT_CATALOG_NAME)
@@ -85,12 +94,11 @@ impl Instance {
.schema(DEFAULT_SCHEMA_NAME)
.unwrap();
let table_name = &insert_expr.table_name.clone();
let table = schema_provider
.table(table_name)
.context(TableNotFoundSnafu { table_name })?;
let insert = insertion_expr_to_request(insert_expr, table.clone())?;
let insert = insertion_expr_to_request(table_name, values, table.clone())?;
let affected_rows = table
.insert(insert)
@@ -147,8 +155,18 @@ impl Instance {
self.sql_handler.execute(SqlRequest::Create(request)).await
}
_ => unimplemented!(),
Statement::Alter(alter_table) => {
let req = self.sql_handler.alter_to_request(alter_table)?;
self.sql_handler.execute(SqlRequest::Alter(req)).await
}
Statement::ShowDatabases(stmt) => {
self.sql_handler
.execute(SqlRequest::ShowDatabases(stmt))
.await
}
Statement::ShowTables(stmt) => {
self.sql_handler.execute(SqlRequest::ShowTables(stmt)).await
}
}
}
@@ -160,8 +178,8 @@ impl Instance {
Ok(())
}
async fn handle_insert(&self, insert_expr: InsertExpr) -> ObjectResult {
match self.execute_grpc_insert(insert_expr).await {
async fn handle_insert(&self, table_name: &str, values: insert_expr::Values) -> ObjectResult {
match self.execute_grpc_insert(table_name, values).await {
Ok(Output::AffectedRows(rows)) => ObjectResultBuilder::new()
.status_code(StatusCode::Success as u32)
.mutate_result(rows as u32, 0)
@@ -195,13 +213,51 @@ impl Instance {
}
}
pub fn sql_handler(&self) -> &SqlHandler<DefaultEngine> {
pub fn sql_handler(&self) -> &SqlHandler {
&self.sql_handler
}
pub fn catalog_manager(&self) -> &CatalogManagerRef {
&self.catalog_manager
}
// This method is used in other crate's testing codes, so move it out of "cfg(test)".
// TODO(LFC): Delete it when callers no longer need it.
pub async fn new_mock() -> Result<Self> {
use table_engine::table::test_util::new_test_object_store;
use table_engine::table::test_util::MockEngine;
use table_engine::table::test_util::MockMitoEngine;
let (_dir, object_store) = new_test_object_store("setup_mock_engine_and_table").await;
let mock_engine = Arc::new(MockMitoEngine::new(
TableEngineConfig::default(),
MockEngine::default(),
object_store,
));
let catalog_manager = Arc::new(
catalog::LocalCatalogManager::try_new(mock_engine.clone())
.await
.unwrap(),
);
let factory = QueryEngineFactory::new(catalog_manager.clone());
let query_engine = factory.query_engine().clone();
let sql_handler = SqlHandler::new(mock_engine.clone(), catalog_manager.clone());
let physical_planner = PhysicalPlanner::new(query_engine.clone());
let script_executor = ScriptExecutor::new(catalog_manager.clone(), query_engine.clone())
.await
.unwrap();
Ok(Self {
query_engine,
sql_handler,
catalog_manager,
physical_planner,
script_executor,
})
}
}
async fn new_object_store(store_config: &ObjectStoreConfig) -> Result<ObjectStore> {
@@ -243,6 +299,7 @@ async fn create_local_file_log_store(opts: &DatanodeOptions) -> Result<LocalFile
Ok(log_store)
}
// TODO(LFC): Refactor datanode and frontend instances, separate impl for each query handler.
#[async_trait]
impl SqlQueryHandler for Instance {
async fn do_query(&self, query: &str) -> servers::error::Result<Output> {
@@ -256,8 +313,12 @@ impl SqlQueryHandler for Instance {
.context(servers::error::ExecuteQuerySnafu { query })
}
async fn execute_script(&self, script: &str) -> servers::error::Result<Output> {
self.script_executor.execute_script(script).await
async fn insert_script(&self, name: &str, script: &str) -> servers::error::Result<()> {
self.script_executor.insert_script(name, script).await
}
async fn execute_script(&self, name: &str) -> servers::error::Result<Output> {
self.script_executor.execute_script(name).await
}
}
@@ -265,7 +326,23 @@ impl SqlQueryHandler for Instance {
impl GrpcQueryHandler for Instance {
async fn do_query(&self, query: ObjectExpr) -> servers::error::Result<ObjectResult> {
let object_resp = match query.expr {
Some(object_expr::Expr::Insert(insert_expr)) => self.handle_insert(insert_expr).await,
Some(object_expr::Expr::Insert(insert_expr)) => {
let table_name = &insert_expr.table_name;
let expr = insert_expr
.expr
.context(servers::error::InvalidQuerySnafu {
reason: "missing `expr` in `InsertExpr`",
})?;
match expr {
insert_expr::Expr::Values(values) => {
self.handle_insert(table_name, values).await
}
insert_expr::Expr::Sql(sql) => {
let output = self.execute_sql(&sql).await;
to_object_result(output).await
}
}
}
Some(object_expr::Expr::Select(select_expr)) => self.handle_select(select_expr).await,
other => {
return servers::error::NotSupportedSnafu {
@@ -277,3 +354,20 @@ impl GrpcQueryHandler for Instance {
Ok(object_resp)
}
}
#[async_trait]
impl GrpcAdminHandler for Instance {
async fn exec_admin_request(&self, expr: AdminExpr) -> servers::error::Result<AdminResult> {
let admin_resp = match expr.expr {
Some(admin_expr::Expr::Create(create_expr)) => self.handle_create(create_expr).await,
Some(admin_expr::Expr::Alter(alter_expr)) => self.handle_alter(alter_expr).await,
other => {
return servers::error::NotSupportedSnafu {
feat: format!("{:?}", other),
}
.fail();
}
};
Ok(admin_resp)
}
}

View File

@@ -1,6 +1,9 @@
use query::Output;
use catalog::CatalogManagerRef;
use common_query::Output;
use query::QueryEngineRef;
use crate::error::Result;
#[cfg(not(feature = "python"))]
mod dummy {
use super::*;
@@ -8,8 +11,19 @@ mod dummy {
pub struct ScriptExecutor;
impl ScriptExecutor {
pub fn new(_query_engine: QueryEngineRef) -> Self {
Self {}
pub async fn new(
_catalog_manager: CatalogManagerRef,
_query_engine: QueryEngineRef,
) -> Result<Self> {
Ok(Self {})
}
pub async fn insert_script(
&self,
_name: &str,
_script: &str,
) -> servers::error::Result<()> {
servers::error::NotSupportedSnafu { feat: "script" }.fail()
}
pub async fn execute_script(&self, _script: &str) -> servers::error::Result<Output> {
@@ -22,44 +36,50 @@ mod dummy {
mod python {
use common_error::prelude::BoxedError;
use common_telemetry::logging::error;
use script::{
engine::{CompileContext, EvalContext, Script, ScriptEngine},
python::PyEngine,
};
use script::manager::ScriptManager;
use snafu::ResultExt;
use super::*;
pub struct ScriptExecutor {
py_engine: PyEngine,
script_manager: ScriptManager,
}
impl ScriptExecutor {
pub fn new(query_engine: QueryEngineRef) -> Self {
Self {
py_engine: PyEngine::new(query_engine),
}
pub async fn new(
catalog_manager: CatalogManagerRef,
query_engine: QueryEngineRef,
) -> Result<Self> {
Ok(Self {
script_manager: ScriptManager::new(catalog_manager, query_engine)
.await
.context(crate::error::StartScriptManagerSnafu)?,
})
}
pub async fn execute_script(&self, script: &str) -> servers::error::Result<Output> {
let py_script = self
.py_engine
.compile(script, CompileContext::default())
pub async fn insert_script(&self, name: &str, script: &str) -> servers::error::Result<()> {
let _s = self
.script_manager
.insert_and_compile(name, script)
.await
.map_err(|e| {
error!(e; "Instance failed to execute script");
error!(e; "Instance failed to insert script");
BoxedError::new(e)
})
.context(servers::error::ExecuteScriptSnafu { script })?;
.context(servers::error::InsertScriptSnafu { name })?;
py_script
.evaluate(EvalContext::default())
Ok(())
}
pub async fn execute_script(&self, name: &str) -> servers::error::Result<Output> {
self.script_manager
.execute(name)
.await
.map_err(|e| {
error!(e; "Instance failed to execute script");
BoxedError::new(e)
})
.context(servers::error::ExecuteScriptSnafu { script })
.context(servers::error::ExecuteScriptSnafu { name })
}
}
}

View File

@@ -7,6 +7,7 @@ use common_runtime::Builder as RuntimeBuilder;
use servers::grpc::GrpcServer;
use servers::http::HttpServer;
use servers::mysql::server::MysqlServer;
use servers::postgres::PostgresServer;
use servers::server::Server;
use snafu::ResultExt;
use tokio::try_join;
@@ -20,6 +21,7 @@ pub struct Services {
http_server: HttpServer,
grpc_server: GrpcServer,
mysql_server: Box<dyn Server>,
postgres_server: Box<dyn Server>,
}
impl Services {
@@ -31,34 +33,45 @@ impl Services {
.build()
.context(error::RuntimeResourceSnafu)?,
);
let postgres_io_runtime = Arc::new(
RuntimeBuilder::default()
.worker_threads(opts.postgres_runtime_size as usize)
.thread_name("postgres-io-handlers")
.build()
.context(error::RuntimeResourceSnafu)?,
);
Ok(Self {
http_server: HttpServer::new(instance.clone()),
grpc_server: GrpcServer::new(instance.clone()),
mysql_server: MysqlServer::create_server(instance, mysql_io_runtime),
grpc_server: GrpcServer::new(instance.clone(), instance.clone()),
mysql_server: MysqlServer::create_server(instance.clone(), mysql_io_runtime),
postgres_server: Box::new(PostgresServer::new(instance, postgres_io_runtime)),
})
}
// TODO(LFC): make servers started on demand (not starting mysql if no needed, for example)
pub async fn start(&mut self, opts: &DatanodeOptions) -> Result<()> {
let http_addr = &opts.http_addr;
let http_addr: SocketAddr = http_addr
.parse()
.context(error::ParseAddrSnafu { addr: http_addr })?;
let http_addr: SocketAddr = opts.http_addr.parse().context(error::ParseAddrSnafu {
addr: &opts.http_addr,
})?;
let grpc_addr = &opts.rpc_addr;
let grpc_addr: SocketAddr = grpc_addr
.parse()
.context(error::ParseAddrSnafu { addr: grpc_addr })?;
let grpc_addr: SocketAddr = opts.rpc_addr.parse().context(error::ParseAddrSnafu {
addr: &opts.rpc_addr,
})?;
let mysql_addr = &opts.mysql_addr;
let mysql_addr: SocketAddr = mysql_addr
.parse()
.context(error::ParseAddrSnafu { addr: mysql_addr })?;
let mysql_addr: SocketAddr = opts.mysql_addr.parse().context(error::ParseAddrSnafu {
addr: &opts.mysql_addr,
})?;
let postgres_addr: SocketAddr =
opts.postgres_addr.parse().context(error::ParseAddrSnafu {
addr: &opts.postgres_addr,
})?;
try_join!(
self.http_server.start(http_addr),
self.grpc_server.start(grpc_addr),
self.mysql_server.start(mysql_addr),
self.postgres_server.start(postgres_addr),
)
.context(error::StartServerSnafu)?;
Ok(())

View File

@@ -1,4 +1,5 @@
mod ddl;
pub(crate) mod handler;
pub(crate) mod insert;
pub(crate) mod plan;
pub(crate) mod select;
pub mod select;

View File

@@ -0,0 +1,330 @@
use std::sync::Arc;
use api::helper::ColumnDataTypeWrapper;
use api::v1::{alter_expr::Kind, AdminResult, AlterExpr, ColumnDef, CreateExpr};
use common_error::prelude::{ErrorExt, StatusCode};
use common_query::Output;
use datatypes::schema::ColumnDefaultConstraint;
use datatypes::schema::{ColumnSchema, SchemaBuilder, SchemaRef};
use futures::TryFutureExt;
use snafu::prelude::*;
use table::requests::{AlterKind, AlterTableRequest, CreateTableRequest};
use crate::error::{self, ColumnDefaultConstraintSnafu, MissingFieldSnafu, Result};
use crate::instance::Instance;
use crate::server::grpc::handler::AdminResultBuilder;
use crate::sql::SqlRequest;
impl Instance {
pub(crate) async fn handle_create(&self, expr: CreateExpr) -> AdminResult {
let request = self.create_expr_to_request(expr);
let result = futures::future::ready(request)
.and_then(|request| self.sql_handler().execute(SqlRequest::Create(request)))
.await;
match result {
Ok(Output::AffectedRows(rows)) => AdminResultBuilder::default()
.status_code(StatusCode::Success as u32)
.mutate_result(rows as u32, 0)
.build(),
// Unreachable because we are executing "CREATE TABLE"; otherwise it's an internal bug.
Ok(Output::Stream(_)) | Ok(Output::RecordBatches(_)) => unreachable!(),
Err(err) => AdminResultBuilder::default()
.status_code(err.status_code() as u32)
.err_msg(err.to_string())
.build(),
}
}
pub(crate) async fn handle_alter(&self, expr: AlterExpr) -> AdminResult {
let request = match self.alter_expr_to_request(expr).transpose() {
Some(req) => req,
None => {
return AdminResultBuilder::default()
.status_code(StatusCode::Success as u32)
.mutate_result(0, 0)
.build()
}
};
let result = futures::future::ready(request)
.and_then(|request| self.sql_handler().execute(SqlRequest::Alter(request)))
.await;
match result {
Ok(Output::AffectedRows(rows)) => AdminResultBuilder::default()
.status_code(StatusCode::Success as u32)
.mutate_result(rows as u32, 0)
.build(),
Ok(Output::Stream(_)) | Ok(Output::RecordBatches(_)) => unreachable!(),
Err(err) => AdminResultBuilder::default()
.status_code(err.status_code() as u32)
.err_msg(err.to_string())
.build(),
}
}
fn create_expr_to_request(&self, expr: CreateExpr) -> Result<CreateTableRequest> {
let schema = create_table_schema(&expr)?;
let primary_key_indices = expr
.primary_keys
.iter()
.map(|key| {
schema
.column_index_by_name(key)
.context(error::KeyColumnNotFoundSnafu { name: key })
})
.collect::<Result<Vec<usize>>>()?;
let table_id = self.catalog_manager().next_table_id();
Ok(CreateTableRequest {
id: table_id,
catalog_name: expr.catalog_name,
schema_name: expr.schema_name,
table_name: expr.table_name,
desc: expr.desc,
schema,
primary_key_indices,
create_if_not_exists: expr.create_if_not_exists,
table_options: expr.table_options,
})
}
fn alter_expr_to_request(&self, expr: AlterExpr) -> Result<Option<AlterTableRequest>> {
match expr.kind {
Some(Kind::AddColumn(add_column)) => {
let column_def = add_column.column_def.context(MissingFieldSnafu {
field: "column_def",
})?;
let alter_kind = AlterKind::AddColumn {
new_column: create_column_schema(&column_def)?,
};
let request = AlterTableRequest {
catalog_name: expr.catalog_name,
schema_name: expr.schema_name,
table_name: expr.table_name,
alter_kind,
};
Ok(Some(request))
}
None => Ok(None),
}
}
}
fn create_table_schema(expr: &CreateExpr) -> Result<SchemaRef> {
let column_schemas = expr
.column_defs
.iter()
.map(create_column_schema)
.collect::<Result<Vec<ColumnSchema>>>()?;
let ts_index = column_schemas
.iter()
.enumerate()
.find_map(|(i, column)| {
if column.name == expr.time_index {
Some(i)
} else {
None
}
})
.context(error::KeyColumnNotFoundSnafu {
name: &expr.time_index,
})?;
Ok(Arc::new(
SchemaBuilder::try_from(column_schemas)
.context(error::CreateSchemaSnafu)?
.timestamp_index(ts_index)
.build()
.context(error::CreateSchemaSnafu)?,
))
}
fn create_column_schema(column_def: &ColumnDef) -> Result<ColumnSchema> {
let data_type =
ColumnDataTypeWrapper::try_new(column_def.datatype).context(error::ColumnDataTypeSnafu)?;
Ok(ColumnSchema {
name: column_def.name.clone(),
data_type: data_type.into(),
is_nullable: column_def.is_nullable,
default_constraint: match &column_def.default_constraint {
None => None,
Some(v) => Some(
ColumnDefaultConstraint::try_from(&v[..]).context(ColumnDefaultConstraintSnafu)?,
),
},
})
}
#[cfg(test)]
mod tests {
use std::collections::HashMap;
use catalog::MIN_USER_TABLE_ID;
use datatypes::prelude::ConcreteDataType;
use datatypes::value::Value;
use super::*;
use crate::tests::test_util;
#[tokio::test]
async fn test_create_expr_to_request() {
let (opts, _guard) = test_util::create_tmp_dir_and_datanode_opts();
let instance = Instance::new(&opts).await.unwrap();
instance.start().await.unwrap();
let expr = testing_create_expr();
let request = instance.create_expr_to_request(expr).unwrap();
assert_eq!(request.id, MIN_USER_TABLE_ID);
assert_eq!(request.catalog_name, None);
assert_eq!(request.schema_name, None);
assert_eq!(request.table_name, "my-metrics");
assert_eq!(request.desc, Some("blabla".to_string()));
assert_eq!(request.schema, expected_table_schema());
assert_eq!(request.primary_key_indices, vec![1, 0]);
assert!(request.create_if_not_exists);
let mut expr = testing_create_expr();
expr.primary_keys = vec!["host".to_string(), "not-exist-column".to_string()];
let result = instance.create_expr_to_request(expr);
assert!(result.is_err());
assert!(result
.unwrap_err()
.to_string()
.contains("Specified timestamp key or primary key column not found: not-exist-column"));
}
#[test]
fn test_create_table_schema() {
let mut expr = testing_create_expr();
let schema = create_table_schema(&expr).unwrap();
assert_eq!(schema, expected_table_schema());
expr.time_index = "not-exist-column".to_string();
let result = create_table_schema(&expr);
assert!(result.is_err());
assert!(result
.unwrap_err()
.to_string()
.contains("Specified timestamp key or primary key column not found: not-exist-column"));
}
#[test]
fn test_create_column_schema() {
let column_def = ColumnDef {
name: "a".to_string(),
datatype: 1024,
is_nullable: true,
default_constraint: None,
};
let result = create_column_schema(&column_def);
assert!(result.is_err());
assert_eq!(
result.unwrap_err().to_string(),
"Column datatype error, source: Unknown proto column datatype: 1024"
);
let column_def = ColumnDef {
name: "a".to_string(),
datatype: 12, // string
is_nullable: true,
default_constraint: None,
};
let column_schema = create_column_schema(&column_def).unwrap();
assert_eq!(column_schema.name, "a");
assert_eq!(column_schema.data_type, ConcreteDataType::string_datatype());
assert!(column_schema.is_nullable);
let default_constraint = ColumnDefaultConstraint::Value(Value::from("defaut value"));
let column_def = ColumnDef {
name: "a".to_string(),
datatype: 12, // string
is_nullable: true,
default_constraint: Some(default_constraint.clone().try_into().unwrap()),
};
let column_schema = create_column_schema(&column_def).unwrap();
assert_eq!(column_schema.name, "a");
assert_eq!(column_schema.data_type, ConcreteDataType::string_datatype());
assert!(column_schema.is_nullable);
assert_eq!(
default_constraint,
column_schema.default_constraint.unwrap()
);
}
fn testing_create_expr() -> CreateExpr {
let column_defs = vec![
ColumnDef {
name: "host".to_string(),
datatype: 12, // string
is_nullable: false,
default_constraint: None,
},
ColumnDef {
name: "ts".to_string(),
datatype: 15, // timestamp
is_nullable: false,
default_constraint: None,
},
ColumnDef {
name: "cpu".to_string(),
datatype: 9, // float32
is_nullable: true,
default_constraint: None,
},
ColumnDef {
name: "memory".to_string(),
datatype: 10, // float64
is_nullable: true,
default_constraint: None,
},
];
CreateExpr {
catalog_name: None,
schema_name: None,
table_name: "my-metrics".to_string(),
desc: Some("blabla".to_string()),
column_defs,
time_index: "ts".to_string(),
primary_keys: vec!["ts".to_string(), "host".to_string()],
create_if_not_exists: true,
table_options: HashMap::new(),
}
}
fn expected_table_schema() -> SchemaRef {
let column_schemas = vec![
ColumnSchema {
name: "host".to_string(),
data_type: ConcreteDataType::string_datatype(),
is_nullable: false,
default_constraint: None,
},
ColumnSchema {
name: "ts".to_string(),
data_type: ConcreteDataType::timestamp_millis_datatype(),
is_nullable: false,
default_constraint: None,
},
ColumnSchema {
name: "cpu".to_string(),
data_type: ConcreteDataType::float32_datatype(),
is_nullable: true,
default_constraint: None,
},
ColumnSchema {
name: "memory".to_string(),
data_type: ConcreteDataType::float64_datatype(),
is_nullable: true,
default_constraint: None,
},
];
Arc::new(
SchemaBuilder::try_from(column_schemas)
.unwrap()
.timestamp_index(1)
.build()
.unwrap(),
)
}
}

View File

@@ -1,6 +1,6 @@
use api::v1::{
codec::SelectResult, object_result, MutateResult, ObjectResult, ResultHeader,
SelectResult as SelectResultRaw,
admin_result, codec::SelectResult, object_result, AdminResult, MutateResult, ObjectResult,
ResultHeader, SelectResult as SelectResultRaw,
};
use common_error::prelude::ErrorExt;
@@ -87,6 +87,61 @@ pub(crate) fn build_err_result(err: &impl ErrorExt) -> ObjectResult {
.build()
}
#[derive(Debug)]
pub(crate) struct AdminResultBuilder {
version: u32,
code: u32,
err_msg: Option<String>,
mutate: Option<(Success, Failure)>,
}
impl AdminResultBuilder {
pub fn status_code(mut self, code: u32) -> Self {
self.code = code;
self
}
pub fn err_msg(mut self, err_msg: String) -> Self {
self.err_msg = Some(err_msg);
self
}
pub fn mutate_result(mut self, success: u32, failure: u32) -> Self {
self.mutate = Some((success, failure));
self
}
pub fn build(self) -> AdminResult {
let header = Some(ResultHeader {
version: self.version,
code: self.code,
err_msg: self.err_msg.unwrap_or_default(),
});
let result = if let Some((success, failure)) = self.mutate {
Some(admin_result::Result::Mutate(MutateResult {
success,
failure,
}))
} else {
None
};
AdminResult { header, result }
}
}
impl Default for AdminResultBuilder {
fn default() -> Self {
Self {
version: PROTOCOL_VERSION,
code: 0,
err_msg: None,
mutate: None,
}
}
}
#[cfg(test)]
mod tests {
use api::v1::{object_result, MutateResult};

View File

@@ -4,8 +4,9 @@ use std::{
sync::Arc,
};
use api::v1::{codec::InsertBatch, column::Values, Column, InsertExpr};
use api::v1::{codec::InsertBatch, column::Values, insert_expr, Column};
use common_base::BitVec;
use common_time::timestamp::Timestamp;
use datatypes::{data_type::ConcreteDataType, value::Value, vectors::VectorBuilder};
use snafu::{ensure, OptionExt, ResultExt};
use table::{requests::InsertRequest, Table};
@@ -13,13 +14,13 @@ use table::{requests::InsertRequest, Table};
use crate::error::{ColumnNotFoundSnafu, DecodeInsertSnafu, IllegalInsertDataSnafu, Result};
pub fn insertion_expr_to_request(
insert: InsertExpr,
table_name: &str,
values: insert_expr::Values,
table: Arc<dyn Table>,
) -> Result<InsertRequest> {
let schema = table.schema();
let table_name = &insert.table_name;
let mut columns_builders = HashMap::with_capacity(schema.column_schemas().len());
let insert_batches = insert_batches(insert.values)?;
let insert_batches = insert_batches(values.values)?;
for InsertBatch { columns, row_count } in insert_batches {
for Column {
@@ -66,12 +67,10 @@ pub fn insertion_expr_to_request(
}
fn insert_batches(bytes_vec: Vec<Vec<u8>>) -> Result<Vec<InsertBatch>> {
let mut insert_batches = Vec::with_capacity(bytes_vec.len());
for bytes in bytes_vec {
insert_batches.push(bytes.deref().try_into().context(DecodeInsertSnafu)?);
}
Ok(insert_batches)
bytes_vec
.iter()
.map(|bytes| bytes.deref().try_into().context(DecodeInsertSnafu))
.collect()
}
fn add_values_to_builder(
@@ -170,7 +169,24 @@ fn convert_values(data_type: &ConcreteDataType, values: Values) -> Vec<Value> {
.into_iter()
.map(|val| val.into())
.collect(),
_ => unimplemented!(),
ConcreteDataType::DateTime(_) => values
.i64_values
.into_iter()
.map(|v| Value::DateTime(v.into()))
.collect(),
ConcreteDataType::Date(_) => values
.i32_values
.into_iter()
.map(|v| Value::Date(v.into()))
.collect(),
ConcreteDataType::Timestamp(_) => values
.ts_millis_values
.into_iter()
.map(|v| Value::Timestamp(Timestamp::from_millis(v)))
.collect(),
ConcreteDataType::Null(_) => unreachable!(),
ConcreteDataType::List(_) => unreachable!(),
ConcreteDataType::Geometry(_) => todo!(),
}
}
@@ -185,11 +201,12 @@ mod tests {
use api::v1::{
codec::InsertBatch,
column::{self, Values},
Column, InsertExpr,
insert_expr, Column,
};
use common_base::BitVec;
use common_query::prelude::Expr;
use common_recordbatch::SendableRecordBatchStream;
use common_time::timestamp::Timestamp;
use datatypes::{
data_type::ConcreteDataType,
schema::{ColumnSchema, SchemaBuilder, SchemaRef},
@@ -202,13 +219,12 @@ mod tests {
#[test]
fn test_insertion_expr_to_request() {
let insert_expr = InsertExpr {
table_name: "demo".to_string(),
values: mock_insert_batches(),
};
let table: Arc<dyn Table> = Arc::new(DemoTable {});
let insert_req = insertion_expr_to_request(insert_expr, table).unwrap();
let values = insert_expr::Values {
values: mock_insert_batches(),
};
let insert_req = insertion_expr_to_request("demo", values, table).unwrap();
assert_eq!("demo", insert_req.table_name);
@@ -225,8 +241,8 @@ mod tests {
assert_eq!(Value::Float64(0.1.into()), memory.get(1));
let ts = insert_req.columns_values.get("ts").unwrap();
assert_eq!(Value::Int64(100), ts.get(0));
assert_eq!(Value::Int64(101), ts.get(1));
assert_eq!(Value::Timestamp(Timestamp::from_millis(100)), ts.get(0));
assert_eq!(Value::Timestamp(Timestamp::from_millis(101)), ts.get(1));
}
#[test]
@@ -276,11 +292,12 @@ mod tests {
ColumnSchema::new("host", ConcreteDataType::string_datatype(), false),
ColumnSchema::new("cpu", ConcreteDataType::float64_datatype(), true),
ColumnSchema::new("memory", ConcreteDataType::float64_datatype(), true),
ColumnSchema::new("ts", ConcreteDataType::int64_datatype(), true),
ColumnSchema::new("ts", ConcreteDataType::timestamp_millis_datatype(), true),
];
Arc::new(
SchemaBuilder::from(column_schemas)
SchemaBuilder::try_from(column_schemas)
.unwrap()
.timestamp_index(3)
.build()
.unwrap(),
@@ -298,7 +315,7 @@ mod tests {
fn mock_insert_batches() -> Vec<Vec<u8>> {
const SEMANTIC_TAG: i32 = 0;
const SEMANTIC_FEILD: i32 = 1;
const SEMANTIC_FIELD: i32 = 1;
const SEMANTIC_TS: i32 = 2;
let row_count = 2;
@@ -312,6 +329,7 @@ mod tests {
semantic_type: SEMANTIC_TAG,
values: Some(host_vals),
null_mask: vec![0],
..Default::default()
};
let cpu_vals = column::Values {
@@ -320,9 +338,10 @@ mod tests {
};
let cpu_column = Column {
column_name: "cpu".to_string(),
semantic_type: SEMANTIC_FEILD,
semantic_type: SEMANTIC_FIELD,
values: Some(cpu_vals),
null_mask: vec![2],
..Default::default()
};
let mem_vals = column::Values {
@@ -331,13 +350,14 @@ mod tests {
};
let mem_column = Column {
column_name: "memory".to_string(),
semantic_type: SEMANTIC_FEILD,
semantic_type: SEMANTIC_FIELD,
values: Some(mem_vals),
null_mask: vec![1],
..Default::default()
};
let ts_vals = column::Values {
i64_values: vec![100, 101],
ts_millis_values: vec![100, 101],
..Default::default()
};
let ts_column = Column {
@@ -345,6 +365,7 @@ mod tests {
semantic_type: SEMANTIC_TS,
values: Some(ts_vals),
null_mask: vec![0],
datatype: Some(15),
};
let insert_batch = InsertBatch {

View File

@@ -2,9 +2,10 @@ use std::sync::Arc;
use common_grpc::AsExcutionPlan;
use common_grpc::DefaultAsPlanImpl;
use common_query::Output;
use datatypes::schema::Schema;
use query::PhysicalPlanAdapter;
use query::{plan::PhysicalPlan, Output, QueryEngineRef};
use query::{plan::PhysicalPlan, QueryEngineRef};
use snafu::ResultExt;
use crate::error::Result;

View File

@@ -1,60 +1,64 @@
use std::sync::Arc;
use api::helper::ColumnDataTypeWrapper;
use api::v1::{codec::SelectResult, column::Values, Column, ObjectResult};
use arrow::array::{Array, BooleanArray, PrimitiveArray};
use common_base::BitVec;
use common_error::prelude::ErrorExt;
use common_error::status_code::StatusCode;
use common_recordbatch::{util, RecordBatch, SendableRecordBatchStream};
use common_query::Output;
use common_recordbatch::{util, RecordBatches, SendableRecordBatchStream};
use datatypes::arrow_array::{BinaryArray, StringArray};
use query::Output;
use snafu::OptionExt;
use snafu::{OptionExt, ResultExt};
use crate::error::{ConversionSnafu, Result};
use crate::error::{self, ConversionSnafu, Result};
use crate::server::grpc::handler::{build_err_result, ObjectResultBuilder};
pub async fn to_object_result(result: Result<Output>) -> ObjectResult {
match result {
Ok(Output::AffectedRows(rows)) => ObjectResultBuilder::new()
pub async fn to_object_result(output: Result<Output>) -> ObjectResult {
let result = match output {
Ok(Output::AffectedRows(rows)) => Ok(ObjectResultBuilder::new()
.status_code(StatusCode::Success as u32)
.mutate_result(rows as u32, 0)
.build(),
Ok(Output::RecordBatch(stream)) => record_batchs(stream).await,
Err(err) => ObjectResultBuilder::new()
.status_code(err.status_code() as u32)
.err_msg(err.to_string())
.build(),
}
}
async fn record_batchs(stream: SendableRecordBatchStream) -> ObjectResult {
let builder = ObjectResultBuilder::new();
match util::collect(stream).await {
Ok(record_batches) => match try_convert(record_batches) {
Ok(select_result) => builder
.status_code(StatusCode::Success as u32)
.select_result(select_result)
.build(),
Err(err) => build_err_result(&err),
},
Err(err) => build_err_result(&err),
}
}
// All schemas of record_batches must be the same.
fn try_convert(record_batches: Vec<RecordBatch>) -> Result<SelectResult> {
let first = if let Some(r) = record_batches.get(0) {
r
} else {
return Ok(SelectResult::default());
.build()),
Ok(Output::Stream(stream)) => collect(stream).await,
Ok(Output::RecordBatches(recordbatches)) => build_result(recordbatches),
Err(e) => return build_err_result(&e),
};
match result {
Ok(r) => r,
Err(e) => build_err_result(&e),
}
}
async fn collect(stream: SendableRecordBatchStream) -> Result<ObjectResult> {
let schema = stream.schema();
let recordbatches = util::collect(stream)
.await
.and_then(|batches| RecordBatches::try_new(schema, batches))
.context(error::CollectRecordBatchesSnafu)?;
let object_result = build_result(recordbatches)?;
Ok(object_result)
}
fn build_result(recordbatches: RecordBatches) -> Result<ObjectResult> {
let select_result = try_convert(recordbatches)?;
let object_result = ObjectResultBuilder::new()
.status_code(StatusCode::Success as u32)
.select_result(select_result)
.build();
Ok(object_result)
}
fn try_convert(record_batches: RecordBatches) -> Result<SelectResult> {
let schema = record_batches.schema();
let record_batches = record_batches.take();
let row_count: usize = record_batches
.iter()
.map(|r| r.df_recordbatch.num_rows())
.sum();
let schemas = first.schema.column_schemas();
let schemas = schema.column_schemas();
let mut columns = Vec::with_capacity(schemas.len());
for (idx, schema) in schemas.iter().enumerate() {
@@ -69,6 +73,11 @@ fn try_convert(record_batches: Vec<RecordBatch>) -> Result<SelectResult> {
column_name,
values: Some(values(&arrays)?),
null_mask: null_mask(&arrays, row_count),
datatype: Some(
ColumnDataTypeWrapper::try_from(schema.data_type.clone())
.context(error::ColumnDataTypeSnafu)?
.datatype() as i32,
),
..Default::default()
};
columns.push(column);
@@ -80,7 +89,7 @@ fn try_convert(record_batches: Vec<RecordBatch>) -> Result<SelectResult> {
})
}
fn null_mask(arrays: &Vec<Arc<dyn Array>>, row_count: usize) -> Vec<u8> {
pub fn null_mask(arrays: &Vec<Arc<dyn Array>>, row_count: usize) -> Vec<u8> {
let null_count: usize = arrays.iter().map(|a| a.null_count()).sum();
if null_count == 0 {
@@ -100,10 +109,10 @@ fn null_mask(arrays: &Vec<Arc<dyn Array>>, row_count: usize) -> Vec<u8> {
}
macro_rules! convert_arrow_array_to_grpc_vals {
($data_type: expr, $arrays: ident, $(($Type: ident, $CastType: ty, $field: ident, $MapFunction: expr)), +) => {
($data_type: expr, $arrays: ident, $(($Type: pat, $CastType: ty, $field: ident, $MapFunction: expr)), +) => {
match $data_type {
$(
arrow::datatypes::DataType::$Type => {
$Type => {
let mut vals = Values::default();
for array in $arrays {
let array = array.as_any().downcast_ref::<$CastType>().with_context(|| ConversionSnafu {
@@ -119,39 +128,44 @@ macro_rules! convert_arrow_array_to_grpc_vals {
)+
_ => unimplemented!(),
}
};
}
fn values(arrays: &[Arc<dyn Array>]) -> Result<Values> {
pub fn values(arrays: &[Arc<dyn Array>]) -> Result<Values> {
if arrays.is_empty() {
return Ok(Values::default());
}
let data_type = arrays[0].data_type();
use arrow::datatypes::DataType;
convert_arrow_array_to_grpc_vals!(
data_type, arrays,
(Boolean, BooleanArray, bool_values, |x| {x}),
(DataType::Boolean, BooleanArray, bool_values, |x| {x}),
(Int8, PrimitiveArray<i8>, i8_values, |x| {*x as i32}),
(Int16, PrimitiveArray<i16>, i16_values, |x| {*x as i32}),
(Int32, PrimitiveArray<i32>, i32_values, |x| {*x}),
(Int64, PrimitiveArray<i64>, i64_values, |x| {*x}),
(DataType::Int8, PrimitiveArray<i8>, i8_values, |x| {*x as i32}),
(DataType::Int16, PrimitiveArray<i16>, i16_values, |x| {*x as i32}),
(DataType::Int32, PrimitiveArray<i32>, i32_values, |x| {*x}),
(DataType::Int64, PrimitiveArray<i64>, i64_values, |x| {*x}),
(UInt8, PrimitiveArray<u8>, u8_values, |x| {*x as u32}),
(UInt16, PrimitiveArray<u16>, u16_values, |x| {*x as u32}),
(UInt32, PrimitiveArray<u32>, u32_values, |x| {*x}),
(UInt64, PrimitiveArray<u64>, u64_values, |x| {*x}),
(DataType::UInt8, PrimitiveArray<u8>, u8_values, |x| {*x as u32}),
(DataType::UInt16, PrimitiveArray<u16>, u16_values, |x| {*x as u32}),
(DataType::UInt32, PrimitiveArray<u32>, u32_values, |x| {*x}),
(DataType::UInt64, PrimitiveArray<u64>, u64_values, |x| {*x}),
(Float32, PrimitiveArray<f32>, f32_values, |x| {*x}),
(Float64, PrimitiveArray<f64>, f64_values, |x| {*x}),
(DataType::Float32, PrimitiveArray<f32>, f32_values, |x| {*x}),
(DataType::Float64, PrimitiveArray<f64>, f64_values, |x| {*x}),
(Binary, BinaryArray, binary_values, |x| {x.into()}),
(LargeBinary, BinaryArray, binary_values, |x| {x.into()}),
(DataType::Binary, BinaryArray, binary_values, |x| {x.into()}),
(DataType::LargeBinary, BinaryArray, binary_values, |x| {x.into()}),
(Utf8, StringArray, string_values, |x| {x.into()}),
(LargeUtf8, StringArray, string_values, |x| {x.into()})
(DataType::Utf8, StringArray, string_values, |x| {x.into()}),
(DataType::LargeUtf8, StringArray, string_values, |x| {x.into()}),
(DataType::Date32, PrimitiveArray<i32>, date_values, |x| {*x as i32}),
(DataType::Date64, PrimitiveArray<i64>, datetime_values,|x| {*x as i64}),
(DataType::Timestamp(arrow::datatypes::TimeUnit::Millisecond, _), PrimitiveArray<i64>, ts_millis_values, |x| {*x})
)
}
@@ -163,7 +177,7 @@ mod tests {
array::{Array, BooleanArray, PrimitiveArray},
datatypes::{DataType, Field},
};
use common_recordbatch::RecordBatch;
use common_recordbatch::{RecordBatch, RecordBatches};
use datafusion::field_util::SchemaExt;
use datatypes::arrow::datatypes::Schema as ArrowSchema;
use datatypes::{
@@ -177,8 +191,10 @@ mod tests {
#[test]
fn test_convert_record_batches_to_select_result() {
let r1 = mock_record_batch();
let schema = r1.schema.clone();
let r2 = mock_record_batch();
let record_batches = vec![r1, r2];
let record_batches = RecordBatches::try_new(schema, record_batches).unwrap();
let s = try_convert(record_batches).unwrap();

View File

@@ -1,36 +1,44 @@
//! sql handler
use std::sync::Arc;
use catalog::CatalogManagerRef;
use common_error::ext::BoxedError;
use query::query_engine::Output;
use catalog::{
schema::SchemaProviderRef, CatalogManagerRef, CatalogProviderRef, DEFAULT_CATALOG_NAME,
DEFAULT_SCHEMA_NAME,
};
use common_query::Output;
use snafu::{OptionExt, ResultExt};
use table::engine::{EngineContext, TableEngine};
use sql::statements::show::{ShowDatabases, ShowTables};
use table::engine::{EngineContext, TableEngineRef};
use table::requests::*;
use table::TableRef;
use crate::error::{GetTableSnafu, Result, TableNotFoundSnafu};
use crate::error::{
CatalogNotFoundSnafu, GetTableSnafu, Result, SchemaNotFoundSnafu, TableNotFoundSnafu,
};
mod alter;
mod create;
mod insert;
mod show;
#[derive(Debug)]
pub enum SqlRequest {
Insert(InsertRequest),
Create(CreateTableRequest),
Alter(AlterTableRequest),
ShowDatabases(ShowDatabases),
ShowTables(ShowTables),
}
// Handler to execute SQL except query
pub struct SqlHandler<Engine: TableEngine> {
table_engine: Arc<Engine>,
pub struct SqlHandler {
table_engine: TableEngineRef,
catalog_manager: CatalogManagerRef,
}
impl<Engine: TableEngine> SqlHandler<Engine> {
pub fn new(table_engine: Engine, catalog_manager: CatalogManagerRef) -> Self {
impl SqlHandler {
pub fn new(table_engine: TableEngineRef, catalog_manager: CatalogManagerRef) -> Self {
Self {
table_engine: Arc::new(table_engine),
table_engine,
catalog_manager,
}
}
@@ -39,18 +47,40 @@ impl<Engine: TableEngine> SqlHandler<Engine> {
match request {
SqlRequest::Insert(req) => self.insert(req).await,
SqlRequest::Create(req) => self.create(req).await,
SqlRequest::Alter(req) => self.alter(req).await,
SqlRequest::ShowDatabases(stmt) => self.show_databases(stmt).await,
SqlRequest::ShowTables(stmt) => self.show_tables(stmt).await,
}
}
pub(crate) fn get_table(&self, table_name: &str) -> Result<TableRef> {
self.table_engine
.get_table(&EngineContext::default(), table_name)
.map_err(BoxedError::new)
.context(GetTableSnafu { table_name })?
.context(TableNotFoundSnafu { table_name })
}
pub fn table_engine(&self) -> Arc<Engine> {
pub(crate) fn get_default_catalog(&self) -> Result<CatalogProviderRef> {
self.catalog_manager
.catalog(DEFAULT_CATALOG_NAME)
.context(CatalogNotFoundSnafu {
name: DEFAULT_CATALOG_NAME,
})
}
pub(crate) fn get_default_schema(&self) -> Result<SchemaProviderRef> {
self.catalog_manager
.catalog(DEFAULT_CATALOG_NAME)
.context(CatalogNotFoundSnafu {
name: DEFAULT_CATALOG_NAME,
})?
.schema(DEFAULT_SCHEMA_NAME)
.context(SchemaNotFoundSnafu {
name: DEFAULT_SCHEMA_NAME,
})
}
pub fn table_engine(&self) -> TableEngineRef {
self.table_engine.clone()
}
}
@@ -63,6 +93,7 @@ mod tests {
use catalog::SchemaProvider;
use common_query::logical_plan::Expr;
use common_recordbatch::SendableRecordBatchStream;
use common_time::timestamp::Timestamp;
use datatypes::prelude::ConcreteDataType;
use datatypes::schema::{ColumnSchema, SchemaBuilder, SchemaRef};
use datatypes::value::Value;
@@ -93,11 +124,12 @@ mod tests {
ColumnSchema::new("host", ConcreteDataType::string_datatype(), false),
ColumnSchema::new("cpu", ConcreteDataType::float64_datatype(), true),
ColumnSchema::new("memory", ConcreteDataType::float64_datatype(), true),
ColumnSchema::new("ts", ConcreteDataType::int64_datatype(), true),
ColumnSchema::new("ts", ConcreteDataType::timestamp_millis_datatype(), true),
];
Arc::new(
SchemaBuilder::from(column_schemas)
SchemaBuilder::try_from(column_schemas)
.unwrap()
.timestamp_index(3)
.build()
.unwrap(),
@@ -156,7 +188,7 @@ mod tests {
('host2', 88.8, 333.3, 1655276558000)
"#;
let table_engine = MitoEngine::<EngineImpl<NoopLogStore>>::new(
let table_engine = Arc::new(MitoEngine::<EngineImpl<NoopLogStore>>::new(
TableEngineConfig::default(),
EngineImpl::new(
StorageEngineConfig::default(),
@@ -164,10 +196,10 @@ mod tests {
object_store.clone(),
),
object_store,
);
));
let catalog_list = Arc::new(
catalog::LocalCatalogManager::try_new(Arc::new(table_engine.clone()))
catalog::LocalCatalogManager::try_new(table_engine.clone())
.await
.unwrap(),
);
@@ -209,8 +241,14 @@ mod tests {
let ts = &columns_values["ts"];
assert_eq!(2, ts.len());
assert_eq!(Value::from(1655276557000i64), ts.get(0));
assert_eq!(Value::from(1655276558000i64), ts.get(1));
assert_eq!(
Value::from(Timestamp::from_millis(1655276557000i64)),
ts.get(0)
);
assert_eq!(
Value::from(Timestamp::from_millis(1655276558000i64)),
ts.get(1)
);
}
_ => {
panic!("Not supposed to reach here")

View File

@@ -0,0 +1,92 @@
use common_query::Output;
use snafu::prelude::*;
use sql::statements::alter::{AlterTable, AlterTableOperation};
use sql::statements::{column_def_to_schema, table_idents_to_full_name};
use table::engine::EngineContext;
use table::requests::{AlterKind, AlterTableRequest};
use crate::error::{self, Result};
use crate::sql::SqlHandler;
impl SqlHandler {
pub(crate) async fn alter(&self, req: AlterTableRequest) -> Result<Output> {
let ctx = EngineContext {};
let table_name = &req.table_name.clone();
if !self.table_engine.table_exists(&ctx, table_name) {
return error::TableNotFoundSnafu { table_name }.fail();
}
self.table_engine
.alter_table(&ctx, req)
.await
.context(error::AlterTableSnafu { table_name })?;
// Tried in MySQL, it really prints "Affected Rows: 0".
Ok(Output::AffectedRows(0))
}
pub(crate) fn alter_to_request(&self, alter_table: AlterTable) -> Result<AlterTableRequest> {
let (catalog_name, schema_name, table_name) =
table_idents_to_full_name(alter_table.table_name()).context(error::ParseSqlSnafu)?;
let alter_kind = match alter_table.alter_operation() {
AlterTableOperation::AddConstraint(table_constraint) => {
return error::InvalidSqlSnafu {
msg: format!("unsupported table constraint {}", table_constraint),
}
.fail()
}
AlterTableOperation::AddColumn { column_def } => AlterKind::AddColumn {
new_column: column_def_to_schema(column_def).context(error::ParseSqlSnafu)?,
},
};
Ok(AlterTableRequest {
catalog_name,
schema_name,
table_name,
alter_kind,
})
}
}
#[cfg(test)]
mod tests {
use std::assert_matches::assert_matches;
use datatypes::prelude::ConcreteDataType;
use sql::dialect::GenericDialect;
use sql::parser::ParserContext;
use sql::statements::statement::Statement;
use super::*;
use crate::tests::test_util::create_mock_sql_handler;
fn parse_sql(sql: &str) -> AlterTable {
let mut stmt = ParserContext::create_with_dialect(sql, &GenericDialect {}).unwrap();
assert_eq!(1, stmt.len());
let stmt = stmt.remove(0);
assert_matches!(stmt, Statement::Alter(_));
match stmt {
Statement::Alter(alter_table) => alter_table,
_ => unreachable!(),
}
}
#[tokio::test]
async fn test_alter_to_request_with_adding_column() {
let handler = create_mock_sql_handler().await;
let alter_table = parse_sql("ALTER TABLE my_metric_1 ADD tagk_i STRING Null;");
let req = handler.alter_to_request(alter_table).unwrap();
assert_eq!(req.catalog_name, None);
assert_eq!(req.schema_name, None);
assert_eq!(req.table_name, "my_metric_1");
let alter_kind = req.alter_kind;
assert_matches!(alter_kind, AlterKind::AddColumn { .. });
match alter_kind {
AlterKind::AddColumn { new_column } => {
assert_eq!(new_column.name, "tagk_i");
assert!(new_column.is_nullable);
assert_eq!(new_column.data_type, ConcreteDataType::string_datatype());
}
}
}
}

View File

@@ -2,26 +2,25 @@ use std::collections::HashMap;
use std::sync::Arc;
use catalog::RegisterTableRequest;
use common_query::Output;
use common_telemetry::tracing::info;
use datatypes::prelude::ConcreteDataType;
use datatypes::schema::{ColumnSchema, SchemaBuilder};
use datatypes::types::DateTimeType;
use query::query_engine::Output;
use snafu::{OptionExt, ResultExt};
use sql::ast::{ColumnDef, ColumnOption, DataType as SqlDataType, ObjectName, TableConstraint};
use datatypes::schema::SchemaBuilder;
use snafu::{ensure, OptionExt, ResultExt};
use sql::ast::TableConstraint;
use sql::statements::create_table::CreateTable;
use sql::statements::{column_def_to_schema, table_idents_to_full_name};
use store_api::storage::consts::TIME_INDEX_NAME;
use table::engine::{EngineContext, TableEngine};
use table::engine::EngineContext;
use table::metadata::TableId;
use table::requests::*;
use crate::error::{
ConstraintNotSupportedSnafu, CreateSchemaSnafu, CreateTableSnafu, InsertSystemCatalogSnafu,
InvalidCreateTableSqlSnafu, KeyColumnNotFoundSnafu, Result, SqlTypeNotSupportedSnafu,
self, ConstraintNotSupportedSnafu, CreateSchemaSnafu, CreateTableSnafu,
InsertSystemCatalogSnafu, InvalidPrimaryKeySnafu, KeyColumnNotFoundSnafu, Result,
};
use crate::sql::SqlHandler;
impl<Engine: TableEngine> SqlHandler<Engine> {
impl SqlHandler {
pub(crate) async fn create(&self, req: CreateTableRequest) -> Result<Output> {
let ctx = EngineContext {};
let catalog_name = req.catalog_name.clone();
@@ -63,7 +62,8 @@ impl<Engine: TableEngine> SqlHandler<Engine> {
let mut ts_index = usize::MAX;
let mut primary_keys = vec![];
let (catalog_name, schema_name, table_name) = table_idents_to_full_name(stmt.name)?;
let (catalog_name, schema_name, table_name) =
table_idents_to_full_name(&stmt.name).context(error::ParseSqlSnafu)?;
let col_map = stmt
.columns
@@ -88,7 +88,7 @@ impl<Engine: TableEngine> SqlHandler<Engine> {
},
)?;
} else {
return InvalidCreateTableSqlSnafu {
return error::InvalidSqlSnafu {
msg: format!("Cannot recognize named UNIQUE constraint: {}", name),
}
.fail();
@@ -102,7 +102,7 @@ impl<Engine: TableEngine> SqlHandler<Engine> {
)?);
}
} else {
return InvalidCreateTableSqlSnafu {
return error::InvalidSqlSnafu {
msg: format!(
"Unrecognized non-primary unnamed UNIQUE constraint: {:?}",
name
@@ -120,6 +120,13 @@ impl<Engine: TableEngine> SqlHandler<Engine> {
}
}
ensure!(
!primary_keys.iter().any(|index| *index == ts_index),
InvalidPrimaryKeySnafu {
msg: "time index column can't be included in primary key"
}
);
if primary_keys.is_empty() {
info!(
"Creating table: {:?}.{:?}.{} but primary key not set, use time index column: {}",
@@ -131,11 +138,12 @@ impl<Engine: TableEngine> SqlHandler<Engine> {
let columns_schemas: Vec<_> = stmt
.columns
.iter()
.map(column_def_to_schema)
.map(|column| column_def_to_schema(column).context(error::ParseSqlSnafu))
.collect::<Result<Vec<_>>>()?;
let schema = Arc::new(
SchemaBuilder::from(columns_schemas)
SchemaBuilder::try_from(columns_schemas)
.context(CreateSchemaSnafu)?
.timestamp_index(ts_index)
.build()
.context(CreateSchemaSnafu)?,
@@ -150,98 +158,24 @@ impl<Engine: TableEngine> SqlHandler<Engine> {
schema,
primary_key_indices: primary_keys,
create_if_not_exists: stmt.if_not_exists,
table_options: HashMap::new(),
};
Ok(request)
}
}
/// Converts maybe fully-qualified table name (`<catalog>.<schema>.<table>` or `<table>` when catalog and schema are default)
/// to tuples
fn table_idents_to_full_name(
obj_name: ObjectName,
) -> Result<(Option<String>, Option<String>, String)> {
match &obj_name.0[..] {
[table] => Ok((None, None, table.value.clone())),
[catalog, schema, table] => Ok((
Some(catalog.value.clone()),
Some(schema.value.clone()),
table.value.clone(),
)),
_ => InvalidCreateTableSqlSnafu {
msg: format!(
"table name can only be <catalog>.<schema>.<table> or <table>, but found: {}",
obj_name
),
}
.fail(),
}
}
fn column_def_to_schema(def: &ColumnDef) -> Result<ColumnSchema> {
let is_nullable = def
.options
.iter()
.any(|o| matches!(o.option, ColumnOption::Null));
Ok(ColumnSchema {
name: def.name.value.clone(),
data_type: sql_data_type_to_concrete_data_type(&def.data_type)?,
is_nullable,
})
}
fn sql_data_type_to_concrete_data_type(t: &SqlDataType) -> Result<ConcreteDataType> {
match t {
SqlDataType::BigInt(_) => Ok(ConcreteDataType::int64_datatype()),
SqlDataType::Int(_) => Ok(ConcreteDataType::int32_datatype()),
SqlDataType::SmallInt(_) => Ok(ConcreteDataType::int16_datatype()),
SqlDataType::Char(_)
| SqlDataType::Varchar(_)
| SqlDataType::Text
| SqlDataType::String => Ok(ConcreteDataType::string_datatype()),
SqlDataType::Float(_) => Ok(ConcreteDataType::float32_datatype()),
SqlDataType::Double => Ok(ConcreteDataType::float64_datatype()),
SqlDataType::Boolean => Ok(ConcreteDataType::boolean_datatype()),
SqlDataType::Date => Ok(ConcreteDataType::date_datatype()),
SqlDataType::Custom(obj_name) => match &obj_name.0[..] {
[type_name] => {
if type_name.value.eq_ignore_ascii_case(DateTimeType::name()) {
Ok(ConcreteDataType::datetime_datatype())
} else {
SqlTypeNotSupportedSnafu { t: t.clone() }.fail()
}
}
_ => SqlTypeNotSupportedSnafu { t: t.clone() }.fail(),
},
_ => SqlTypeNotSupportedSnafu { t: t.clone() }.fail(),
}
}
#[cfg(test)]
mod tests {
use std::assert_matches::assert_matches;
use sql::ast::Ident;
use datatypes::prelude::ConcreteDataType;
use sql::dialect::GenericDialect;
use sql::parser::ParserContext;
use sql::statements::statement::Statement;
use table_engine::config::EngineConfig;
use table_engine::engine::MitoEngine;
use table_engine::table::test_util::{new_test_object_store, MockEngine, MockMitoEngine};
use super::*;
use crate::error::Error;
async fn create_mock_sql_handler() -> SqlHandler<MitoEngine<MockEngine>> {
let (_dir, object_store) = new_test_object_store("setup_mock_engine_and_table").await;
let mock_engine =
MockMitoEngine::new(EngineConfig::default(), MockEngine::default(), object_store);
let catalog_manager = Arc::new(
catalog::LocalCatalogManager::try_new(Arc::new(mock_engine.clone()))
.await
.unwrap(),
);
SqlHandler::new(mock_engine, catalog_manager)
}
use crate::tests::test_util::create_mock_sql_handler;
fn sql_to_statement(sql: &str) -> CreateTable {
let mut res = ParserContext::create_with_dialect(sql, &GenericDialect {}).unwrap();
@@ -257,14 +191,22 @@ mod tests {
#[tokio::test]
pub async fn test_create_to_request() {
let handler = create_mock_sql_handler().await;
let parsed_stmt = sql_to_statement("create table demo_table( host string, ts bigint, cpu double default 0, memory double, TIME INDEX (ts), PRIMARY KEY(ts, host)) engine=mito with(regions=1);");
let parsed_stmt = sql_to_statement(
r#"create table demo_table(
host string,
ts timestamp,
cpu double default 0,
memory double,
TIME INDEX (ts),
PRIMARY KEY(host)) engine=mito with(regions=1);"#,
);
let c = handler.create_to_request(42, parsed_stmt).unwrap();
assert_eq!("demo_table", c.table_name);
assert_eq!(42, c.id);
assert!(c.schema_name.is_none());
assert!(c.catalog_name.is_none());
assert!(!c.create_if_not_exists);
assert_eq!(vec![1, 0], c.primary_key_indices);
assert_eq!(vec![0], c.primary_key_indices);
assert_eq!(1, c.schema.timestamp_index().unwrap());
assert_eq!(4, c.schema.column_schemas().len());
}
@@ -273,18 +215,31 @@ mod tests {
#[tokio::test]
pub async fn test_time_index_not_specified() {
let handler = create_mock_sql_handler().await;
let parsed_stmt = sql_to_statement("create table demo_table( host string, ts bigint, cpu double default 0, memory double, PRIMARY KEY(ts, host)) engine=mito with(regions=1);");
let parsed_stmt = sql_to_statement(
r#"create table demo_table(
host string,
ts bigint,
cpu double default 0,
memory double,
PRIMARY KEY(host)) engine=mito with(regions=1);"#,
);
let error = handler.create_to_request(42, parsed_stmt).unwrap_err();
assert_matches!(error, Error::CreateSchema { .. });
}
/// If primary key is not specified, time index should be used as primary key.
/// If primary key is not specified, time index should be used as primary key.
#[tokio::test]
pub async fn test_primary_key_not_specified() {
let handler = create_mock_sql_handler().await;
let parsed_stmt = sql_to_statement("create table demo_table( host string, ts bigint, cpu double default 0, memory double, TIME INDEX (ts)) engine=mito with(regions=1);");
let parsed_stmt = sql_to_statement(
r#"create table demo_table(
host string,
ts timestamp,
cpu double default 0,
memory double,
TIME INDEX (ts)) engine=mito with(regions=1);"#,
);
let c = handler.create_to_request(42, parsed_stmt).unwrap();
assert_eq!(1, c.primary_key_indices.len());
assert_eq!(
@@ -299,23 +254,45 @@ mod tests {
let handler = create_mock_sql_handler().await;
let parsed_stmt = sql_to_statement(
"create table demo_table( host string, TIME INDEX (ts)) engine=mito with(regions=1);",
r#"create table demo_table(
host string,
TIME INDEX (ts)) engine=mito with(regions=1);"#,
);
let error = handler.create_to_request(42, parsed_stmt).unwrap_err();
assert_matches!(error, Error::KeyColumnNotFound { .. });
}
#[tokio::test]
pub async fn test_invalid_primary_key() {
let create_table = sql_to_statement(
r"create table c.s.demo(
host string,
ts timestamp,
cpu double default 0,
memory double,
TIME INDEX (ts),
PRIMARY KEY(host, cpu, ts)) engine=mito
with(regions=1);
",
);
let handler = create_mock_sql_handler().await;
let error = handler.create_to_request(42, create_table).unwrap_err();
assert_matches!(error, Error::InvalidPrimaryKey { .. });
}
#[tokio::test]
pub async fn test_parse_create_sql() {
let create_table = sql_to_statement(
r"create table c.s.demo(
host string,
ts bigint,
ts timestamp,
cpu double default 0,
memory double,
TIME INDEX (ts),
PRIMARY KEY(ts, host)) engine=mito
PRIMARY KEY(host)) engine=mito
with(regions=1);
",
);
@@ -331,7 +308,7 @@ mod tests {
assert!(!request.create_if_not_exists);
assert_eq!(4, request.schema.column_schemas().len());
assert_eq!(vec![1, 0], request.primary_key_indices);
assert_eq!(vec![0], request.primary_key_indices);
assert_eq!(
ConcreteDataType::string_datatype(),
request
@@ -341,7 +318,7 @@ mod tests {
.data_type
);
assert_eq!(
ConcreteDataType::int64_datatype(),
ConcreteDataType::timestamp_millis_datatype(),
request
.schema
.column_schema_by_name("ts")
@@ -365,42 +342,4 @@ mod tests {
.data_type
);
}
fn check_type(sql_type: SqlDataType, data_type: ConcreteDataType) {
assert_eq!(
data_type,
sql_data_type_to_concrete_data_type(&sql_type).unwrap()
);
}
#[test]
pub fn test_sql_data_type_to_concrete_data_type() {
check_type(
SqlDataType::BigInt(None),
ConcreteDataType::int64_datatype(),
);
check_type(SqlDataType::Int(None), ConcreteDataType::int32_datatype());
check_type(
SqlDataType::SmallInt(None),
ConcreteDataType::int16_datatype(),
);
check_type(SqlDataType::Char(None), ConcreteDataType::string_datatype());
check_type(
SqlDataType::Varchar(None),
ConcreteDataType::string_datatype(),
);
check_type(SqlDataType::Text, ConcreteDataType::string_datatype());
check_type(SqlDataType::String, ConcreteDataType::string_datatype());
check_type(
SqlDataType::Float(None),
ConcreteDataType::float32_datatype(),
);
check_type(SqlDataType::Double, ConcreteDataType::float64_datatype());
check_type(SqlDataType::Boolean, ConcreteDataType::boolean_datatype());
check_type(SqlDataType::Date, ConcreteDataType::date_datatype());
check_type(
SqlDataType::Custom(ObjectName(vec![Ident::new("datetime")])),
ConcreteDataType::datetime_datatype(),
);
}
}

View File

@@ -1,25 +1,21 @@
use std::str::FromStr;
use catalog::SchemaProviderRef;
use common_query::Output;
use datatypes::prelude::ConcreteDataType;
use datatypes::prelude::VectorBuilder;
use datatypes::value::Value;
use query::query_engine::Output;
use snafu::ensure;
use snafu::OptionExt;
use snafu::ResultExt;
use sql::ast::Value as SqlValue;
use sql::statements::insert::Insert;
use table::engine::TableEngine;
use sql::statements::{self, insert::Insert};
use table::requests::*;
use crate::error::{
ColumnNotFoundSnafu, ColumnTypeMismatchSnafu, ColumnValuesNumberMismatchSnafu, InsertSnafu,
ParseSqlValueSnafu, Result, TableNotFoundSnafu,
ColumnNotFoundSnafu, ColumnValuesNumberMismatchSnafu, InsertSnafu, ParseSqlValueSnafu, Result,
TableNotFoundSnafu,
};
use crate::sql::{SqlHandler, SqlRequest};
impl<Engine: TableEngine> SqlHandler<Engine> {
impl SqlHandler {
pub(crate) async fn insert(&self, req: InsertRequest) -> Result<Output> {
let table_name = &req.table_name.to_string();
let table = self.get_table(table_name)?;
@@ -119,216 +115,9 @@ fn add_row_to_vector(
sql_val: &SqlValue,
builder: &mut VectorBuilder,
) -> Result<()> {
let value = parse_sql_value(column_name, data_type, sql_val)?;
let value = statements::sql_value_to_value(column_name, data_type, sql_val)
.context(ParseSqlValueSnafu)?;
builder.push(&value);
Ok(())
}
fn parse_sql_value(
column_name: &str,
data_type: &ConcreteDataType,
sql_val: &SqlValue,
) -> Result<Value> {
Ok(match sql_val {
SqlValue::Number(n, _) => sql_number_to_value(data_type, n)?,
SqlValue::Null => Value::Null,
SqlValue::Boolean(b) => {
ensure!(
data_type.is_boolean(),
ColumnTypeMismatchSnafu {
column_name,
expect: data_type.clone(),
actual: ConcreteDataType::boolean_datatype(),
}
);
(*b).into()
}
SqlValue::DoubleQuotedString(s) | SqlValue::SingleQuotedString(s) => {
ensure!(
data_type.is_string(),
ColumnTypeMismatchSnafu {
column_name,
expect: data_type.clone(),
actual: ConcreteDataType::string_datatype(),
}
);
parse_string_to_value(s.to_owned(), data_type)?
}
_ => todo!("Other sql value"),
})
}
fn parse_string_to_value(s: String, data_type: &ConcreteDataType) -> Result<Value> {
match data_type {
ConcreteDataType::String(_) => Ok(Value::String(s.into())),
ConcreteDataType::Date(_) => {
if let Ok(date) = common_time::date::Date::from_str(&s) {
Ok(Value::Date(date))
} else {
ParseSqlValueSnafu {
msg: format!("Failed to parse {} to Date value", s),
}
.fail()
}
}
ConcreteDataType::DateTime(_) => {
if let Ok(datetime) = common_time::datetime::DateTime::from_str(&s) {
Ok(Value::DateTime(datetime))
} else {
ParseSqlValueSnafu {
msg: format!("Failed to parse {} to DateTime value", s),
}
.fail()
}
}
_ => {
unreachable!()
}
}
}
macro_rules! parse_number_to_value {
($data_type: expr, $n: ident, $(($Type: ident, $PrimitiveType: ident)), +) => {
match $data_type {
$(
ConcreteDataType::$Type(_) => {
let n = parse_sql_number::<$PrimitiveType>($n)?;
Ok(Value::from(n))
},
)+
_ => ParseSqlValueSnafu {
msg: format!("Fail to parse number {}, invalid column type: {:?}",
$n, $data_type
)}.fail(),
}
}
}
fn sql_number_to_value(data_type: &ConcreteDataType, n: &str) -> Result<Value> {
parse_number_to_value!(
data_type,
n,
(UInt8, u8),
(UInt16, u16),
(UInt32, u32),
(UInt64, u64),
(Int8, i8),
(Int16, i16),
(Int32, i32),
(Int64, i64),
(Float64, f64),
(Float32, f32)
)
}
fn parse_sql_number<R: FromStr + std::fmt::Debug>(n: &str) -> Result<R>
where
<R as FromStr>::Err: std::fmt::Debug,
{
match n.parse::<R>() {
Ok(n) => Ok(n),
Err(e) => ParseSqlValueSnafu {
msg: format!("Fail to parse number {}, {:?}", n, e),
}
.fail(),
}
}
#[cfg(test)]
mod tests {
use datatypes::value::OrderedFloat;
use super::*;
#[test]
fn test_sql_number_to_value() {
let v = sql_number_to_value(&ConcreteDataType::float64_datatype(), "3.0").unwrap();
assert_eq!(Value::Float64(OrderedFloat(3.0)), v);
let v = sql_number_to_value(&ConcreteDataType::int32_datatype(), "999").unwrap();
assert_eq!(Value::Int32(999), v);
let v = sql_number_to_value(&ConcreteDataType::string_datatype(), "999");
assert!(v.is_err(), "parse value error is: {:?}", v);
}
#[test]
fn test_parse_sql_value() {
let sql_val = SqlValue::Null;
assert_eq!(
Value::Null,
parse_sql_value("a", &ConcreteDataType::float64_datatype(), &sql_val).unwrap()
);
let sql_val = SqlValue::Boolean(true);
assert_eq!(
Value::Boolean(true),
parse_sql_value("a", &ConcreteDataType::boolean_datatype(), &sql_val).unwrap()
);
let sql_val = SqlValue::Number("3.0".to_string(), false);
assert_eq!(
Value::Float64(OrderedFloat(3.0)),
parse_sql_value("a", &ConcreteDataType::float64_datatype(), &sql_val).unwrap()
);
let sql_val = SqlValue::Number("3.0".to_string(), false);
let v = parse_sql_value("a", &ConcreteDataType::boolean_datatype(), &sql_val);
assert!(v.is_err());
assert!(format!("{:?}", v)
.contains("Fail to parse number 3.0, invalid column type: Boolean(BooleanType)"));
let sql_val = SqlValue::Boolean(true);
let v = parse_sql_value("a", &ConcreteDataType::float64_datatype(), &sql_val);
assert!(v.is_err());
assert!(format!("{:?}", v).contains(
"column_name: \"a\", expect: Float64(Float64), actual: Boolean(BooleanType)"
));
}
#[test]
pub fn test_parse_date_literal() {
let value = parse_sql_value(
"date",
&ConcreteDataType::date_datatype(),
&SqlValue::DoubleQuotedString("2022-02-22".to_string()),
)
.unwrap();
assert_eq!(ConcreteDataType::date_datatype(), value.data_type());
if let Value::Date(d) = value {
assert_eq!("2022-02-22", d.to_string());
} else {
unreachable!()
}
}
#[test]
pub fn test_parse_datetime_literal() {
let value = parse_sql_value(
"datetime_col",
&ConcreteDataType::datetime_datatype(),
&SqlValue::DoubleQuotedString("2022-02-22 00:01:03".to_string()),
)
.unwrap();
assert_eq!(ConcreteDataType::date_datatype(), value.data_type());
if let Value::DateTime(d) = value {
assert_eq!("2022-02-22 00:01:03", d.to_string());
} else {
unreachable!()
}
}
#[test]
pub fn test_parse_illegal_datetime_literal() {
assert!(parse_sql_value(
"datetime_col",
&ConcreteDataType::datetime_datatype(),
&SqlValue::DoubleQuotedString("2022-02-22 00:01:61".to_string()),
)
.is_err());
}
}

View File

@@ -0,0 +1,137 @@
use std::sync::Arc;
use common_query::Output;
use common_recordbatch::{RecordBatch, RecordBatches};
use datatypes::arrow::compute;
use datatypes::arrow_array::StringArray;
use datatypes::prelude::ConcreteDataType;
use datatypes::schema::{ColumnSchema, Schema};
use datatypes::vectors::{Helper, StringVector, VectorRef};
use snafu::{ensure, OptionExt, ResultExt};
use sql::statements::show::{ShowDatabases, ShowKind, ShowTables};
use crate::error::{
ArrowComputationSnafu, CastVectorSnafu, NewRecordBatchSnafu, NewRecordBatchesSnafu, Result,
SchemaNotFoundSnafu, UnsupportedExprSnafu,
};
use crate::sql::SqlHandler;
const TABLES_COLUMN: &str = "Tables";
const SCHEMAS_COLUMN: &str = "Schemas";
impl SqlHandler {
fn like_utf8(names: Vec<String>, s: &str) -> Result<VectorRef> {
let array = StringArray::from_slice(&names);
let boolean_array =
compute::like::like_utf8_scalar(&array, s).context(ArrowComputationSnafu)?;
Helper::try_into_vector(
compute::filter::filter(&array, &boolean_array).context(ArrowComputationSnafu)?,
)
.context(CastVectorSnafu)
}
pub(crate) async fn show_databases(&self, stmt: ShowDatabases) -> Result<Output> {
// TODO(dennis): supports WHERE
ensure!(
matches!(stmt.kind, ShowKind::All | ShowKind::Like(_)),
UnsupportedExprSnafu {
name: stmt.kind.to_string(),
}
);
let catalog = self.get_default_catalog()?;
// TODO(dennis): return an iterator or stream would be better.
let schemas = catalog.schema_names();
let column_schemas = vec![ColumnSchema::new(
SCHEMAS_COLUMN,
ConcreteDataType::string_datatype(),
false,
)];
let schema = Arc::new(Schema::new(column_schemas));
let schemas_vector = if let ShowKind::Like(ident) = stmt.kind {
Self::like_utf8(schemas, &ident.value)?
} else {
Arc::new(StringVector::from(schemas))
};
let columns: Vec<VectorRef> = vec![schemas_vector];
let recordbatch = RecordBatch::new(schema.clone(), columns).context(NewRecordBatchSnafu)?;
Ok(Output::RecordBatches(
RecordBatches::try_new(schema, vec![recordbatch]).context(NewRecordBatchesSnafu)?,
))
}
pub(crate) async fn show_tables(&self, stmt: ShowTables) -> Result<Output> {
// TODO(dennis): supports WHERE
ensure!(
matches!(stmt.kind, ShowKind::All | ShowKind::Like(_)),
UnsupportedExprSnafu {
name: stmt.kind.to_string(),
}
);
let schema = if let Some(name) = &stmt.database {
let catalog = self.get_default_catalog()?;
catalog.schema(name).context(SchemaNotFoundSnafu { name })?
} else {
self.get_default_schema()?
};
let tables = schema.table_names();
let column_schemas = vec![ColumnSchema::new(
TABLES_COLUMN,
ConcreteDataType::string_datatype(),
false,
)];
let schema = Arc::new(Schema::new(column_schemas));
let tables_vector = if let ShowKind::Like(ident) = stmt.kind {
Self::like_utf8(tables, &ident.value)?
} else {
Arc::new(StringVector::from(tables))
};
let columns: Vec<VectorRef> = vec![tables_vector];
let recordbatch = RecordBatch::new(schema.clone(), columns).context(NewRecordBatchSnafu)?;
Ok(Output::RecordBatches(
RecordBatches::try_new(schema, vec![recordbatch]).context(NewRecordBatchesSnafu)?,
))
}
}
#[cfg(test)]
mod tests {
use super::*;
fn assert_vector(expected: Vec<&str>, actual: &VectorRef) {
let actual = actual.as_any().downcast_ref::<StringVector>().unwrap();
assert_eq!(*actual, StringVector::from(expected));
}
#[test]
fn test_like_utf8() {
let names: Vec<String> = vec!["greptime", "hello", "public", "world"]
.into_iter()
.map(|x| x.to_string())
.collect();
let ret = SqlHandler::like_utf8(names.clone(), "%ll%").unwrap();
assert_vector(vec!["hello"], &ret);
let ret = SqlHandler::like_utf8(names.clone(), "%time").unwrap();
assert_vector(vec!["greptime"], &ret);
let ret = SqlHandler::like_utf8(names.clone(), "%ld").unwrap();
assert_vector(vec!["world"], &ret);
let ret = SqlHandler::like_utf8(names, "%").unwrap();
assert_vector(vec!["greptime", "hello", "public", "world"], &ret);
}
}

Some files were not shown because too many files have changed in this diff Show More