Compare commits

...

178 Commits

Author SHA1 Message Date
zyy17
0ffe640f7d build: install ca-certificates in docker image building (#807)
refactor: install ca-certificates in docker image building

Signed-off-by: zyy17 <zyylsxm@gmail.com>

Signed-off-by: zyy17 <zyylsxm@gmail.com>
2023-01-09 17:39:03 +08:00
Lei, HUANG
0d660e45cf feat: wal config 2023-01-09 13:02:30 +08:00
Lei, HUANG
a640872cda fix: parquet native row group pruning support 2023-01-07 21:34:08 +08:00
Lei, HUANG
7e3c59fb51 fix: remove start from LogStore; fix error message (#837)
(cherry picked from commit 627d444723)
2023-01-06 15:20:04 +08:00
Lei, HUANG
7bbc679c76 fix: revert script dependenciex 2023-01-06 15:15:41 +08:00
Lei, HUANG
0b3a2cbcda fix: revert cargo workspace dependencies 2023-01-06 15:10:04 +08:00
Lei, HUANG
53ee85cdad feat: use raft-engine crate to reimplement logstore (#799)
(cherry picked from commit 8f5ecefc90)
2023-01-06 15:05:55 +08:00
Mike Yang
bc9a46dbb7 feat: support varbinary (#767)
feat: support varbinary for table creation and record insertion
2022-12-26 13:14:12 +08:00
Ruihang Xia
a61e96477b docs: RFC of promql (#779)
* docs: RFC of promql

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* docs: change styles, list drawback of misusing arrow

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
2022-12-26 13:12:24 +08:00
Yingwen
f8500e54c1 refactor: Remove PutOperation and Simplify WriteRequest API (#775)
* chore: Remove unused MutationExtra

* refactor(storage): Refactor Mutation and Payload

Change Mutation from enum to a struct that holds op type and record
batches so the encoder don't need to convert the mutation into record
batch. Now The Payload is no more an enum, it just holds the data, to
be serialized to the WAL, of the WriteBatch. The encoder and decoder
now deal with the Payload instead of the WriteBatch, so we could hold
more information not necessary to be stored to the WAL in the
WriteBatch.

This commit also merge variants in write_batch::Error to storage::Error
as some variants of them denote the same error.

* test(storage): Pass all tests in storage

* chore: Remove unused codes then format codes

* test(storage): Fix test_put_unknown_column test

* style(storage): Fix clippy

* chore: Remove some unused codes

* chore: Rebase upstream and fix clippy

* chore(storage): Remove unused codes

* chore(storage): Update comments

* feat: Remove PayloadType from wal.proto

* chore: Address CR comments

* chore: Remove unused write_batch.proto
2022-12-26 13:11:24 +08:00
discord9
e85780b5e4 refactor: rename some mod.rs to <MOD_NAME>.rs (#784)
* refactor: rename `mod.rs` to <MOD_NAME>.rs

* refactor: not rename mod.rs in benches/
2022-12-26 12:48:34 +08:00
Ning Sun
11bdb33d37 feat: sql query interceptor and plugin refactoring (#773)
* feat: let instance hold plugins

* feat: add sql query interceptor definition

* docs: add comments to key apis

* feat: add implementation for pre-parsing and post-parsing

* feat: add post_execute hook

* test: add tests for interceptor

* chore: add license header

* fix: clippy error

* Update src/cmd/src/frontend.rs

Co-authored-by: LFC <bayinamine@gmail.com>

* refactor: batching post_parsing calls

* refactor: rename AnyMap2 to Plugins

* feat: call pre_execute with logical plan empty at the moment

Co-authored-by: LFC <bayinamine@gmail.com>
2022-12-23 15:22:12 +08:00
LFC
1daba75e7b refactor: use "USE" keyword (#785)
Co-authored-by: luofucong <luofucong@greptime.com>
2022-12-23 14:29:47 +08:00
LFC
dc52a51576 chore: upgrade to Arrow 29.0 and use workspace package and dependencies (#782)
* chore: upgrade to Arrow 29.0 and use workspace package and dependencies

* fix: resolve PR comments

Co-authored-by: luofucong <luofucong@greptime.com>
2022-12-23 14:28:37 +08:00
Ruihang Xia
26af9e6214 ci: setup secrets for setup-protoc job (#783)
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
2022-12-23 11:36:39 +08:00
fys
e07791c5e8 chore: make election mod public (#781) 2022-12-22 17:32:35 +08:00
Yingwen
b6d29afcd1 ci: Use lld for coverage (#778)
* ci: Use lld for coverage

* style: Fix clippy
2022-12-22 16:10:37 +08:00
LFC
ea9af42091 chore: upgrade Rust to nightly 2022-12-20 (#772)
* chore: upgrade Rust to nightly 2022-12-20

* chore: upgrade Rust to nightly 2022-12-20

Co-authored-by: luofucong <luofucong@greptime.com>
2022-12-21 19:32:30 +08:00
shuiyisong
d0ebcc3b5a chore: open userinfo constructor (#774) 2022-12-21 17:58:43 +08:00
LFC
77182f5024 chore: upgrade Arrow to version 28, and DataFusion to 15 (#771)
Co-authored-by: luofucong <luofucong@greptime.com>
2022-12-21 17:02:11 +08:00
Ning Sun
539ead5460 feat: check database existence on http api (#764)
* feat: check database existance on http api

* Update src/servers/src/http/handler.rs

Co-authored-by: Ruihang Xia <waynestxia@gmail.com>

* feat: use database not found status code

* test: add assertion for status code

Co-authored-by: Ruihang Xia <waynestxia@gmail.com>
2022-12-21 10:28:45 +08:00
Ruihang Xia
bc0e4e2cb0 fix: fill NULL based on row_count (#765)
* fix: fill NULL based on row_count

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* simplify code

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* fix: replace set_len with resize

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
2022-12-20 12:12:48 +08:00
Ruihang Xia
7d29670c86 fix: consider null mask in sqlness display util (#763)
* fix: consider null mask in sqlness display util

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* add test case

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* fix test case

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* change placeholder to null

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
2022-12-19 14:20:28 +08:00
LFC
afd88dd53a fix: test_dist_table_scan block (#761)
* fix: `test_dist_table_scan` block

* fix: resolve PR comments

Co-authored-by: luofucong <luofucong@greptime.com>
2022-12-19 11:20:51 +08:00
Ning Sun
efd85df6be feat: add schema check on postgres startup (#758)
* feat: add schema check on postgres startup

* chore: update pgwire to 0.6.3

* test: add test for unspecified db
2022-12-19 10:53:44 +08:00
Ning Sun
ea1896493b feat: allow multiple sql statements in query string (#699)
* feat: allow multiple sql statement in query string

* test: add a test for multiple statement call

* feat: add temprary workaround for standalone mode

* fix: resolve sql parser issue temporarily

* Update src/datanode/src/instance/sql.rs

Co-authored-by: Yingwen <realevenyag@gmail.com>

* fix: adopt new sql handler

* refactor: revert changes in query engine

* refactor: assume sql-statement 1-1 on datanode

* test: use frontend for integration test

* refactor: add statement execution api for explicit single statement call

* fix: typo

* refactor: rename query method

* test: add test case for error

* test: data type change adoption

* chore: add todo from review

* chore: remove obsolete comments

* fix: resolve resolve issues

Co-authored-by: Yingwen <realevenyag@gmail.com>
2022-12-16 19:50:20 +08:00
Jiachun Feng
66bca11401 refactor: remove optional from the protos (#756) 2022-12-16 15:47:51 +08:00
Yingwen
7c16a4a17b refactor(storage): Move write_batch::codec to a separate file (#757)
* refactor(storage): Move write_batch::codec to a separate file

* chore: move new_test_batch to write_batch mod
2022-12-16 15:32:59 +08:00
dennis zhuang
28bd7404ad feat: change column's default property to nullable (#751)
* feat: change column's default property to nullable

* chore: use all instead of any

* fix: compile error

* fix: dependencies order in cargo
2022-12-16 11:17:01 +08:00
Lei, HUANG
0653301754 feat: replace arrow2 with official implementation 🎉 (#753)
* chore: kick off. change datafusion/arrow/parquet to target version

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* chore: replace one last datafusion dep

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* feat: arrow_array switch to arrow

* chore: update dep of binary vector

* chore: fix wrong merge commit

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* feat: Switch to datatypes2

* feat: Make recordbatch compile

* chore: sort Cargo.toml

* feat: Fix common::recordbatch compiler errors

* feat: Fix recordbatch test compiling issue

* fix: api crate (#708)

* fix: rename ConcreteDataType::timestamp_millis_type to ConcreteDataType::timestamp_millisecond_type. fix other warnings regarding timestamp

* fix: revert changes in datatypes2

* fix: helper

* chore: delete datatypes based on arrow2

* feat: Fix some compiler errors in common::query (#710)

* feat: Fix some compiler errors in common::query

* feat: test_collect use vectors api

* fix: common-query subcrate (#712)

* fix: record batch adapter

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* fix error enum

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* fix: Fix common::query compiler errors (#713)

* feat: Move conversion to ScalarValue to value.rs

* fix: Fix common::query compiler errors

This commit also make InnerError pub(crate)

* feat: Implements diff accumulator using WrapperType (#715)

* feat: Remove usage of opaque error from common::recordbatch

* feat: Remove opaque error from common::query

* feat: Fix diff compiler errors

Now common_function just use common_query's Error and Result. Adds
a LargestType associated type to LogicalPrimitiveType to get the largest
type a logical primitive type can cast to.

* feat: Remove LargestType from NativeType trait

* chore: Update comments

* feat: Restrict Scalar::RefType of WrapperType to itself

Add trait bound `for<'a> Scalar<RefType<'a> = Self>` to WrapperType

* chore: Address CR comments

* chore: Format codes

* fix: fix compile error for mean/polyval/pow/interp ops

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* Revert "fix: fix compile error for mean/polyval/pow/interp ops"

This reverts commit fb0b4eb826.

* fix: Fix compiler errors in argmax/rate/median/norm_cdf (#716)

* fix: Fix compiler errors in argmax/rate/median/norm_cdf

* chore: Address CR comments

* fix: fix compile error for mean/polyval/pow/interp ops (#717)

* fix: fix compile error for mean/polyval/pow/interp ops

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* simplify type bounds

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* fix: fix argmin/percentile/clip/interp/scipy_stats_norm_pdf errors (#718)

fix: fix argmin/percentile/clip/interp/scipy_stats_norm_pdf compiler errors

* fix: fix other compile error in common-function (#719)

* further fixing

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* fix all compile errors in common function

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* fix: Fix tests and clippy for common-function subcrate (#726)

* further fixing

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* fix all compile errors in common function

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* fix tests

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* fix clippy

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* revert test changes

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* fix: row group pruning (#725)

* fix: row group pruning

* chore: use macro to simplify stats implemetation

* fxi: CR comments

* fix: row group metadata length mismatch

* fix: simplify code

* fix: Fix common::grpc compiler errors (#722)

* fix: Fix common::grpc compiler errors

This commit refactors RecordBatch and holds vectors in the RecordBatch
struct, so we don't need to cast the array to vector when doing
serialization or iterating the batch.

Now we use the vector API instead of the arrow API in grpc crate.

* chore: Address CR comments

* fix common record batch

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* fix: Fix compile error in server subcrate (#727)

* fix: Fix compile error in server subcrate

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* remove unused type alias

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* explicitly panic

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* Update src/storage/src/sst/parquet.rs

Co-authored-by: Yingwen <realevenyag@gmail.com>

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
Co-authored-by: Yingwen <realevenyag@gmail.com>

* fix: Fix common grpc expr (#730)

* fix compile errors

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* rename fn names

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* fix styles

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* fix wranings in common-time

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* fix: pre-cast to avoid tremendous match arms (#734)

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* feat: upgrade storage crate to arrow and parquet offcial impl (#738)

* fix: compile erros

* fix: parquet reader and writer

* fix: parquet reader and writer

* fix: WriteBatch IPC encode/decode

* fix: clippy errors in storage subcrate

* chore: remove suspicious unwrap

* fix: some cr comments

* fix: CR comments

* fix: CR comments

* fix: Fix compiler errors in catalog and mito crates (#742)

* fix: Fix compiler errors in mito

* fix: Fix compiler errors in catalog crate

* style: Fix clippy

* chore: Fix use

* Merge pull request #745

* fix nyc-taxi and util

* Merge branch 'replace-arrow2' into fix-others

* fix substrait

* fix warnings and error in test

* fix: Fix imports in optimizer.rs

* fix: errors in optimzer

* fix: remove unwrap

* fix: Fix compiler errors in query crate (#746)

* fix: Fix compiler errors in state.rs

* fix: fix compiler errors in state

* feat: upgrade sqlparser to 0.26

* fix: fix datafusion engine compiler errors

* fix: Fix some tests in query crate

* fix: Fix all warnings in tests

* feat: Remove `Type` from timestamp's type name

* fix: fix query tests

Now datafusion already supports median, so this commit also remove the
median function

* style: Fix clippy

* feat: Remove RecordBatch::pretty_print

* chore: Address CR comments

* Update src/query/src/query_engine/state.rs

Co-authored-by: Ruihang Xia <waynestxia@gmail.com>

* fix: frontend compile errors (#747)

fix: fix compile errors in frontend

* fix: Fix compiler errors in script crate (#749)

* fix: Fix compiler errors in state.rs

* fix: fix compiler errors in state

* feat: upgrade sqlparser to 0.26

* fix: fix datafusion engine compiler errors

* fix: Fix some tests in query crate

* fix: Fix all warnings in tests

* feat: Remove `Type` from timestamp's type name

* fix: fix query tests

Now datafusion already supports median, so this commit also remove the
median function

* style: Fix clippy

* feat: Remove RecordBatch::pretty_print

* chore: Address CR comments

* feat: Add column_by_name to RecordBatch

* feat: modify select_from_rb

* feat: Fix some compiler errors in vector.rs

* feat: Fix more compiler errors in vector.rs

* fix: fix table.rs

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* fix: Fix compiler errors in coprocessor

* fix: Fix some compiler errors

* fix: Fix compiler errors in script

* chore: Remove unused imports and format code

* test: disable interval tests

* test: Fix test_compile_execute test

* style: Fix clippy

* feat: Support interval

* feat: Add RecordBatch::columns and fix clippy

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
Co-authored-by: Ruihang Xia <waynestxia@gmail.com>

* fix: Fix All The Tests! (#752)

* fix: Fix several tests compile errors

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* fix: some compile errors in tests

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* fix: compile errors in frontend tests

* fix: compile errors in frontend tests

* test: Fix tests in api and common-query

* test: Fix test in sql crate

* fix: resolve substrait error

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* chore: add more test

* test: Fix tests in servers

* fix instance_test

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* test: Fix tests in tests-integration

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
Co-authored-by: Lei, HUANG <mrsatangel@gmail.com>
Co-authored-by: evenyag <realevenyag@gmail.com>

* fix: clippy errors

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
Co-authored-by: Ruihang Xia <waynestxia@gmail.com>
Co-authored-by: evenyag <realevenyag@gmail.com>
2022-12-15 18:49:12 +08:00
LFC
61d8bc2ea1 refactor(frontend): minor changes around FrontendInstance constructor (#748)
* refactor: minor changes in some testing codes

Co-authored-by: luofucong <luofucong@greptime.com>
2022-12-15 14:34:40 +08:00
Ruihang Xia
e3785fca70 docs: change logo in readme automatically based on github theme (#743)
* docs: adaptive logo on theme

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* switch logos

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* aligh center

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* adjust stylet

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* use new logo image

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
2022-12-14 19:32:51 +08:00
shuiyisong
fda9e80cbf feat: impl static_user_provider (#739)
* feat: add MemUserProvider and impl auth

* feat: impl user_provider option in fe and standalone mode

* chore: add file impl for mem provider

* chore: remove mem opts

* chore: minor change

* chore: refac pg server to use user_provider as indicator for using pwd auth

* chore: fix test

* chore: extract common code

* chore: add unit test

* chore: rebase develop

* chore: add user provider to http server

* chore: minor rename

* chore: change to ref when convert to anymap

* chore: fix according to clippy

* chore: remove clone on startcommand

* chore: fix cr issue

* chore: update tempdir use

* chore: change TryFrom to normal func while parsing anymap

* chore: minor change

* chore: remove to_lowercase
2022-12-14 16:38:29 +08:00
Lei, HUANG
756c068166 feat: logstore compaction (#740)
* feat: add benchmark for wal

* add bin

* feat: impl wal compaction

* chore: This reverts commit ef9f2326

* chore: This reverts commit 9142ec0e

* fix: remove empty files

* fix: failing tests

* fix: CR comments

* fix: Mark log as stable after writer applies manifest

* fix: some cr comments and namings

* chore: rename all stable_xxx to obsolete_xxx

* chore: error message
2022-12-14 16:15:29 +08:00
dennis zhuang
6a4e2e5975 feat: promql create and skeleton (#720)
* feat: adds promql crate

* feat: adds promql-parser dependency and rfc doc

* fix: dependencies order in servers crate

* fix: forgot error.rs

* fix: comment

* fix: license header

* fix: remove docs/rfc/20221207_promql.md
2022-12-13 17:08:22 +08:00
Lei, HUANG
9ad6ddb26e fix: remove useless metaclient field from datanode Instance (#744) 2022-12-13 14:26:26 +08:00
fys
c5661ee362 feat: support http basic authentication (#733)
* feat: support http auth

* add some unit test and log

* fix

* cr

* remove unused #[derive(Clone)]
2022-12-13 10:44:33 +08:00
zyy17
9b093463cc feat: add Makefile to aggregate the commands that developers always use (#736)
* feat: add Makefile to aggregate the commands that developers always use

* refactor: add 'clean' and 'unit-test' target

* refactor: add sqlness-test target and modify some decriptions format

Signed-off-by: zyy17 <zyylsxm@gmail.com>
2022-12-12 13:03:49 +08:00
zyy17
61e0f1a11c refactor: add tls option in frontend cli options (#735)
* refactor: add tls option in frontend cli options

* fix: add 'Eq' trait for fixing clippy error

* fix: remove redundant clone

Signed-off-by: zyy17 <zyylsxm@gmail.com>
2022-12-12 10:02:17 +08:00
Ning Sun
249ebc6937 feat: update pgwire and refactor pg auth handler (#732) 2022-12-09 17:01:55 +08:00
elijah
c1b8981f61 refactor(mito): change the table path to schema/table_id (#728)
refactor: change the table path to `schema/table_id`
2022-12-09 12:59:16 +08:00
Jiachun Feng
949cd3e3af feat: move_value & delete_route (#707)
* feat: move_value & delete_route

* chore: minor refactor

* chore: refactor unit test of metaclient

* chore: map to kv

* Update src/meta-srv/src/service/router.rs

Co-authored-by: Yingwen <realevenyag@gmail.com>

* Update src/meta-srv/src/service/router.rs

Co-authored-by: Yingwen <realevenyag@gmail.com>

* chore: by code review

Co-authored-by: Yingwen <realevenyag@gmail.com>
2022-12-09 11:07:48 +08:00
SSebo
b26982c5d7 feat: support timestamp new syntax (#697)
* feat: support timestamp new syntax

* fix: not null at end of new time stamp index syntax

* chore: simplify code
2022-12-09 10:52:14 +08:00
fys
4fdf26810c feat: support auth in frontend (#688)
* feat: add UserProvider trait

* chore: minor fix

* support pg mysql

* refactor and add some logs

* chore: add license

Co-authored-by: shuiyisong <xixing.sys@gmail.com>
2022-12-08 11:51:52 +08:00
dennis zhuang
7f59758e69 feat: bump opendal version to 0.22 (#721)
* feat: bump opendal version to 0.22

* fix: LoggingLayer
2022-12-08 11:19:21 +08:00
Zheming Li
a521ab5041 fix: set default value when fail to get git info instead of panic (#696)
fix: set default value when fail to git info instead of panic
2022-12-07 13:16:27 +08:00
LFC
833216d317 refactor: directly invoke Datanode methods in standalone mode (part 1) (#694)
* refactor: directly invoke Datanode methods in standalone mode

* test: add more unit tests

* fix: get rid of `println` in testing codes

* fix: resolve PR comments

* fix: resolve PR comments

Co-authored-by: luofucong <luofucong@greptime.com>
2022-12-07 11:37:59 +08:00
Ruihang Xia
90c832b33d refactor: drop support of physical plan query interface (#714)
* refactor: drop support of physical plan query interface

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* refactor: collapse server/grpc sub-module

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* refactor: remove unused errors

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
2022-12-06 19:23:32 +08:00
LFC
8959dbcef8 feat: Substrait logical plan (#704)
* feat: use Substrait logical plan to query data from Datanode in Frontend in distributed mode

* fix: resolve PR comments

* fix: resolve PR comments

* fix: resolve PR comments

Co-authored-by: luofucong <luofucong@greptime.com>
2022-12-06 19:21:57 +08:00
discord9
2034b40f33 chore: update RustPython dependence(With a tweaked fork) (#655)
* refactor: update RsPy

* depend: add `rustpython-pylib`

* feat: add_frozen stdlib for every vm init

* feat: limit stdlib to a selected few

* chore: use `rev` instead of branch` im depend

* refactor: rename to allow_list

* feat: use opt level one

* doc: add username for TODO&change optimize to 0

* style: fmt .toml
2022-12-06 14:15:00 +08:00
SSebo
55e6be7af1 fix: test_server_require_secure_client_secure (#701) 2022-12-06 10:38:54 +08:00
discord9
f9bfb121db feat: add rate() udf (#508)
* feat: rewrite `rate` UDF

* feat: rename to `prom_rate`

* refactor: solve conflict&add license

* refactor: import arrow
2022-12-06 10:30:13 +08:00
Ruihang Xia
6fb413ae50 ci: add toml format linter (#706)
* chore: run taplo format

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* ci: add workflow to check toml

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* rerun formatter with ident to 4 spaces

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* update check command

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
2022-12-05 20:03:10 +08:00
Ruihang Xia
beb07fc895 feat: new datatypes subcrate based on the official arrow (#705)
* feat: Init datatypes2 crate

* chore: Remove some unimplemented types

* feat: Implements PrimitiveType and PrimitiveVector for datatypes2 (#633)

* feat: Implement primitive types and vectors

* feat: Implement a wrapper type

* feat: Remove VectorType from ScalarRef

* feat: Move some trait bound from NativeType to WrapperType

* feat: pub use  primitive vectors and builders

* feat: Returns error in try_from when type mismatch

* feat: Impl PartialEq for some vectors

* test: Pass vector tests

* chore: Add license header

* test: Pass more vector tests

* feat: Implement some methods of vector Helper

* test: Pass more tests

* style: Fix clippy

* chore: Add license header

* feat: Remove IntoValueRef trait

* feat: Add NativeType trait bound to WrapperType::Native

* docs: Explain what is wrapper type

* chore: Fix typos

* refactor: LogicalPrimitiveType::type_name returns str

* feat: Implements DateType and DateVector (#651)

* feat: Implement DateType and DateVector

* test: Pass more value and data type tests

* chore: Address CR comments

* test: Skip list value test

* feat: datatypes2 datetime (#661)

* feat: impl DateTime type and vector

* fix: add license header

* fix: CR comments and add more tests

* fix: customized serialization for wrapper type

* feat: Implements NullType and NullVector (#658)

* feat: Implements NullType and NullVector

* chore: Address CR comment

Co-authored-by: Ruihang Xia <waynestxia@gmail.com>

* chore: Address CR comment

Co-authored-by: Ruihang Xia <waynestxia@gmail.com>

* feat: Implements StringType and StringVector (#659)

* feat: implement string vector

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* add more test and from

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* fix clippy

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* cover NUL

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* feat: impl datatypes2/timestamp (#686)

* feat: add timestamp datatype and vectors

* fix: cr comments and reformat code

* chore: add some tests

* feat: Implements ListType and ListVector (#681)

* feat: Implement ListType and ListVector

* test: Pass more tests

* style: Fix clippy

* chore: Fix comment

* chore: Address CR comments

* feat: impl constant vector (#680)

* feat: impl constant vector

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* fix tests

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* Apply suggestions from code review

Co-authored-by: Yingwen <realevenyag@gmail.com>

* rename fn names

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* remove println

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
Co-authored-by: Yingwen <realevenyag@gmail.com>

* feat: Implements Validity (#684)

* feat: Implements Validity

* chore: remove pub from sub mod in vectors

* feat: Implements schema for datatypes2 (#695)

* feat: Add is_timestamp_compatible to DataType

* feat: Implement ColumnSchema and Schema

* feat: Impl RawSchema

* chore: Remove useless codes and run more tests

* chore: Fix clippy

* feat: Impl from_arrow_time_unit and pass schema tests

* chore: add more tests for timestamp (#702)

* chore: add more tests for timestamp

* chore: add replicate test for timestamps

* feat: Implements helper methods for vectors/values (#703)

* feat: Implement helper methods for vectors/values

* chore: Address CR comments

* chore: add more test for timestamp

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
Co-authored-by: evenyag <realevenyag@gmail.com>
Co-authored-by: Lei, HUANG <6406592+v0y4g3r@users.noreply.github.com>
Co-authored-by: Lei, HUANG <mrsatangel@gmail.com>
2022-12-05 19:59:23 +08:00
Ning Sun
4275e47bdb refactor: use updated mysql_async client (#698) 2022-12-05 11:18:32 +08:00
dennis zhuang
6720bc5f7c fix: validate create table request in mito engine (#690)
* fix: validate create table request in mito engine

* fix: comment

* chore: remove TIMESTAMP_INDEX in system.rs
2022-12-05 11:01:43 +08:00
Lei, HUANG
4052563248 fix: pr template task default state (#687) 2022-12-02 20:39:53 +08:00
dennis zhuang
952e1bd626 test: update dummy result (#693) 2022-12-02 19:22:37 +08:00
shuiyisong
8232015998 fix: cargo sort in pre-commit (#689) 2022-12-02 16:19:31 +08:00
Ruihang Xia
d82a3a7d58 feat: implement most of scalar function and selection conversion in substrait (#678)
* impl to_df_scalar_function

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* part of scalar functions

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* conjunction over filters

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* change the ser/de target to substrait::Plan

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* basic test coverage

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* fix typos and license header

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* fix clippy

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* fix CR comments

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* logs unsupported extension

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* Update src/common/substrait/src/df_expr.rs

Co-authored-by: Yingwen <realevenyag@gmail.com>

* address review comments

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* change format

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* replace context with with_context

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
Co-authored-by: Yingwen <realevenyag@gmail.com>
2022-12-02 14:46:05 +08:00
Ning Sun
0599465685 feat: inject current database/schema into query context for postgres protocol (#685)
* feat: inject current database/schema into query context

* test: avoid duplicate server setup
2022-12-02 11:49:39 +08:00
Mofeng
13d51250ba feat: add http /health api (#676)
* feat: add http `/health` api

* feat: add `/health` api test suit in http intergration test
2022-12-01 19:11:58 +08:00
LFC
6127706b5b feat: support "use" stmt part 1 (#672)
* feat: a bare sketch of session; support "use" in MySQL server; modify insertion and selection related codes in Datanode
2022-12-01 17:05:32 +08:00
dennis zhuang
2e17e9c4b5 feat: supports s3 storage (#656)
* feat: adds s3 object storage configuration

* feat: adds s3 integration test

* chore: use map

* fix: forgot license header

* fix: checking if bucket is empty in test_on

* chore: address CR issues

* refactor: run s3 test with dotenv

* chore: randomize grpc port for test

* fix: README in tests-integration

* chore: remove redundant comments
2022-12-01 10:59:14 +08:00
xiaomin tang
b0cbfa7ffb docs: add a roadmap link in README (#673)
* docs: add roadmap to README

* docs: missing period
2022-11-30 21:25:27 +08:00
Ruihang Xia
20172338e8 ci: Revert "ci: change CI unit test trigger" (#674)
Revert "ci: change CI unit test trigger (#671)"

This reverts commit 9c53f9b24c.
2022-11-30 21:23:40 +08:00
Ruihang Xia
9c53f9b24c ci: change CI unit test trigger (#671)
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
2022-11-30 20:19:35 +08:00
Dongxu Wang
6d24f7ebb6 refactor: bump axum 0.6, use recommended way to nest routes (#668) 2022-11-30 20:04:33 +08:00
SSebo
68c2de8e45 feat: mysql and pg server support tls (#641)
* feat: mysql and pg server support tls

* chore: replace opensrv-mysql to original

* chore: TlsOption is required but supply default value

* feat: mysql server support force tls

* chore: move TlsOption to servers

* test: mysql server disable / prefer / required tls mode

* test: pg server disable / prefer / required tls mode

* chore: add doc and remove no used code

* chore: add TODO and restore cargo linker config
2022-11-30 12:46:15 +08:00
Yingwen
a17dcbc511 chore: fix SequenceNotMonotonic error message (#664)
* chore: fix SequenceNotMonotonic error message

previous sequence should greater than or equal to given sequence

* Apply suggestions from code review

Co-authored-by: Ruihang Xia <waynestxia@gmail.com>
2022-11-30 11:58:43 +08:00
Ning Sun
53ab19ea5a ci: remove assignees which is causing error (#663) 2022-11-30 11:36:35 +08:00
Ning Sun
84c44cf540 ci: fix doc label task on forked repo (#654) 2022-11-30 11:23:15 +08:00
LFC
020b9936cd fix: correctly detach spawned mysql listener task (#657)
Co-authored-by: luofucong <luofucong@greptime.com>
2022-11-29 18:39:18 +08:00
Ning Sun
75dcf2467b refactor: add tests-integration module (#590)
* refactor: add integration-tests module

* Apply suggestions from code review

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

* test: move grpc module to tests-integration

* test: adapt new standalone mode

* test: improve http assertion

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2022-11-29 16:28:58 +08:00
Ruihang Xia
eea5393f96 feat: UI improvement for integration test runner (#645)
* improve dir resolving and start up ordering

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* fix orphan process

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* Update tests/runner/src/util.rs, fix typo

Co-authored-by: Dongxu Wang <dongxu@apache.org>

* simplify logic via tokio timeout

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
Co-authored-by: Dongxu Wang <dongxu@apache.org>
2022-11-29 15:32:39 +08:00
Ning Sun
3d312d389d ci: add doc label support for pr too (#650) 2022-11-29 15:21:12 +08:00
dennis zhuang
fdc73fb52f perf: cache python interpreter in TLS (#649)
* perf: cache python interpreter when executing coprocessors

* test: speedup test_execute_script by reusing interpreter

* fix: remove comment

* chore: use get_or_insert_with instead
2022-11-29 14:41:37 +08:00
Ning Sun
2a36e26d19 ci: add action to create doc issue when change labelled (#648)
ci: add action to create doc issue when change labeled
2022-11-29 14:25:57 +08:00
Zheming Li
baef640fe3 feat: add --version command line option (#632)
* add version command line option

* use concat!

Co-authored-by: Ruihang Xia <waynestxia@gmail.com>

Co-authored-by: Ruihang Xia <waynestxia@gmail.com>
2022-11-28 17:07:17 +08:00
dennis zhuang
5fddb799f7 feat: enable atomic write for file object storage (#643)
* fix: remove opendal from catalog dependencies

* feat: enable atomic writing for fs service
2022-11-28 16:01:32 +08:00
Dongxu Wang
f372229b18 fix: append table id to table data dir (#640) 2022-11-28 10:53:13 +08:00
Xuanwo
4085fc7899 chore: Bump OpenDAL to v0.21.1 (#639)
* deps: Bump OpenDAL to v0.21.1

Signed-off-by: Xuanwo <github@xuanwo.io>

* Avoid using raw types when not needed

Signed-off-by: Xuanwo <github@xuanwo.io>

Signed-off-by: Xuanwo <github@xuanwo.io>
2022-11-27 10:18:39 +08:00
Ruihang Xia
30940e692a feat: impl DROP TABLE on memory catalog based standalone mode (#630)
* feat: implement drop table for standalone mode

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* update integration test

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* enhancement test

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
2022-11-25 11:53:46 +08:00
Mike Yang
b371ce0f48 test: added tests for statements methods (#622)
* test: added tests for parse_column_default_constraint

* test: added test for sql_column_def_to_grpc_column_def

* refactor: remove hardcode in test
2022-11-25 11:35:06 +08:00
Lei, HUANG
ac7f52d303 fix: start datanode instance before frontend services (#634) 2022-11-25 11:25:57 +08:00
Dongxu Wang
051768b735 ci: add spell check with typos (#627) 2022-11-24 14:46:50 +08:00
fys
c5b0d2431f feat: remove InsertBatch in gRPC message (#570) 2022-11-24 14:04:48 +08:00
Lei, HUANG
4038dd4067 fix: add concurrency control for catalog manager (#619) 2022-11-24 11:10:33 +08:00
Dongxu Wang
8be0f05570 chore: able to config axum timeout in toml (#624) 2022-11-24 11:09:21 +08:00
zyy17
69f06eec8b ci: change scheduled release from nigthly to weekly (#623)
Signed-off-by: zyy17 <zyylsxm@gmail.com>
2022-11-24 11:05:35 +08:00
Ruihang Xia
7b37e99a45 feat: deregister table for MemoryCatalogManager (#620)
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
2022-11-24 09:36:27 +08:00
dennis zhuang
c09775d17f feat: adds metrics, tracing and retry layer to object-store (#621) 2022-11-23 11:40:03 +08:00
Francis Du
4a9cf49637 feat: support explain syntax (#546) 2022-11-22 21:22:32 +08:00
Ruihang Xia
9f865b50ab test: add dummy select case (#618)
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
2022-11-22 16:47:45 +08:00
Ruihang Xia
b407ebf6bb feat: integration test suite (#487)
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
2022-11-22 15:34:13 +08:00
Lei, HUANG
c144a1b20e feat: impl alter table in distributed mode (#572) 2022-11-22 15:17:25 +08:00
Yingwen
0791c65149 refactor: replace some usage of MutableBitmap by BitVec (#610) 2022-11-21 17:36:53 +08:00
LFC
62fcb54258 fix: correctly open table when distributed datanode restart (#576)
Co-authored-by: luofucong <luofucong@greptime.com>
2022-11-21 15:15:14 +08:00
Lei, HUANG
2b6b979d5a fix: remove datanode mysql options in standalone mode (#595) 2022-11-21 14:15:47 +08:00
Dongxu Wang
b6fa316c65 chore: correct typos (#589) (#592) 2022-11-21 14:07:45 +08:00
Lei, HUANG
ca5734edb3 feat: disable mysql server on datande when running standalone mode (#593) 2022-11-21 12:12:26 +08:00
Mike Yang
5428ad364e fix: make nullable as default when alter table (#591) 2022-11-21 12:11:19 +08:00
zyy17
663c725838 fix: fix nightly build error and fix typo (#588)
Signed-off-by: zyy17 <zyylsxm@gmail.com>
2022-11-21 11:49:36 +08:00
zyy17
c94b544e4a ci: modify image registry in release.yml (#582)
Signed-off-by: zyy17 <zyylsxm@gmail.com>
2022-11-19 09:19:54 +08:00
Ruihang Xia
f465040acc feat: lazy evaluated record batch stream (#573)
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
2022-11-18 21:42:10 +08:00
Yingwen
22ae983280 refactor: Use re-exported arrow mod from datatypes crate (#571) 2022-11-18 18:38:07 +08:00
Igor Morozov
e1f326295f feat: implement DESCRIBE TABLE (#558)
Also need to support describe table in other catalog/schema
2022-11-18 16:34:00 +08:00
aievl
6d762aa9dc feat: update mysql default listen port to 4406 (#568)
Co-authored-by: zhaozhenhang <zhaozhenhang@kuaishou.com>
2022-11-18 14:55:11 +08:00
Ruihang Xia
d4b09f69ab docs: specify protoc version requirement (#564)
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
Co-authored-by: Yingwen <realevenyag@gmail.com>
2022-11-18 14:36:25 +08:00
Xuanwo
1f0b39cc8d chore: Bump OpenDAL to v0.20 (#569)
Signed-off-by: Xuanwo <github@xuanwo.io>
2022-11-18 14:17:38 +08:00
zyy17
dee5ccec9e ci: add nightly build job (#565) 2022-11-18 11:48:29 +08:00
dennis zhuang
f8788273d5 feat: drop column for alter table (#562)
* feat: drop column for alter table

* refactor: rename RemoveColumns to DropColumns

* test: alter table

* chore: error msg

Co-authored-by: Ruihang Xia <waynestxia@gmail.com>

* fix: test_parse_alter_drop_column

Co-authored-by: Ruihang Xia <waynestxia@gmail.com>
2022-11-17 23:00:16 +08:00
jay
df465308cc current blog url response as 404, should be https://greptime.com/blogs/index (#561) 2022-11-17 21:24:04 +08:00
LFC
e7b4d2b9cd feat: Implement table_info() for DistTable (#536) (#557)
* feat: Implement `table_info()`` for `DistTable` (#536)

* Update src/catalog/src/error.rs

Co-authored-by: Yingwen <1405012107@qq.com>

Co-authored-by: luofucong <luofucong@greptime.com>
Co-authored-by: Yingwen <1405012107@qq.com>
2022-11-17 18:40:58 +08:00
discord9
bf408e3b96 Update README.md (#552)
Add RustPython's Acknowledgement
2022-11-17 14:15:43 +08:00
dennis zhuang
73e6e2e01b fix: split code and output in README (#549) 2022-11-17 12:54:02 +08:00
Lei, Huang
8faa6b0f09 refactor: start options (#545)
* refactor: config options for frontend/datanode/standalone

* chore: rename MetaClientOpts::metasrv_addr to MetaClientOpts::metasrv_addrs

* fix: clippy

* fix: change default meta-srv addr to 127.0.0.1:3002
2022-11-17 11:47:39 +08:00
Yingwen
55f18b5a0b refactor: Rename table-engine to mito (#539)
* refactor: Rename table-engine to mito

* style: Format codes

* docs: Update mito engine comment

* docs: Explain what is mito in README
2022-11-16 18:19:29 +08:00
Lei, Huang
7b43f027f9 fix: respect node id and metasrv addr in config file (#542)
* fix: respect node id and metasrv addr in config file

* fix: fmt

* fix: unit test
2022-11-16 18:16:11 +08:00
Ruihang Xia
08cc775d7c chore: remove clean disk job (#543)
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
2022-11-16 18:07:17 +08:00
fys
5e42eb5ec6 fix: field number of proto (#541) 2022-11-16 17:41:34 +08:00
Ruihang Xia
5979dcfc17 chore: remote issue title prefix from template (#533)
* chore: remote issue title prefix from template

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* change feature request's label name

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
2022-11-16 15:46:52 +08:00
LFC
872ac8058f feat: distributed execute gRPC and Prometheus query in Frontend (#520)
* feat: distributed execute GRPC and Prometheus query in Frontend

* feat: distributed execute GRPC and Prometheus query in Frontend

* Apply suggestions from code review

Co-authored-by: Lei, Huang <6406592+v0y4g3r@users.noreply.github.com>

* feat: distributed execute GRPC and Prometheus query in Frontend

* fix: do not convert timestamp to string when converting logical plan to SQL

* fix: tests

* refactor: no mock

* refactor: 0.0.0.0 -> 127.0.0.1

* refactor: 0.0.0.0 -> 127.0.0.1

* refactor: 0.0.0.0 -> 127.0.0.1

Co-authored-by: luofucong <luofucong@greptime.com>
Co-authored-by: Lei, Huang <6406592+v0y4g3r@users.noreply.github.com>
2022-11-16 14:59:48 +08:00
xiaomin tang
ce11a64fe2 docs: move Docs section under Resources (#530) 2022-11-16 12:05:15 +08:00
SSebo
29ad16d048 chore: fix typo (#524) 2022-11-16 11:53:25 +08:00
Ning Sun
173a8f67a1 test: ignore empty s3 bucket (#529) 2022-11-16 11:35:12 +08:00
xiaomin tang
e823cde6ff fix: task list syntax error in pull_request_template (#528) 2022-11-15 23:53:16 +08:00
Ruihang Xia
eeacfe9f73 fix: move ISSUE_TEMPLATE into .github dir (#525)
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
2022-11-15 23:34:13 +08:00
xiaomin tang
43c4189a8e chore: add issue&pr template (#523)
* chore: add pull request template

* chore: add issue template

* chore: apply suggestions from code review

Co-authored-by: Ning Sun <sunng@protonmail.com>

Co-authored-by: Ning Sun <sunng@protonmail.com>
2022-11-15 23:06:22 +08:00
Yingwen
57979c9d3d docs: Add acknowledgment to README (#522)
* docs: Add acknowledgment to README

* docs: Address review comment
2022-11-15 19:06:17 +08:00
Ning Sun
e6768a3dd3 docs: correct link to docs again (#521) 2022-11-15 18:26:14 +08:00
Yingwen
e073fea443 ci: Ignore some files (#519) 2022-11-15 18:22:22 +08:00
Ruihang Xia
7ba512980a chore: add APACHE-2.0 license header (#518)
* feat: add license checker workflow

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* fix existing header

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* specify license for internal sub-crate

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* fix rustfmt

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
2022-11-15 18:05:46 +08:00
zyy17
b93c084666 Update install.sh (#517) 2022-11-15 17:52:43 +08:00
dennis zhuang
6c6eeda429 refactor: options and sample configurations (#514)
* refactor: options and sample configurations

* chore: newline at end of file

* chore: format code

* chore: remove comment and set sample configurations to default values

* chore: use single quoted string in sample configuration files
2022-11-15 17:39:22 +08:00
dennis zhuang
ba27e0d058 chore: remove component temporally (#516) 2022-11-15 17:37:46 +08:00
Jiachun Feng
cabb55322b fix: meta minor fix (#513)
* chore: fix metaclient example

* chore: initial sequece value
2022-11-15 16:38:05 +08:00
Ning Sun
b34f26ee07 docs: fix docs site link in readme (#512) 2022-11-15 16:37:52 +08:00
Ruihang Xia
1565c8d236 chore: specify import style in rustfmt (#460)
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
2022-11-15 15:58:54 +08:00
sarahlau0415
ecb2d7692f docs: Add guidelines, issue process, community (#432)
* docs: Add guidelines, issue process, community

* Update CONTRIBUTING.md

Co-authored-by: Ning Sun <sunng@protonmail.com>

* Update CONTRIBUTING .md

add missing links, grammar check

* Apply suggestions from code review

* docs: apply suggests from code review

Co-authored-by: Ning Sun <sunng@protonmail.com>
Co-authored-by: xiaomin tang <xtang@users.noreply.github.com>
2022-11-15 15:20:08 +08:00
greenapril
acd8970f15 docs: fix spelling grammar and provide new suggs (#494)
* doc: fix spelling, minor grammar mistakes

also provided alternatives for "with transparent experience from users' perspective"
alternatives: 
1. provide users with transparency
2. provide a transparent experience for all users
3. transparent to users from all perspectives

* docs: apply suggestions from code review

Co-authored-by: xiaomin tang <xtang@users.noreply.github.com>
2022-11-15 15:10:03 +08:00
dennis zhuang
102e512a0a feat: enable freeze-stdlib feature in rust-python (#511) 2022-11-15 15:06:58 +08:00
Jiachun Feng
a0144ffa61 fix: leader checker (#510)
* fix: leader checker bug

* chore: rm  of test_dist_table_scan
2022-11-15 14:52:47 +08:00
Lei, Huang
934c18b914 feat: dist create database (#495)
* feat: create database in distribute mode

* rebase develop

Co-authored-by: luofucong <luofucong@greptime.com>
2022-11-15 14:52:35 +08:00
LFC
2c0d2da5a7 feat: Frontend show tables and databases (#504)
* feat: Frontend show tables and databases

Co-authored-by: luofucong <luofucong@greptime.com>
2022-11-15 14:21:50 +08:00
dennis zhuang
6e93c5e1de fix: make scripts API work again (#507) 2022-11-15 11:39:53 +08:00
Ruihang Xia
a88c649088 fix: force set gRPC create request's table ID to None (#502)
* fix: force set gRPC create request's table ID to None

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* fix: fix style

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
2022-11-15 11:17:42 +08:00
Lei, Huang
deb7d5fc2c fix: opentsdb/influxdb tags are not put to primary key indices (#506) 2022-11-15 11:06:51 +08:00
Jiachun Feng
3f12f5443d feat: meta election (#492)
* feat: meta election

* feat: election by etcd

* chore: redirect on re-election

* chore: by cr

* chore: by cr

* chore: by cr

* chore: rename CI
2022-11-15 11:04:15 +08:00
Ruihang Xia
a7d311e480 chore: enlarge CI and disable test job (#503)
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
2022-11-15 10:53:07 +08:00
Ning Sun
57304ec091 docs: remove database creation (#500)
* docs: remove database creation

* docs: add project status
2022-11-15 08:22:29 +08:00
dennis zhuang
448e8f139e fix: table and database conflicts (#491)
* fix: table conflicts in different database, #483

* feat: support db query param in prometheus remoting read/write

* feat: support db query param in influxdb line protocol

* fix: make schema_name work in gRPC

* fix: table data path

* fix: table manifest dir

* feat: adds opendal logging layer to object store

* Update src/frontend/src/instance.rs

Co-authored-by: LFC <bayinamine@gmail.com>

* Update src/frontend/src/instance.rs

Co-authored-by: LFC <bayinamine@gmail.com>

* Update src/servers/src/line_writer.rs

Co-authored-by: Lei, Huang <6406592+v0y4g3r@users.noreply.github.com>

* Update src/servers/src/line_writer.rs

Co-authored-by: Lei, Huang <6406592+v0y4g3r@users.noreply.github.com>

* fix: compile error

* ci: use larger runner for running coverage

* fix: address already in use in test

Co-authored-by: LFC <bayinamine@gmail.com>
Co-authored-by: Lei, Huang <6406592+v0y4g3r@users.noreply.github.com>
2022-11-14 23:16:52 +08:00
Ning Sun
76732d6506 fix: add more parameters to postgresql for python client (#493) 2022-11-14 21:55:26 +08:00
Ning Sun
74c236a308 feat: stream write for postgresql query results (#472) 2022-11-14 21:50:11 +08:00
Ning Sun
c673debc89 feat: Update Http SQL api for dashboard requirements (#474)
* feat: make sql api output a vector to support multi-statement

* feat: add execution_time_ms to http sql and script api

* fix: use u128 for execution time

* Apply suggestions from code review

Co-authored-by: Yingwen <realevenyag@gmail.com>

* fix: lint error

Co-authored-by: Yingwen <realevenyag@gmail.com>
2022-11-14 21:40:31 +08:00
Yingwen
281eae9f44 fix: Fix filtering out rows incorrectly during dedup phase (#484)
* fix: dedup should not mark element as unneeded

It should only mark element as selected, because some column of
different rows may have same value.

* refactor: Rename dedup to find_unique

As the original `dedup` method only mark bitmap to true when it finds
the element is unique, so `find_unique` is more appropriate for its
name.

* test: Renew bitmap in test_batch_find_unique

* chore: Update comments
2022-11-14 21:40:17 +08:00
Ning Sun
fdae67b43e docs: Simplify code in readme (#488)
* docs: simplify readme

* docs: update content

* docs: add start docker section

* docs: add c/c++ toolchain description

* docs: minor tweak

* docs: minor tweak again

* docs: address review comments
2022-11-14 21:18:23 +08:00
Ruihang Xia
ab9b1a91d4 chore: turn-off codecov's patch comment (#498)
* chore: turn-off codecov's patch comment

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* chore: fix style

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
2022-11-14 21:18:14 +08:00
Lei, Huang
4e7efbbe7e fix: insert batch missing semantic type (#499) 2022-11-14 21:18:01 +08:00
Yingwen
508f4cdfd0 fix: Fix test_insert_and_select hangs occasionally (#496)
* fix: Also handles admin request in another runtime

* chore: Describe why executes admin request in another runtime

* test: Enable test_insert_and_select
2022-11-14 21:11:25 +08:00
dennis zhuang
68b299e04a fix: apply recovered metadata after last WAL entry (#461)
* fix: apply recovered metadata after last WAL entry

* fix: condition error
2022-11-14 20:43:47 +08:00
Lei, Huang
c90832ea6c feat: distribute mode support auto create table (#489) 2022-11-14 19:53:35 +08:00
LFC
d10e45f4aa feat: distributed query in Frontend (#486)
* feat: distributed query in Frontend

Co-authored-by: luofucong <luofucong@greptime.com>
2022-11-14 18:15:49 +08:00
shuiyisong
dcd5e34dbd feat: generating context in http middleware & mysql auth method (#453) 2022-11-14 17:24:11 +08:00
xiaomin tang
7e49493e34 docs: add more sections to readme (#478)
* docs: add badges & logo

* docs: add What_is_GreptimeDB section

* docs: add Community&Documentation&License section

* docs: simplify name of CI badge
2022-11-14 16:46:09 +08:00
LFC
e7b4a00ef0 feat: create distributed table in Frontend (#475)
* feat: create distributed table in Frontend

* fix: some table creation issues (#482)

Co-authored-by: luofucong <luofucong@greptime.com>
Co-authored-by: Lei, Huang <6406592+v0y4g3r@users.noreply.github.com>
2022-11-14 15:49:25 +08:00
Yingwen
ef12bb7f24 ci: Fix codecov.yml syntax (#464) 2022-11-14 14:21:09 +08:00
Lei, Huang
70442f6810 feat: add mysql protocol handler back to datanode for debugging (#479) 2022-11-14 13:15:44 +08:00
Lei, Huang
fae331d2ba feat: Move create table logic to frontend (#455)
* refactor: dependency, from frontend depends on datanode to datanode depends on frontend

* wip: start frontend in datanode

* wip: migrate create database to frontend

* wip: impl alter table

* fix: CR comments

* feat: add table id and region ids field to CreateExpr

* chore: rebase develop

* refactor: frontend catalog should set from datanode

* feat: gRPC AddColumn request support add multi columns

* wip: move create table and create-on-insertion to frontend

* wip: error handling

* fix: some unit tests

* fix: all unit tests

* chore: merge develop

* feat: add create/alter-on-insertion to dist_insert/sql_dist_insert

* fix: add region number/catalog/schema to InsertExpr

* feat: add handle_create_table/handle_create_database...

* fix: remove catalog from insert expr

* fix: CR comments

* fix: when running in standalone mode, mysql opts and postgres opts should pass to frontend so that auctually running service can change the port to listen on

* refactor: add a standalone subcommand, move frontend start stuff to cmd package

* chore: optimize create table failure logs

* docs: change readme

* docs: update readme
2022-11-14 10:54:35 +08:00
fys
488eabce4a feat: support standalone and distributed insert in frontend (#473)
* feat: support standalone and distributed insert in frontend

* cr
2022-11-13 11:57:23 +08:00
Lei, Huang
2d869e1e43 refactor: datanode starts frontend (#471)
* refactor: dependency, from frontend depends on datanode to datanode depends on frontend

* wip: start frontend in datanode

* wip: migrate create database to frontend

* wip: impl alter table

* fix: CR comments
2022-11-12 21:07:18 +08:00
Ning Sun
0d4c191a06 fix: improve postgresql protocol implementation and fix time/date format (#452)
* feat: add server_version as postgresql jdbc connector requires

* refactor: do not require password at the moment

* fix: correct datetime output as required by postgresql

* docs: corrected timestamp on our readme

* refactor: simplify import

* fix: address review issues
2022-11-11 21:28:28 +08:00
zyy17
1d78f8db1f ci: use larger runner in release building (#467) 2022-11-11 19:04:04 +08:00
LFC
f375e18a76 feat: table route cache (#462)
* feat: table route cache

Co-authored-by: luofucong <luofucong@greptime.com>
2022-11-11 18:54:56 +08:00
Ruihang Xia
e30879f638 feat: Remove memtable's time bucket (#442)
* refactor: partially replace MemtableSet with Memtable

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* remove MemtableWithMeta and MemtableSet in non-test mod

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* remove dead code

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* make test compile 🤣

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* fix broken tests

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* make all tests pass

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* fix clippys

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* remove redundant clone

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* update comment

Co-authored-by: Yingwen <realevenyag@gmail.com>

* resolve review comment

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
Co-authored-by: Yingwen <realevenyag@gmail.com>
2022-11-11 18:02:34 +08:00
dennis zhuang
74ea529d1a feat: move time index metadata from schema into field (#444)
* feat: move time index metadata from schema into field

* chore: remove useless code

* test: test select with column alias

* fix: conflicts with develop branch

* test: add test

* test: order by timestamp to ensure query results order

* fix: comment
2022-11-11 15:36:27 +08:00
dennis zhuang
e7b4d24df5 feat: create database (#451)
* feat: parsing create database statement

* feat: impl create database in datanode

* feat: supports insert into catalog.schema.table

* fix: conflicts with develop branch

* test: create database then insert and query

* fix: grpc schema provider

* feat: use CatalogManager::register_schema instead of CatalogProvide::register_schema

* refactor: revert InsertExpr catalog_name and schema_name

* fix: revert database.proto

* fix: revert client cargo

* feat: accepts schema.table as table name in sql

Co-authored-by: Lei, HUANG <mrsatangel@gmail.com>
2022-11-11 14:15:38 +08:00
Yingwen
d5ae5e6afa fix: Ignore test_insert_and_select (#459)
It sometimes hangs in CI
2022-11-11 12:02:29 +08:00
685 changed files with 39636 additions and 22119 deletions

4
.env.example Normal file
View File

@@ -0,0 +1,4 @@
# Settings for s3 test
GT_S3_BUCKET=S3 bucket
GT_S3_ACCESS_KEY_ID=S3 access key id
GT_S3_ACCESS_KEY=S3 secret access key

86
.github/ISSUE_TEMPLATE/bug_report.yml vendored Normal file
View File

@@ -0,0 +1,86 @@
---
name: Bug report
description: Is something not working? Help us fix it!
labels: [ "bug" ]
body:
- type: markdown
attributes:
value: |
Take some time to fill out this bug report. Thank you!
- type: dropdown
id: type
attributes:
label: What type of bug is this?
multiple: true
options:
- Configuration
- Crash
- Data corruption
- Incorrect result
- Locking issue
- Performance issue
- Unexpected error
- Other
validations:
required: true
- type: dropdown
id: subsystem
attributes:
label: What subsystems are affected?
description: You can pick multiple subsystems.
multiple: true
options:
- Standalone mode
- Frontend
- Datanode
- Meta
- Other
validations:
required: true
- type: textarea
id: what-happened
attributes:
label: What happened?
description: |
Tell us what happened and also what you would have expected to
happen instead.
placeholder: "Describe the bug"
validations:
required: true
- type: input
id: os
attributes:
label: What operating system did you use?
description: |
Please provide OS, version, and architecture. For example:
Windows 10 x64, Ubuntu 21.04 x64, Mac OS X 10.5 ARM, Rasperry
Pi i386, etc.
placeholder: "Ubuntu 21.04 x64"
validations:
required: true
- type: textarea
id: logs
attributes:
label: Relevant log output and stack trace
description: |
Please copy and paste any relevant log output or a stack
trace. This will be automatically formatted into code, so no
need for backticks.
render: bash
- type: textarea
id: reproduce
attributes:
label: How can we reproduce the bug?
description: |
Please walk us through and provide steps and details on how
to reproduce the issue. If possible, provide scripts that we
can run to trigger the bug.
render: bash
validations:
required: true

8
.github/ISSUE_TEMPLATE/config.yml vendored Normal file
View File

@@ -0,0 +1,8 @@
blank_issues_enabled: false
contact_links:
- name: Greptime Community Slack
url: https://greptime.com/slack
about: Get free help from the Greptime community
- name: Greptime Community Discussion
url: https://github.com/greptimeTeam/greptimedb/discussions
about: Get free help from the Greptime community

39
.github/ISSUE_TEMPLATE/enhancement.yml vendored Normal file
View File

@@ -0,0 +1,39 @@
---
name: Enhancement
description: Suggest an enhancement to existing functionality
labels: [ "enhancement" ]
body:
- type: dropdown
id: type
attributes:
label: What type of enhancement is this?
multiple: true
options:
- API improvement
- Configuration
- Performance
- Refactor
- Tech debt reduction
- User experience
- Other
validations:
required: true
- type: textarea
id: what
attributes:
label: What does the enhancement do?
description: |
Give a high-level overview of how you
suggest improving an existing feature or functionality.
validations:
required: true
- type: textarea
id: implementation
attributes:
label: Implementation challenges
description: |
Share any ideas of how to implement the enhancement.
validations:
required: false

View File

@@ -0,0 +1,42 @@
---
name: Feature request
description: Suggest a new feature for GreptimeDB
labels: [ "feature request" ]
body:
- type: markdown
id: info
attributes:
value: |
Only use this template to suggest a new feature that doesn't already exist in GreptimeDB.
For enhancements to existing features, use the "Enhancement" issue template. For bugs,
use the bug report template.
- type: textarea
id: what
attributes:
label: What problem does the new feature solve?
description: |
Describe the problem and why it is important to solve. Did you consider alternative
solutions, perhaps outside the database? Why is it better to add the feature to
GreptimeDB?
validations:
required: true
- type: textarea
id: how
attributes:
label: What does the feature do?
description: |
Give a high-level overview of what the feature does and how it would work.
validations:
required: true
- type: textarea
id: implementation
attributes:
label: Implementation challenges
description: |
If you have ideas of how to implement the feature, and any particularly
challenging issues to overcome, then provide them here.
validations:
required: false

19
.github/pull_request_template.md vendored Normal file
View File

@@ -0,0 +1,19 @@
I hereby agree to the terms of the [GreptimeDB CLA](https://gist.github.com/xtang/6378857777706e568c1949c7578592cc)
## What's changed and what's your intention?
_PLEASE DO NOT LEAVE THIS EMPTY !!!_
Please explain IN DETAIL what the changes are in this PR and why they are needed:
- Summarize your change (**mandatory**)
- How does this PR work? Need a brief introduction for the changed logic (optional)
- Describe clearly one logical change and avoid lazy messages (optional)
- Describe any limitations of the current code (optional)
## Checklist
- [ ] I have written the necessary rustdoc comments.
- [ ] I have added the necessary unit tests and integration tests.
## Refer to a related PR or issue link (optional)

View File

@@ -1,25 +1,44 @@
on:
pull_request:
types: [opened, synchronize, reopened, ready_for_review]
paths-ignore:
- 'docs/**'
- 'config/**'
- '**.md'
- '.dockerignore'
- 'docker/**'
- '.gitignore'
push:
branches:
- "main"
- "develop"
paths-ignore:
- 'docs/**'
- 'config/**'
- '**.md'
- '.dockerignore'
- 'docker/**'
- '.gitignore'
workflow_dispatch:
name: Code coverage
env:
RUST_TOOLCHAIN: nightly-2022-07-14
RUST_TOOLCHAIN: nightly-2022-12-20
jobs:
coverage:
if: github.event.pull_request.draft == false
runs-on: ubuntu-latest
runs-on: ubuntu-latest-8-cores
timeout-minutes: 60
steps:
- uses: actions/checkout@v3
- uses: arduino/setup-protoc@v1
with:
repo-token: ${{ secrets.GITHUB_TOKEN }}
- uses: KyleMayes/install-llvm-action@v1
with:
version: "14.0"
- name: Install toolchain
uses: dtolnay/rust-toolchain@master
with:
@@ -27,10 +46,6 @@ jobs:
components: llvm-tools-preview
- name: Rust Cache
uses: Swatinem/rust-cache@v2
- name: Cleanup disk
uses: curoky/cleanup-disk-action@v2.0
with:
retain: 'rust'
- name: Install latest nextest release
uses: taiki-e/install-action@nextest
- name: Install cargo-llvm-cov
@@ -38,6 +53,7 @@ jobs:
- name: Collect coverage data
run: cargo llvm-cov nextest --workspace --lcov --output-path lcov.info
env:
CARGO_BUILD_RUSTFLAGS: "-C link-arg=-fuse-ld=lld"
RUST_BACKTRACE: 1
CARGO_INCREMENTAL: 0
GT_S3_BUCKET: ${{ secrets.S3_BUCKET }}

View File

@@ -1,6 +1,12 @@
on:
pull_request:
types: [opened, synchronize, reopened, ready_for_review]
paths-ignore:
- 'docs/**'
- 'config/**'
- '**.md'
- '.dockerignore'
- 'docker/**'
push:
branches:
- develop
@@ -8,19 +14,25 @@ on:
paths-ignore:
- 'docs/**'
- 'config/**'
- '.github/**'
- '**.md'
- '**.yml'
- '.dockerignore'
- 'docker/**'
- '.gitignore'
workflow_dispatch:
name: Continuous integration for developing
name: CI
env:
RUST_TOOLCHAIN: nightly-2022-07-14
RUST_TOOLCHAIN: nightly-2022-12-20
jobs:
typos:
name: Spell Check with Typos
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
- uses: crate-ci/typos@v1.0.4
check:
name: Check
if: github.event.pull_request.draft == false
@@ -29,6 +41,8 @@ jobs:
steps:
- uses: actions/checkout@v3
- uses: arduino/setup-protoc@v1
with:
repo-token: ${{ secrets.GITHUB_TOKEN }}
- uses: dtolnay/rust-toolchain@master
with:
toolchain: ${{ env.RUST_TOOLCHAIN }}
@@ -37,44 +51,64 @@ jobs:
- name: Run cargo check
run: cargo check --workspace --all-targets
test:
name: Test Suite
toml:
name: Toml Check
if: github.event.pull_request.draft == false
runs-on: ubuntu-latest
timeout-minutes: 60
steps:
- uses: actions/checkout@v3
- name: Cache LLVM and Clang
id: cache-llvm
uses: actions/cache@v3
with:
path: ./llvm
key: llvm
- uses: arduino/setup-protoc@v1
- uses: KyleMayes/install-llvm-action@v1
with:
version: "14.0"
cached: ${{ steps.cache-llvm.outputs.cache-hit }}
- uses: dtolnay/rust-toolchain@master
with:
toolchain: ${{ env.RUST_TOOLCHAIN }}
- name: Rust Cache
uses: Swatinem/rust-cache@v2
- name: Cleanup disk
uses: curoky/cleanup-disk-action@v2.0
with:
retain: 'rust,llvm'
- name: Install latest nextest release
uses: taiki-e/install-action@nextest
- name: Run tests
run: cargo nextest run
env:
CARGO_BUILD_RUSTFLAGS: "-C link-arg=-fuse-ld=lld"
RUST_BACKTRACE: 1
GT_S3_BUCKET: ${{ secrets.S3_BUCKET }}
GT_S3_ACCESS_KEY_ID: ${{ secrets.S3_ACCESS_KEY_ID }}
GT_S3_ACCESS_KEY: ${{ secrets.S3_ACCESS_KEY }}
UNITTEST_LOG_DIR: "__unittest_logs"
- name: Install taplo
run: cargo install taplo-cli --version ^0.8 --locked
- name: Run taplo
run: taplo format --check --option "indent_string= "
# Use coverage to run test.
# test:
# name: Test Suite
# if: github.event.pull_request.draft == false
# runs-on: ubuntu-latest
# timeout-minutes: 60
# steps:
# - uses: actions/checkout@v3
# - name: Cache LLVM and Clang
# id: cache-llvm
# uses: actions/cache@v3
# with:
# path: ./llvm
# key: llvm
# - uses: arduino/setup-protoc@v1
# with:
# repo-token: ${{ secrets.GITHUB_TOKEN }}
# - uses: KyleMayes/install-llvm-action@v1
# with:
# version: "14.0"
# cached: ${{ steps.cache-llvm.outputs.cache-hit }}
# - uses: dtolnay/rust-toolchain@master
# with:
# toolchain: ${{ env.RUST_TOOLCHAIN }}
# - name: Rust Cache
# uses: Swatinem/rust-cache@v2
# - name: Cleanup disk
# uses: curoky/cleanup-disk-action@v2.0
# with:
# retain: 'rust,llvm'
# - name: Install latest nextest release
# uses: taiki-e/install-action@nextest
# - name: Run tests
# run: cargo nextest run
# env:
# CARGO_BUILD_RUSTFLAGS: "-C link-arg=-fuse-ld=lld"
# RUST_BACKTRACE: 1
# GT_S3_BUCKET: ${{ secrets.S3_BUCKET }}
# GT_S3_ACCESS_KEY_ID: ${{ secrets.S3_ACCESS_KEY_ID }}
# GT_S3_ACCESS_KEY: ${{ secrets.S3_ACCESS_KEY }}
# UNITTEST_LOG_DIR: "__unittest_logs"
fmt:
name: Rustfmt
@@ -84,6 +118,8 @@ jobs:
steps:
- uses: actions/checkout@v3
- uses: arduino/setup-protoc@v1
with:
repo-token: ${{ secrets.GITHUB_TOKEN }}
- uses: dtolnay/rust-toolchain@master
with:
toolchain: ${{ env.RUST_TOOLCHAIN }}
@@ -101,6 +137,8 @@ jobs:
steps:
- uses: actions/checkout@v3
- uses: arduino/setup-protoc@v1
with:
repo-token: ${{ secrets.GITHUB_TOKEN }}
- uses: dtolnay/rust-toolchain@master
with:
toolchain: ${{ env.RUST_TOOLCHAIN }}

25
.github/workflows/doc-issue.yml vendored Normal file
View File

@@ -0,0 +1,25 @@
name: Create Issue in docs repo on doc related changes
on:
issues:
types:
- labeled
pull_request_target:
types:
- labeled
jobs:
doc_issue:
if: github.event.label.name == 'doc update required'
runs-on: ubuntu-latest
steps:
- name: create an issue in doc repo
uses: dacbd/create-issue-action@main
with:
owner: GreptimeTeam
repo: docs
token: ${{ secrets.DOCS_REPO_TOKEN }}
title: Update docs for ${{ github.event.issue.title || github.event.pull_request.title }}
body: |
A document change request is generated from
${{ github.event.issue.html_url || github.event.pull_request.html_url }}

16
.github/workflows/license.yaml vendored Normal file
View File

@@ -0,0 +1,16 @@
name: License checker
on:
push:
branches:
- develop
pull_request:
types: [opened, synchronize, reopened, ready_for_review]
jobs:
license-header-check:
runs-on: ubuntu-latest
name: license-header-check
steps:
- uses: actions/checkout@v2
- name: Check License Header
uses: apache/skywalking-eyes/header@main

View File

@@ -2,12 +2,21 @@ on:
push:
tags:
- "v*.*.*"
schedule:
# At 00:00 on Monday.
- cron: '0 0 * * 1'
workflow_dispatch:
name: Release
env:
RUST_TOOLCHAIN: nightly-2022-07-14
RUST_TOOLCHAIN: nightly-2022-12-20
# FIXME(zyy17): Would be better to use `gh release list -L 1 | cut -f 3` to get the latest release version tag, but for a long time, we will stay at 'v0.1.0-alpha-*'.
SCHEDULED_BUILD_VERSION_PREFIX: v0.1.0-alpha
# In the future, we can change SCHEDULED_PERIOD to nightly.
SCHEDULED_PERIOD: weekly
jobs:
build:
@@ -17,10 +26,10 @@ jobs:
# The file format is greptime-<os>-<arch>
include:
- arch: x86_64-unknown-linux-gnu
os: ubuntu-latest
os: ubuntu-latest-16-cores
file: greptime-linux-amd64
- arch: aarch64-unknown-linux-gnu
os: ubuntu-latest
os: ubuntu-latest-16-cores
file: greptime-linux-arm64
- arch: aarch64-apple-darwin
os: macos-latest
@@ -106,8 +115,32 @@ jobs:
- name: Download artifacts
uses: actions/download-artifact@v3
- name: Configure scheduled build version # the version would be ${SCHEDULED_BUILD_VERSION_PREFIX}-YYYYMMDD-${SCHEDULED_PERIOD}, like v0.1.0-alpha-20221119-weekly.
shell: bash
if: github.event_name == 'schedule'
run: |
buildTime=`date "+%Y%m%d"`
SCHEDULED_BUILD_VERSION=${{ env.SCHEDULED_BUILD_VERSION_PREFIX }}-$buildTime-${{ env.SCHEDULED_PERIOD }}
echo "SCHEDULED_BUILD_VERSION=${SCHEDULED_BUILD_VERSION}" >> $GITHUB_ENV
- name: Create scheduled build git tag
if: github.event_name == 'schedule'
run: |
git tag ${{ env.SCHEDULED_BUILD_VERSION }}
- name: Publish scheduled release # configure the different release title and tags.
uses: softprops/action-gh-release@v1
if: github.event_name == 'schedule'
with:
name: "Release ${{ env.SCHEDULED_BUILD_VERSION }}"
tag_name: ${{ env.SCHEDULED_BUILD_VERSION }}
generate_release_notes: true
files: |
**/greptime-*
- name: Publish release
uses: softprops/action-gh-release@v1
if: github.event_name != 'schedule'
with:
name: "Release ${{ github.ref_name }}"
files: |
@@ -145,12 +178,12 @@ jobs:
tar xvf greptime-linux-arm64.tgz
rm greptime-linux-arm64.tgz
- name: Login to GitHub Container Registry
- name: Login to UCloud Container Registry
uses: docker/login-action@v2
with:
registry: ghcr.io
username: ${{ github.actor }}
password: ${{ secrets.GITHUB_TOKEN }}
registry: uhub.service.ucloud.cn
username: ${{ secrets.UCLOUD_USERNAME }}
password: ${{ secrets.UCLOUD_PASSWORD }}
- name: Login to Dockerhub
uses: docker/login-action@v2
@@ -158,11 +191,20 @@ jobs:
username: ${{ secrets.DOCKERHUB_USERNAME }}
password: ${{ secrets.DOCKERHUB_TOKEN }}
- name: Configure scheduled build image tag # the tag would be ${SCHEDULED_BUILD_VERSION_PREFIX}-YYYYMMDD-${SCHEDULED_PERIOD}
shell: bash
if: github.event_name == 'schedule'
run: |
buildTime=`date "+%Y%m%d"`
SCHEDULED_BUILD_VERSION=${{ env.SCHEDULED_BUILD_VERSION_PREFIX }}-$buildTime-${{ env.SCHEDULED_PERIOD }}
echo "IMAGE_TAG=${SCHEDULED_BUILD_VERSION:1}" >> $GITHUB_ENV
- name: Configure tag # If the release tag is v0.1.0, then the image version tag will be 0.1.0.
shell: bash
if: github.event_name != 'schedule'
run: |
VERSION=${{ github.ref_name }}
echo "VERSION=${VERSION:1}" >> $GITHUB_ENV
echo "IMAGE_TAG=${VERSION:1}" >> $GITHUB_ENV
- name: Set up QEMU
uses: docker/setup-qemu-action@v2
@@ -179,5 +221,6 @@ jobs:
platforms: linux/amd64,linux/arm64
tags: |
greptime/greptimedb:latest
greptime/greptimedb:${{ env.VERSION }}
ghcr.io/greptimeteam/greptimedb:${{ env.VERSION }}
greptime/greptimedb:${{ env.IMAGE_TAG }}
uhub.service.ucloud.cn/greptime/greptimedb:latest
uhub.service.ucloud.cn/greptime/greptimedb:${{ env.IMAGE_TAG }}

4
.gitignore vendored
View File

@@ -18,6 +18,7 @@ debug/
# JetBrains IDE config directory
.idea/
*.iml
# VSCode IDE config directory
.vscode/
@@ -31,3 +32,6 @@ logs/
# Benchmark dataset
benchmarks/data
# dotenv
.env

14
.licenserc.yaml Normal file
View File

@@ -0,0 +1,14 @@
header:
license:
spdx-id: Apache-2.0
copyright-owner: Greptime Team
paths:
- "**/*.rs"
- "**/*.py"
comment: on-failure
dependency:
files:
- Cargo.toml

View File

@@ -9,7 +9,7 @@ repos:
rev: e6a795bc6b2c0958f9ef52af4863bbd7cc17238f
hooks:
- id: cargo-sort
args: ["--workspace", "--print"]
args: ["--workspace"]
- repo: https://github.com/doublify/pre-commit-rust
rev: v1.0

View File

@@ -1,12 +1,55 @@
# Contributing to GreptimeDB
# Welcome!
Much appreciate for your interest in contributing to GreptimeDB! This document list some guidelines for contributing to our code base.
Thanks a lot for considering contributing to GreptimeDB. We believe people like you would make GreptimeDB a great product. We intend to build a community where individuals can have open talks, show respect for one another, and speak with true ❤️. Meanwhile, we are to keep transparency and make your effort count here.
To learn about the design of GreptimeDB, please refer to the [design docs](https://github.com/GrepTimeTeam/docs).
Read the guidelines, and they can help you get started. Communicate with respect to developers maintaining and developing the project. In return, they should reciprocate that respect by addressing your issue, reviewing changes, as well as helping finalize and merge your pull requests.
## Pull Requests
Follow our [README](https://github.com/GreptimeTeam/greptimedb#readme) to get the whole picture of the project. To learn about the design of GreptimeDB, please refer to the [design docs](https://github.com/GrepTimeTeam/docs).
## Your First Contribution
It can feel intimidating to contribute to a complex project, but it can also be exciting and fun. These general notes will help everyone participate in this communal activity.
- Follow the [Code of Conduct](https://github.com/GreptimeTeam/greptimedb/blob/develop/CODE_OF_CONDUCT.md)
- Small changes make huge differences. We will happily accept a PR making a single character change if it helps move forward. Don't wait to have everything working.
- Check the closed issues before opening your issue.
- Try to follow the existing style of the code.
- More importantly, when in doubt, ask away.
Pull requests are great, but we accept all kinds of other help if you like. Such as
- Write tutorials or blog posts. Blog, speak about, or create tutorials about one of GreptimeDB's many features. Mention [@greptime](https://twitter.com/greptime) on Twitter and email info@greptime.com so we can give pointers and tips and help you spread the word by promoting your content on Greptime communication channels.
- Improve the documentation. [Submit documentation](http://github.com/greptimeTeam/docs/) updates, enhancements, designs, or bug fixes, and fixing any spelling or grammar errors will be very much appreciated.
- Present at meetups and conferences about your GreptimeDB projects. Your unique challenges and successes in building things with GreptimeDB can provide great speaking material. We'd love to review your talk abstract, so get in touch with us if you'd like some help!
- Submit bug reports. To report a bug or a security issue, you can [open a new GitHub issue](https://github.com/GrepTimeTeam/greptimedb/issues/new).
- Speak up feature requests. Send feedback is a great way for us to understand your different use cases of GreptimeDB better. If you want to share your experience with GreptimeDB, or if you want to discuss any ideas, you can start a discussion on [GitHub discussions](https://github.com/GreptimeTeam/greptimedb/discussions), chat with the Greptime team on [Slack](https://greptime.com/slack), or you can tweet [@greptime](https://twitter.com/greptime) on Twitter.
## Code of Conduct
Also, there are things that we are not looking for because they don't match the goals of the product or benefit the community. Please read [Code of Conduct](https://github.com/GreptimeTeam/greptimedb/blob/develop/CODE_OF_CONDUCT.md); we hope everyone can keep good manners and become an honored member.
## License
GreptimeDB uses the [Apache 2.0 license](https://github.com/GreptimeTeam/greptimedb/blob/master/LICENSE) to strike a balance between open contributions and allowing you to use the software however you want.
## Getting Started
### Submitting Issues
- Check if an issue already exists. Before filing an issue report, see whether it's already covered. Use the search bar and check out existing issues.
- File an issue:
- To report a bug, a security issue, or anything that you think is a problem and that isn't under the radar, go ahead and [open a new GitHub issue](https://github.com/GrepTimeTeam/greptimedb/issues/new).
- In the given templates, look for the one that suits you.
- If you bump into anything, reach out to our [Slack](https://greptime.com/slack) for a wider audience and ask for help.
- What happens after:
- Once we spot a new issue, we identify and categorize it as soon as possible.
- Usually, it gets assigned to other developers. Follow up and see what folks are talking about and how they take care of it.
- Please be patient and offer as much information as you can to help reach a solution or a consensus. You are not alone and embrace team power.
### Before PR
- To ensure that community is free and confident in its ability to use your contributions, please sign the Contributor License Agreement (CLA) which will be incorporated in the pull request process.
- Make sure all your codes are formatted and follow the [coding style](https://pingcap.github.io/style-guide/rust/).
- Make sure all unit tests are passed.
- Make sure all clippy warnings are fixed (you can check it locally by running `cargo clippy --workspace --all-targets -- -D warnings -D clippy::print_stdout -D clippy::print_stderr`).
@@ -38,19 +81,31 @@ now `pre-commit` will run automatically on `git commit`.
### Title
The titles of pull requests should be prefixed with one of the change types listed in [Conventional Commits specification](https://www.conventionalcommits.org/en/v1.0.0)
like `feat`/`fix`/`docs`, with a concise summary of code change follows. The following scope field is optional, you can fill it with the name of sub-crate if the pull request only changes one, or just leave it blank.
The titles of pull requests should be prefixed with category names listed in [Conventional Commits specification](https://www.conventionalcommits.org/en/v1.0.0)
like `feat`/`fix`/`docs`, with a concise summary of code change following. DO NOT use last commit message as pull request title.
### Description
- If your pull request is small, like a typo fix, feel free to go brief.
- Feel free to go brief if your pull request is small, like a typo fix.
- But if it contains large code change, make sure to state the motivation/design details of this PR so that reviewers can understand what you're trying to do.
- If the PR contains any breaking change or API change, make sure that is clearly listed in your description.
## Getting help
### Commit Messages
All commit messages SHOULD adhere to the [Conventional Commits specification](https://conventionalcommits.org/).
## Getting Help
There are many ways to get help when you're stuck. It is recommended to ask for help by opening an issue, with a detailed description
of what you were trying to do and what went wrong. You can also reach for help in our Slack channel.
of what you were trying to do and what went wrong. You can also reach for help in our [Slack channel](https://greptime.com/slack).
## Bug report
To report a bug or a security issue, you can [open a new GitHub issue](https://github.com/GrepTimeTeam/greptimedb/issues/new).
## Community
The core team will be thrilled if you participate in any way you like. When you are stuck, try ask for help by filing an issue, with a detailed description of what you were trying to do and what went wrong. If you have any questions or if you would like to get involved in our community, please check out:
- [GreptimeDB Community Slack](https://greptime.com/slack)
- [GreptimeDB Github Discussions](https://github.com/GreptimeTeam/greptimedb/discussions)
Also, see some extra GreptimeDB content:
- [GreptimeDB Docs](https://greptime.com/docs)
- [Learn GreptimeDB](https://greptime.com/products/db)
- [Greptime Inc. Website](https://greptime.com)

3570
Cargo.lock generated

File diff suppressed because it is too large Load Diff

View File

@@ -11,11 +11,11 @@ members = [
"src/common/function",
"src/common/function-macro",
"src/common/grpc",
"src/common/grpc-expr",
"src/common/query",
"src/common/recordbatch",
"src/common/runtime",
"src/common/substrait",
"src/common/insert",
"src/common/telemetry",
"src/common/time",
"src/datanode",
@@ -24,16 +24,38 @@ members = [
"src/log-store",
"src/meta-client",
"src/meta-srv",
"src/mito",
"src/object-store",
"src/promql",
"src/query",
"src/script",
"src/servers",
"src/session",
"src/sql",
"src/storage",
"src/store-api",
"src/table",
"src/table-engine",
"tests-integration",
"tests/runner",
]
[workspace.package]
version = "0.1.0"
edition = "2021"
license = "Apache-2.0"
[workspace.dependencies]
arrow = "29.0"
arrow-schema = { version = "29.0", features = ["serde"] }
# TODO(LFC): Use released Datafusion when it officially dpendent on Arrow 29.0
datafusion = { git = "https://github.com/apache/arrow-datafusion.git", rev = "4917235a398ae20145c87d20984e6367dc1a0c1e" }
datafusion-common = { git = "https://github.com/apache/arrow-datafusion.git", rev = "4917235a398ae20145c87d20984e6367dc1a0c1e" }
datafusion-expr = { git = "https://github.com/apache/arrow-datafusion.git", rev = "4917235a398ae20145c87d20984e6367dc1a0c1e" }
datafusion-optimizer = { git = "https://github.com/apache/arrow-datafusion.git", rev = "4917235a398ae20145c87d20984e6367dc1a0c1e" }
datafusion-physical-expr = { git = "https://github.com/apache/arrow-datafusion.git", rev = "4917235a398ae20145c87d20984e6367dc1a0c1e" }
datafusion-sql = { git = "https://github.com/apache/arrow-datafusion.git", rev = "4917235a398ae20145c87d20984e6367dc1a0c1e" }
parquet = "29.0"
sqlparser = "0.28"
[profile.release]
debug = true

67
Makefile Normal file
View File

@@ -0,0 +1,67 @@
IMAGE_REGISTRY ?= greptimedb
IMAGE_TAG ?= latest
##@ Build
.PHONY: build
build: ## Build debug version greptime.
cargo build
.PHONY: release
release: ## Build release version greptime.
cargo build --release
.PHONY: clean
clean: ## Clean the project.
cargo clean
.PHONY: fmt
fmt: ## Format all the Rust code.
cargo fmt --all
.PHONY: docker-image
docker-image: ## Build docker image.
docker build --network host -f docker/Dockerfile -t ${IMAGE_REGISTRY}:${IMAGE_TAG} .
##@ Test
.PHONY: unit-test
unit-test: ## Run unit test.
cargo test --workspace
.PHONY: integration-test
integration-test: ## Run integation test.
cargo test integration
.PHONY: sqlness-test
sqlness-test: ## Run sqlness test.
cargo run --bin sqlness-runner
.PHONY: check
check: ## Cargo check all the targets.
cargo check --workspace --all-targets
.PHONY: clippy
clippy: ## Check clippy rules.
cargo clippy --workspace --all-targets -- -D warnings -D clippy::print_stdout -D clippy::print_stderr
.PHONY: fmt-check
fmt-check: ## Check code format.
cargo fmt --all -- --check
##@ General
# The help target prints out all targets with their descriptions organized
# beneath their categories. The categories are represented by '##@' and the
# target descriptions by '##'. The awk commands is responsible for reading the
# entire set of makefiles included in this invocation, looking for lines of the
# file as xyz: ## something, and then pretty-format the target and help. Then,
# if there's a line with ##@ something, that gets pretty-printed as a category.
# More info on the usage of ANSI control characters for terminal formatting:
# https://en.wikipedia.org/wiki/ANSI_escape_code#SGR_parameters
# More info on the awk command:
# https://linuxcommand.org/lc3_adv_awk.php
.PHONY: help
help: ## Display help messages.
@awk 'BEGIN {FS = ":.*##"; printf "\nUsage:\n make \033[36m<target>\033[0m\n"} /^[a-zA-Z_0-9-]+:.*?##/ { printf " \033[36m%-20s\033[0m %s\n", $$1, $$2 } /^##@/ { printf "\n\033[1m%s\033[0m\n", substr($$0, 5) } ' $(MAKEFILE_LIST)

233
README.md
View File

@@ -1,101 +1,100 @@
# GreptimeDB
<p align="center">
<picture>
<source media="(prefers-color-scheme: light)" srcset="/docs/logo-text-padding.png">
<source media="(prefers-color-scheme: dark)" srcset="/docs/logo-text-padding-dark.png">
<img alt="GreptimeDB Logo" src="/docs/logo-text-padding.png" width="400px">
</picture>
</p>
[![codecov](https://codecov.io/gh/GrepTimeTeam/greptimedb/branch/develop/graph/badge.svg?token=FITFDI3J3C)](https://codecov.io/gh/GrepTimeTeam/greptimedb)
GreptimeDB: the next-generation hybrid timeseries/analytics processing database in the cloud.
<h3 align="center">
The next-generation hybrid timeseries/analytics processing database in the cloud
</h3>
## Getting Started
<p align="center">
<a href="https://codecov.io/gh/GrepTimeTeam/greptimedb"><img src="https://codecov.io/gh/GrepTimeTeam/greptimedb/branch/develop/graph/badge.svg?token=FITFDI3J3C"></img></a>
&nbsp;
<a href="https://github.com/GreptimeTeam/greptimedb/actions/workflows/develop.yml"><img src="https://github.com/GreptimeTeam/greptimedb/actions/workflows/develop.yml/badge.svg" alt="CI"></img></a>
&nbsp;
<a href="https://github.com/greptimeTeam/greptimedb/blob/develop/LICENSE"><img src="https://img.shields.io/github/license/greptimeTeam/greptimedb"></a>
</p>
### Prerequisites
<p align="center">
<a href="https://twitter.com/greptime"><img src="https://img.shields.io/badge/twitter-follow_us-1d9bf0.svg"></a>
&nbsp;
<a href="https://www.linkedin.com/company/greptime/"><img src="https://img.shields.io/badge/linkedin-connect_with_us-0a66c2.svg"></a>
</p>
To compile GreptimeDB from source, you'll need the following:
- Rust
- Protobuf
## What is GreptimeDB
#### Rust
GreptimeDB is an open-source time-series database with a special focus on
scalability, analytical capabilities and efficiency. It's designed to work on
infrastructure of the cloud era, and users benefit from its elasticity and commodity
storage.
The easiest way to install Rust is to use [`rustup`](https://rustup.rs/), which will check our `rust-toolchain` file and install correct Rust version for you.
Our core developers have been building time-series data platform
for years. Based on their best-practices, GreptimeDB is born to give you:
#### Protobuf
- A standalone binary that scales to highly-available distributed cluster, providing a transparent experience for cluster users
- Optimized columnar layout for handling time-series data; compacted, compressed, stored on various storage backends
- Flexible index options, tackling high cardinality issues down
- Distributed, parallel query execution, leveraging elastic computing resource
- Native SQL, and Python scripting for advanced analytical scenarios
- Widely adopted database protocols and APIs
- Extensible table engine architecture for extensive workloads
`protoc` is required for compiling `.proto` files. `protobuf` is available from
major package manager on macos and linux distributions. You can find an
installation instructions [here](https://grpc.io/docs/protoc-installation/).
## Quick Start
### Build the Docker Image
### Build
#### Build from Source
To compile GreptimeDB from source, you'll need:
- C/C++ Toolchain: provides basic tools for compiling and linking. This is
available as `build-essential` on ubuntu and similar name on other platforms.
- Rust: the easiest way to install Rust is to use
[`rustup`](https://rustup.rs/), which will check our `rust-toolchain` file and
install correct Rust version for you.
- Protobuf: `protoc` is required for compiling `.proto` files. `protobuf` is
available from major package manager on macos and linux distributions. You can
find an installation instructions [here](https://grpc.io/docs/protoc-installation/).
**Note that `protoc` version needs to be >= 3.15** because we have used the `optional`
keyword. You can check it with `protoc --version`.
#### Build with Docker
A docker image with necessary dependencies is provided:
```
docker build --network host -f docker/Dockerfile -t greptimedb .
```
## Usage
### Run
### Start Datanode
Start GreptimeDB from source code, in standalone mode:
```
// Start datanode with default options.
cargo run -- datanode start
OR
// Start datanode with `http-addr` option.
cargo run -- datanode start --http-addr=0.0.0.0:9999
OR
// Start datanode with `log-dir` and `log-level` options.
cargo run -- --log-dir=logs --log-level=debug datanode start
cargo run -- standalone start
```
Start datanode with config file:
Or if you built from docker:
```
cargo run -- --log-dir=logs --log-level=debug datanode start -c ./config/datanode.example.toml
docker run -p 4002:4002 -v "$(pwd):/tmp/greptimedb" greptime/greptimedb standalone start
```
Start datanode by runing docker container:
For more startup options, greptimedb's **distributed mode** and information
about Kubernetes deployment, check our [docs](https://docs.greptime.com/).
```
docker run -p 3000:3000 \
-p 3001:3001 \
-p 3306:3306 \
greptimedb
```
### Connect
### Start Frontend
Frontend should connect to Datanode, so **Datanode must have been started** at first!
```
// Connects to local Datanode at its default GRPC port: 3001
// Start Frontend with default options.
cargo run -- frontend start
OR
// Start Frontend with `mysql-addr` option.
cargo run -- frontend start --mysql-addr=0.0.0.0:9999
OR
// Start datanode with `log-dir` and `log-level` options.
cargo run -- --log-dir=logs --log-level=debug frontend start
```
Start datanode with config file:
```
cargo run -- --log-dir=logs --log-level=debug frontend start -c ./config/frontend.example.toml
```
### SQL Operations
1. Connecting DB by [mysql client](https://dev.mysql.com/downloads/mysql/):
1. Connect to GreptimeDB via standard [MySQL
client](https://dev.mysql.com/downloads/mysql/):
```
# The datanode listen on port 3306 by default.
mysql -h 127.0.0.1 -P 3306
# The standalone instance listen on port 4002 by default.
mysql -h 127.0.0.1 -P 4002
```
2. Create table:
@@ -110,29 +109,95 @@ cargo run -- --log-dir=logs --log-level=debug frontend start -c ./config/fronten
PRIMARY KEY(host)) ENGINE=mito WITH(regions=1);
```
3. Insert data:
3. Insert some data:
```SQL
INSERT INTO monitor(host, cpu, memory, ts) VALUES ('host1', 66.6, 1024, 1660897955);
INSERT INTO monitor(host, cpu, memory, ts) VALUES ('host2', 77.7, 2048, 1660897956);
INSERT INTO monitor(host, cpu, memory, ts) VALUES ('host3', 88.8, 4096, 1660897957);
INSERT INTO monitor(host, cpu, memory, ts) VALUES ('host1', 66.6, 1024, 1660897955000);
INSERT INTO monitor(host, cpu, memory, ts) VALUES ('host2', 77.7, 2048, 1660897956000);
INSERT INTO monitor(host, cpu, memory, ts) VALUES ('host3', 88.8, 4096, 1660897957000);
```
4. Query data:
4. Query the data:
```SQL
mysql> SELECT * FROM monitor;
+-------+------------+------+--------+
| host | ts | cpu | memory |
+-------+------------+------+--------+
| host1 | 1660897955 | 66.6 | 1024 |
| host2 | 1660897956 | 77.7 | 2048 |
| host3 | 1660897957 | 88.8 | 4096 |
+-------+------------+------+--------+
SELECT * FROM monitor;
```
```TEXT
+-------+---------------------+------+--------+
| host | ts | cpu | memory |
+-------+---------------------+------+--------+
| host1 | 2022-08-19 08:32:35 | 66.6 | 1024 |
| host2 | 2022-08-19 08:32:36 | 77.7 | 2048 |
| host3 | 2022-08-19 08:32:37 | 88.8 | 4096 |
+-------+---------------------+------+--------+
3 rows in set (0.01 sec)
```
You can delete your data by removing `/tmp/greptimedb`.
You can always cleanup test database by removing `/tmp/greptimedb`.
## Resources
### Installation
- [Pre-built Binaries](https://github.com/GreptimeTeam/greptimedb/releases):
downloadable pre-built binaries for Linux and MacOS
- [Docker Images](https://hub.docker.com/r/greptime/greptimedb): pre-built
Docker images
- [`gtctl`](https://github.com/GreptimeTeam/gtctl): the command-line tool for
Kubernetes deployment
### Documentation
- GreptimeDB [User Guide](https://docs.greptime.com/user-guide/concepts.html)
- GreptimeDB [Developer
Guide](https://docs.greptime.com/developer-guide/overview.html)
### SDK
- [GreptimeDB Java
Client](https://github.com/GreptimeTeam/greptimedb-client-java)
## Project Status
This project is in its early stage and under heavy development. We move fast and
break things. Benchmark on development branch may not represent its potential
performance. We release pre-built binaries constantly for functional
evaluation. Do not use it in production at the moment.
For future plans, check out [GreptimeDB roadmap](https://github.com/GreptimeTeam/greptimedb/issues/669).
## Community
Our core team is thrilled too see you participate in any ways you like. When you are stuck, try to
ask for help by filling an issue with a detailed description of what you were trying to do
and what went wrong. If you have any questions or if you would like to get involved in our
community, please check out:
- GreptimeDB Community on [Slack](https://greptime.com/slack)
- GreptimeDB GitHub [Discussions](https://github.com/GreptimeTeam/greptimedb/discussions)
- Greptime official [Website](https://greptime.com)
In addition, you may:
- View our official [Blog](https://greptime.com/blogs/index)
- Connect us with [Linkedin](https://www.linkedin.com/company/greptime/)
- Follow us on [Twitter](https://twitter.com/greptime)
## License
GreptimeDB uses the [Apache 2.0 license][1] to strike a balance between
open contributions and allowing you to use the software however you want.
[1]: <https://github.com/greptimeTeam/greptimedb/blob/develop/LICENSE>
## Contributing
Please refer to [contribution guidelines](CONTRIBUTING.md) for more information.
## Acknowledgement
- GreptimeDB uses [Apache Arrow](https://arrow.apache.org/) as the memory model and [Apache Parquet](https://parquet.apache.org/) as the persistent file format.
- GreptimeDB's query engine is powered by [Apache Arrow DataFusion](https://github.com/apache/arrow-datafusion).
- [OpenDAL](https://github.com/datafuselabs/opendal) from [Datafuse Labs](https://github.com/datafuselabs) gives GreptimeDB a very general and elegant data access abstraction layer.
- GreptimeDBs meta service is based on [etcd](https://etcd.io/).
- GreptimeDB uses [RustPython](https://github.com/RustPython/RustPython) for experimental embedded python scripting.

View File

@@ -1,13 +1,14 @@
[package]
name = "benchmarks"
version = "0.1.0"
edition = "2021"
version.workspace = true
edition.workspace = true
license.workspace = true
[dependencies]
arrow = "10"
arrow.workspace = true
clap = { version = "4.0", features = ["derive"] }
client = { path = "../src/client" }
indicatif = "0.17.1"
itertools = "0.10.5"
parquet = { version = "*" }
parquet.workspace = true
tokio = { version = "1.21", features = ["full"] }

View File

@@ -1,35 +1,36 @@
// Copyright 2022 Greptime Team
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
//! Use the taxi trip records from New York City dataset to bench. You can download the dataset from
//! [here](https://www1.nyc.gov/site/tlc/about/tlc-trip-record-data.page).
#![feature(once_cell)]
#![allow(clippy::print_stdout)]
use std::{
collections::HashMap,
path::{Path, PathBuf},
sync::Arc,
time::Instant,
};
use std::collections::HashMap;
use std::path::{Path, PathBuf};
use std::time::Instant;
use arrow::{
array::{ArrayRef, PrimitiveArray, StringArray, TimestampNanosecondArray},
datatypes::{DataType, Float64Type, Int64Type},
record_batch::RecordBatch,
};
use arrow::array::{ArrayRef, PrimitiveArray, StringArray, TimestampNanosecondArray};
use arrow::datatypes::{DataType, Float64Type, Int64Type};
use arrow::record_batch::RecordBatch;
use clap::Parser;
use client::{
admin::Admin,
api::v1::{
codec::InsertBatch, column::Values, insert_expr, Column, ColumnDataType, ColumnDef,
CreateExpr, InsertExpr,
},
Client, Database, Select,
};
use client::admin::Admin;
use client::api::v1::column::Values;
use client::api::v1::{Column, ColumnDataType, ColumnDef, CreateTableExpr, InsertExpr, TableId};
use client::{Client, Database, Select};
use indicatif::{MultiProgress, ProgressBar, ProgressStyle};
use parquet::{
arrow::{ArrowReader, ParquetFileArrowReader},
file::{reader::FileReader, serialized_reader::SerializedFileReader},
};
use parquet::arrow::arrow_reader::ParquetRecordBatchReaderBuilder;
use tokio::task::JoinSet;
const DATABASE_NAME: &str = "greptime";
@@ -81,27 +82,30 @@ async fn write_data(
pb_style: ProgressStyle,
) -> u128 {
let file = std::fs::File::open(&path).unwrap();
let file_reader = Arc::new(SerializedFileReader::new(file).unwrap());
let row_num = file_reader.metadata().file_metadata().num_rows();
let record_batch_reader = ParquetFileArrowReader::new(file_reader)
.get_record_reader(batch_size)
let record_batch_reader_builder = ParquetRecordBatchReaderBuilder::try_new(file).unwrap();
let row_num = record_batch_reader_builder
.metadata()
.file_metadata()
.num_rows();
let record_batch_reader = record_batch_reader_builder
.with_batch_size(batch_size)
.build()
.unwrap();
let progress_bar = mpb.add(ProgressBar::new(row_num as _));
progress_bar.set_style(pb_style);
progress_bar.set_message(format!("{:?}", path));
progress_bar.set_message(format!("{path:?}"));
let mut total_rpc_elapsed_ms = 0;
for record_batch in record_batch_reader {
let record_batch = record_batch.unwrap();
let row_count = record_batch.num_rows();
let insert_batch = convert_record_batch(record_batch).into();
let (columns, row_count) = convert_record_batch(record_batch);
let insert_expr = InsertExpr {
schema_name: "public".to_string(),
table_name: TABLE_NAME.to_string(),
expr: Some(insert_expr::Expr::Values(insert_expr::Values {
values: vec![insert_batch],
})),
options: HashMap::default(),
region_number: 0,
columns,
row_count,
};
let now = Instant::now();
db.insert(insert_expr).await.unwrap();
@@ -110,14 +114,11 @@ async fn write_data(
progress_bar.inc(row_count as _);
}
progress_bar.finish_with_message(format!(
"file {:?} done in {}ms",
path, total_rpc_elapsed_ms
));
progress_bar.finish_with_message(format!("file {path:?} done in {total_rpc_elapsed_ms}ms",));
total_rpc_elapsed_ms
}
fn convert_record_batch(record_batch: RecordBatch) -> InsertBatch {
fn convert_record_batch(record_batch: RecordBatch) -> (Vec<Column>, u32) {
let schema = record_batch.schema();
let fields = schema.fields();
let row_count = record_batch.num_rows();
@@ -135,10 +136,7 @@ fn convert_record_batch(record_batch: RecordBatch) -> InsertBatch {
columns.push(column);
}
InsertBatch {
columns,
row_count: row_count as _,
}
(columns, row_count as _)
}
fn build_values(column: &ArrayRef) -> Values {
@@ -209,139 +207,142 @@ fn build_values(column: &ArrayRef) -> Values {
| DataType::FixedSizeList(_, _)
| DataType::LargeList(_)
| DataType::Struct(_)
| DataType::Union(_, _)
| DataType::Union(_, _, _)
| DataType::Dictionary(_, _)
| DataType::Decimal(_, _)
| DataType::Decimal128(_, _)
| DataType::Decimal256(_, _)
| DataType::Map(_, _) => todo!(),
}
}
fn create_table_expr() -> CreateExpr {
CreateExpr {
catalog_name: Some(CATALOG_NAME.to_string()),
schema_name: Some(SCHEMA_NAME.to_string()),
fn create_table_expr() -> CreateTableExpr {
CreateTableExpr {
catalog_name: CATALOG_NAME.to_string(),
schema_name: SCHEMA_NAME.to_string(),
table_name: TABLE_NAME.to_string(),
desc: None,
desc: "".to_string(),
column_defs: vec![
ColumnDef {
name: "VendorID".to_string(),
datatype: ColumnDataType::Int64 as i32,
is_nullable: true,
default_constraint: None,
default_constraint: vec![],
},
ColumnDef {
name: "tpep_pickup_datetime".to_string(),
datatype: ColumnDataType::Int64 as i32,
is_nullable: true,
default_constraint: None,
default_constraint: vec![],
},
ColumnDef {
name: "tpep_dropoff_datetime".to_string(),
datatype: ColumnDataType::Int64 as i32,
is_nullable: true,
default_constraint: None,
default_constraint: vec![],
},
ColumnDef {
name: "passenger_count".to_string(),
datatype: ColumnDataType::Float64 as i32,
is_nullable: true,
default_constraint: None,
default_constraint: vec![],
},
ColumnDef {
name: "trip_distance".to_string(),
datatype: ColumnDataType::Float64 as i32,
is_nullable: true,
default_constraint: None,
default_constraint: vec![],
},
ColumnDef {
name: "RatecodeID".to_string(),
datatype: ColumnDataType::Float64 as i32,
is_nullable: true,
default_constraint: None,
default_constraint: vec![],
},
ColumnDef {
name: "store_and_fwd_flag".to_string(),
datatype: ColumnDataType::String as i32,
is_nullable: true,
default_constraint: None,
default_constraint: vec![],
},
ColumnDef {
name: "PULocationID".to_string(),
datatype: ColumnDataType::Int64 as i32,
is_nullable: true,
default_constraint: None,
default_constraint: vec![],
},
ColumnDef {
name: "DOLocationID".to_string(),
datatype: ColumnDataType::Int64 as i32,
is_nullable: true,
default_constraint: None,
default_constraint: vec![],
},
ColumnDef {
name: "payment_type".to_string(),
datatype: ColumnDataType::Int64 as i32,
is_nullable: true,
default_constraint: None,
default_constraint: vec![],
},
ColumnDef {
name: "fare_amount".to_string(),
datatype: ColumnDataType::Float64 as i32,
is_nullable: true,
default_constraint: None,
default_constraint: vec![],
},
ColumnDef {
name: "extra".to_string(),
datatype: ColumnDataType::Float64 as i32,
is_nullable: true,
default_constraint: None,
default_constraint: vec![],
},
ColumnDef {
name: "mta_tax".to_string(),
datatype: ColumnDataType::Float64 as i32,
is_nullable: true,
default_constraint: None,
default_constraint: vec![],
},
ColumnDef {
name: "tip_amount".to_string(),
datatype: ColumnDataType::Float64 as i32,
is_nullable: true,
default_constraint: None,
default_constraint: vec![],
},
ColumnDef {
name: "tolls_amount".to_string(),
datatype: ColumnDataType::Float64 as i32,
is_nullable: true,
default_constraint: None,
default_constraint: vec![],
},
ColumnDef {
name: "improvement_surcharge".to_string(),
datatype: ColumnDataType::Float64 as i32,
is_nullable: true,
default_constraint: None,
default_constraint: vec![],
},
ColumnDef {
name: "total_amount".to_string(),
datatype: ColumnDataType::Float64 as i32,
is_nullable: true,
default_constraint: None,
default_constraint: vec![],
},
ColumnDef {
name: "congestion_surcharge".to_string(),
datatype: ColumnDataType::Float64 as i32,
is_nullable: true,
default_constraint: None,
default_constraint: vec![],
},
ColumnDef {
name: "airport_fee".to_string(),
datatype: ColumnDataType::Float64 as i32,
is_nullable: true,
default_constraint: None,
default_constraint: vec![],
},
],
time_index: "tpep_pickup_datetime".to_string(),
primary_keys: vec!["VendorID".to_string()],
create_if_not_exists: false,
table_options: Default::default(),
region_ids: vec![0],
table_id: Some(TableId { id: 0 }),
}
}
@@ -350,12 +351,12 @@ fn query_set() -> HashMap<String, String> {
ret.insert(
"count_all".to_string(),
format!("SELECT COUNT(*) FROM {};", TABLE_NAME),
format!("SELECT COUNT(*) FROM {TABLE_NAME};"),
);
ret.insert(
"fare_amt_by_passenger".to_string(),
format!("SELECT passenger_count, MIN(fare_amount), MAX(fare_amount), SUM(fare_amount) FROM {} GROUP BY passenger_count",TABLE_NAME)
format!("SELECT passenger_count, MIN(fare_amount), MAX(fare_amount), SUM(fare_amount) FROM {TABLE_NAME} GROUP BY passenger_count")
);
ret
@@ -368,7 +369,7 @@ async fn do_write(args: &Args, client: &Client) {
let mut write_jobs = JoinSet::new();
let create_table_result = admin.create(create_table_expr()).await;
println!("Create table result: {:?}", create_table_result);
println!("Create table result: {create_table_result:?}");
let progress_bar_style = ProgressStyle::with_template(
"[{elapsed_precise}] {bar:60.cyan/blue} {pos:>7}/{len:7} {msg}",
@@ -401,7 +402,7 @@ async fn do_write(args: &Args, client: &Client) {
async fn do_query(num_iter: usize, db: &Database) {
for (query_name, query) in query_set() {
println!("Running query: {}", query);
println!("Running query: {query}");
for i in 0..num_iter {
let now = Instant::now();
let _res = db.select(Select::Sql(query.clone())).await.unwrap();

View File

@@ -1,10 +1,10 @@
# codecov config
coverage:
status:
patch: off # disable patch status
project:
default:
enable: yes
threshold: 1%
patch: off
ignore:
- "**/error*.rs" # ignore all error.rs files
- "tests/runner/*.rs" # ignore integration test runner

View File

@@ -1,71 +0,0 @@
import sys
# for annoying releative import beyond top-level package
sys.path.insert(0, "../")
from greptime import mock_tester, coprocessor, greptime as gt_builtin
from greptime.greptime import interval, vector, log, prev, sqrt, datetime
import greptime.greptime as greptime
import json
import numpy as np
def data_sample(k_lines, symbol, density=5 * 30 * 86400):
"""
Only return close data for simplicty for now
"""
k_lines = k_lines["result"] if k_lines["ret_msg"] == "OK" else None
if k_lines is None:
raise Exception("Expect a `OK`ed message")
close = [float(i["close"]) for i in k_lines]
return interval(close, density, "prev")
def as_table(kline: list):
col_len = len(kline)
ret = {
k: vector([fn(row[k]) for row in kline], str(ty))
for k, fn, ty in
[
("symbol", str, "str"),
("period", str, "str"),
("open_time", int, "int"),
("open", float, "float"),
("high", float, "float"),
("low", float, "float"),
("close", float, "float")
]
}
return ret
@coprocessor(args=["open_time", "close"], returns=[
"rv_7d",
"rv_15d",
"rv_30d",
"rv_60d",
"rv_90d",
"rv_180d"
])
def calc_rvs(open_time, close):
from greptime import vector, log, prev, sqrt, datetime, pow, sum, last
import greptime as g
def calc_rv(close, open_time, time, interval):
mask = (open_time < time) & (open_time > time - interval)
close = close[mask]
open_time = open_time[mask]
close = g.interval(open_time, close, datetime("10m"), lambda x:last(x))
avg_time_interval = (open_time[-1] - open_time[0])/(len(open_time)-1)
ref = log(close/prev(close))
var = sum(pow(ref, 2)/(len(ref)-1))
return sqrt(var/avg_time_interval)
# how to get env var,
# maybe through accessing scope and serde then send to remote?
timepoint = open_time[-1]
rv_7d = vector([calc_rv(close, open_time, timepoint, datetime("7d"))])
rv_15d = vector([calc_rv(close, open_time, timepoint, datetime("15d"))])
rv_30d = vector([calc_rv(close, open_time, timepoint, datetime("30d"))])
rv_60d = vector([calc_rv(close, open_time, timepoint, datetime("60d"))])
rv_90d = vector([calc_rv(close, open_time, timepoint, datetime("90d"))])
rv_180d = vector([calc_rv(close, open_time, timepoint, datetime("180d"))])
return rv_7d, rv_15d, rv_30d, rv_60d, rv_90d, rv_180d

View File

@@ -1 +0,0 @@
curl "https://api.bybit.com/v2/public/index-price-kline?symbol=BTCUSD&interval=1&limit=$1&from=1581231260" > kline.json

View File

@@ -1,108 +0,0 @@
{
"ret_code": 0,
"ret_msg": "OK",
"ext_code": "",
"ext_info": "",
"result": [
{
"symbol": "BTCUSD",
"period": "1",
"open_time": 300,
"open": "10107",
"high": "10109.34",
"low": "10106.71",
"close": "10106.79"
},
{
"symbol": "BTCUSD",
"period": "1",
"open_time": 900,
"open": "10106.79",
"high": "10109.27",
"low": "10105.92",
"close": "10106.09"
},
{
"symbol": "BTCUSD",
"period": "1",
"open_time": 1200,
"open": "10106.09",
"high": "10108.75",
"low": "10104.66",
"close": "10108.73"
},
{
"symbol": "BTCUSD",
"period": "1",
"open_time": 1800,
"open": "10108.73",
"high": "10109.52",
"low": "10106.07",
"close": "10106.38"
},
{
"symbol": "BTCUSD",
"period": "1",
"open_time": 2400,
"open": "10106.38",
"high": "10109.48",
"low": "10104.81",
"close": "10106.95"
},
{
"symbol": "BTCUSD",
"period": "1",
"open_time": 3000,
"open": "10106.95",
"high": "10109.48",
"low": "10106.6",
"close": "10107.55"
},
{
"symbol": "BTCUSD",
"period": "1",
"open_time": 3600,
"open": "10107.55",
"high": "10109.28",
"low": "10104.68",
"close": "10104.68"
},
{
"symbol": "BTCUSD",
"period": "1",
"open_time": 4200,
"open": "10104.68",
"high": "10109.18",
"low": "10104.14",
"close": "10108.8"
},
{
"symbol": "BTCUSD",
"period": "1",
"open_time": 4800,
"open": "10108.8",
"high": "10117.36",
"low": "10108.8",
"close": "10115.96"
},
{
"symbol": "BTCUSD",
"period": "1",
"open_time": 5400,
"open": "10115.96",
"high": "10119.19",
"low": "10115.96",
"close": "10117.08"
},
{
"symbol": "BTCUSD",
"period": "1",
"open_time": 6000,
"open": "10117.08",
"high": "10120.73",
"low": "10116.96",
"close": "10120.43"
}
],
"time_now": "1661225351.158190"
}

View File

@@ -1,4 +0,0 @@
from .greptime import coprocessor, copr
from .greptime import vector, log, prev, next, first, last, sqrt, pow, datetime, sum, interval
from .mock import mock_tester
from .cfg import set_conn_addr, get_conn_addr

View File

@@ -1,11 +0,0 @@
GREPTIME_DB_CONN_ADDRESS = "localhost:3000"
"""The Global Variable for address for conntect to database"""
def set_conn_addr(addr: str):
"""set database address to given `addr`"""
global GREPTIME_DB_CONN_ADDRESS
GREPTIME_DB_CONN_ADDRESS = addr
def get_conn_addr()->str:
global GREPTIME_DB_CONN_ADDRESS
return GREPTIME_DB_CONN_ADDRESS

View File

@@ -1,207 +0,0 @@
"""
Be note that this is a mock library, if not connected to database,
it can only run on mock data and mock function which is supported by numpy
"""
import functools
import numpy as np
import json
from urllib import request
import inspect
import requests
from .cfg import set_conn_addr, get_conn_addr
log = np.log
sum = np.nansum
sqrt = np.sqrt
pow = np.power
nan = np.nan
class TimeStamp(str):
"""
TODO: impl date time
"""
pass
class i32(int):
"""
For Python Coprocessor Type Annotation ONLY
A signed 32-bit integer.
"""
def __repr__(self) -> str:
return "i32"
class i64(int):
"""
For Python Coprocessor Type Annotation ONLY
A signed 64-bit integer.
"""
def __repr__(self) -> str:
return "i64"
class f32(float):
"""
For Python Coprocessor Type Annotation ONLY
A 32-bit floating point number.
"""
def __repr__(self) -> str:
return "f32"
class f64(float):
"""
For Python Coprocessor Type Annotation ONLY
A 64-bit floating point number.
"""
def __repr__(self) -> str:
return "f64"
class vector(np.ndarray):
"""
A compact Vector with all elements of same Data type.
"""
_datatype: str | None = None
def __new__(
cls,
lst,
dtype=None
) -> ...:
self = np.asarray(lst).view(cls)
self._datatype = dtype
return self
def __str__(self) -> str:
return "vector({}, \"{}\")".format(super().__str__(), self.datatype())
def datatype(self):
return self._datatype
def filter(self, lst_bool):
return self[lst_bool]
def last(lst):
return lst[-1]
def first(lst):
return lst[0]
def prev(lst):
ret = np.zeros(len(lst))
ret[1:] = lst[0:-1]
ret[0] = nan
return ret
def next(lst):
ret = np.zeros(len(lst))
ret[:-1] = lst[1:]
ret[-1] = nan
return ret
def interval(ts: vector, arr: vector, duration: int, func):
"""
Note that this is a mock function with same functionailty to the actual Python Coprocessor
`arr` is a vector of integral or temporal type.
"""
start = np.min(ts)
end = np.max(ts)
masks = [(ts >= i) & (ts <= (i+duration)) for i in range(start, end, duration)]
lst_res = [func(arr[mask]) for mask in masks]
return lst_res
def factor(unit: str) -> int:
if unit == "d":
return 24 * 60 * 60
elif unit == "h":
return 60 * 60
elif unit == "m":
return 60
elif unit == "s":
return 1
else:
raise Exception("Only d,h,m,s, found{}".format(unit))
def datetime(input_time: str) -> int:
"""
support `d`(day) `h`(hour) `m`(minute) `s`(second)
support format:
`12s` `7d` `12d2h7m`
"""
prev = 0
cur = 0
state = "Num"
parse_res = []
for idx, ch in enumerate(input_time):
if ch.isdigit():
cur = idx
if state != "Num":
parse_res.append((state, input_time[prev:cur], (prev, cur)))
prev = idx
state = "Num"
else:
cur = idx
if state != "Symbol":
parse_res.append((state, input_time[prev:cur], (prev, cur)))
prev = idx
state = "Symbol"
parse_res.append((state, input_time[prev:cur+1], (prev, cur+1)))
cur_idx = 0
res_time = 0
while cur_idx < len(parse_res):
pair = parse_res[cur_idx]
if pair[0] == "Num":
val = int(pair[1])
nxt = parse_res[cur_idx+1]
res_time += val * factor(nxt[1])
cur_idx += 2
else:
raise Exception("Two symbol in a row is impossible")
return res_time
def coprocessor(args=None, returns=None, sql=None):
"""
The actual coprocessor, which will connect to database and update
whatever function decorated with `@coprocessor(args=[...], returns=[...], sql=...)`
"""
def decorator_copr(func):
@functools.wraps(func)
def wrapper_do_actual(*args, **kwargs):
if len(args)!=0 or len(kwargs)!=0:
raise Exception("Expect call with no arguements(for all args are given by coprocessor itself)")
source = inspect.getsource(func)
url = "http://{}/v1/scripts".format(get_conn_addr())
print("Posting to {}".format(url))
data = {
"script": source,
"engine": None,
}
res = requests.post(
url,
headers={"Content-Type": "application/json"},
json=data
)
return res
return wrapper_do_actual
return decorator_copr
# make a alias for short
copr = coprocessor

View File

@@ -1,82 +0,0 @@
"""
Note this is a mock library, if not connected to database,
it can only run on mock data and support by numpy
"""
from typing import Any
import numpy as np
from .greptime import i32,i64,f32,f64, vector, interval, prev, datetime, log, sum, sqrt, pow, nan, copr, coprocessor
import inspect
import functools
import ast
def mock_tester(
func,
env:dict,
table=None
):
"""
Mock tester helper function,
What it does is replace `@coprocessor` with `@mock_cpor` and add a keyword `env=env`
like `@mock_copr(args=...,returns=...,env=env)`
"""
code = inspect.getsource(func)
tree = ast.parse(code)
tree = HackyReplaceDecorator("env").visit(tree)
new_func = tree.body[0]
fn_name = new_func.name
code_obj = compile(tree, "<embedded>", "exec")
exec(code_obj)
ret = eval("{}()".format(fn_name))
return ret
def mock_copr(args, returns, sql=None, env:None|dict=None):
"""
This should not be used directly by user
"""
def decorator_copr(func):
@functools.wraps(func)
def wrapper_do_actual(*fn_args, **fn_kwargs):
real_args = [env[name] for name in args]
ret = func(*real_args)
return ret
return wrapper_do_actual
return decorator_copr
class HackyReplaceDecorator(ast.NodeTransformer):
"""
This class accept a `env` dict for environment to extract args from,
and put `env` dict in the param list of `mock_copr` decorator, i.e:
a `@copr(args=["a", "b"], returns=["c"])` with call like mock_helper(abc, env={"a":2, "b":3})
will be transform into `@mock_copr(args=["a", "b"], returns=["c"], env={"a":2, "b":3})`
"""
def __init__(self, env: str) -> None:
# just for add `env` keyword
self.env = env
def visit_FunctionDef(self, node: ast.FunctionDef) -> Any:
new_node = node
decorator_list = new_node.decorator_list
if len(decorator_list)!=1:
return node
deco = decorator_list[0]
if deco.func.id!="coprocessor" and deco.func.id !="copr":
raise Exception("Expect a @copr or @coprocessor, found {}.".format(deco.func.id))
deco.func = ast.Name(id="mock_copr", ctx=ast.Load())
new_kw = ast.keyword(arg="env", value=ast.Name(id=self.env, ctx=ast.Load()))
deco.keywords.append(new_kw)
# Tie up loose ends in the AST.
ast.copy_location(new_node, node)
ast.fix_missing_locations(new_node)
self.generic_visit(node)
return new_node

View File

@@ -1,60 +0,0 @@
from example.calc_rv import as_table, calc_rvs
from greptime import coprocessor, set_conn_addr, get_conn_addr, mock_tester
import sys
import json
import requests
'''
To run this script, you need to first start a http server of greptime, and
`
python3 component/script/python/test.py 地址:端口
`
'''
@coprocessor(sql='select number from numbers limit 10', args=['number'], returns=['n'])
def test(n):
return n+2
def init_table(close, open_time):
req_init = "/v1/sql?sql=create table k_line (close double, open_time bigint, TIME INDEX (open_time))"
print(get_db(req_init).text)
for c1, c2 in zip(close, open_time):
req = "/v1/sql?sql=INSERT INTO k_line(close, open_time) VALUES ({}, {})".format(c1, c2)
print(get_db(req).text)
print(get_db("/v1/sql?sql=select * from k_line").text)
def get_db(req:str):
return requests.get("http://{}{}".format(get_conn_addr(), req))
if __name__ == "__main__":
with open("component/script/python/example/kline.json", "r") as kline_file:
kline = json.load(kline_file)
table = as_table(kline["result"])
close = table["close"]
open_time = table["open_time"]
env = {"close":close, "open_time": open_time}
res = mock_tester(calc_rvs, env=env)
print("Mock result:", [i[0] for i in res])
exit()
if len(sys.argv)!=2:
raise Exception("Expect only one address as cmd's args")
set_conn_addr(sys.argv[1])
res = test()
print(res.headers)
print(res.text)
with open("component/script/python/example/kline.json", "r") as kline_file:
kline = json.load(kline_file)
# vec = vector([1,2,3], int)
# print(vec, vec.datatype())
table = as_table(kline["result"])
# print(table)
close = table["close"]
open_time = table["open_time"]
init_table(close, open_time)
real = calc_rvs()
print(real)
try:
print(real.text["error"])
except:
print(real.text)

View File

@@ -1,22 +1,26 @@
node_id = 42
http_addr = '0.0.0.0:3000'
rpc_addr = '0.0.0.0:3001'
mode = 'distributed'
rpc_addr = '127.0.0.1:3001'
wal_dir = '/tmp/greptimedb/wal'
rpc_runtime_size = 8
mode = "standalone"
mysql_addr = '0.0.0.0:3306'
mysql_addr = '127.0.0.1:4406'
mysql_runtime_size = 4
enable_memory_catalog = false
# applied when postgres feature enbaled
postgres_addr = '0.0.0.0:5432'
postgres_runtime_size = 4
[wal]
dir = "/tmp/greptimedb/wal"
file_size = 1073741824
purge_interval = 600
purge_threshold = 53687091200
read_batch_size = 128
sync_write = false
[storage]
type = 'File'
data_dir = '/tmp/greptimedb/data/'
[meta_client_opts]
metasrv_addr = "1.1.1.1:3002"
metasrv_addrs = ['127.0.0.1:3002']
timeout_millis = 3000
connect_timeout_millis = 5000
tcp_nodelay = true
tcp_nodelay = false

View File

@@ -1,4 +1,12 @@
http_addr = '0.0.0.0:4000'
grpc_addr = '0.0.0.0:4001'
mysql_addr = '0.0.0.0:4003'
mysql_runtime_size = 4
mode = 'distributed'
datanode_rpc_addr = '127.0.0.1:3001'
[http_options]
addr = '127.0.0.1:4000'
timeout = "30s"
[meta_client_opts]
metasrv_addrs = ['127.0.0.1:3002']
timeout_millis = 3000
connect_timeout_millis = 5000
tcp_nodelay = false

View File

@@ -1,4 +1,4 @@
bind_addr = '127.0.0.1:3002'
server_addr = '0.0.0.0:3002'
store_addr = '127.0.0.1:2380'
datanode_lease_secs = 30
server_addr = '127.0.0.1:3002'
store_addr = '127.0.0.1:2379'
datanode_lease_secs = 15

View File

@@ -0,0 +1,44 @@
node_id = 0
mode = 'standalone'
enable_memory_catalog = false
[http_options]
addr = '127.0.0.1:4000'
timeout = "30s"
[wal]
dir = "/tmp/greptimedb/wal"
file_size = 1073741824
purge_interval = 600
purge_threshold = 53687091200
read_batch_size = 128
sync_write = false
[storage]
type = 'File'
data_dir = '/tmp/greptimedb/data/'
[grpc_options]
addr = '127.0.0.1:4001'
runtime_size = 8
[mysql_options]
addr = '127.0.0.1:4002'
runtime_size = 2
[influxdb_options]
enable = true
[opentsdb_options]
addr = '127.0.0.1:4242'
enable = true
runtime_size = 2
[prometheus_options]
enable = true
[postgres_options]
addr = '127.0.0.1:4003'
runtime_size = 2
check_pwd = false

View File

@@ -24,6 +24,8 @@ RUN cargo build --release
# TODO(zyy17): Maybe should use the more secure container image.
FROM ubuntu:22.04 as base
RUN apt-get update && DEBIAN_FRONTEND=noninteractive apt-get -y install ca-certificates
WORKDIR /greptime
COPY --from=builder /greptimedb/target/release/greptime /greptime/bin/
ENV PATH /greptime/bin/:$PATH

View File

@@ -1,5 +1,7 @@
FROM ubuntu:22.04
RUN apt-get update && DEBIAN_FRONTEND=noninteractive apt-get -y install ca-certificates
ARG TARGETARCH
ADD $TARGETARCH/greptime /greptime/bin/

View File

@@ -55,7 +55,7 @@ The DataFusion basically execute aggregate like this:
2. Call `update_batch` on each accumulator with partitioned data, to let you update your aggregate calculation.
3. Call `state` to get each accumulator's internal state, the medial calculation result.
4. Call `merge_batch` to merge all accumulator's internal state to one.
5. Execute `evalute` on the chosen one to get the final calculation result.
5. Execute `evaluate` on the chosen one to get the final calculation result.
Once you know the meaning of each method, you can easily write your accumulator. You can refer to `Median` accumulator or `SUM` accumulator defined in file `my_sum_udaf_example.rs` for more details.
@@ -63,7 +63,7 @@ Once you know the meaning of each method, you can easily write your accumulator.
You can call `register_aggregate_function` method in query engine to register your aggregate function. To do that, you have to new an instance of struct `AggregateFunctionMeta`. The struct has three fields, first is the name of your aggregate function's name. The function name is case-sensitive due to DataFusion's restriction. We strongly recommend using lowercase for your name. If you have to use uppercase name, wrap your aggregate function with quotation marks. For example, if you define an aggregate function named "my_aggr", you can use "`SELECT MY_AGGR(x)`"; if you define "my_AGGR", you have to use "`SELECT "my_AGGR"(x)`".
The second field is arg_counts ,the count of the arguments. Like accumulator `percentile`, caculating the p_number of the column. We need to input the value of column and the value of p to cacalate, and so the count of the arguments is two.
The second field is arg_counts ,the count of the arguments. Like accumulator `percentile`, calculating the p_number of the column. We need to input the value of column and the value of p to cacalate, and so the count of the arguments is two.
The third field is a function about how to create your accumulator creator that you defined in step 1 above. Create creator, that's a bit intertwined, but it is how we make DataFusion use a newly created aggregate function each time it executes a SQL, preventing the stored input types from affecting each other. The key detail can be starting looking at our `DfContextProviderAdapter` struct's `get_aggregate_meta` method.

Binary file not shown.

After

Width:  |  Height:  |  Size: 25 KiB

BIN
docs/logo-text-padding.png Executable file

Binary file not shown.

After

Width:  |  Height:  |  Size: 18 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 34 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 58 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 35 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 46 KiB

View File

@@ -0,0 +1,175 @@
---
Feature Name: "promql-in-rust"
Tracking Issue: https://github.com/GreptimeTeam/greptimedb/issues/596
Date: 2022-12-20
Author: "Ruihang Xia <waynestxia@gmail.com>"
---
Rewrite PromQL in Rust
----------------------
# Summary
A Rust native implementation of PromQL, for GreptimeDB.
# Motivation
Prometheus and its query language PromQL prevails in the cloud-native observability area, which is an important scenario for time series database like GreptimeDB. We already have support for its remote read and write protocols. Users can now integrate GreptimeDB as the storage backend to existing Prometheus deployment, but cannot run PromQL query directly on GreptimeDB like SQL.
This RFC proposes to add support for PromQL. Because it was created in Go, we can't use the existing code easily. For interoperability, performance and extendability, porting its logic to Rust is a good choice.
# Details
## Overview
One of the goals is to make use of our existing basic operators, execution model and runtime to reduce the work. So the entire proposal is built on top of Apache Arrow DataFusion. The rewrote PromQL logic is manifested as `Expr` or `Execution Plan` in DataFusion. And both the intermediate data structure and the result is in the format of `Arrow`'s `RecordBatch`.
The following sections are organized in a top-down manner. Starts with evaluation procedure. Then introduces the building blocks of our new PromQL operation. Follows by an explanation of data model. And end with an example logic plan.
*This RFC is heavily related to Prometheus and PromQL. It won't repeat some basic concepts of them.*
## Evaluation
The original implementation is like an interpreter of parsed PromQL AST. It has two characteristics: (1) Operations are evaluated in place after they are parsed to AST. And some key parameters are separated from the AST because they do not present in the query, but come from other places like another field in the HTTP payload. (2) calculation is performed per timestamp. You can see this pattern many times:
```go
for ts := ev.startTimestamp; ts <= ev.endTimestamp; ts += ev.interval {}
```
These bring out two differences in the proposed implementation. First, to make it more general and clear, the evaluation procedure is reorganized into serval phases (and is the same as DataFusion's). And second, data are evaluated by time series (corresponding to "columnar calculation", if think timestamp as row number).
```
Logic
Query AST Plan
─────────► Parser ───────► Logical ────────► Physical ────┐
Planner Planner │
◄───────────────────────────── Executor ◄────────────────┘
Evaluation Result Execution
Plan
```
- Parser
Provided by [`promql-parser`](https://github.com/GreptimeTeam/promql-parser) crate. Same as the original implementation.
- Logical Planner
Generates a logical plan with all the needed parameters. It should accept something like `EvalStmt` in Go's implementation, which contains query time range, evaluation interval and lookback range.
Another important thing done here is assembling the logic plan, with all the operations baked into logically. Like what's the filter and time range to read, how the data then flows through a selector into a binary operation, etc. Or what's the output schema of every single step. The generated logic plan is deterministic without variables, and can be `EXPLAIN`ed clearly.
- Physical Planner
This step converts a logic plan into evaluatable execution plan. There are not many special things like the previous step. Except when a query is going to be executed distributedly. In this case, a logic plan will be divided into serval parts and sent to serval nodes. One physical planner only sees its own part.
- Executor
As its name shows, this step calculates data to result. And all new calculation logic, the implementation of PromQL in rust, is placed here. And the rewrote functions are using `RecordBatch` and `Array` from `Arrow` as the intermediate data structure.
Each "batch" contains only data from single time series. This is from the underlying storage implementation. Though it's not a requirement of this RFC, having this property can simplify some functions.
Another thing to mention is the rewrote functions don't aware of timestamp or value columns, they are defined only based on the input data types. For example, `increase()` function in PromQL calculates the unbiased delta of data, its implementation here only does this single thing. Let's compare the signature of two implementations:
- Go
```go
func funcIncrease(vals []parser.Value, args parser.Expressions) Vector {}
```
- Rust
```rust
fn prom_increase(input: Array) -> Array {}
```
Some unimportant parameters are omitted. The original Go version only writes the logic for `Point`'s value, either float or histogram. But the proposed rewritten one accepts a generic `Array` as input, which can be any type that suits, from `i8` to `u64` to `TimestampNanosecond`.
## Plan and Expression
They are structures to express logic from PromQL. The proposed implementation is built on top of DataFusion, thus our plan and expression are in form of `ExtensionPlan` and `ScalarUDF`. The only difference between them in this context is the return type: plan returns a record batch while expression returns a single column.
This RFC proposes to add four new plans, they are fundamental building blocks that mainly handle data selection logic in PromQL, for the following calculation expressions.
- `SeriesNormalize`
Sort data inside one series on the timestamp column, and bias "offset" if has. This plan usually comes after `TableScan` (or `TableScan` and `Filter`) plan.
- `VectorManipulator` and `MatrixManipulator`
Corresponding to `InstantSelector` and `RangeSelector`. We don't calculate timestamp by timestamp, thus use "vector" instead of "instant", this image shows the difference. And "matrix" is another name for "range vector", for not confused with our "vector". The following section will detail how they are implemented using Arrow.
![instant_and_vector](instant-and-vector.png)
Due to "interval" parameter in PromQL, data after "selector" (or "manipulator" here) are usually shorter than input. And we have to modify the entire record batch to shorten both timestamp, value and tag columns. So they are formed as plan.
- `PromAggregator`
The carrier of aggregator expressions. This should not be very different from the DataFusion built-in `Aggregate` plan, except PromQL can use "group without" to do reverse selection.
PromQL has around 70 expressions and functions. But luckily we can reuse lots of them from DataFusion. Like unary expression, binary expression and aggregator. We only need to implement those PromQL-specific expressions, like `rate` or `percentile`. The following table lists some typical functions in PromQL, and their signature in the proposed implementation. Other function should be the same.
| Name | In Param(s) | Out Param(s) | Explain |
|-------------------- |------------------------------------------------------ |-------------- |-------------------- |
| instant_delta | Matrix T | Array T | idelta in PromQL |
| increase | Matrix T | Array T | increase in PromQL |
| extrapolate_factor | - Matrix T<br>- Array Timestamp<br>- Array Timestamp | Array T | * |
*: *`extrapolate_factor` is one of the "dark sides" in PromQL. In short it's a translation of this [paragraph](https://github.com/prometheus/prometheus/blob/0372e259baf014bbade3134fd79bcdfd8cbdef2c/promql/functions.go#L134-L159)*
To reuse those common calculation logic, we can break them into serval expressions, and assemble in the logic planning phase. Like `rate()` in PromQL can be represented as `increase / extrapolate_factor`.
## Data Model
This part explains how data is represented. Following the data model in GreptimeDB, all the data are stored as table, with tag columns, timestamp column and value column. Table to record batch is very straightforward. So an instant vector can be thought of as a row (though as said before, we don't use instant vectors) in the table. Given four basic types in PromQL: scalar, string, instant vector and range vector, only the last "range vector" need some tricks to adapt our columnar calculation.
Range vector is some sort of matrix, it's consisted of small one-dimension vectors, with each being an input of range function. And, applying range function to a range vector can be thought of kind of convolution.
![range-vector-with-matrix](range-vector-with-matrix.png)
(Left is an illustration of range vector. Notice the Y-axis has no meaning, it's just put different pieces separately. The right side is an imagined "matrix" as range function. Multiplying the left side to it can get a one-dimension "matrix" with four elements. That's the evaluation result of a range vector.)
To adapt this range vector to record batch, it should be represented by a column. This RFC proposes to use `DictionaryArray` from Arrow to represent range vector, or `Matrix`. This is "misusing" `DictionaryArray` to ship some additional information about an array. Because the range vector is sliding over one series, we only need to know the `offset` and `length` of each slides to reconstruct the matrix from an array:
![matrix-from-array](matrix-from-array.png)
The length is not fixed, it depends on the input's timestamp. An PoC implementation of `Matrix` and `increase()` can be found in [this repo](https://github.com/waynexia/corroding-prometheus).
## Example
The logic plan of PromQL query
```promql
# start: 2022-12-20T10:00:00
# end: 2022-12-21T10:00:00
# interval: 1m
# lookback: 30s
sum (rate(request_duration[5m])) by (idc)
```
looks like
<!-- title: 'PromAggregator: \naggr = sum, column = idc'
operator: prom
inputs:
- title: 'Matrix Manipulator: \ninterval = 1m, range = 5m, expr = div(increase(value), extrapolate_factor(timestamp))'
operator: prom
inputs:
- title: 'Series Normalize: \noffset = 0'
operator: prom
inputs:
- title: 'Filter: \ntimetamp > 2022-12-20T10:00:00 && timestamp < 2022-12-21T10:00:00'
operator: filter
inputs:
- title: 'Table Scan: \ntable = request_duration, timetamp > 2022-12-20T10:00:00 && timestamp < 2022-12-21T10:00:00'
operator: scan -->
![example](example.png)
# Drawbacks
Human-being is always error-prone. It's harder to endeavor to rewrite from the ground and requires more attention to ensure correctness, than translate line-by-line. And, since the evaluator's architecture are different, it might be painful to catch up with PromQL's breaking update (if any) in the future.
Misusing Arrow's DictionaryVector as Matrix is another point. This hack needs some `unsafe` function call to bypass Arrow's check. And though Arrow's API is stable, this is still an undocumented behavior.
# Alternatives
There are a few alternatives we've considered:
- Wrap the existing PromQL's implementation via FFI, and import it to GreptimeDB.
- Translate its evaluator engine line-by-line, rather than rewrite one.
- Integrate the Prometheus server into GreptimeDB via RPC, making it a detached execution engine for PromQL.
The first and second options are making a separate execution engine in GreptimeDB, they may alleviate the pain during rewriting, but will have negative impacts to afterward evolve like resource management. And introduce another deploy component in the last option will bring a complex deploy architecture.
And all of them are more or less redundant in data transportation that affects performance and resources. The proposed built-in executing procedure is also easy to integrate and expose to the existing SQL interface GreptimeDB currently provides. Some concepts in PromQL like sliding windows (range vector in PromQL) are very convenient and ergonomic in analyzing series data. This makes it not only a PromQL evaluator, but also an enhancement to our query system.

View File

@@ -1 +1 @@
nightly-2022-07-14
nightly-2022-12-20

View File

@@ -1,3 +1,2 @@
group_imports = "StdExternalCrate"
imports_granularity = "Module"

View File

@@ -7,7 +7,7 @@ ARCH_TYPE=
VERSION=${1:-latest}
GITHUB_ORG=GreptimeTeam
GITHUB_REPO=greptimedb
BIN=greptimedb
BIN=greptime
get_os_type() {
os_type="$(uname -s)"

View File

@@ -1,11 +1,12 @@
[package]
name = "api"
version = "0.1.0"
edition = "2021"
# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html
version.workspace = true
edition.workspace = true
license.workspace = true
[dependencies]
common-base = { path = "../common/base" }
common-error = { path = "../common/error" }
common-time = { path = "../common/time" }
datatypes = { path = "../datatypes" }
prost = "0.11"

View File

@@ -1,3 +1,17 @@
// Copyright 2022 Greptime Team
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
use std::path::PathBuf;
fn main() {
@@ -6,9 +20,7 @@ fn main() {
.file_descriptor_set_path(default_out_dir.join("greptime_fd.bin"))
.compile(
&[
"greptime/v1/insert.proto",
"greptime/v1/select.proto",
"greptime/v1/physical_plan.proto",
"greptime/v1/greptime.proto",
"greptime/v1/meta/common.proto",
"greptime/v1/meta/heartbeat.proto",

View File

@@ -17,8 +17,10 @@ message AdminResponse {
message AdminExpr {
ExprHeader header = 1;
oneof expr {
CreateExpr create = 2;
CreateTableExpr create_table = 2;
AlterExpr alter = 3;
CreateDatabaseExpr create_database = 4;
DropTableExpr drop_table = 5;
}
}
@@ -29,27 +31,58 @@ message AdminResult {
}
}
message CreateExpr {
optional string catalog_name = 1;
optional string schema_name = 2;
message CreateTableExpr {
string catalog_name = 1;
string schema_name = 2;
string table_name = 3;
optional string desc = 4;
string desc = 4;
repeated ColumnDef column_defs = 5;
string time_index = 6;
repeated string primary_keys = 7;
bool create_if_not_exists = 8;
map<string, string> table_options = 9;
TableId table_id = 10;
repeated uint32 region_ids = 11;
}
message AlterExpr {
optional string catalog_name = 1;
optional string schema_name = 2;
string catalog_name = 1;
string schema_name = 2;
string table_name = 3;
oneof kind {
AddColumn add_column = 4;
AddColumns add_columns = 4;
DropColumns drop_columns = 5;
}
}
message DropTableExpr {
string catalog_name = 1;
string schema_name = 2;
string table_name = 3;
}
message CreateDatabaseExpr {
//TODO(hl): maybe rename to schema_name?
string database_name = 1;
}
message AddColumns {
repeated AddColumn add_columns = 1;
}
message DropColumns {
repeated DropColumn drop_columns = 1;
}
message AddColumn {
ColumnDef column_def = 1;
bool is_key = 2;
}
message DropColumn {
string name = 1;
}
message TableId {
uint32 id = 1;
}

View File

@@ -32,7 +32,10 @@ message Column {
repeated int32 date_values = 14;
repeated int64 datetime_values = 15;
repeated int64 ts_millis_values = 16;
repeated int64 ts_second_values = 16;
repeated int64 ts_millisecond_values = 17;
repeated int64 ts_microsecond_values = 18;
repeated int64 ts_nanosecond_values = 19;
}
// The array of non-null values in this column.
//
@@ -56,7 +59,7 @@ message ColumnDef {
string name = 1;
ColumnDataType datatype = 2;
bool is_nullable = 3;
optional bytes default_constraint = 4;
bytes default_constraint = 4;
}
enum ColumnDataType {
@@ -75,5 +78,8 @@ enum ColumnDataType {
STRING = 12;
DATE = 13;
DATETIME = 14;
TIMESTAMP = 15;
TIMESTAMP_SECOND = 15;
TIMESTAMP_MILLISECOND = 16;
TIMESTAMP_MICROSECOND = 17;
TIMESTAMP_NANOSECOND = 18;
}

View File

@@ -2,6 +2,7 @@ syntax = "proto3";
package greptime.v1;
import "greptime/v1/column.proto";
import "greptime/v1/common.proto";
message DatabaseRequest {
@@ -28,36 +29,23 @@ message SelectExpr {
oneof expr {
string sql = 1;
bytes logical_plan = 2;
PhysicalPlan physical_plan = 15;
}
}
message PhysicalPlan {
bytes original_ql = 1;
bytes plan = 2;
}
message InsertExpr {
string table_name = 1;
string schema_name = 1;
string table_name = 2;
message Values {
repeated bytes values = 1;
}
// Data is represented here.
repeated Column columns = 3;
oneof expr {
Values values = 2;
// The row_count of all columns, which include null and non-null values.
//
// Note: the row_count of all columns in a InsertExpr must be same.
uint32 row_count = 4;
// TODO(LFC): Remove field "sql" in InsertExpr.
// When Frontend instance received an insertion SQL (`insert into ...`), it's anticipated to parse the SQL and
// assemble the values to insert to feed Datanode. In other words, inserting data through Datanode instance's GRPC
// interface shouldn't use SQL directly.
// Then why the "sql" field exists here? It's because the Frontend needs table schema to create the values to insert,
// which is currently not able to find anywhere. (Maybe the table schema is suppose to be fetched from Meta?)
// The "sql" field is meant to be removed in the future.
string sql = 3;
}
map<string, bytes> options = 4;
// The region number of current insert request.
uint32 region_number = 5;
}
// TODO(jiachun)

View File

@@ -1,14 +0,0 @@
syntax = "proto3";
package greptime.v1.codec;
import "greptime/v1/column.proto";
message InsertBatch {
repeated Column columns = 1;
uint32 row_count = 2;
}
message RegionNumber {
uint32 id = 1;
}

View File

@@ -39,7 +39,7 @@ message NodeStat {
uint64 wcus = 2;
// Table number in this node
uint64 table_num = 3;
// Regon number in this node
// Region number in this node
uint64 region_num = 4;
double cpu_usage = 5;

View File

@@ -5,6 +5,8 @@ package greptime.v1.meta;
import "greptime/v1/meta/common.proto";
service Router {
rpc Create(CreateRequest) returns (RouteResponse) {}
// Fetch routing information for tables. The smallest unit is the complete
// routing information(all regions) of a table.
//
@@ -26,7 +28,14 @@ service Router {
//
rpc Route(RouteRequest) returns (RouteResponse) {}
rpc Create(CreateRequest) returns (RouteResponse) {}
rpc Delete(DeleteRequest) returns (RouteResponse) {}
}
message CreateRequest {
RequestHeader header = 1;
TableName table_name = 2;
repeated Partition partitions = 3;
}
message RouteRequest {
@@ -35,6 +44,12 @@ message RouteRequest {
repeated TableName table_names = 2;
}
message DeleteRequest {
RequestHeader header = 1;
TableName table_name = 2;
}
message RouteResponse {
ResponseHeader header = 1;
@@ -42,13 +57,6 @@ message RouteResponse {
repeated TableRoute table_routes = 3;
}
message CreateRequest {
RequestHeader header = 1;
TableName table_name = 2;
repeated Partition partitions = 3;
}
message TableRoute {
Table table = 1;
repeated RegionRoute region_routes = 2;
@@ -69,6 +77,7 @@ message Table {
}
message Region {
// TODO(LFC): Maybe use message RegionNumber?
uint64 id = 1;
string name = 2;
Partition partition = 3;

View File

@@ -20,6 +20,9 @@ service Store {
// DeleteRange deletes the given range from the key-value store.
rpc DeleteRange(DeleteRangeRequest) returns (DeleteRangeResponse);
// MoveValue atomically renames the key to the given updated key.
rpc MoveValue(MoveValueRequest) returns (MoveValueResponse);
}
message RangeRequest {
@@ -136,3 +139,21 @@ message DeleteRangeResponse {
// returned.
repeated KeyValue prev_kvs = 3;
}
message MoveValueRequest {
RequestHeader header = 1;
// If from_key dose not exist, return the value of to_key (if it exists).
// If from_key exists, move the value of from_key to to_key (i.e. rename),
// and return the value.
bytes from_key = 2;
bytes to_key = 3;
}
message MoveValueResponse {
ResponseHeader header = 1;
// If from_key dose not exist, return the value of to_key (if it exists).
// If from_key exists, return the value of from_key.
KeyValue kv = 2;
}

View File

@@ -1,33 +0,0 @@
syntax = "proto3";
package greptime.v1.codec;
message PhysicalPlanNode {
oneof PhysicalPlanType {
ProjectionExecNode projection = 1;
MockInputExecNode mock = 99;
// TODO(fys): impl other physical plan node
}
}
message ProjectionExecNode {
PhysicalPlanNode input = 1;
repeated PhysicalExprNode expr = 2;
repeated string expr_name = 3;
}
message PhysicalExprNode {
oneof ExprType {
PhysicalColumn column = 1;
// TODO(fys): impl other physical expr node
}
}
message PhysicalColumn {
string name = 1;
uint64 index = 2;
}
message MockInputExecNode {
string name = 1;
}

View File

@@ -1,6 +1,24 @@
// Copyright 2022 Greptime Team
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
use std::any::Any;
use common_error::ext::ErrorExt;
use common_error::prelude::StatusCode;
use datatypes::prelude::ConcreteDataType;
use snafu::prelude::*;
use snafu::Backtrace;
use snafu::{Backtrace, ErrorCompat};
pub type Result<T> = std::result::Result<T, Error>;
@@ -15,4 +33,44 @@ pub enum Error {
from: ConcreteDataType,
backtrace: Backtrace,
},
#[snafu(display(
"Failed to convert column default constraint, column: {}, source: {}",
column,
source
))]
ConvertColumnDefaultConstraint {
column: String,
#[snafu(backtrace)]
source: datatypes::error::Error,
},
#[snafu(display(
"Invalid column default constraint, column: {}, source: {}",
column,
source
))]
InvalidColumnDefaultConstraint {
column: String,
#[snafu(backtrace)]
source: datatypes::error::Error,
},
}
impl ErrorExt for Error {
fn status_code(&self) -> StatusCode {
match self {
Error::UnknownColumnDataType { .. } => StatusCode::InvalidArguments,
Error::IntoColumnDataType { .. } => StatusCode::Unexpected,
Error::ConvertColumnDefaultConstraint { source, .. }
| Error::InvalidColumnDefaultConstraint { source, .. } => source.status_code(),
}
}
fn backtrace_opt(&self) -> Option<&Backtrace> {
ErrorCompat::backtrace(self)
}
fn as_any(&self) -> &dyn Any {
self
}
}

View File

@@ -1,14 +1,28 @@
// Copyright 2022 Greptime Team
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
use common_base::BitVec;
use common_time::timestamp::TimeUnit;
use datatypes::prelude::ConcreteDataType;
use datatypes::types::TimestampType;
use datatypes::value::Value;
use datatypes::vectors::VectorRef;
use snafu::prelude::*;
use crate::error::{self, Result};
use crate::v1::column::Values;
use crate::v1::Column;
use crate::v1::ColumnDataType;
use crate::v1::{Column, ColumnDataType};
#[derive(Debug, PartialEq, Eq)]
pub struct ColumnDataTypeWrapper(ColumnDataType);
@@ -43,7 +57,16 @@ impl From<ColumnDataTypeWrapper> for ConcreteDataType {
ColumnDataType::String => ConcreteDataType::string_datatype(),
ColumnDataType::Date => ConcreteDataType::date_datatype(),
ColumnDataType::Datetime => ConcreteDataType::datetime_datatype(),
ColumnDataType::Timestamp => ConcreteDataType::timestamp_millis_datatype(),
ColumnDataType::TimestampSecond => ConcreteDataType::timestamp_second_datatype(),
ColumnDataType::TimestampMillisecond => {
ConcreteDataType::timestamp_millisecond_datatype()
}
ColumnDataType::TimestampMicrosecond => {
ConcreteDataType::timestamp_microsecond_datatype()
}
ColumnDataType::TimestampNanosecond => {
ConcreteDataType::timestamp_nanosecond_datatype()
}
}
}
}
@@ -68,7 +91,12 @@ impl TryFrom<ConcreteDataType> for ColumnDataTypeWrapper {
ConcreteDataType::String(_) => ColumnDataType::String,
ConcreteDataType::Date(_) => ColumnDataType::Date,
ConcreteDataType::DateTime(_) => ColumnDataType::Datetime,
ConcreteDataType::Timestamp(_) => ColumnDataType::Timestamp,
ConcreteDataType::Timestamp(unit) => match unit {
TimestampType::Second(_) => ColumnDataType::TimestampSecond,
TimestampType::Millisecond(_) => ColumnDataType::TimestampMillisecond,
TimestampType::Microsecond(_) => ColumnDataType::TimestampMicrosecond,
TimestampType::Nanosecond(_) => ColumnDataType::TimestampNanosecond,
},
ConcreteDataType::Null(_) | ConcreteDataType::List(_) => {
return error::IntoColumnDataTypeSnafu { from: datatype }.fail()
}
@@ -140,8 +168,20 @@ impl Values {
datetime_values: Vec::with_capacity(capacity),
..Default::default()
},
ColumnDataType::Timestamp => Values {
ts_millis_values: Vec::with_capacity(capacity),
ColumnDataType::TimestampSecond => Values {
ts_second_values: Vec::with_capacity(capacity),
..Default::default()
},
ColumnDataType::TimestampMillisecond => Values {
ts_millisecond_values: Vec::with_capacity(capacity),
..Default::default()
},
ColumnDataType::TimestampMicrosecond => Values {
ts_microsecond_values: Vec::with_capacity(capacity),
..Default::default()
},
ColumnDataType::TimestampNanosecond => Values {
ts_nanosecond_values: Vec::with_capacity(capacity),
..Default::default()
},
}
@@ -174,9 +214,12 @@ impl Column {
Value::Binary(val) => values.binary_values.push(val.to_vec()),
Value::Date(val) => values.date_values.push(val.val()),
Value::DateTime(val) => values.datetime_values.push(val.val()),
Value::Timestamp(val) => values
.ts_millis_values
.push(val.convert_to(TimeUnit::Millisecond)),
Value::Timestamp(val) => match val.unit() {
TimeUnit::Second => values.ts_second_values.push(val.value()),
TimeUnit::Millisecond => values.ts_millisecond_values.push(val.value()),
TimeUnit::Microsecond => values.ts_microsecond_values.push(val.value()),
TimeUnit::Nanosecond => values.ts_nanosecond_values.push(val.value()),
},
Value::List(_) => unreachable!(),
});
self.null_mask = null_mask.into_vec();
@@ -187,7 +230,10 @@ impl Column {
mod tests {
use std::sync::Arc;
use datatypes::vectors::BooleanVector;
use datatypes::vectors::{
BooleanVector, TimestampMicrosecondVector, TimestampMillisecondVector,
TimestampNanosecondVector, TimestampSecondVector,
};
use super::*;
@@ -245,8 +291,8 @@ mod tests {
let values = values.datetime_values;
assert_eq!(2, values.capacity());
let values = Values::with_capacity(ColumnDataType::Timestamp, 2);
let values = values.ts_millis_values;
let values = Values::with_capacity(ColumnDataType::TimestampMillisecond, 2);
let values = values.ts_millisecond_values;
assert_eq!(2, values.capacity());
}
@@ -313,8 +359,8 @@ mod tests {
ColumnDataTypeWrapper(ColumnDataType::Datetime).into()
);
assert_eq!(
ConcreteDataType::timestamp_millis_datatype(),
ColumnDataTypeWrapper(ColumnDataType::Timestamp).into()
ConcreteDataType::timestamp_millisecond_datatype(),
ColumnDataTypeWrapper(ColumnDataType::TimestampMillisecond).into()
);
}
@@ -381,8 +427,8 @@ mod tests {
ConcreteDataType::datetime_datatype().try_into().unwrap()
);
assert_eq!(
ColumnDataTypeWrapper(ColumnDataType::Timestamp),
ConcreteDataType::timestamp_millis_datatype()
ColumnDataTypeWrapper(ColumnDataType::TimestampMillisecond),
ConcreteDataType::timestamp_millisecond_datatype()
.try_into()
.unwrap()
);
@@ -399,7 +445,48 @@ mod tests {
assert!(result.is_err());
assert_eq!(
result.unwrap_err().to_string(),
"Failed to create column datatype from List(ListType { inner: Boolean(BooleanType) })"
"Failed to create column datatype from List(ListType { item_type: Boolean(BooleanType) })"
);
}
#[test]
fn test_column_put_timestamp_values() {
let mut column = Column {
column_name: "test".to_string(),
semantic_type: 0,
values: Some(Values {
..Default::default()
}),
null_mask: vec![],
datatype: 0,
};
let vector = Arc::new(TimestampNanosecondVector::from_vec(vec![1, 2, 3]));
column.push_vals(3, vector);
assert_eq!(
vec![1, 2, 3],
column.values.as_ref().unwrap().ts_nanosecond_values
);
let vector = Arc::new(TimestampMillisecondVector::from_vec(vec![4, 5, 6]));
column.push_vals(3, vector);
assert_eq!(
vec![4, 5, 6],
column.values.as_ref().unwrap().ts_millisecond_values
);
let vector = Arc::new(TimestampMicrosecondVector::from_vec(vec![7, 8, 9]));
column.push_vals(3, vector);
assert_eq!(
vec![7, 8, 9],
column.values.as_ref().unwrap().ts_microsecond_values
);
let vector = Arc::new(TimestampSecondVector::from_vec(vec![10, 11, 12]));
column.push_vals(3, vector);
assert_eq!(
vec![10, 11, 12],
column.values.as_ref().unwrap().ts_second_values
);
}

View File

@@ -1,6 +1,21 @@
// Copyright 2022 Greptime Team
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
pub mod error;
pub mod helper;
pub mod prometheus;
pub mod result;
pub mod serde;
pub mod v1;

View File

@@ -1,3 +1,17 @@
// Copyright 2022 Greptime Team
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
#![allow(clippy::derive_partial_eq_without_eq)]
pub mod remote {

View File

@@ -1,23 +1,39 @@
use api::v1::{
admin_result, codec::SelectResult, object_result, AdminResult, MutateResult, ObjectResult,
ResultHeader, SelectResult as SelectResultRaw,
};
// Copyright 2022 Greptime Team
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
use common_error::prelude::ErrorExt;
use crate::v1::codec::SelectResult;
use crate::v1::{
admin_result, object_result, AdminResult, MutateResult, ObjectResult, ResultHeader,
SelectResult as SelectResultRaw,
};
pub const PROTOCOL_VERSION: u32 = 1;
pub type Success = u32;
pub type Failure = u32;
#[derive(Default)]
pub(crate) struct ObjectResultBuilder {
pub struct ObjectResultBuilder {
version: u32,
code: u32,
err_msg: Option<String>,
result: Option<Body>,
}
pub(crate) enum Body {
pub enum Body {
Mutate((Success, Failure)),
Select(SelectResult),
}
@@ -80,7 +96,7 @@ impl ObjectResultBuilder {
}
}
pub(crate) fn build_err_result(err: &impl ErrorExt) -> ObjectResult {
pub fn build_err_result(err: &impl ErrorExt) -> ObjectResult {
ObjectResultBuilder::new()
.status_code(err.status_code() as u32)
.err_msg(err.to_string())
@@ -88,7 +104,7 @@ pub(crate) fn build_err_result(err: &impl ErrorExt) -> ObjectResult {
}
#[derive(Debug)]
pub(crate) struct AdminResultBuilder {
pub struct AdminResultBuilder {
version: u32,
code: u32,
err_msg: Option<String>,
@@ -144,11 +160,11 @@ impl Default for AdminResultBuilder {
#[cfg(test)]
mod tests {
use api::v1::{object_result, MutateResult};
use common_error::status_code::StatusCode;
use super::*;
use crate::error::UnsupportedExprSnafu;
use crate::error::UnknownColumnDataTypeSnafu;
use crate::v1::{object_result, MutateResult};
#[test]
fn test_object_result_builder() {
@@ -175,14 +191,13 @@ mod tests {
#[test]
fn test_build_err_result() {
let err = UnsupportedExprSnafu { name: "select" }.build();
let err = UnknownColumnDataTypeSnafu { datatype: 1 }.build();
let err_result = build_err_result(&err);
let header = err_result.header.unwrap();
let result = err_result.result;
assert_eq!(PROTOCOL_VERSION, header.version);
assert_eq!(StatusCode::Internal as u32, header.code);
assert_eq!("Unsupported expr type: select", header.err_msg);
assert_eq!(StatusCode::InvalidArguments as u32, header.code);
assert!(result.is_none());
}
}

View File

@@ -1,9 +1,20 @@
// Copyright 2022 Greptime Team
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
pub use prost::DecodeError;
use prost::Message;
use crate::v1::codec::InsertBatch;
use crate::v1::codec::PhysicalPlanNode;
use crate::v1::codec::RegionNumber;
use crate::v1::codec::SelectResult;
use crate::v1::meta::TableRouteValue;
@@ -25,10 +36,7 @@ macro_rules! impl_convert_with_bytes {
};
}
impl_convert_with_bytes!(InsertBatch);
impl_convert_with_bytes!(SelectResult);
impl_convert_with_bytes!(PhysicalPlanNode);
impl_convert_with_bytes!(RegionNumber);
impl_convert_with_bytes!(TableRouteValue);
#[cfg(test)]
@@ -36,57 +44,10 @@ mod tests {
use std::ops::Deref;
use crate::v1::codec::*;
use crate::v1::column;
use crate::v1::Column;
use crate::v1::{column, Column};
const SEMANTIC_TAG: i32 = 0;
#[test]
fn test_convert_insert_batch() {
let insert_batch = mock_insert_batch();
let bytes: Vec<u8> = insert_batch.into();
let insert: InsertBatch = bytes.deref().try_into().unwrap();
assert_eq!(8, insert.row_count);
assert_eq!(1, insert.columns.len());
let column = &insert.columns[0];
assert_eq!("foo", column.column_name);
assert_eq!(SEMANTIC_TAG, column.semantic_type);
assert_eq!(vec![1], column.null_mask);
assert_eq!(
vec![2, 3, 4, 5, 6, 7, 8],
column.values.as_ref().unwrap().i32_values
);
}
#[should_panic]
#[test]
fn test_convert_insert_batch_wrong() {
let insert_batch = mock_insert_batch();
let mut bytes: Vec<u8> = insert_batch.into();
// modify some bytes
bytes[0] = 0b1;
bytes[1] = 0b1;
let insert: InsertBatch = bytes.deref().try_into().unwrap();
assert_eq!(8, insert.row_count);
assert_eq!(1, insert.columns.len());
let column = &insert.columns[0];
assert_eq!("foo", column.column_name);
assert_eq!(SEMANTIC_TAG, column.semantic_type);
assert_eq!(vec![1], column.null_mask);
assert_eq!(
vec![2, 3, 4, 5, 6, 7, 8],
column.values.as_ref().unwrap().i32_values
);
}
#[test]
fn test_convert_select_result() {
let select_result = mock_select_result();
@@ -133,35 +94,6 @@ mod tests {
);
}
#[test]
fn test_convert_region_id() {
let region_id = RegionNumber { id: 12 };
let bytes: Vec<u8> = region_id.into();
let region_id: RegionNumber = bytes.deref().try_into().unwrap();
assert_eq!(12, region_id.id);
}
fn mock_insert_batch() -> InsertBatch {
let values = column::Values {
i32_values: vec![2, 3, 4, 5, 6, 7, 8],
..Default::default()
};
let null_mask = vec![1];
let column = Column {
column_name: "foo".to_string(),
semantic_type: SEMANTIC_TAG,
values: Some(values),
null_mask,
..Default::default()
};
InsertBatch {
columns: vec![column],
row_count: 8,
}
}
fn mock_select_result() -> SelectResult {
let values = column::Values {
i32_values: vec![2, 3, 4, 5, 6, 7, 8],

View File

@@ -1,3 +1,17 @@
// Copyright 2022 Greptime Team
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
#![allow(clippy::derive_partial_eq_without_eq)]
tonic::include_proto!("greptime.v1");
@@ -7,4 +21,5 @@ pub mod codec {
tonic::include_proto!("greptime.v1.codec");
}
mod column_def;
pub mod meta;

View File

@@ -0,0 +1,39 @@
// Copyright 2022 Greptime Team
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
use datatypes::schema::{ColumnDefaultConstraint, ColumnSchema};
use snafu::ResultExt;
use crate::error::{self, Result};
use crate::helper::ColumnDataTypeWrapper;
use crate::v1::ColumnDef;
impl ColumnDef {
pub fn try_as_column_schema(&self) -> Result<ColumnSchema> {
let data_type = ColumnDataTypeWrapper::try_new(self.datatype)?;
let constraint = if self.default_constraint.is_empty() {
None
} else {
Some(
ColumnDefaultConstraint::try_from(self.default_constraint.as_slice())
.context(error::ConvertColumnDefaultConstraintSnafu { column: &self.name })?,
)
};
ColumnSchema::new(&self.name, data_type.into(), self.is_nullable)
.with_default_constraint(constraint)
.context(error::InvalidColumnDefaultConstraintSnafu { column: &self.name })
}
}

View File

@@ -1,8 +1,21 @@
// Copyright 2022 Greptime Team
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
tonic::include_proto!("greptime.v1.meta");
use std::collections::HashMap;
use std::hash::Hash;
use std::hash::Hasher;
use std::hash::{Hash, Hasher};
pub const PROTOCOL_VERSION: u64 = 1;
@@ -71,11 +84,22 @@ impl ResponseHeader {
error: Some(error),
}
}
#[inline]
pub fn is_not_leader(&self) -> bool {
if let Some(error) = &self.error {
if error.code == ErrorCode::NotLeader as i32 {
return true;
}
}
false
}
}
#[derive(Debug, Clone, Copy, PartialEq, Eq)]
pub enum ErrorCode {
NoActiveDatanodes = 1,
NotLeader = 2,
}
impl Error {
@@ -86,6 +110,24 @@ impl Error {
err_msg: "No active datanodes".to_string(),
}
}
#[inline]
pub fn is_not_leader() -> Self {
Self {
code: ErrorCode::NotLeader as i32,
err_msg: "Current server is not leader".to_string(),
}
}
}
impl HeartbeatResponse {
#[inline]
pub fn is_not_leader(&self) -> bool {
if let Some(header) = &self.header {
return header.is_not_leader();
}
false
}
}
macro_rules! gen_set_header {
@@ -103,10 +145,12 @@ gen_set_header!(HeartbeatRequest);
gen_set_header!(RouteRequest);
gen_set_header!(CreateRequest);
gen_set_header!(RangeRequest);
gen_set_header!(DeleteRequest);
gen_set_header!(PutRequest);
gen_set_header!(BatchPutRequest);
gen_set_header!(CompareAndPutRequest);
gen_set_header!(DeleteRangeRequest);
gen_set_header!(MoveValueRequest);
#[cfg(test)]
mod tests {

View File

@@ -1,8 +1,8 @@
[package]
name = "catalog"
version = "0.1.0"
edition = "2021"
# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html
version.workspace = true
edition.workspace = true
license.workspace = true
[dependencies]
api = { path = "../api" }
@@ -18,15 +18,12 @@ common-recordbatch = { path = "../common/recordbatch" }
common-runtime = { path = "../common/runtime" }
common-telemetry = { path = "../common/telemetry" }
common-time = { path = "../common/time" }
datafusion = { git = "https://github.com/apache/arrow-datafusion.git", branch = "arrow2", features = [
"simd",
] }
datafusion.workspace = true
datatypes = { path = "../datatypes" }
futures = "0.3"
futures-util = "0.3"
lazy_static = "1.4"
meta-client = { path = "../meta-client" }
opendal = "0.17"
regex = "1.6"
serde = "1.0"
serde_json = "1.0"
@@ -38,9 +35,8 @@ tokio = { version = "1.18", features = ["full"] }
[dev-dependencies]
chrono = "0.4"
log-store = { path = "../log-store" }
mito = { path = "../mito", features = ["test"] }
object-store = { path = "../object-store" }
opendal = "0.17"
storage = { path = "../storage" }
table-engine = { path = "../table-engine" }
tempdir = "0.3"
tokio = { version = "1.0", features = ["full"] }

View File

@@ -1,9 +1,23 @@
// Copyright 2022 Greptime Team
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
use std::any::Any;
use common_error::ext::{BoxedError, ErrorExt};
use common_error::prelude::{Snafu, StatusCode};
use datafusion::error::DataFusionError;
use datatypes::arrow;
use datatypes::prelude::ConcreteDataType;
use datatypes::schema::RawSchema;
use snafu::{Backtrace, ErrorCompat};
@@ -37,14 +51,12 @@ pub enum Error {
SystemCatalog { msg: String, backtrace: Backtrace },
#[snafu(display(
"System catalog table type mismatch, expected: binary, found: {:?} source: {}",
"System catalog table type mismatch, expected: binary, found: {:?}",
data_type,
source
))]
SystemCatalogTypeMismatch {
data_type: arrow::datatypes::DataType,
#[snafu(backtrace)]
source: datatypes::error::Error,
data_type: ConcreteDataType,
backtrace: Backtrace,
},
#[snafu(display("Invalid system catalog entry type: {:?}", entry_type))]
@@ -80,15 +92,27 @@ pub enum Error {
backtrace: Backtrace,
},
#[snafu(display("Table {} already exists", table))]
#[snafu(display("Table `{}` already exists", table))]
TableExists { table: String, backtrace: Backtrace },
#[snafu(display("Schema {} already exists", schema))]
SchemaExists {
schema: String,
backtrace: Backtrace,
},
#[snafu(display("Failed to register table"))]
RegisterTable {
#[snafu(backtrace)]
source: BoxedError,
},
#[snafu(display("Operation {} not implemented yet", operation))]
Unimplemented {
operation: String,
backtrace: Backtrace,
},
#[snafu(display("Failed to open table, table info: {}, source: {}", table_info, source))]
OpenTable {
table_info: String,
@@ -112,7 +136,7 @@ pub enum Error {
"Failed to insert table creation record to system catalog, source: {}",
source
))]
InsertTableRecord {
InsertCatalogRecord {
#[snafu(backtrace)]
source: table::error::Error,
},
@@ -165,21 +189,8 @@ pub enum Error {
source: meta_client::error::Error,
},
#[snafu(display("Failed to bump table id"))]
BumpTableId { msg: String, backtrace: Backtrace },
#[snafu(display("Failed to parse table id from metasrv, data: {:?}", data))]
ParseTableId { data: String, backtrace: Backtrace },
#[snafu(display("Failed to deserialize partition rule from string: {:?}", data))]
DeserializePartitionRule {
data: String,
source: serde_json::error::Error,
backtrace: Backtrace,
},
#[snafu(display("Invalid table schema in catalog, source: {:?}", source))]
InvalidSchemaInCatalog {
#[snafu(display("Invalid table info in catalog, source: {}", source))]
InvalidTableInfoInCatalog {
#[snafu(backtrace)]
source: datatypes::error::Error,
},
@@ -209,28 +220,29 @@ impl ErrorExt for Error {
| Error::ValueDeserialize { .. }
| Error::Io { .. } => StatusCode::StorageUnavailable,
Error::RegisterTable { .. } | Error::SystemCatalogTypeMismatch { .. } => {
StatusCode::Internal
}
Error::ReadSystemCatalog { source, .. } => source.status_code(),
Error::SystemCatalogTypeMismatch { source, .. } => source.status_code(),
Error::InvalidCatalogValue { source, .. } => source.status_code(),
Error::RegisterTable { .. } => StatusCode::Internal,
Error::TableExists { .. } => StatusCode::TableAlreadyExists,
Error::SchemaExists { .. } => StatusCode::InvalidArguments,
Error::OpenSystemCatalog { source, .. }
| Error::CreateSystemCatalog { source, .. }
| Error::InsertTableRecord { source, .. }
| Error::InsertCatalogRecord { source, .. }
| Error::OpenTable { source, .. }
| Error::CreateTable { source, .. } => source.status_code(),
Error::MetaSrv { source, .. } => source.status_code(),
Error::SystemCatalogTableScan { source } => source.status_code(),
Error::SystemCatalogTableScanExec { source } => source.status_code(),
Error::InvalidTableSchema { source, .. } => source.status_code(),
Error::BumpTableId { .. } | Error::ParseTableId { .. } => {
StatusCode::StorageUnavailable
}
Error::DeserializePartitionRule { .. } => StatusCode::Unexpected,
Error::InvalidSchemaInCatalog { .. } => StatusCode::Unexpected,
Error::InvalidTableInfoInCatalog { .. } => StatusCode::Unexpected,
Error::Internal { source, .. } => source.status_code(),
Error::Unimplemented { .. } => StatusCode::Unsupported,
}
}
@@ -252,7 +264,6 @@ impl From<Error> for DataFusionError {
#[cfg(test)]
mod tests {
use common_error::mock::MockError;
use datatypes::arrow::datatypes::DataType;
use snafu::GenerateImplicitData;
use super::*;
@@ -301,11 +312,8 @@ mod tests {
assert_eq!(
StatusCode::Internal,
Error::SystemCatalogTypeMismatch {
data_type: DataType::Boolean,
source: datatypes::error::Error::UnsupportedArrowType {
arrow_type: DataType::Boolean,
backtrace: Backtrace::generate()
}
data_type: ConcreteDataType::binary_datatype(),
backtrace: Backtrace::generate(),
}
.status_code()
);

View File

@@ -1,54 +1,70 @@
// Copyright 2022 Greptime Team
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
use std::collections::HashMap;
use std::fmt::{Display, Formatter};
use common_catalog::error::{
DeserializeCatalogEntryValueSnafu, Error, InvalidCatalogSnafu, SerializeCatalogEntryValueSnafu,
};
use lazy_static::lazy_static;
use regex::Regex;
use serde::{Deserialize, Serialize, Serializer};
use snafu::{ensure, OptionExt, ResultExt};
use table::metadata::{RawTableMeta, TableId, TableVersion};
use table::metadata::{RawTableInfo, TableId, TableVersion};
use crate::consts::{
CATALOG_KEY_PREFIX, SCHEMA_KEY_PREFIX, TABLE_GLOBAL_KEY_PREFIX, TABLE_REGIONAL_KEY_PREFIX,
};
use crate::error::{
DeserializeCatalogEntryValueSnafu, Error, InvalidCatalogSnafu, SerializeCatalogEntryValueSnafu,
};
const CATALOG_KEY_PREFIX: &str = "__c";
const SCHEMA_KEY_PREFIX: &str = "__s";
const TABLE_GLOBAL_KEY_PREFIX: &str = "__tg";
const TABLE_REGIONAL_KEY_PREFIX: &str = "__tr";
const ALPHANUMERICS_NAME_PATTERN: &str = "[a-zA-Z_][a-zA-Z0-9_]*";
lazy_static! {
static ref CATALOG_KEY_PATTERN: Regex =
Regex::new(&format!("^{}-([a-zA-Z_]+)$", CATALOG_KEY_PREFIX)).unwrap();
static ref CATALOG_KEY_PATTERN: Regex = Regex::new(&format!(
"^{CATALOG_KEY_PREFIX}-({ALPHANUMERICS_NAME_PATTERN})$"
))
.unwrap();
}
lazy_static! {
static ref SCHEMA_KEY_PATTERN: Regex = Regex::new(&format!(
"^{}-([a-zA-Z_]+)-([a-zA-Z_]+)$",
SCHEMA_KEY_PREFIX
"^{SCHEMA_KEY_PREFIX}-({ALPHANUMERICS_NAME_PATTERN})-({ALPHANUMERICS_NAME_PATTERN})$"
))
.unwrap();
}
lazy_static! {
static ref TABLE_GLOBAL_KEY_PATTERN: Regex = Regex::new(&format!(
"^{}-([a-zA-Z_]+)-([a-zA-Z_]+)-([a-zA-Z_]+)$",
TABLE_GLOBAL_KEY_PREFIX
"^{TABLE_GLOBAL_KEY_PREFIX}-({ALPHANUMERICS_NAME_PATTERN})-({ALPHANUMERICS_NAME_PATTERN})-({ALPHANUMERICS_NAME_PATTERN})$"
))
.unwrap();
}
lazy_static! {
static ref TABLE_REGIONAL_KEY_PATTERN: Regex = Regex::new(&format!(
"^{}-([a-zA-Z_]+)-([a-zA-Z_]+)-([a-zA-Z_]+)-([0-9]+)$",
TABLE_REGIONAL_KEY_PREFIX
"^{TABLE_REGIONAL_KEY_PREFIX}-({ALPHANUMERICS_NAME_PATTERN})-({ALPHANUMERICS_NAME_PATTERN})-({ALPHANUMERICS_NAME_PATTERN})-([0-9]+)$"
))
.unwrap();
}
pub fn build_catalog_prefix() -> String {
format!("{}-", CATALOG_KEY_PREFIX)
format!("{CATALOG_KEY_PREFIX}-")
}
pub fn build_schema_prefix(catalog_name: impl AsRef<str>) -> String {
format!("{}-{}-", SCHEMA_KEY_PREFIX, catalog_name.as_ref())
format!("{SCHEMA_KEY_PREFIX}-{}-", catalog_name.as_ref())
}
pub fn build_table_global_prefix(
@@ -56,8 +72,7 @@ pub fn build_table_global_prefix(
schema_name: impl AsRef<str>,
) -> String {
format!(
"{}-{}-{}-",
TABLE_GLOBAL_KEY_PREFIX,
"{TABLE_GLOBAL_KEY_PREFIX}-{}-{}-",
catalog_name.as_ref(),
schema_name.as_ref()
)
@@ -112,18 +127,20 @@ impl TableGlobalKey {
/// Table global info contains necessary info for a datanode to create table regions, including
/// table id, table meta(schema...), region id allocation across datanodes.
#[derive(Debug, Clone, Serialize, Deserialize, PartialEq)]
#[derive(Debug, Clone, Serialize, Deserialize, PartialEq, Eq)]
pub struct TableGlobalValue {
/// Table id is the same across all datanodes.
pub id: TableId,
/// Id of datanode that created the global table info kv. only for debugging.
pub node_id: u64,
// TODO(LFC): Maybe remove it?
/// Allocation of region ids across all datanodes.
pub regions_id_map: HashMap<u64, Vec<u32>>,
/// Node id -> region ids
pub meta: RawTableMeta,
/// Partition rules for table
pub partition_rules: String,
pub table_info: RawTableInfo,
}
impl TableGlobalValue {
pub fn table_id(&self) -> TableId {
self.table_info.ident.table_id
}
}
/// Table regional info that varies between datanode, so it contains a `node_id` field.
@@ -245,6 +262,10 @@ macro_rules! define_catalog_value {
.context(DeserializeCatalogEntryValueSnafu { raw: s.as_ref() })
}
pub fn from_bytes(bytes: impl AsRef<[u8]>) -> Result<Self, Error> {
Self::parse(&String::from_utf8_lossy(bytes.as_ref()))
}
pub fn as_bytes(&self) -> Result<Vec<u8>, Error> {
Ok(serde_json::to_string(self)
.context(SerializeCatalogEntryValueSnafu)?
@@ -266,6 +287,7 @@ define_catalog_value!(
mod tests {
use datatypes::prelude::ConcreteDataType;
use datatypes::schema::{ColumnSchema, RawSchema, Schema};
use table::metadata::{RawTableMeta, TableIdent, TableType};
use super::*;
@@ -326,15 +348,26 @@ mod tests {
region_numbers: vec![1],
};
let table_info = RawTableInfo {
ident: TableIdent {
table_id: 42,
version: 1,
},
name: "table_1".to_string(),
desc: Some("blah".to_string()),
catalog_name: "catalog_1".to_string(),
schema_name: "schema_1".to_string(),
meta,
table_type: TableType::Base,
};
let value = TableGlobalValue {
id: 42,
node_id: 0,
regions_id_map: HashMap::from([(0, vec![1, 2, 3])]),
meta,
partition_rules: "{}".to_string(),
table_info,
};
let serialized = serde_json::to_string(&value).unwrap();
let deserialized = TableGlobalValue::parse(&serialized).unwrap();
let deserialized = TableGlobalValue::parse(serialized).unwrap();
assert_eq!(value, deserialized);
}
}

View File

@@ -1,6 +1,21 @@
// Copyright 2022 Greptime Team
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
#![feature(assert_matches)]
use std::any::Any;
use std::fmt::{Debug, Formatter};
use std::sync::Arc;
use common_telemetry::info;
@@ -14,6 +29,7 @@ use crate::error::{CreateTableSnafu, Result};
pub use crate::schema::{SchemaProvider, SchemaProviderRef};
pub mod error;
pub mod helper;
pub mod local;
pub mod remote;
pub mod schema;
@@ -69,17 +85,24 @@ pub trait CatalogManager: CatalogList {
/// Starts a catalog manager.
async fn start(&self) -> Result<()>;
/// Returns next available table id.
async fn next_table_id(&self) -> Result<TableId>;
/// Registers a table within given catalog/schema to catalog manager,
/// returns whether the table registered.
async fn register_table(&self, request: RegisterTableRequest) -> Result<bool>;
/// Registers a table given given catalog/schema to catalog manager,
/// returns table registered.
async fn register_table(&self, request: RegisterTableRequest) -> Result<usize>;
/// Deregisters a table within given catalog/schema to catalog manager,
/// returns whether the table deregistered.
async fn deregister_table(&self, request: DeregisterTableRequest) -> Result<bool>;
/// Register a schema with catalog name and schema name. Retuens whether the
/// schema registered.
async fn register_schema(&self, request: RegisterSchemaRequest) -> Result<bool>;
/// Register a system table, should be called before starting the manager.
async fn register_system_table(&self, request: RegisterSystemTableRequest)
-> error::Result<()>;
fn schema(&self, catalog: &str, schema: &str) -> Result<Option<SchemaProviderRef>>;
/// Returns the table by catalog, schema and table name.
fn table(&self, catalog: &str, schema: &str, table_name: &str) -> Result<Option<TableRef>>;
}
@@ -107,9 +130,34 @@ pub struct RegisterTableRequest {
pub table: TableRef,
}
impl Debug for RegisterTableRequest {
fn fmt(&self, f: &mut Formatter<'_>) -> std::fmt::Result {
f.debug_struct("RegisterTableRequest")
.field("catalog", &self.catalog)
.field("schema", &self.schema)
.field("table_name", &self.table_name)
.field("table_id", &self.table_id)
.field("table", &self.table.table_info())
.finish()
}
}
#[derive(Clone)]
pub struct DeregisterTableRequest {
pub catalog: String,
pub schema: String,
pub table_name: String,
}
#[derive(Debug, Clone)]
pub struct RegisterSchemaRequest {
pub catalog: String,
pub schema: String,
}
/// Formats table fully-qualified name
pub fn format_full_table_name(catalog: &str, schema: &str, table: &str) -> String {
format!("{}.{}.{}", catalog, schema, table)
format!("{catalog}.{schema}.{table}")
}
pub trait CatalogProviderFactory {
@@ -139,8 +187,7 @@ pub(crate) async fn handle_system_table_request<'a, M: CatalogManager>(
.await
.with_context(|_| CreateTableSnafu {
table_info: format!(
"{}.{}.{}, id: {}",
catalog_name, schema_name, table_name, table_id,
"{catalog_name}.{schema_name}.{table_name}, id: {table_id}",
),
})?;
manager
@@ -152,7 +199,7 @@ pub(crate) async fn handle_system_table_request<'a, M: CatalogManager>(
table: table.clone(),
})
.await?;
info!("Created and registered system table: {}", table_name);
info!("Created and registered system table: {table_name}");
table
};
if let Some(hook) = req.open_hook {

View File

@@ -1,3 +1,17 @@
// Copyright 2022 Greptime Team
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
pub mod manager;
pub mod memory;

View File

@@ -1,3 +1,17 @@
// Copyright 2022 Greptime Team
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
use std::any::Any;
use std::sync::atomic::{AtomicU32, Ordering};
use std::sync::Arc;
@@ -7,7 +21,7 @@ use common_catalog::consts::{
SYSTEM_CATALOG_NAME, SYSTEM_CATALOG_TABLE_NAME,
};
use common_recordbatch::{RecordBatch, SendableRecordBatchStream};
use common_telemetry::info;
use common_telemetry::{error, info};
use datatypes::prelude::ScalarVector;
use datatypes::vectors::{BinaryVector, UInt8Vector};
use futures_util::lock::Mutex;
@@ -16,13 +30,14 @@ use table::engine::{EngineContext, TableEngineRef};
use table::metadata::TableId;
use table::requests::OpenTableRequest;
use table::table::numbers::NumbersTable;
use table::table::TableIdProvider;
use table::TableRef;
use crate::error::{
CatalogNotFoundSnafu, IllegalManagerStateSnafu, OpenTableSnafu, SchemaNotFoundSnafu,
SystemCatalogSnafu, SystemCatalogTypeMismatchSnafu, TableExistsSnafu, TableNotFoundSnafu,
CatalogNotFoundSnafu, IllegalManagerStateSnafu, OpenTableSnafu, ReadSystemCatalogSnafu, Result,
SchemaExistsSnafu, SchemaNotFoundSnafu, SystemCatalogSnafu, SystemCatalogTypeMismatchSnafu,
TableExistsSnafu, TableNotFoundSnafu, UnimplementedSnafu,
};
use crate::error::{ReadSystemCatalogSnafu, Result};
use crate::local::memory::{MemoryCatalogManager, MemoryCatalogProvider, MemorySchemaProvider};
use crate::system::{
decode_system_catalog, Entry, SystemCatalogTable, TableEntry, ENTRY_TYPE_INDEX, KEY_INDEX,
@@ -31,8 +46,8 @@ use crate::system::{
use crate::tables::SystemCatalog;
use crate::{
format_full_table_name, handle_system_table_request, CatalogList, CatalogManager,
CatalogProvider, CatalogProviderRef, RegisterSystemTableRequest, RegisterTableRequest,
SchemaProvider,
CatalogProvider, CatalogProviderRef, DeregisterTableRequest, RegisterSchemaRequest,
RegisterSystemTableRequest, RegisterTableRequest, SchemaProvider, SchemaProviderRef,
};
/// A `CatalogManager` consists of a system catalog and a bunch of user catalogs.
@@ -42,6 +57,7 @@ pub struct LocalCatalogManager {
engine: TableEngineRef,
next_table_id: AtomicU32,
init_lock: Mutex<bool>,
register_lock: Mutex<()>,
system_table_requests: Mutex<Vec<RegisterSystemTableRequest>>,
}
@@ -61,6 +77,7 @@ impl LocalCatalogManager {
engine,
next_table_id: AtomicU32::new(MIN_USER_TABLE_ID),
init_lock: Mutex::new(false),
register_lock: Mutex::new(()),
system_table_requests: Mutex::new(Vec::default()),
})
}
@@ -128,27 +145,34 @@ impl LocalCatalogManager {
/// Convert `RecordBatch` to a vector of `Entry`.
fn record_batch_to_entry(rb: RecordBatch) -> Result<Vec<Entry>> {
ensure!(
rb.df_recordbatch.columns().len() >= 6,
rb.num_columns() >= 6,
SystemCatalogSnafu {
msg: format!("Length mismatch: {}", rb.df_recordbatch.columns().len())
msg: format!("Length mismatch: {}", rb.num_columns())
}
);
let entry_type = UInt8Vector::try_from_arrow_array(&rb.df_recordbatch.columns()[0])
.with_context(|_| SystemCatalogTypeMismatchSnafu {
data_type: rb.df_recordbatch.columns()[ENTRY_TYPE_INDEX]
.data_type()
.clone(),
let entry_type = rb
.column(ENTRY_TYPE_INDEX)
.as_any()
.downcast_ref::<UInt8Vector>()
.with_context(|| SystemCatalogTypeMismatchSnafu {
data_type: rb.column(ENTRY_TYPE_INDEX).data_type(),
})?;
let key = BinaryVector::try_from_arrow_array(&rb.df_recordbatch.columns()[1])
.with_context(|_| SystemCatalogTypeMismatchSnafu {
data_type: rb.df_recordbatch.columns()[KEY_INDEX].data_type().clone(),
let key = rb
.column(KEY_INDEX)
.as_any()
.downcast_ref::<BinaryVector>()
.with_context(|| SystemCatalogTypeMismatchSnafu {
data_type: rb.column(KEY_INDEX).data_type(),
})?;
let value = BinaryVector::try_from_arrow_array(&rb.df_recordbatch.columns()[3])
.with_context(|_| SystemCatalogTypeMismatchSnafu {
data_type: rb.df_recordbatch.columns()[VALUE_INDEX].data_type().clone(),
let value = rb
.column(VALUE_INDEX)
.as_any()
.downcast_ref::<BinaryVector>()
.with_context(|| SystemCatalogTypeMismatchSnafu {
data_type: rb.column(VALUE_INDEX).data_type(),
})?;
let mut res = Vec::with_capacity(rb.num_rows());
@@ -226,6 +250,7 @@ impl LocalCatalogManager {
schema_name: t.schema_name.clone(),
table_name: t.table_name.clone(),
table_id: t.table_id,
region_numbers: vec![0],
};
let option = self
@@ -278,6 +303,13 @@ impl CatalogList for LocalCatalogManager {
}
}
#[async_trait::async_trait]
impl TableIdProvider for LocalCatalogManager {
async fn next_table_id(&self) -> table::Result<TableId> {
Ok(self.next_table_id.fetch_add(1, Ordering::Relaxed))
}
}
#[async_trait::async_trait]
impl CatalogManager for LocalCatalogManager {
/// Start [LocalCatalogManager] to load all information from system catalog table.
@@ -286,12 +318,7 @@ impl CatalogManager for LocalCatalogManager {
self.init().await
}
#[inline]
async fn next_table_id(&self) -> Result<TableId> {
Ok(self.next_table_id.fetch_add(1, Ordering::Relaxed))
}
async fn register_table(&self, request: RegisterTableRequest) -> Result<usize> {
async fn register_table(&self, request: RegisterTableRequest) -> Result<bool> {
let started = self.init_lock.lock().await;
ensure!(
@@ -311,27 +338,82 @@ impl CatalogManager for LocalCatalogManager {
let schema = catalog
.schema(schema_name)?
.with_context(|| SchemaNotFoundSnafu {
schema_info: format!("{}.{}", catalog_name, schema_name),
schema_info: format!("{catalog_name}.{schema_name}"),
})?;
if schema.table_exist(&request.table_name)? {
return TableExistsSnafu {
table: format_full_table_name(catalog_name, schema_name, &request.table_name),
{
let _lock = self.register_lock.lock().await;
if let Some(existing) = schema.table(&request.table_name)? {
if existing.table_info().ident.table_id != request.table_id {
error!(
"Unexpected table register request: {:?}, existing: {:?}",
request,
existing.table_info()
);
return TableExistsSnafu {
table: format_full_table_name(
catalog_name,
schema_name,
&request.table_name,
),
}
.fail();
}
// Try to register table with same table id, just ignore.
Ok(false)
} else {
// table does not exist
self.system
.register_table(
catalog_name.clone(),
schema_name.clone(),
request.table_name.clone(),
request.table_id,
)
.await?;
schema.register_table(request.table_name, request.table)?;
Ok(true)
}
.fail();
}
}
self.system
.register_table(
catalog_name.clone(),
schema_name.clone(),
request.table_name.clone(),
request.table_id,
)
.await?;
async fn deregister_table(&self, _request: DeregisterTableRequest) -> Result<bool> {
UnimplementedSnafu {
operation: "deregister table",
}
.fail()
}
schema.register_table(request.table_name, request.table)?;
Ok(1)
async fn register_schema(&self, request: RegisterSchemaRequest) -> Result<bool> {
let started = self.init_lock.lock().await;
ensure!(
*started,
IllegalManagerStateSnafu {
msg: "Catalog manager not started",
}
);
let catalog_name = &request.catalog;
let schema_name = &request.schema;
let catalog = self
.catalogs
.catalog(catalog_name)?
.context(CatalogNotFoundSnafu { catalog_name })?;
{
let _lock = self.register_lock.lock().await;
ensure!(
catalog.schema(schema_name)?.is_none(),
SchemaExistsSnafu {
schema: schema_name,
}
);
self.system
.register_schema(request.catalog, schema_name.clone())
.await?;
catalog.register_schema(request.schema, Arc::new(MemorySchemaProvider::new()))?;
Ok(true)
}
}
async fn register_system_table(&self, request: RegisterSystemTableRequest) -> Result<()> {
@@ -348,6 +430,15 @@ impl CatalogManager for LocalCatalogManager {
Ok(())
}
fn schema(&self, catalog: &str, schema: &str) -> Result<Option<SchemaProviderRef>> {
self.catalogs
.catalog(catalog)?
.context(CatalogNotFoundSnafu {
catalog_name: catalog,
})?
.schema(schema)
}
fn table(
&self,
catalog_name: &str,
@@ -361,7 +452,7 @@ impl CatalogManager for LocalCatalogManager {
let schema = catalog
.schema(schema_name)?
.with_context(|| SchemaNotFoundSnafu {
schema_info: format!("{}.{}", catalog_name, schema_name),
schema_info: format!("{catalog_name}.{schema_name}"),
})?;
schema.table(table_name)
}

View File

@@ -1,20 +1,35 @@
// Copyright 2022 Greptime Team
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
use std::any::Any;
use std::collections::hash_map::Entry;
use std::collections::HashMap;
use std::sync::atomic::{AtomicU32, Ordering};
use std::sync::Arc;
use std::sync::RwLock;
use std::sync::{Arc, RwLock};
use common_catalog::consts::MIN_USER_TABLE_ID;
use common_telemetry::error;
use snafu::OptionExt;
use table::metadata::TableId;
use table::table::TableIdProvider;
use table::TableRef;
use crate::error::{CatalogNotFoundSnafu, Result, SchemaNotFoundSnafu, TableExistsSnafu};
use crate::schema::SchemaProvider;
use crate::{
CatalogList, CatalogManager, CatalogProvider, CatalogProviderRef, RegisterSystemTableRequest,
RegisterTableRequest, SchemaProviderRef,
CatalogList, CatalogManager, CatalogProvider, CatalogProviderRef, DeregisterTableRequest,
RegisterSchemaRequest, RegisterSystemTableRequest, RegisterTableRequest, SchemaProviderRef,
};
/// Simple in-memory list of catalogs
@@ -41,6 +56,13 @@ impl Default for MemoryCatalogManager {
}
}
#[async_trait::async_trait]
impl TableIdProvider for MemoryCatalogManager {
async fn next_table_id(&self) -> table::error::Result<TableId> {
Ok(self.table_id.fetch_add(1, Ordering::Relaxed))
}
}
#[async_trait::async_trait]
impl CatalogManager for MemoryCatalogManager {
async fn start(&self) -> Result<()> {
@@ -48,11 +70,7 @@ impl CatalogManager for MemoryCatalogManager {
Ok(())
}
async fn next_table_id(&self) -> Result<TableId> {
Ok(self.table_id.fetch_add(1, Ordering::Relaxed))
}
async fn register_table(&self, request: RegisterTableRequest) -> Result<usize> {
async fn register_table(&self, request: RegisterTableRequest) -> Result<bool> {
let catalogs = self.catalogs.write().unwrap();
let catalog = catalogs
.get(&request.catalog)
@@ -67,11 +85,50 @@ impl CatalogManager for MemoryCatalogManager {
})?;
schema
.register_table(request.table_name, request.table)
.map(|v| if v.is_some() { 0 } else { 1 })
.map(|v| v.is_none())
}
async fn deregister_table(&self, request: DeregisterTableRequest) -> Result<bool> {
let catalogs = self.catalogs.write().unwrap();
let catalog = catalogs
.get(&request.catalog)
.context(CatalogNotFoundSnafu {
catalog_name: &request.catalog,
})?
.clone();
let schema = catalog
.schema(&request.schema)?
.with_context(|| SchemaNotFoundSnafu {
schema_info: format!("{}.{}", &request.catalog, &request.schema),
})?;
schema
.deregister_table(&request.table_name)
.map(|v| v.is_some())
}
async fn register_schema(&self, request: RegisterSchemaRequest) -> Result<bool> {
let catalogs = self.catalogs.write().unwrap();
let catalog = catalogs
.get(&request.catalog)
.context(CatalogNotFoundSnafu {
catalog_name: &request.catalog,
})?;
catalog.register_schema(request.schema, Arc::new(MemorySchemaProvider::new()))?;
Ok(true)
}
async fn register_system_table(&self, _request: RegisterSystemTableRequest) -> Result<()> {
unimplemented!()
// TODO(ruihang): support register system table request
Ok(())
}
fn schema(&self, catalog: &str, schema: &str) -> Result<Option<SchemaProviderRef>> {
let catalogs = self.catalogs.read().unwrap();
if let Some(c) = catalogs.get(catalog) {
c.schema(schema)
} else {
Ok(None)
}
}
fn table(&self, catalog: &str, schema: &str, table_name: &str) -> Result<Option<TableRef>> {
@@ -214,11 +271,21 @@ impl SchemaProvider for MemorySchemaProvider {
}
fn register_table(&self, name: String, table: TableRef) -> Result<Option<TableRef>> {
if self.table_exist(name.as_str())? {
return TableExistsSnafu { table: name }.fail()?;
}
let mut tables = self.tables.write().unwrap();
Ok(tables.insert(name, table))
if let Some(existing) = tables.get(name.as_str()) {
// if table with the same name but different table id exists, then it's a fatal bug
if existing.table_info().ident.table_id != table.table_info().ident.table_id {
error!(
"Unexpected table register: {:?}, existing: {:?}",
table.table_info(),
existing.table_info()
);
return TableExistsSnafu { table: name }.fail()?;
}
Ok(Some(existing.clone()))
} else {
Ok(tables.insert(name, table))
}
}
fn deregister_table(&self, name: &str) -> Result<Option<TableRef>> {
@@ -278,7 +345,7 @@ mod tests {
.unwrap()
.is_none());
assert!(provider.table_exist(table_name).unwrap());
let other_table = NumbersTable::default();
let other_table = NumbersTable::new(12);
let result = provider.register_table(table_name.to_string(), Arc::new(other_table));
let err = result.err().unwrap();
assert!(err.backtrace_opt().is_some());
@@ -303,4 +370,34 @@ mod tests {
.downcast_ref::<MemoryCatalogManager>()
.unwrap();
}
#[tokio::test]
pub async fn test_catalog_deregister_table() {
let catalog = MemoryCatalogManager::default();
let schema = catalog
.schema(DEFAULT_CATALOG_NAME, DEFAULT_SCHEMA_NAME)
.unwrap()
.unwrap();
let register_table_req = RegisterTableRequest {
catalog: DEFAULT_CATALOG_NAME.to_string(),
schema: DEFAULT_SCHEMA_NAME.to_string(),
table_name: "numbers".to_string(),
table_id: 2333,
table: Arc::new(NumbersTable::default()),
};
catalog.register_table(register_table_req).await.unwrap();
assert!(schema.table_exist("numbers").unwrap());
let deregister_table_req = DeregisterTableRequest {
catalog: DEFAULT_CATALOG_NAME.to_string(),
schema: DEFAULT_SCHEMA_NAME.to_string(),
table_name: "numbers".to_string(),
};
catalog
.deregister_table(deregister_table_req)
.await
.unwrap();
assert!(!schema.table_exist("numbers").unwrap());
}
}

View File

@@ -1,3 +1,17 @@
// Copyright 2022 Greptime Team
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
use std::fmt::Debug;
use std::pin::Pin;
use std::sync::Arc;

View File

@@ -1,4 +1,19 @@
// Copyright 2022 Greptime Team
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
use std::fmt::Debug;
use std::sync::Arc;
use async_stream::stream;
use common_telemetry::info;
@@ -10,7 +25,7 @@ use crate::error::{Error, MetaSrvSnafu};
use crate::remote::{Kv, KvBackend, ValueIter};
#[derive(Debug)]
pub struct MetaKvBackend {
pub client: MetaClient,
pub client: Arc<MetaClient>,
}
/// Implement `KvBackend` trait for `MetaKvBackend` instead of opendal's `Accessor` since

View File

@@ -1,19 +1,26 @@
// Copyright 2022 Greptime Team
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
use std::any::Any;
use std::collections::HashMap;
use std::pin::Pin;
use std::sync::Arc;
use std::time::Duration;
use arc_swap::ArcSwap;
use async_stream::stream;
use backoff::exponential::ExponentialBackoffBuilder;
use backoff::ExponentialBackoff;
use common_catalog::consts::{DEFAULT_CATALOG_NAME, DEFAULT_SCHEMA_NAME, MIN_USER_TABLE_ID};
use common_catalog::{
build_catalog_prefix, build_schema_prefix, build_table_global_prefix, CatalogKey, CatalogValue,
SchemaKey, SchemaValue, TableGlobalKey, TableGlobalValue, TableRegionalKey, TableRegionalValue,
};
use common_telemetry::{debug, error, info};
use common_telemetry::{debug, info};
use futures::Stream;
use futures_util::StreamExt;
use snafu::{OptionExt, ResultExt};
@@ -25,14 +32,18 @@ use table::TableRef;
use tokio::sync::Mutex;
use crate::error::{
BumpTableIdSnafu, CatalogNotFoundSnafu, CreateTableSnafu, InvalidCatalogValueSnafu,
OpenTableSnafu, ParseTableIdSnafu, SchemaNotFoundSnafu, TableExistsSnafu,
CatalogNotFoundSnafu, CreateTableSnafu, InvalidCatalogValueSnafu, InvalidTableSchemaSnafu,
OpenTableSnafu, Result, SchemaNotFoundSnafu, TableExistsSnafu, UnimplementedSnafu,
};
use crate::helper::{
build_catalog_prefix, build_schema_prefix, build_table_global_prefix, CatalogKey, CatalogValue,
SchemaKey, SchemaValue, TableGlobalKey, TableGlobalValue, TableRegionalKey, TableRegionalValue,
};
use crate::error::{InvalidTableSchemaSnafu, Result};
use crate::remote::{Kv, KvBackendRef};
use crate::{
handle_system_table_request, CatalogList, CatalogManager, CatalogProvider, CatalogProviderRef,
RegisterSystemTableRequest, RegisterTableRequest, SchemaProvider, SchemaProviderRef,
DeregisterTableRequest, RegisterSchemaRequest, RegisterSystemTableRequest,
RegisterTableRequest, SchemaProvider, SchemaProviderRef,
};
/// Catalog manager based on metasrv.
@@ -65,6 +76,7 @@ impl RemoteCatalogManager {
fn new_catalog_provider(&self, catalog_name: &str) -> CatalogProviderRef {
Arc::new(RemoteCatalogProvider {
node_id: self.node_id,
catalog_name: catalog_name.to_string(),
backend: self.backend.clone(),
schemas: Default::default(),
@@ -126,8 +138,6 @@ impl RemoteCatalogManager {
}
/// Iterate over all table entries on metasrv
/// TODO(hl): table entries with different version is not currently considered.
/// Ideally deprecated table entry must be deleted when deregistering from catalog.
async fn iter_remote_tables(
&self,
catalog_name: &str,
@@ -144,10 +154,10 @@ impl RemoteCatalogManager {
}
let table_key = TableGlobalKey::parse(&String::from_utf8_lossy(&k))
.context(InvalidCatalogValueSnafu)?;
let table_value = TableGlobalValue::parse(&String::from_utf8_lossy(&v))
.context(InvalidCatalogValueSnafu)?;
let table_value =
TableGlobalValue::from_bytes(&v).context(InvalidCatalogValueSnafu)?;
debug!(
info!(
"Found catalog table entry, key: {}, value: {:?}",
table_key, table_value
);
@@ -232,17 +242,21 @@ impl RemoteCatalogManager {
schema: SchemaProviderRef,
mut max_table_id: TableId,
) -> Result<()> {
info!("initializing tables in {}.{}", catalog_name, schema_name);
let mut table_num = 0;
let mut tables = self.iter_remote_tables(catalog_name, schema_name).await;
while let Some(r) = tables.next().await {
let (table_key, table_value) = r?;
let table_ref = self.open_or_create_table(&table_key, &table_value).await?;
schema.register_table(table_key.table_name.to_string(), table_ref)?;
info!("Registered table {}", &table_key.table_name);
if table_value.id > max_table_id {
info!("Max table id: {} -> {}", max_table_id, table_value.id);
max_table_id = table_value.id;
}
max_table_id = max_table_id.max(table_value.table_id());
table_num += 1;
}
info!(
"initialized tables in {}.{}, total: {}",
catalog_name, schema_name, table_num
);
Ok(())
}
@@ -294,44 +308,61 @@ impl RemoteCatalogManager {
..
} = table_key;
let table_id = table_value.table_id();
let TableGlobalValue {
id,
meta,
table_info,
regions_id_map,
..
} = table_value;
// unwrap safety: checked in yielding this table when `iter_remote_tables`
let region_numbers = regions_id_map.get(&self.node_id).unwrap();
let request = OpenTableRequest {
catalog_name: catalog_name.clone(),
schema_name: schema_name.clone(),
table_name: table_name.clone(),
table_id: *id,
table_id,
region_numbers: region_numbers.clone(),
};
match self
.engine
.open_table(&context, request)
.await
.with_context(|_| OpenTableSnafu {
table_info: format!("{}.{}.{}, id:{}", catalog_name, schema_name, table_name, id,),
table_info: format!("{catalog_name}.{schema_name}.{table_name}, id:{table_id}"),
})? {
Some(table) => Ok(table),
Some(table) => {
info!(
"Table opened: {}.{}.{}",
catalog_name, schema_name, table_name
);
Ok(table)
}
None => {
info!(
"Try create table: {}.{}.{}",
catalog_name, schema_name, table_name
);
let meta = &table_info.meta;
let schema = meta
.schema
.clone()
.try_into()
.context(InvalidTableSchemaSnafu {
table_info: format!("{}.{}.{}", catalog_name, schema_name, table_name,),
table_info: format!("{catalog_name}.{schema_name}.{table_name}"),
schema: meta.schema.clone(),
})?;
let req = CreateTableRequest {
id: *id,
id: table_id,
catalog_name: catalog_name.clone(),
schema_name: schema_name.clone(),
table_name: table_name.clone(),
desc: None,
schema: Arc::new(schema),
region_numbers: regions_id_map.get(&self.node_id).unwrap().clone(), // this unwrap is safe because region_id_map is checked in `iter_remote_tables`
region_numbers: region_numbers.clone(),
primary_key_indices: meta.primary_key_indices.clone(),
create_if_not_exists: true,
table_options: meta.options.clone(),
@@ -343,7 +374,7 @@ impl RemoteCatalogManager {
.context(CreateTableSnafu {
table_info: format!(
"{}.{}.{}, id:{}",
&catalog_name, &schema_name, &table_name, id
&catalog_name, &schema_name, &table_name, table_id
),
})
}
@@ -377,64 +408,7 @@ impl CatalogManager for RemoteCatalogManager {
Ok(())
}
/// Bump table id in a CAS manner with backoff.
async fn next_table_id(&self) -> Result<TableId> {
let key = common_catalog::consts::TABLE_ID_KEY_PREFIX.as_bytes();
let op = || async {
// TODO(hl): optimize this get
let (prev, prev_bytes) = match self.backend.get(key).await? {
None => (MIN_USER_TABLE_ID, vec![]),
Some(kv) => (parse_table_id(&kv.1)?, kv.1),
};
match self
.backend
.compare_and_set(key, &prev_bytes, &(prev + 1).to_le_bytes())
.await
{
Ok(cas_res) => match cas_res {
Ok(_) => Ok(prev),
Err(e) => {
info!("Table id {:?} already occupied", e);
Err(backoff::Error::transient(
BumpTableIdSnafu {
msg: "Table id occupied",
}
.build(),
))
}
},
Err(e) => {
error!(e;"Failed to CAS table id");
Err(backoff::Error::permanent(
BumpTableIdSnafu {
msg: format!("Failed to perform CAS operation: {:?}", e),
}
.build(),
))
}
}
};
let retry_policy: ExponentialBackoff = ExponentialBackoffBuilder::new()
.with_initial_interval(Duration::from_millis(4))
.with_multiplier(2.0)
.with_max_interval(Duration::from_millis(1000))
.with_max_elapsed_time(Some(Duration::from_millis(3000)))
.build();
backoff::future::retry(retry_policy, op).await.map_err(|e| {
BumpTableIdSnafu {
msg: format!(
"Bump table id exceeds max fail times, last error msg: {:?}",
e
),
}
.build()
})
}
async fn register_table(&self, request: RegisterTableRequest) -> Result<usize> {
async fn register_table(&self, request: RegisterTableRequest) -> Result<bool> {
let catalog_name = request.catalog;
let schema_name = request.schema;
let catalog_provider = self.catalog(&catalog_name)?.context(CatalogNotFoundSnafu {
@@ -453,7 +427,25 @@ impl CatalogManager for RemoteCatalogManager {
.fail();
}
schema_provider.register_table(request.table_name, request.table)?;
Ok(1)
Ok(true)
}
async fn deregister_table(&self, _request: DeregisterTableRequest) -> Result<bool> {
UnimplementedSnafu {
operation: "deregister table",
}
.fail()
}
async fn register_schema(&self, request: RegisterSchemaRequest) -> Result<bool> {
let catalog_name = request.catalog;
let schema_name = request.schema;
let catalog_provider = self.catalog(&catalog_name)?.context(CatalogNotFoundSnafu {
catalog_name: &catalog_name,
})?;
let schema_provider = self.new_schema_provider(&catalog_name, &schema_name);
catalog_provider.register_schema(schema_name, schema_provider)?;
Ok(true)
}
async fn register_system_table(&self, request: RegisterSystemTableRequest) -> Result<()> {
@@ -462,6 +454,14 @@ impl CatalogManager for RemoteCatalogManager {
Ok(())
}
fn schema(&self, catalog: &str, schema: &str) -> Result<Option<SchemaProviderRef>> {
self.catalog(catalog)?
.context(CatalogNotFoundSnafu {
catalog_name: catalog,
})?
.schema(schema)
}
fn table(
&self,
catalog_name: &str,
@@ -474,7 +474,7 @@ impl CatalogManager for RemoteCatalogManager {
let schema = catalog
.schema(schema_name)?
.with_context(|| SchemaNotFoundSnafu {
schema_info: format!("{}.{}", catalog_name, schema_name),
schema_info: format!("{catalog_name}.{schema_name}"),
})?;
schema.table(table_name)
}
@@ -530,6 +530,7 @@ impl CatalogList for RemoteCatalogManager {
}
pub struct RemoteCatalogProvider {
node_id: u64,
catalog_name: String,
backend: KvBackendRef,
schemas: Arc<ArcSwap<HashMap<String, SchemaProviderRef>>>,
@@ -537,8 +538,9 @@ pub struct RemoteCatalogProvider {
}
impl RemoteCatalogProvider {
pub fn new(catalog_name: String, backend: KvBackendRef) -> Self {
pub fn new(catalog_name: String, backend: KvBackendRef, node_id: u64) -> Self {
Self {
node_id,
catalog_name,
backend,
schemas: Default::default(),
@@ -546,6 +548,48 @@ impl RemoteCatalogProvider {
}
}
pub fn refresh_schemas(&self) -> Result<()> {
let schemas = self.schemas.clone();
let schema_prefix = build_schema_prefix(&self.catalog_name);
let catalog_name = self.catalog_name.clone();
let mutex = self.mutex.clone();
let backend = self.backend.clone();
let node_id = self.node_id;
std::thread::spawn(move || {
common_runtime::block_on_write(async move {
let _guard = mutex.lock().await;
let prev_schemas = schemas.load();
let mut new_schemas = HashMap::with_capacity(prev_schemas.len() + 1);
new_schemas.clone_from(&prev_schemas);
let mut remote_schemas = backend.range(schema_prefix.as_bytes());
while let Some(r) = remote_schemas.next().await {
let Kv(k, _) = r?;
let schema_key = SchemaKey::parse(&String::from_utf8_lossy(&k))
.context(InvalidCatalogValueSnafu)?;
if !new_schemas.contains_key(&schema_key.schema_name) {
new_schemas.insert(
schema_key.schema_name.clone(),
Arc::new(RemoteSchemaProvider::new(
catalog_name.clone(),
schema_key.schema_name,
node_id,
backend.clone(),
)),
);
}
}
schemas.store(Arc::new(new_schemas));
Ok(())
})
})
.join()
.unwrap()?;
Ok(())
}
fn build_schema_key(&self, schema_name: impl AsRef<str>) -> SchemaKey {
SchemaKey {
catalog_name: self.catalog_name.clone(),
@@ -560,6 +604,7 @@ impl CatalogProvider for RemoteCatalogProvider {
}
fn schema_names(&self) -> Result<Vec<String>> {
self.refresh_schemas()?;
Ok(self.schemas.load().keys().cloned().collect::<Vec<_>>())
}
@@ -598,20 +643,12 @@ impl CatalogProvider for RemoteCatalogProvider {
}
fn schema(&self, name: &str) -> Result<Option<Arc<dyn SchemaProvider>>> {
// TODO(hl): We should refresh whole catalog before calling datafusion's query engine.
self.refresh_schemas()?;
Ok(self.schemas.load().get(name).cloned())
}
}
/// Parse u8 slice to `TableId`
fn parse_table_id(val: &[u8]) -> Result<TableId> {
Ok(TableId::from_le_bytes(val.try_into().map_err(|_| {
ParseTableIdSnafu {
data: format!("{:?}", val),
}
.build()
})?))
}
pub struct RemoteSchemaProvider {
catalog_name: String,
schema_name: String,
@@ -733,17 +770,3 @@ impl SchemaProvider for RemoteSchemaProvider {
Ok(self.tables.load().contains_key(name))
}
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn test_parse_table_id() {
assert_eq!(12, parse_table_id(&12_i32.to_le_bytes()).unwrap());
let mut data = vec![];
data.extend_from_slice(&12_i32.to_le_bytes());
data.push(0);
assert!(parse_table_id(&data).is_err());
}
}

View File

@@ -1,3 +1,17 @@
// Copyright 2022 Greptime Team
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
use std::any::Any;
use std::sync::Arc;

View File

@@ -1,21 +1,33 @@
// Copyright 2022 Greptime Team
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
use std::any::Any;
use std::collections::HashMap;
use std::sync::Arc;
use common_catalog::consts::{
INFORMATION_SCHEMA_NAME, SYSTEM_CATALOG_NAME, SYSTEM_CATALOG_TABLE_ID,
SYSTEM_CATALOG_TABLE_NAME,
DEFAULT_CATALOG_NAME, DEFAULT_SCHEMA_NAME, INFORMATION_SCHEMA_NAME, SYSTEM_CATALOG_NAME,
SYSTEM_CATALOG_TABLE_ID, SYSTEM_CATALOG_TABLE_NAME,
};
use common_query::logical_plan::Expr;
use common_query::physical_plan::PhysicalPlanRef;
use common_query::physical_plan::RuntimeEnv;
use common_query::physical_plan::{PhysicalPlanRef, SessionContext};
use common_recordbatch::SendableRecordBatchStream;
use common_telemetry::debug;
use common_time::timestamp::Timestamp;
use common_time::util;
use datatypes::prelude::{ConcreteDataType, ScalarVector};
use datatypes::schema::{ColumnSchema, Schema, SchemaBuilder, SchemaRef};
use datatypes::vectors::{BinaryVector, TimestampVector, UInt8Vector};
use datatypes::vectors::{BinaryVector, TimestampMillisecondVector, UInt8Vector};
use serde::{Deserialize, Serialize};
use snafu::{ensure, OptionExt, ResultExt};
use table::engine::{EngineContext, TableEngineRef};
@@ -30,7 +42,6 @@ use crate::error::{
pub const ENTRY_TYPE_INDEX: usize = 0;
pub const KEY_INDEX: usize = 1;
pub const TIMESTAMP_INDEX: usize = 2;
pub const VALUE_INDEX: usize = 3;
pub struct SystemCatalogTable {
@@ -50,7 +61,7 @@ impl Table for SystemCatalogTable {
async fn scan(
&self,
_projection: &Option<Vec<usize>>,
_projection: Option<&Vec<usize>>,
_filters: &[Expr],
_limit: Option<usize>,
) -> table::Result<PhysicalPlanRef> {
@@ -74,6 +85,7 @@ impl SystemCatalogTable {
schema_name: INFORMATION_SCHEMA_NAME.to_string(),
table_name: SYSTEM_CATALOG_TABLE_NAME.to_string(),
table_id: SYSTEM_CATALOG_TABLE_ID,
region_numbers: vec![0],
};
let schema = Arc::new(build_system_catalog_schema());
let ctx = EngineContext::default();
@@ -97,7 +109,7 @@ impl SystemCatalogTable {
desc: Some("System catalog table".to_string()),
schema: schema.clone(),
region_numbers: vec![0],
primary_key_indices: vec![ENTRY_TYPE_INDEX, KEY_INDEX, TIMESTAMP_INDEX],
primary_key_indices: vec![ENTRY_TYPE_INDEX, KEY_INDEX],
create_if_not_exists: true,
table_options: HashMap::new(),
};
@@ -114,14 +126,14 @@ impl SystemCatalogTable {
/// Create a stream of all entries inside system catalog table
pub async fn records(&self) -> Result<SendableRecordBatchStream> {
let full_projection = None;
let ctx = SessionContext::new();
let scan = self
.table
.scan(&full_projection, &[], None)
.scan(full_projection, &[], None)
.await
.context(error::SystemCatalogTableScanSnafu)?;
let stream = scan
.execute(0, Arc::new(RuntimeEnv::default()))
.await
.execute(0, ctx.task_ctx())
.context(error::SystemCatalogTableScanExecSnafu)?;
Ok(stream)
}
@@ -149,9 +161,10 @@ fn build_system_catalog_schema() -> Schema {
),
ColumnSchema::new(
"timestamp".to_string(),
ConcreteDataType::timestamp_millis_datatype(),
ConcreteDataType::timestamp_millisecond_datatype(),
false,
),
)
.with_time_index(true),
ColumnSchema::new(
"value".to_string(),
ConcreteDataType::binary_datatype(),
@@ -159,66 +172,78 @@ fn build_system_catalog_schema() -> Schema {
),
ColumnSchema::new(
"gmt_created".to_string(),
ConcreteDataType::timestamp_millis_datatype(),
ConcreteDataType::timestamp_millisecond_datatype(),
false,
),
ColumnSchema::new(
"gmt_modified".to_string(),
ConcreteDataType::timestamp_millis_datatype(),
ConcreteDataType::timestamp_millisecond_datatype(),
false,
),
];
// The schema of this table must be valid.
SchemaBuilder::try_from(cols)
.unwrap()
.timestamp_index(Some(2))
.build()
.unwrap()
SchemaBuilder::try_from(cols).unwrap().build().unwrap()
}
pub fn build_table_insert_request(full_table_name: String, table_id: TableId) -> InsertRequest {
build_insert_request(
EntryType::Table,
full_table_name.as_bytes(),
serde_json::to_string(&TableEntryValue { table_id })
.unwrap()
.as_bytes(),
)
}
pub fn build_schema_insert_request(catalog_name: String, schema_name: String) -> InsertRequest {
let full_schema_name = format!("{catalog_name}.{schema_name}");
build_insert_request(
EntryType::Schema,
full_schema_name.as_bytes(),
serde_json::to_string(&SchemaEntryValue {})
.unwrap()
.as_bytes(),
)
}
pub fn build_insert_request(entry_type: EntryType, key: &[u8], value: &[u8]) -> InsertRequest {
let mut columns_values = HashMap::with_capacity(6);
columns_values.insert(
"entry_type".to_string(),
Arc::new(UInt8Vector::from_slice(&[EntryType::Table as u8])) as _,
Arc::new(UInt8Vector::from_slice(&[entry_type as u8])) as _,
);
columns_values.insert(
"key".to_string(),
Arc::new(BinaryVector::from_slice(&[full_table_name.as_bytes()])) as _,
Arc::new(BinaryVector::from_slice(&[key])) as _,
);
// Timestamp in key part is intentionally left to 0
columns_values.insert(
"timestamp".to_string(),
Arc::new(TimestampVector::from_slice(&[Timestamp::from_millis(0)])) as _,
Arc::new(TimestampMillisecondVector::from_slice(&[0])) as _,
);
columns_values.insert(
"value".to_string(),
Arc::new(BinaryVector::from_slice(&[serde_json::to_string(
&TableEntryValue { table_id },
)
.unwrap()
.as_bytes()])) as _,
Arc::new(BinaryVector::from_slice(&[value])) as _,
);
let now = util::current_time_millis();
columns_values.insert(
"gmt_created".to_string(),
Arc::new(TimestampVector::from_slice(&[Timestamp::from_millis(
util::current_time_millis(),
)])) as _,
Arc::new(TimestampMillisecondVector::from_slice(&[now])) as _,
);
columns_values.insert(
"gmt_modified".to_string(),
Arc::new(TimestampVector::from_slice(&[Timestamp::from_millis(
util::current_time_millis(),
)])) as _,
Arc::new(TimestampMillisecondVector::from_slice(&[now])) as _,
);
InsertRequest {
catalog_name: DEFAULT_CATALOG_NAME.to_string(),
schema_name: DEFAULT_SCHEMA_NAME.to_string(),
table_name: SYSTEM_CATALOG_TABLE_NAME.to_string(),
columns_values,
}
@@ -324,6 +349,9 @@ pub struct SchemaEntry {
pub schema_name: String,
}
#[derive(Debug, Serialize, Deserialize, PartialEq, Eq)]
pub struct SchemaEntryValue;
#[derive(Debug, PartialEq, Eq, Ord, PartialOrd)]
pub struct TableEntry {
pub catalog_name: String,
@@ -339,20 +367,20 @@ pub struct TableEntryValue {
#[cfg(test)]
mod tests {
use log_store::fs::noop::NoopLogStore;
use log_store::NoopLogStore;
use mito::config::EngineConfig;
use mito::engine::MitoEngine;
use object_store::ObjectStore;
use storage::config::EngineConfig as StorageEngineConfig;
use storage::EngineImpl;
use table::metadata::TableType;
use table::metadata::TableType::Base;
use table_engine::config::EngineConfig;
use table_engine::engine::MitoEngine;
use tempdir::TempDir;
use super::*;
#[test]
pub fn test_decode_catalog_enrty() {
pub fn test_decode_catalog_entry() {
let entry = decode_system_catalog(
Some(EntryType::Catalog as u8),
Some("some_catalog".as_bytes()),
@@ -362,7 +390,7 @@ mod tests {
if let Entry::Catalog(e) = entry {
assert_eq!("some_catalog", e.catalog_name);
} else {
panic!("Unexpected type: {:?}", entry);
panic!("Unexpected type: {entry:?}");
}
}
@@ -379,7 +407,7 @@ mod tests {
assert_eq!("some_catalog", e.catalog_name);
assert_eq!("some_schema", e.schema_name);
} else {
panic!("Unexpected type: {:?}", entry);
panic!("Unexpected type: {entry:?}");
}
}
@@ -398,7 +426,7 @@ mod tests {
assert_eq!("some_table", e.table_name);
assert_eq!(42, e.table_id);
} else {
panic!("Unexpected type: {:?}", entry);
panic!("Unexpected type: {entry:?}");
}
}
@@ -424,7 +452,7 @@ mod tests {
pub async fn prepare_table_engine() -> (TempDir, TableEngineRef) {
let dir = TempDir::new("system-table-test").unwrap();
let store_dir = dir.path().to_string_lossy();
let accessor = opendal::services::fs::Builder::default()
let accessor = object_store::backend::fs::Builder::default()
.root(&store_dir)
.build()
.unwrap();

View File

@@ -1,3 +1,17 @@
// Copyright 2022 Greptime Team
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
// The `tables` table in system catalog keeps a record of all tables created by user.
use std::any::Any;
@@ -12,9 +26,9 @@ use common_query::logical_plan::Expr;
use common_query::physical_plan::PhysicalPlanRef;
use common_recordbatch::error::Result as RecordBatchResult;
use common_recordbatch::{RecordBatch, RecordBatchStream};
use datatypes::prelude::{ConcreteDataType, VectorBuilder};
use datatypes::prelude::{ConcreteDataType, DataType};
use datatypes::schema::{ColumnSchema, Schema, SchemaRef};
use datatypes::value::Value;
use datatypes::value::ValueRef;
use datatypes::vectors::VectorRef;
use futures::Stream;
use snafu::ResultExt;
@@ -24,8 +38,8 @@ use table::metadata::{TableId, TableInfoRef};
use table::table::scan::SimpleTableScan;
use table::{Table, TableRef};
use crate::error::{Error, InsertTableRecordSnafu};
use crate::system::{build_table_insert_request, SystemCatalogTable};
use crate::error::{Error, InsertCatalogRecordSnafu};
use crate::system::{build_schema_insert_request, build_table_insert_request, SystemCatalogTable};
use crate::{
format_full_table_name, CatalogListRef, CatalogProvider, SchemaProvider, SchemaProviderRef,
};
@@ -63,7 +77,7 @@ impl Table for Tables {
async fn scan(
&self,
_projection: &Option<Vec<usize>>,
_projection: Option<&Vec<usize>>,
_filters: &[Expr],
_limit: Option<usize>,
) -> table::error::Result<PhysicalPlanRef> {
@@ -135,26 +149,33 @@ fn tables_to_record_batch(
engine: &str,
) -> Vec<VectorRef> {
let mut catalog_vec =
VectorBuilder::with_capacity(ConcreteDataType::string_datatype(), table_names.len());
ConcreteDataType::string_datatype().create_mutable_vector(table_names.len());
let mut schema_vec =
VectorBuilder::with_capacity(ConcreteDataType::string_datatype(), table_names.len());
ConcreteDataType::string_datatype().create_mutable_vector(table_names.len());
let mut table_name_vec =
VectorBuilder::with_capacity(ConcreteDataType::string_datatype(), table_names.len());
ConcreteDataType::string_datatype().create_mutable_vector(table_names.len());
let mut engine_vec =
VectorBuilder::with_capacity(ConcreteDataType::string_datatype(), table_names.len());
ConcreteDataType::string_datatype().create_mutable_vector(table_names.len());
for table_name in table_names {
catalog_vec.push(&Value::String(catalog_name.into()));
schema_vec.push(&Value::String(schema_name.into()));
table_name_vec.push(&Value::String(table_name.into()));
engine_vec.push(&Value::String(engine.into()));
// Safety: All these vectors are string type.
catalog_vec
.push_value_ref(ValueRef::String(catalog_name))
.unwrap();
schema_vec
.push_value_ref(ValueRef::String(schema_name))
.unwrap();
table_name_vec
.push_value_ref(ValueRef::String(&table_name))
.unwrap();
engine_vec.push_value_ref(ValueRef::String(engine)).unwrap();
}
vec![
catalog_vec.finish(),
schema_vec.finish(),
table_name_vec.finish(),
engine_vec.finish(),
catalog_vec.to_vector(),
schema_vec.to_vector(),
table_name_vec.to_vector(),
engine_vec.to_vector(),
]
}
@@ -254,7 +275,20 @@ impl SystemCatalog {
.system
.insert(request)
.await
.context(InsertTableRecordSnafu)
.context(InsertCatalogRecordSnafu)
}
pub async fn register_schema(
&self,
catalog: String,
schema: String,
) -> crate::error::Result<usize> {
let request = build_schema_insert_request(catalog, schema);
self.information_schema
.system
.insert(request)
.await
.context(InsertCatalogRecordSnafu)
}
}
@@ -313,9 +347,7 @@ fn build_schema_for_tables() -> Schema {
#[cfg(test)]
mod tests {
use common_catalog::consts::{DEFAULT_CATALOG_NAME, DEFAULT_SCHEMA_NAME};
use common_query::physical_plan::RuntimeEnv;
use datatypes::arrow::array::Utf8Array;
use datatypes::arrow::datatypes::DataType;
use common_query::physical_plan::SessionContext;
use futures_util::StreamExt;
use table::table::numbers::NumbersTable;
@@ -338,58 +370,48 @@ mod tests {
.unwrap();
let tables = Tables::new(catalog_list, "test_engine".to_string());
let tables_stream = tables.scan(&None, &[], None).await.unwrap();
let mut tables_stream = tables_stream
.execute(0, Arc::new(RuntimeEnv::default()))
.await
.unwrap();
let tables_stream = tables.scan(None, &[], None).await.unwrap();
let session_ctx = SessionContext::new();
let mut tables_stream = tables_stream.execute(0, session_ctx.task_ctx()).unwrap();
if let Some(t) = tables_stream.next().await {
let batch = t.unwrap().df_recordbatch;
let batch = t.unwrap();
assert_eq!(1, batch.num_rows());
assert_eq!(4, batch.num_columns());
assert_eq!(&DataType::Utf8, batch.column(0).data_type());
assert_eq!(&DataType::Utf8, batch.column(1).data_type());
assert_eq!(&DataType::Utf8, batch.column(2).data_type());
assert_eq!(&DataType::Utf8, batch.column(3).data_type());
assert_eq!(
ConcreteDataType::string_datatype(),
batch.column(0).data_type()
);
assert_eq!(
ConcreteDataType::string_datatype(),
batch.column(1).data_type()
);
assert_eq!(
ConcreteDataType::string_datatype(),
batch.column(2).data_type()
);
assert_eq!(
ConcreteDataType::string_datatype(),
batch.column(3).data_type()
);
assert_eq!(
"greptime",
batch
.column(0)
.as_any()
.downcast_ref::<Utf8Array<i32>>()
.unwrap()
.value(0)
batch.column(0).get_ref(0).as_string().unwrap().unwrap()
);
assert_eq!(
"public",
batch
.column(1)
.as_any()
.downcast_ref::<Utf8Array<i32>>()
.unwrap()
.value(0)
batch.column(1).get_ref(0).as_string().unwrap().unwrap()
);
assert_eq!(
"test_table",
batch
.column(2)
.as_any()
.downcast_ref::<Utf8Array<i32>>()
.unwrap()
.value(0)
batch.column(2).get_ref(0).as_string().unwrap().unwrap()
);
assert_eq!(
"test_engine",
batch
.column(3)
.as_any()
.downcast_ref::<Utf8Array<i32>>()
.unwrap()
.value(0)
batch.column(3).get_ref(0).as_string().unwrap().unwrap()
);
} else {
panic!("Record batch should not be empty!")

View File

@@ -0,0 +1,131 @@
// Copyright 2022 Greptime Team
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
#[cfg(test)]
mod tests {
use std::sync::Arc;
use catalog::local::LocalCatalogManager;
use catalog::{CatalogManager, RegisterTableRequest};
use common_catalog::consts::{DEFAULT_CATALOG_NAME, DEFAULT_SCHEMA_NAME};
use common_telemetry::{error, info};
use mito::config::EngineConfig;
use table::table::numbers::NumbersTable;
use table::TableRef;
use tokio::sync::Mutex;
async fn create_local_catalog_manager() -> Result<LocalCatalogManager, catalog::error::Error> {
let (_dir, object_store) =
mito::table::test_util::new_test_object_store("setup_mock_engine_and_table").await;
let mock_engine = Arc::new(mito::table::test_util::MockMitoEngine::new(
EngineConfig::default(),
mito::table::test_util::MockEngine::default(),
object_store,
));
let catalog_manager = LocalCatalogManager::try_new(mock_engine).await.unwrap();
catalog_manager.start().await?;
Ok(catalog_manager)
}
#[tokio::test]
async fn test_duplicate_register() {
let catalog_manager = create_local_catalog_manager().await.unwrap();
let request = RegisterTableRequest {
catalog: DEFAULT_CATALOG_NAME.to_string(),
schema: DEFAULT_SCHEMA_NAME.to_string(),
table_name: "test_table".to_string(),
table_id: 42,
table: Arc::new(NumbersTable::new(42)),
};
assert!(catalog_manager
.register_table(request.clone())
.await
.unwrap());
// register table with same table id will succeed with 0 as return val.
assert!(!catalog_manager.register_table(request).await.unwrap());
let err = catalog_manager
.register_table(RegisterTableRequest {
catalog: DEFAULT_CATALOG_NAME.to_string(),
schema: DEFAULT_SCHEMA_NAME.to_string(),
table_name: "test_table".to_string(),
table_id: 43,
table: Arc::new(NumbersTable::new(43)),
})
.await
.unwrap_err();
assert!(
err.to_string()
.contains("Table `greptime.public.test_table` already exists"),
"Actual error message: {err}",
);
}
#[test]
fn test_concurrent_register() {
common_telemetry::init_default_ut_logging();
let rt = Arc::new(tokio::runtime::Builder::new_multi_thread().build().unwrap());
let catalog_manager =
Arc::new(rt.block_on(async { create_local_catalog_manager().await.unwrap() }));
let succeed: Arc<Mutex<Option<TableRef>>> = Arc::new(Mutex::new(None));
let mut handles = Vec::with_capacity(8);
for i in 0..8 {
let catalog = catalog_manager.clone();
let succeed = succeed.clone();
let handle = rt.spawn(async move {
let table_id = 42 + i;
let table = Arc::new(NumbersTable::new(table_id));
let req = RegisterTableRequest {
catalog: DEFAULT_CATALOG_NAME.to_string(),
schema: DEFAULT_SCHEMA_NAME.to_string(),
table_name: "test_table".to_string(),
table_id,
table: table.clone(),
};
match catalog.register_table(req).await {
Ok(res) => {
if res {
let mut succeed = succeed.lock().await;
info!("Successfully registered table: {}", table_id);
*succeed = Some(table);
}
}
Err(_) => {
error!("Failed to register table {}", table_id);
}
}
});
handles.push(handle);
}
rt.block_on(async move {
for handle in handles {
handle.await.unwrap();
}
let guard = succeed.lock().await;
let table = guard.as_ref().unwrap();
let table_registered = catalog_manager
.table(DEFAULT_CATALOG_NAME, DEFAULT_SCHEMA_NAME, "test_table")
.unwrap()
.unwrap();
assert_eq!(
table_registered.table_info().ident.table_id,
table.table_info().ident.table_id
);
});
}
}

View File

@@ -1,3 +1,17 @@
// Copyright 2022 Greptime Team
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
use std::collections::btree_map::Entry;
use std::collections::{BTreeMap, HashMap};
use std::fmt::{Display, Formatter};
@@ -13,7 +27,7 @@ use datatypes::data_type::ConcreteDataType;
use datatypes::schema::{ColumnSchema, Schema};
use datatypes::vectors::StringVector;
use serde::Serializer;
use table::engine::{EngineContext, TableEngine};
use table::engine::{EngineContext, TableEngine, TableReference};
use table::metadata::TableId;
use table::requests::{AlterTableRequest, CreateTableRequest, DropTableRequest, OpenTableRequest};
use table::test_util::MemTable;
@@ -151,6 +165,7 @@ impl TableEngine for MockTableEngine {
table_id,
catalog_name,
schema_name,
vec![0],
)) as Arc<_>;
let mut tables = self.tables.write().await;
@@ -174,19 +189,35 @@ impl TableEngine for MockTableEngine {
unimplemented!()
}
fn get_table(&self, _ctx: &EngineContext, name: &str) -> table::Result<Option<TableRef>> {
futures::executor::block_on(async { Ok(self.tables.read().await.get(name).cloned()) })
fn get_table(
&self,
_ctx: &EngineContext,
table_ref: &TableReference,
) -> table::Result<Option<TableRef>> {
futures::executor::block_on(async {
Ok(self
.tables
.read()
.await
.get(&table_ref.to_string())
.cloned())
})
}
fn table_exists(&self, _ctx: &EngineContext, name: &str) -> bool {
futures::executor::block_on(async { self.tables.read().await.contains_key(name) })
fn table_exists(&self, _ctx: &EngineContext, table_ref: &TableReference) -> bool {
futures::executor::block_on(async {
self.tables
.read()
.await
.contains_key(&table_ref.to_string())
})
}
async fn drop_table(
&self,
_ctx: &EngineContext,
_request: DropTableRequest,
) -> table::Result<()> {
) -> table::Result<bool> {
unimplemented!()
}
}

View File

@@ -1,3 +1,17 @@
// Copyright 2022 Greptime Team
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
#![feature(assert_matches)]
mod mock;
@@ -8,12 +22,12 @@ mod tests {
use std::collections::HashSet;
use std::sync::Arc;
use catalog::helper::{CatalogKey, CatalogValue, SchemaKey, SchemaValue};
use catalog::remote::{
KvBackend, KvBackendRef, RemoteCatalogManager, RemoteCatalogProvider, RemoteSchemaProvider,
};
use catalog::{CatalogManager, CatalogManagerRef, RegisterTableRequest};
use common_catalog::consts::{DEFAULT_CATALOG_NAME, DEFAULT_SCHEMA_NAME, MIN_USER_TABLE_ID};
use common_catalog::{CatalogKey, CatalogValue, SchemaKey, SchemaValue};
use catalog::{CatalogList, CatalogManager, RegisterTableRequest};
use common_catalog::consts::{DEFAULT_CATALOG_NAME, DEFAULT_SCHEMA_NAME};
use datatypes::schema::Schema;
use futures_util::StreamExt;
use table::engine::{EngineContext, TableEngineRef};
@@ -61,7 +75,9 @@ mod tests {
);
}
async fn prepare_components(node_id: u64) -> (KvBackendRef, TableEngineRef, CatalogManagerRef) {
async fn prepare_components(
node_id: u64,
) -> (KvBackendRef, TableEngineRef, Arc<RemoteCatalogManager>) {
let backend = Arc::new(MockKvBackend::default()) as KvBackendRef;
let table_engine = Arc::new(MockTableEngine::default());
let catalog_manager =
@@ -186,7 +202,7 @@ mod tests {
table_id,
table,
};
assert_eq!(1, catalog_manager.register_table(reg_req).await.unwrap());
assert!(catalog_manager.register_table(reg_req).await.unwrap());
assert_eq!(
HashSet::from([table_name, "numbers".to_string()]),
default_schema
@@ -207,6 +223,7 @@ mod tests {
let catalog = Arc::new(RemoteCatalogProvider::new(
catalog_name.clone(),
backend.clone(),
node_id,
));
// register catalog to catalog manager
@@ -270,26 +287,11 @@ mod tests {
.register_schema(schema_name.clone(), schema.clone())
.expect("Register schema should not fail");
assert!(prev.is_none());
assert_eq!(1, catalog_manager.register_table(reg_req).await.unwrap());
assert!(catalog_manager.register_table(reg_req).await.unwrap());
assert_eq!(
HashSet::from([schema_name.clone()]),
new_catalog.schema_names().unwrap().into_iter().collect()
)
}
#[tokio::test]
async fn test_next_table_id() {
let node_id = 42;
let (_, _, catalog_manager) = prepare_components(node_id).await;
assert_eq!(
MIN_USER_TABLE_ID,
catalog_manager.next_table_id().await.unwrap()
);
assert_eq!(
MIN_USER_TABLE_ID + 1,
catalog_manager.next_table_id().await.unwrap()
);
}
}

View File

@@ -1,8 +1,8 @@
[package]
name = "client"
version = "0.1.0"
edition = "2021"
# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html
version.workspace = true
edition.workspace = true
license.workspace = true
[dependencies]
api = { path = "../api" }
@@ -10,12 +10,11 @@ async-stream = "0.3"
common-base = { path = "../common/base" }
common-error = { path = "../common/error" }
common-grpc = { path = "../common/grpc" }
common-grpc-expr = { path = "../common/grpc-expr" }
common-query = { path = "../common/query" }
common-recordbatch = { path = "../common/recordbatch" }
common-time = { path = "../common/time" }
datafusion = { git = "https://github.com/apache/arrow-datafusion.git", branch = "arrow2", features = [
"simd",
] }
datafusion.workspace = true
datatypes = { path = "../datatypes" }
enum_dispatch = "0.3"
parking_lot = "0.12"

View File

@@ -1,6 +1,18 @@
use std::collections::HashMap;
// Copyright 2022 Greptime Team
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
use api::v1::{codec::InsertBatch, *};
use api::v1::*;
use client::{Client, Database};
fn main() {
@@ -15,19 +27,21 @@ async fn run() {
let client = Client::with_urls(vec!["127.0.0.1:3001"]);
let db = Database::new("greptime", client);
let (columns, row_count) = insert_data();
let expr = InsertExpr {
schema_name: "public".to_string(),
table_name: "demo".to_string(),
expr: Some(insert_expr::Expr::Values(insert_expr::Values {
values: insert_batches(),
})),
options: HashMap::default(),
region_number: 0,
columns,
row_count,
};
db.insert(expr).await.unwrap();
}
fn insert_batches() -> Vec<Vec<u8>> {
fn insert_data() -> (Vec<Column>, u32) {
const SEMANTIC_TAG: i32 = 0;
const SEMANTIC_FEILD: i32 = 1;
const SEMANTIC_FIELD: i32 = 1;
const SEMANTIC_TS: i32 = 2;
let row_count = 4;
@@ -55,7 +69,7 @@ fn insert_batches() -> Vec<Vec<u8>> {
};
let cpu_column = Column {
column_name: "cpu".to_string(),
semantic_type: SEMANTIC_FEILD,
semantic_type: SEMANTIC_FIELD,
values: Some(cpu_vals),
null_mask: vec![2],
..Default::default()
@@ -67,7 +81,7 @@ fn insert_batches() -> Vec<Vec<u8>> {
};
let mem_column = Column {
column_name: "memory".to_string(),
semantic_type: SEMANTIC_FEILD,
semantic_type: SEMANTIC_FIELD,
values: Some(mem_vals),
null_mask: vec![4],
..Default::default()
@@ -85,9 +99,8 @@ fn insert_batches() -> Vec<Vec<u8>> {
..Default::default()
};
let insert_batch = InsertBatch {
columns: vec![host_column, cpu_column, mem_column, ts_column],
(
vec![host_column, cpu_column, mem_column, ts_column],
row_count,
};
vec![insert_batch.into()]
)
}

View File

@@ -1,12 +1,25 @@
use api::v1::{ColumnDataType, ColumnDef, CreateExpr};
use client::{admin::Admin, Client, Database};
// Copyright 2022 Greptime Team
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
use api::v1::{ColumnDataType, ColumnDef, CreateTableExpr, TableId};
use client::admin::Admin;
use client::{Client, Database};
use prost_09::Message;
use substrait_proto::protobuf::{
plan_rel::RelType as PlanRelType,
read_rel::{NamedTable, ReadType},
rel::RelType,
PlanRel, ReadRel, Rel,
};
use substrait_proto::protobuf::plan_rel::RelType as PlanRelType;
use substrait_proto::protobuf::read_rel::{NamedTable, ReadType};
use substrait_proto::protobuf::rel::RelType;
use substrait_proto::protobuf::{PlanRel, ReadRel, Rel};
use tracing::{event, Level};
fn main() {
@@ -20,35 +33,37 @@ fn main() {
async fn run() {
let client = Client::with_urls(vec!["127.0.0.1:3001"]);
let create_table_expr = CreateExpr {
catalog_name: Some("greptime".to_string()),
schema_name: Some("public".to_string()),
let create_table_expr = CreateTableExpr {
catalog_name: "greptime".to_string(),
schema_name: "public".to_string(),
table_name: "test_logical_dist_exec".to_string(),
desc: None,
desc: "".to_string(),
column_defs: vec![
ColumnDef {
name: "timestamp".to_string(),
datatype: ColumnDataType::Timestamp as i32,
datatype: ColumnDataType::TimestampMillisecond as i32,
is_nullable: false,
default_constraint: None,
default_constraint: vec![],
},
ColumnDef {
name: "key".to_string(),
datatype: ColumnDataType::Uint64 as i32,
is_nullable: false,
default_constraint: None,
default_constraint: vec![],
},
ColumnDef {
name: "value".to_string(),
datatype: ColumnDataType::Uint64 as i32,
is_nullable: false,
default_constraint: None,
default_constraint: vec![],
},
],
time_index: "timestamp".to_string(),
primary_keys: vec!["key".to_string()],
create_if_not_exists: false,
table_options: Default::default(),
table_id: Some(TableId { id: 1024 }),
region_ids: vec![0],
};
let admin = Admin::new("create table", client.clone());

View File

@@ -1,37 +0,0 @@
use std::sync::Arc;
use client::{Client, Database};
use common_grpc::MockExecution;
use datafusion::physical_plan::{
expressions::Column, projection::ProjectionExec, ExecutionPlan, PhysicalExpr,
};
use tracing::{event, Level};
fn main() {
tracing::subscriber::set_global_default(tracing_subscriber::FmtSubscriber::builder().finish())
.unwrap();
run();
}
#[tokio::main]
async fn run() {
let client = Client::with_urls(vec!["127.0.0.1:3001"]);
let db = Database::new("greptime", client);
let physical = mock_physical_plan();
let result = db.physical_plan(physical, None).await;
event!(Level::INFO, "result: {:#?}", result);
}
fn mock_physical_plan() -> Arc<dyn ExecutionPlan> {
let id_expr = Arc::new(Column::new("id", 0)) as Arc<dyn PhysicalExpr>;
let age_expr = Arc::new(Column::new("age", 2)) as Arc<dyn PhysicalExpr>;
let expr = vec![(id_expr, "id".to_string()), (age_expr, "age".to_string())];
let input =
Arc::new(MockExecution::new("mock_input_exec".to_string())) as Arc<dyn ExecutionPlan>;
let projection = ProjectionExec::try_new(expr, input).unwrap();
Arc::new(projection)
}

View File

@@ -1,3 +1,17 @@
// Copyright 2022 Greptime Team
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
use client::{Client, Database, Select};
use tracing::{event, Level};

View File

@@ -1,12 +1,24 @@
// Copyright 2022 Greptime Team
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
use api::v1::*;
use common_error::prelude::StatusCode;
use common_query::Output;
use snafu::prelude::*;
use crate::database::PROTOCOL_VERSION;
use crate::error;
use crate::Client;
use crate::Result;
use crate::{error, Client, Result};
#[derive(Clone, Debug)]
pub struct Admin {
@@ -22,13 +34,13 @@ impl Admin {
}
}
pub async fn create(&self, expr: CreateExpr) -> Result<AdminResult> {
pub async fn create(&self, expr: CreateTableExpr) -> Result<AdminResult> {
let header = ExprHeader {
version: PROTOCOL_VERSION,
};
let expr = AdminExpr {
header: Some(header),
expr: Some(admin_expr::Expr::Create(expr)),
expr: Some(admin_expr::Expr::CreateTable(expr)),
};
self.do_request(expr).await
}
@@ -46,7 +58,19 @@ impl Admin {
header: Some(header),
expr: Some(admin_expr::Expr::Alter(expr)),
};
Ok(self.do_requests(vec![expr]).await?.remove(0))
self.do_request(expr).await
}
pub async fn drop_table(&self, expr: DropTableExpr) -> Result<AdminResult> {
let header = ExprHeader {
version: PROTOCOL_VERSION,
};
let expr = AdminExpr {
header: Some(header),
expr: Some(admin_expr::Expr::DropTable(expr)),
};
self.do_request(expr).await
}
/// Invariants: the lengths of input vec (`Vec<AdminExpr>`) and output vec (`Vec<AdminResult>`) are equal.
@@ -70,6 +94,17 @@ impl Admin {
);
Ok(results)
}
pub async fn create_database(&self, expr: CreateDatabaseExpr) -> Result<AdminResult> {
let header = ExprHeader {
version: PROTOCOL_VERSION,
};
let expr = AdminExpr {
header: Some(header),
expr: Some(admin_expr::Expr::CreateDatabase(expr)),
};
Ok(self.do_requests(vec![expr]).await?.remove(0))
}
}
pub fn admin_result_to_output(admin_result: AdminResult) -> Result<Output> {

View File

@@ -1,17 +1,28 @@
// Copyright 2022 Greptime Team
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
use std::sync::Arc;
use api::v1::greptime_client::GreptimeClient;
use api::v1::*;
use common_grpc::channel_manager::ChannelManager;
use parking_lot::RwLock;
use snafu::OptionExt;
use snafu::ResultExt;
use snafu::{OptionExt, ResultExt};
use tonic::transport::Channel;
use crate::error;
use crate::load_balance::LoadBalance;
use crate::load_balance::Loadbalancer;
use crate::Result;
use crate::load_balance::{LoadBalance, Loadbalancer};
use crate::{error, Result};
#[derive(Clone, Debug, Default)]
pub struct Client {
@@ -128,8 +139,11 @@ impl Client {
.context(error::IllegalGrpcClientStateSnafu {
err_msg: "No available peer found",
})?;
let mut client = self.make_client(peer)?;
let result = client.batch(req).await.context(error::TonicStatusSnafu)?;
let mut client = self.make_client(&peer)?;
let result = client
.batch(req)
.await
.context(error::TonicStatusSnafu { addr: peer })?;
Ok(result.into_inner())
}

View File

@@ -1,31 +1,35 @@
// Copyright 2022 Greptime Team
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
use std::sync::Arc;
use api::helper::ColumnDataTypeWrapper;
use api::v1::codec::SelectResult as GrpcSelectResult;
use api::v1::column::SemanticType;
use api::v1::{
column::Values, object_expr, object_result, select_expr, Column, ColumnDataType,
DatabaseRequest, ExprHeader, InsertExpr, MutateResult as GrpcMutateResult, ObjectExpr,
ObjectResult as GrpcObjectResult, PhysicalPlan, SelectExpr,
object_expr, object_result, select_expr, DatabaseRequest, ExprHeader, InsertExpr,
MutateResult as GrpcMutateResult, ObjectExpr, ObjectResult as GrpcObjectResult, SelectExpr,
};
use common_base::BitVec;
use common_error::status_code::StatusCode;
use common_grpc::AsExcutionPlan;
use common_grpc::DefaultAsPlanImpl;
use common_grpc_expr::column_to_vector;
use common_query::Output;
use common_recordbatch::{RecordBatch, RecordBatches};
use common_time::date::Date;
use common_time::datetime::DateTime;
use common_time::timestamp::Timestamp;
use datafusion::physical_plan::ExecutionPlan;
use datatypes::prelude::*;
use datatypes::schema::{ColumnSchema, Schema};
use snafu::{ensure, OptionExt, ResultExt};
use crate::error;
use crate::{
error::{ConvertSchemaSnafu, DatanodeSnafu, DecodeSelectSnafu, EncodePhysicalSnafu},
Client, Result,
};
use crate::error::{ColumnToVectorSnafu, ConvertSchemaSnafu, DatanodeSnafu, DecodeSelectSnafu};
use crate::{error, Client, Result};
pub const PROTOCOL_VERSION: u32 = 1;
@@ -85,24 +89,6 @@ impl Database {
self.do_select(select_expr).await
}
pub async fn physical_plan(
&self,
physical: Arc<dyn ExecutionPlan>,
original_ql: Option<String>,
) -> Result<ObjectResult> {
let plan = DefaultAsPlanImpl::try_from_physical_plan(physical.clone())
.context(EncodePhysicalSnafu { physical })?
.bytes;
let original_ql = original_ql.unwrap_or_default();
let select_expr = SelectExpr {
expr: Some(select_expr::Expr::PhysicalPlan(PhysicalPlan {
original_ql: original_ql.into_bytes(),
plan,
})),
};
self.do_select(select_expr).await
}
pub async fn logical_plan(&self, logical_plan: Vec<u8>) -> Result<ObjectResult> {
let select_expr = SelectExpr {
expr: Some(select_expr::Expr::LogicalPlan(logical_plan)),
@@ -124,8 +110,6 @@ impl Database {
obj_result.try_into()
}
// TODO(jiachun) update/delete
pub async fn object(&self, expr: ObjectExpr) -> Result<GrpcObjectResult> {
let res = self.objects(vec![expr]).await?.pop().unwrap();
Ok(res)
@@ -201,7 +185,9 @@ impl TryFrom<ObjectResult> for Output {
let vectors = select
.columns
.iter()
.map(|column| column_to_vector(column, select.row_count))
.map(|column| {
column_to_vector(column, select.row_count).context(ColumnToVectorSnafu)
})
.collect::<Result<Vec<VectorRef>>>()?;
let column_schemas = select
@@ -211,7 +197,12 @@ impl TryFrom<ObjectResult> for Output {
.map(|(column, vector)| {
let datatype = vector.data_type();
// nullable or not, does not affect the output
ColumnSchema::new(&column.column_name, datatype, true)
let mut column_schema =
ColumnSchema::new(&column.column_name, datatype, true);
if column.semantic_type == SemanticType::Timestamp as i32 {
column_schema = column_schema.with_time_index(true);
}
column_schema
})
.collect::<Vec<ColumnSchema>>();
@@ -239,100 +230,11 @@ impl TryFrom<ObjectResult> for Output {
}
}
fn column_to_vector(column: &Column, rows: u32) -> Result<VectorRef> {
let wrapper =
ColumnDataTypeWrapper::try_new(column.datatype).context(error::ColumnDataTypeSnafu)?;
let column_datatype = wrapper.datatype();
let rows = rows as usize;
let mut vector = VectorBuilder::with_capacity(wrapper.into(), rows);
if let Some(values) = &column.values {
let values = collect_column_values(column_datatype, values);
let mut values_iter = values.into_iter();
let null_mask = BitVec::from_slice(&column.null_mask);
let mut nulls_iter = null_mask.iter().by_vals().fuse();
for i in 0..rows {
if let Some(true) = nulls_iter.next() {
vector.push_null();
} else {
let value_ref = values_iter.next().context(error::InvalidColumnProtoSnafu {
err_msg: format!(
"value not found at position {} of column {}",
i, &column.column_name
),
})?;
vector
.try_push_ref(value_ref)
.context(error::CreateVectorSnafu)?;
}
}
} else {
(0..rows).for_each(|_| vector.push_null());
}
Ok(vector.finish())
}
fn collect_column_values(column_datatype: ColumnDataType, values: &Values) -> Vec<ValueRef> {
macro_rules! collect_values {
($value: expr, $mapper: expr) => {
$value.iter().map($mapper).collect::<Vec<ValueRef>>()
};
}
match column_datatype {
ColumnDataType::Boolean => collect_values!(values.bool_values, |v| ValueRef::from(*v)),
ColumnDataType::Int8 => collect_values!(values.i8_values, |v| ValueRef::from(*v as i8)),
ColumnDataType::Int16 => {
collect_values!(values.i16_values, |v| ValueRef::from(*v as i16))
}
ColumnDataType::Int32 => {
collect_values!(values.i32_values, |v| ValueRef::from(*v))
}
ColumnDataType::Int64 => {
collect_values!(values.i64_values, |v| ValueRef::from(*v as i64))
}
ColumnDataType::Uint8 => {
collect_values!(values.u8_values, |v| ValueRef::from(*v as u8))
}
ColumnDataType::Uint16 => {
collect_values!(values.u16_values, |v| ValueRef::from(*v as u16))
}
ColumnDataType::Uint32 => {
collect_values!(values.u32_values, |v| ValueRef::from(*v))
}
ColumnDataType::Uint64 => {
collect_values!(values.u64_values, |v| ValueRef::from(*v as u64))
}
ColumnDataType::Float32 => collect_values!(values.f32_values, |v| ValueRef::from(*v)),
ColumnDataType::Float64 => collect_values!(values.f64_values, |v| ValueRef::from(*v)),
ColumnDataType::Binary => {
collect_values!(values.binary_values, |v| ValueRef::from(v.as_slice()))
}
ColumnDataType::String => {
collect_values!(values.string_values, |v| ValueRef::from(v.as_str()))
}
ColumnDataType::Date => {
collect_values!(values.date_values, |v| ValueRef::Date(Date::new(*v)))
}
ColumnDataType::Datetime => {
collect_values!(values.datetime_values, |v| ValueRef::DateTime(
DateTime::new(*v)
))
}
ColumnDataType::Timestamp => {
collect_values!(values.ts_millis_values, |v| ValueRef::Timestamp(
Timestamp::from_millis(*v)
))
}
}
}
#[cfg(test)]
mod tests {
use datanode::server::grpc::select::{null_mask, values};
use api::helper::ColumnDataTypeWrapper;
use api::v1::Column;
use common_grpc::select::{null_mask, values};
use datatypes::vectors::{
BinaryVector, BooleanVector, DateTimeVector, DateVector, Float32Vector, Float64Vector,
Int16Vector, Int32Vector, Int64Vector, Int8Vector, StringVector, UInt16Vector,
@@ -416,12 +318,11 @@ mod tests {
fn create_test_column(vector: VectorRef) -> Column {
let wrapper: ColumnDataTypeWrapper = vector.data_type().try_into().unwrap();
let array = vector.to_arrow_array();
Column {
column_name: "test".to_string(),
semantic_type: 1,
values: Some(values(&[array.clone()]).unwrap()),
null_mask: null_mask(&vec![array], vector.len()),
values: Some(values(&[vector.clone()]).unwrap()),
null_mask: null_mask(&[vector.clone()], vector.len()),
datatype: wrapper.datatype() as i32,
}
}

View File

@@ -1,3 +1,17 @@
// Copyright 2022 Greptime Team
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
use std::any::Any;
use std::sync::Arc;
@@ -25,8 +39,9 @@ pub enum Error {
#[snafu(display("Missing result header"))]
MissingHeader,
#[snafu(display("Tonic internal error, source: {}", source))]
#[snafu(display("Tonic internal error, addr: {}, source: {}", addr, source))]
TonicStatus {
addr: String,
source: tonic::Status,
backtrace: Backtrace,
},
@@ -47,24 +62,12 @@ pub enum Error {
#[snafu(display("Mutate result has failure {}", failure))]
MutateFailure { failure: u32, backtrace: Backtrace },
#[snafu(display("Invalid column proto: {}", err_msg))]
InvalidColumnProto {
err_msg: String,
backtrace: Backtrace,
},
#[snafu(display("Column datatype error, source: {}", source))]
ColumnDataType {
#[snafu(backtrace)]
source: api::error::Error,
},
#[snafu(display("Failed to create vector, source: {}", source))]
CreateVector {
#[snafu(backtrace)]
source: datatypes::error::Error,
},
#[snafu(display("Failed to create RecordBatches, source: {}", source))]
CreateRecordBatches {
#[snafu(backtrace)]
@@ -96,6 +99,12 @@ pub enum Error {
#[snafu(backtrace)]
source: common_grpc::error::Error,
},
#[snafu(display("Failed to convert column to vector, source: {}", source))]
ColumnToVector {
#[snafu(backtrace)]
source: common_grpc_expr::error::Error,
},
}
pub type Result<T> = std::result::Result<T, Error>;
@@ -111,15 +120,13 @@ impl ErrorExt for Error {
| Error::Datanode { .. }
| Error::EncodePhysical { .. }
| Error::MutateFailure { .. }
| Error::InvalidColumnProto { .. }
| Error::ColumnDataType { .. }
| Error::MissingField { .. } => StatusCode::Internal,
Error::ConvertSchema { source } | Error::CreateVector { source } => {
source.status_code()
}
Error::ConvertSchema { source } => source.status_code(),
Error::CreateRecordBatches { source } => source.status_code(),
Error::CreateChannel { source, .. } => source.status_code(),
Error::IllegalGrpcClientState { .. } => StatusCode::Unexpected,
Error::ColumnToVector { source, .. } => source.status_code(),
}
}

View File

@@ -1,3 +1,17 @@
// Copyright 2022 Greptime Team
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
pub mod admin;
mod client;
mod database;
@@ -6,8 +20,6 @@ pub mod load_balance;
pub use api;
pub use self::{
client::Client,
database::{Database, ObjectResult, Select},
error::{Error, Result},
};
pub use self::client::Client;
pub use self::database::{Database, ObjectResult, Select};
pub use self::error::{Error, Result};

View File

@@ -1,3 +1,17 @@
// Copyright 2022 Greptime Team
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
use enum_dispatch::enum_dispatch;
use rand::seq::SliceRandom;

View File

@@ -1,7 +1,8 @@
[package]
name = "cmd"
version = "0.1.0"
edition = "2021"
version.workspace = true
edition.workspace = true
license.workspace = true
default-run = "greptime"
[[bin]]
@@ -9,6 +10,7 @@ name = "greptime"
path = "src/bin/greptime.rs"
[dependencies]
anymap = "1.0.0-beta.2"
clap = { version = "3.1", features = ["derive"] }
common-error = { path = "../common/error" }
common-telemetry = { path = "../common/telemetry", features = [
@@ -17,7 +19,10 @@ common-telemetry = { path = "../common/telemetry", features = [
datanode = { path = "../datanode" }
frontend = { path = "../frontend" }
futures = "0.3"
meta-client = { path = "../meta-client" }
meta-srv = { path = "../meta-srv" }
serde = "1.0"
servers = { path = "../servers" }
snafu = { version = "0.7", features = ["backtraces"] }
tokio = { version = "1.18", features = ["full"] }
toml = "0.5"
@@ -25,3 +30,6 @@ toml = "0.5"
[dev-dependencies]
serde = "1.0"
tempdir = "0.3"
[build-dependencies]
build-data = "0.1.3"

29
src/cmd/build.rs Normal file
View File

@@ -0,0 +1,29 @@
// Copyright 2022 Greptime Team
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
const DEFAULT_VALUE: &str = "unknown";
fn main() {
println!(
"cargo:rustc-env=GIT_COMMIT={}",
build_data::get_git_commit().unwrap_or_else(|_| DEFAULT_VALUE.to_string())
);
println!(
"cargo:rustc-env=GIT_BRANCH={}",
build_data::get_git_branch().unwrap_or_else(|_| DEFAULT_VALUE.to_string())
);
println!(
"cargo:rustc-env=GIT_DIRTY={}",
build_data::get_git_dirty().map_or(DEFAULT_VALUE.to_string(), |v| v.to_string())
);
}

View File

@@ -1,15 +1,26 @@
// Copyright 2022 Greptime Team
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
use std::fmt;
use clap::Parser;
use cmd::datanode;
use cmd::error::Result;
use cmd::frontend;
use cmd::metasrv;
use common_telemetry::logging::error;
use common_telemetry::logging::info;
use cmd::{datanode, frontend, metasrv, standalone};
use common_telemetry::logging::{error, info};
#[derive(Parser)]
#[clap(name = "greptimedb")]
#[clap(name = "greptimedb", version = print_version())]
struct Command {
#[clap(long, default_value = "/tmp/greptimedb/logs")]
log_dir: String,
@@ -33,6 +44,8 @@ enum SubCommand {
Frontend(frontend::Command),
#[clap(name = "metasrv")]
Metasrv(metasrv::Command),
#[clap(name = "standalone")]
Standalone(standalone::Command),
}
impl SubCommand {
@@ -41,6 +54,7 @@ impl SubCommand {
SubCommand::Datanode(cmd) => cmd.run().await,
SubCommand::Frontend(cmd) => cmd.run().await,
SubCommand::Metasrv(cmd) => cmd.run().await,
SubCommand::Standalone(cmd) => cmd.run().await,
}
}
}
@@ -51,10 +65,24 @@ impl fmt::Display for SubCommand {
SubCommand::Datanode(..) => write!(f, "greptime-datanode"),
SubCommand::Frontend(..) => write!(f, "greptime-frontend"),
SubCommand::Metasrv(..) => write!(f, "greptime-metasrv"),
SubCommand::Standalone(..) => write!(f, "greptime-standalone"),
}
}
}
fn print_version() -> &'static str {
concat!(
"\nbranch: ",
env!("GIT_BRANCH"),
"\ncommit: ",
env!("GIT_COMMIT"),
"\ndirty: ",
env!("GIT_DIRTY"),
"\nversion: ",
env!("CARGO_PKG_VERSION")
)
}
#[tokio::main]
async fn main() -> Result<()> {
let cmd = Command::parse();

View File

@@ -1,6 +1,22 @@
// Copyright 2022 Greptime Team
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
use clap::Parser;
use common_telemetry::logging;
use datanode::datanode::{Datanode, DatanodeOptions, Mode};
use datanode::datanode::{Datanode, DatanodeOptions, ObjectStoreConfig};
use meta_client::MetaClientOpts;
use servers::Mode;
use snafu::ResultExt;
use crate::error::{Error, MissingConfigSnafu, Result, StartDatanodeSnafu};
@@ -31,22 +47,22 @@ impl SubCommand {
}
}
#[derive(Debug, Parser)]
#[derive(Debug, Parser, Default)]
struct StartCommand {
#[clap(long)]
node_id: Option<u64>,
#[clap(long)]
http_addr: Option<String>,
#[clap(long)]
rpc_addr: Option<String>,
#[clap(long)]
mysql_addr: Option<String>,
#[clap(long)]
postgres_addr: Option<String>,
#[clap(long)]
metasrv_addr: Option<String>,
#[clap(short, long)]
config_file: Option<String>,
#[clap(long)]
data_dir: Option<String>,
#[clap(long)]
wal_dir: Option<String>,
}
impl StartCommand {
@@ -75,43 +91,41 @@ impl TryFrom<StartCommand> for DatanodeOptions {
DatanodeOptions::default()
};
if let Some(addr) = cmd.http_addr {
opts.http_addr = addr;
}
if let Some(addr) = cmd.rpc_addr {
opts.rpc_addr = addr;
}
if let Some(addr) = cmd.mysql_addr {
opts.mysql_addr = addr;
}
if let Some(addr) = cmd.postgres_addr {
opts.postgres_addr = addr;
if let Some(node_id) = cmd.node_id {
opts.node_id = Some(node_id);
}
match (cmd.metasrv_addr, cmd.node_id) {
(Some(meta_addr), Some(node_id)) => {
// Running mode is only set to Distributed when
// both metasrv addr and node id are set in
// commandline options
opts.meta_client_opts.metasrv_addr = meta_addr;
opts.node_id = node_id;
opts.mode = Mode::Distributed;
}
(None, None) => {
opts.mode = Mode::Standalone;
}
(None, Some(_)) => {
return MissingConfigSnafu {
msg: "Missing metasrv address option",
}
.fail();
}
(Some(_), None) => {
return MissingConfigSnafu {
msg: "Missing node id option",
}
.fail();
if let Some(meta_addr) = cmd.metasrv_addr {
opts.meta_client_opts
.get_or_insert_with(MetaClientOpts::default)
.metasrv_addrs = meta_addr
.split(',')
.map(&str::trim)
.map(&str::to_string)
.collect::<_>();
opts.mode = Mode::Distributed;
}
if let (Mode::Distributed, None) = (&opts.mode, &opts.node_id) {
return MissingConfigSnafu {
msg: "Missing node id option",
}
.fail();
}
if let Some(data_dir) = cmd.data_dir {
opts.storage = ObjectStoreConfig::File { data_dir };
}
if let Some(wal_dir) = cmd.wal_dir {
opts.wal.dir = wal_dir;
}
Ok(opts)
}
@@ -119,45 +133,44 @@ impl TryFrom<StartCommand> for DatanodeOptions {
#[cfg(test)]
mod tests {
use std::assert_matches::assert_matches;
use datanode::datanode::ObjectStoreConfig;
use servers::Mode;
use super::*;
#[test]
fn test_read_from_config_file() {
let cmd = StartCommand {
node_id: None,
http_addr: None,
rpc_addr: None,
mysql_addr: None,
postgres_addr: None,
metasrv_addr: None,
config_file: Some(format!(
"{}/../../config/datanode.example.toml",
std::env::current_dir().unwrap().as_path().to_str().unwrap()
)),
..Default::default()
};
let options: DatanodeOptions = cmd.try_into().unwrap();
assert_eq!("0.0.0.0:3000".to_string(), options.http_addr);
assert_eq!("0.0.0.0:3001".to_string(), options.rpc_addr);
assert_eq!("/tmp/greptimedb/wal".to_string(), options.wal_dir);
assert_eq!("0.0.0.0:3306".to_string(), options.mysql_addr);
assert_eq!("127.0.0.1:3001".to_string(), options.rpc_addr);
assert_eq!("/tmp/greptimedb/wal".to_string(), options.wal.dir);
assert_eq!("127.0.0.1:4406".to_string(), options.mysql_addr);
assert_eq!(4, options.mysql_runtime_size);
assert_eq!(
"1.1.1.1:3002".to_string(),
options.meta_client_opts.metasrv_addr
);
assert_eq!(5000, options.meta_client_opts.connect_timeout_millis);
assert_eq!(3000, options.meta_client_opts.timeout_millis);
assert!(options.meta_client_opts.tcp_nodelay);
let MetaClientOpts {
metasrv_addrs: metasrv_addr,
timeout_millis,
connect_timeout_millis,
tcp_nodelay,
} = options.meta_client_opts.unwrap();
assert_eq!("0.0.0.0:5432".to_string(), options.postgres_addr);
assert_eq!(4, options.postgres_runtime_size);
assert_eq!(vec!["127.0.0.1:3002".to_string()], metasrv_addr);
assert_eq!(5000, connect_timeout_millis);
assert_eq!(3000, timeout_millis);
assert!(!tcp_nodelay);
match options.storage {
ObjectStoreConfig::File { data_dir } => {
assert_eq!("/tmp/greptimedb/data/".to_string(), data_dir)
}
ObjectStoreConfig::S3 { .. } => unreachable!(),
};
}
@@ -165,53 +178,54 @@ mod tests {
fn test_try_from_cmd() {
assert_eq!(
Mode::Standalone,
DatanodeOptions::try_from(StartCommand {
node_id: None,
http_addr: None,
rpc_addr: None,
mysql_addr: None,
postgres_addr: None,
metasrv_addr: None,
config_file: None
})
.unwrap()
.mode
DatanodeOptions::try_from(StartCommand::default())
.unwrap()
.mode
);
assert_eq!(
Mode::Distributed,
DatanodeOptions::try_from(StartCommand {
node_id: Some(42),
http_addr: None,
rpc_addr: None,
mysql_addr: None,
postgres_addr: None,
metasrv_addr: Some("127.0.0.1:3002".to_string()),
config_file: None
})
.unwrap()
.mode
);
assert!(DatanodeOptions::try_from(StartCommand {
node_id: None,
http_addr: None,
rpc_addr: None,
mysql_addr: None,
postgres_addr: None,
metasrv_addr: Some("127.0.0.1:3002".to_string()),
config_file: None,
})
.is_err());
assert!(DatanodeOptions::try_from(StartCommand {
let mode = DatanodeOptions::try_from(StartCommand {
node_id: Some(42),
http_addr: None,
rpc_addr: None,
mysql_addr: None,
postgres_addr: None,
metasrv_addr: None,
config_file: None,
metasrv_addr: Some("127.0.0.1:3002".to_string()),
..Default::default()
})
.unwrap()
.mode;
assert_matches!(mode, Mode::Distributed);
assert!(DatanodeOptions::try_from(StartCommand {
metasrv_addr: Some("127.0.0.1:3002".to_string()),
..Default::default()
})
.is_err());
// Providing node_id but leave metasrv_addr absent is ok since metasrv_addr has default value
DatanodeOptions::try_from(StartCommand {
node_id: Some(42),
..Default::default()
})
.unwrap();
}
#[test]
fn test_merge_config() {
let dn_opts = DatanodeOptions::try_from(StartCommand {
config_file: Some(format!(
"{}/../../config/datanode.example.toml",
std::env::current_dir().unwrap().as_path().to_str().unwrap()
)),
..Default::default()
})
.unwrap();
assert_eq!(Some(42), dn_opts.node_id);
let MetaClientOpts {
metasrv_addrs: metasrv_addr,
timeout_millis,
connect_timeout_millis,
tcp_nodelay,
} = dn_opts.meta_client_opts.unwrap();
assert_eq!(vec!["127.0.0.1:3002".to_string()], metasrv_addr);
assert_eq!(3000, timeout_millis);
assert_eq!(5000, connect_timeout_millis);
assert!(!tcp_nodelay);
}
}

View File

@@ -1,3 +1,17 @@
// Copyright 2022 Greptime Team
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
use std::any::Any;
use common_error::prelude::*;
@@ -38,6 +52,15 @@ pub enum Error {
#[snafu(display("Missing config, msg: {}", msg))]
MissingConfig { msg: String, backtrace: Backtrace },
#[snafu(display("Illegal config: {}", msg))]
IllegalConfig { msg: String, backtrace: Backtrace },
#[snafu(display("Illegal auth config: {}", source))]
IllegalAuthConfig {
#[snafu(backtrace)]
source: servers::auth::Error,
},
}
pub type Result<T> = std::result::Result<T, Error>;
@@ -51,6 +74,8 @@ impl ErrorExt for Error {
Error::ReadConfig { .. } | Error::ParseConfig { .. } | Error::MissingConfig { .. } => {
StatusCode::InvalidArguments
}
Error::IllegalConfig { .. } => StatusCode::InvalidArguments,
Error::IllegalAuthConfig { .. } => StatusCode::InvalidArguments,
}
}
@@ -72,10 +97,7 @@ mod tests {
#[test]
fn test_start_node_error() {
fn throw_datanode_error() -> StdResult<datanode::error::Error> {
datanode::error::MissingFieldSnafu {
field: "test_field",
}
.fail()
datanode::error::MissingNodeIdSnafu {}.fail()
}
let e = throw_datanode_error()

Some files were not shown because too many files have changed in this diff Show More