Compare commits

..

122 Commits

Author SHA1 Message Date
Discord9
f995204060 test: more reduce tests 2023-09-06 16:38:51 +08:00
Discord9
93561291e4 support more binary function 2023-09-06 16:38:51 +08:00
Discord9
9f59d68391 eval func 2023-09-06 16:37:49 +08:00
Discord9
51083b12bd reduce_bucketed 2023-09-06 16:37:49 +08:00
Discord9
c80165c377 test: simple render 2023-09-06 16:37:49 +08:00
Discord9
76d8709774 sink&source 2023-09-06 16:37:49 +08:00
Discord9
2cf7d6d569 feat: build_accumulable 2023-09-06 16:37:49 +08:00
Discord9
045c8079e6 feat: flow util func 2023-09-06 16:37:49 +08:00
Discord9
54f2f6495f mfp & reduce partially 2023-09-06 16:37:49 +08:00
Discord9
2798d266f5 feat: render plan partially writen 2023-09-06 16:37:49 +08:00
Discord9
824d03a642 working on reduce 2023-09-06 16:36:41 +08:00
Discord9
47f41371d0 Arrangement&types 2023-09-06 16:36:41 +08:00
Discord9
d702b6e5c4 use newer DD 2023-09-06 16:36:41 +08:00
Discord9
13c02f3f92 basic skeleton 2023-09-06 16:36:41 +08:00
Discord9
b52eb2313e renamed as greptime-flow 2023-09-06 16:36:41 +08:00
Discord9
d422bc8401 basic demo 2023-09-06 16:36:41 +08:00
Zou Wei
b8c50d00aa feat: sqlness test for interval type (#2265)
* feat: add integration-test for interval type.

* chore: add two cases.

* chore: cr

* chore: Field to Column
2023-09-04 14:30:48 +08:00
Ruihang Xia
a12ee5cab8 fix: qualify inputs on handling join in promql (#2297)
* add qualifier to join inputs

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* add one more case

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* update test results

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

---------

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
2023-09-01 11:51:34 +08:00
ZonaHe
a0d15b489a feat: update dashboard to v0.3.2 (#2295)
Co-authored-by: ZonaHex <ZonaHex@users.noreply.github.com>
2023-08-31 22:05:00 +08:00
shuiyisong
baa372520d fix: json compatibility to null (#2287)
* fix: existing null value for schema name value

* chore: fix null check

* fix: change catalognamevalue and schemanamevalue to option

* fix: fix null case
2023-08-31 14:21:10 +08:00
shuiyisong
5df4d44761 feat: schema level opts (#2283)
* chore: update proto

* chore: add try from for schema name value

* chore: merge schema opts to table opts while creating table

* chore: use table ttl opts first

* chore: add unit test

* chore: update proto version
2023-08-30 08:11:08 +00:00
Weny Xu
8e9f2ffce4 fix: skip procedure if target route is not found (#2277)
* fix: skip procedure if target route is not found

* chore: apply suggestions from CR
2023-08-30 06:59:50 +00:00
Weny Xu
1101e7bb18 fix: deregister table after keeper closes table (#2278)
* fix: deregister table after keeper closes table

* chore: apply suggestions from CR
2023-08-30 03:43:04 +00:00
zyy17
5fbc941023 ci: upload the latest artifacts to 'latest/' directory of S3 bucket in scheduled and formal release (#2276)
Signed-off-by: zyy17 <zyylsxm@gmail.com>
2023-08-29 09:00:45 +00:00
Bamboo1
68600a2cf9 feat(mito2): add file purger and cooperate with scheduler to purge sst files (#2251)
* feat: add file purger and use scheduler

Signed-off-by: ZhuZiyi <zyzhu2001@gmail.com>

* chore: code format

Signed-off-by: ZhuZiyi <zyzhu2001@gmail.com>

* chore: code format

Signed-off-by: ZhuZiyi <zyzhu2001@gmail.com>

* feat: print some information about handling error message

Signed-off-by: ZhuZiyi <zyzhu2001@gmail.com>

* fix: resolve conversion

Signed-off-by: ZhuZiyi <zyzhu2001@gmail.com>

* chore: code format

Signed-off-by: ZhuZiyi <zyzhu2001@gmail.com>

* chore: resolve conversation

Signed-off-by: ZhuZiyi <zyzhu2001@gmail.com>

* fix: resolve conflicting files

Signed-off-by: ZhuZiyi <zyzhu2001@gmail.com>

* chore: code format

Signed-off-by: ZhuZiyi <zyzhu2001@gmail.com>

* chore: code format

Signed-off-by: ZhuZiyi <zyzhu2001@gmail.com>

---------

Signed-off-by: ZhuZiyi <zyzhu2001@gmail.com>
2023-08-29 07:55:03 +00:00
Yingwen
805f254d15 feat(mito): Flush framework for mito2 (#2262)
* feat: write buffer manager

* feat: skeleton

* feat: add flush logic to write path

* feat: add methods to memtable trait

* feat: freeze memtable

* feat: define flush task

* feat: schedule_flush wip

* feat: adding pending requests/tasks

* feat: separate ddl request and background request

* feat: Remove RegionTask and RequestBody

* feat: handle flush related requests

* feat: make tests pass

* style: fix clippy

* docs: update comment

* refactor: rename background requests

* feat: replace Option<RegionWriteCtx> with an enum MaybeStalling
2023-08-29 07:13:15 +00:00
Zhenchi
2a6c830ca7 refactor(table): remove Table impl for system (#2270)
* refactor(table): remove Table impl for system

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* fix: format & import

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

---------

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>
2023-08-29 03:43:43 +00:00
Weny Xu
22dea02485 fix: use RegionId region number instead (#2273) 2023-08-29 02:52:24 +00:00
LFC
ef75e8f7c3 feat: create distributed Mito2 table (#2246)
* feat: create distributed Mito2 table

* rebase develop
2023-08-28 12:07:52 +00:00
Weny Xu
71fc3c42d9 fix: open region does not register catalog/schema (#2271)
* fix: open region does not register catalog/schema

* fix: fix ci
2023-08-28 12:06:10 +00:00
JeremyHi
c02ac36ce8 feat: avoid confusion in desc table (#2272)
feat: Field to Column to aviod confusion in DESC TABLE
2023-08-28 11:50:33 +00:00
Lei, HUANG
c112b9a763 feat(mito2): WAL replay (#2264)
* feat: replay memtable when opening table

* test: region replay

* refactor: save logstore in TestEnv

* fix: some cr comments

* chore: rebase develop

* chore: update last entry id during replay
2023-08-28 11:45:23 +00:00
Weny Xu
96fd17aa0a fix: fix typoes (#2268) 2023-08-28 09:26:00 +00:00
Ruihang Xia
6b8cf0bbf0 feat: impl region engine for mito (#2269)
* update proto

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* convert request

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* update proto

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* import result convertor

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* rename symbols

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

---------

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
2023-08-28 09:24:12 +00:00
Yingwen
e2522dff21 feat(mito): Skeleton for scanning a region (#2230)
* feat: define stream builder

* feat: scan region wip

* feat: create SeqScan in ScanRegion

* feat: scanner

* feat: engine handles scan request

* feat: map projection index to column id

* feat: Impl record batch stream

* refactor: change BatchConverter to ProjectionMapper

* feat: add column_ids to mapper

* feat: implement SeqScan::build()

* chore: fix typo

* docs: add mermaid for ScanRegion

* style: fix clippy

* test: fix record batch test

* fix: update sequence and entry id

* test: test query

* feat: address CR comment

* chore: address CR comments

* chore: Update src/mito2/src/read/scan_region.rs

Co-authored-by: Lei, HUANG <6406592+v0y4g3r@users.noreply.github.com>

---------

Co-authored-by: Lei, HUANG <6406592+v0y4g3r@users.noreply.github.com>
2023-08-28 06:59:31 +00:00
LFC
d8f851bef2 fix: keep region failover state not changed upon failure (#2261) 2023-08-28 04:40:47 +00:00
JeremyHi
63b22b2403 feat: prometheus row inserter (#2263)
* feat: prometheus row inserter

* chore: add unit test

* refactor: to row_insert_requests

* chore: typo

* chore: alloc row by TableData

* chore: by review comment
2023-08-28 03:22:23 +00:00
Weny Xu
c56f5e39cd refactor: set default metasrv procedure retry times to 12 (#2242) 2023-08-26 07:41:15 +00:00
Weny Xu
7ff200c0fa fix: align region numbers to real regions (#2257) 2023-08-25 11:48:58 +00:00
dennis zhuang
5160838d04 chore: change version to 0.4.0-nightly (#2258)
* chore: change version to 0.4.0-nightly

* fix: test
2023-08-25 09:44:39 +00:00
shuiyisong
f16f58266e refactor: query_ctx from http middleware (#2253)
* chore: change userinfo to query_ctx in http handler

* chore: minor change

* chore: move prometheus http to http mod

* chore: fix uni test:

* chore: add back schema check

* chore: minor change

* chore: remove clone
2023-08-25 09:36:33 +00:00
Ruihang Xia
8d446ed741 fix: quote ident on rendered SQL (#2248)
* fix: quote ident on rendered SQL

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* read quote style from query context

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* update sqlness result

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

---------

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
2023-08-25 07:25:21 +00:00
JeremyHi
de1daec680 feat: upgrade desc table output (#2256) 2023-08-25 06:52:22 +00:00
Zhenchi
9d87c8b6de refactor(table): cleanup dist table (#2255)
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>
2023-08-25 06:37:39 +00:00
Lei, HUANG
6bf260a05c chore: write to mito2 (#2250)
* chore: write to mito2

* fix: clippy

* feat: brdige memtable

* chore: rebase develop
2023-08-25 06:18:42 +00:00
WU Jingdi
15912afd96 fix: the inconsistent order of input/output in range select (#2229)
* fix: the inconsistent order of input/output in range select

* chore: apply CR
2023-08-25 04:12:59 +00:00
Lei, HUANG
dbe0e95f2f feat(mito2): concat and projection (#2243)
* refactor: use arrow::compute::concat instead of push values to vector builders

* feat: support projection

* refactor: remove sequence

* refactor: concatenate

* fix: series must not be empty

* refactor: projection
2023-08-25 03:25:27 +00:00
Ruihang Xia
20b7f907b2 fix: promql planner should clear its states on each selector (#2247)
* reset planner status on selector

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* add sqlness test

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* add empty line

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* sort result

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* mask fields to keep ordering

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

---------

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
2023-08-25 03:07:44 +00:00
Weny Xu
b13d932e4e fix: fix RegionAliveKeeper does not find the table after restarting (#2249) 2023-08-25 03:05:17 +00:00
Bamboo1
48348aa364 fix: fix test_scheduler_continuous_stop in scheduler (#2252)
* fix: fix test_scheduler_continuous_stop in scheduler

Signed-off-by: ZhuZiyi <zyzhu2001@gmail.com>

* chore: add document annotation

Signed-off-by: ZhuZiyi <zyzhu2001@gmail.com>

---------

Signed-off-by: ZhuZiyi <zyzhu2001@gmail.com>
2023-08-25 02:59:48 +00:00
Zhenchi
9ce73e7ca1 refactor(frontend): TableScan instead of scan_to_stream for COPY TO (#2244)
* refactor(frontend): TableScan instead of `scan_to_stream` for `COPY TO`

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* fix: format

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

---------

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>
2023-08-24 12:46:54 +00:00
Ruihang Xia
b633a16667 feat: apply rewriter to subquery exprs (#2245)
* apply rewriter to subquery exprs

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* workaround for datafusion's check

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* clean up

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* add sqlness test

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* fix typo

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* change time index type

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

---------

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
2023-08-24 11:48:04 +00:00
Zhenchi
0a6ab2a287 refactor(script): not to call scan_to_stream on table (#2241)
* refactor(script): not to call `scan_to_stream` on table

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* refactor: build plan via LogicalPlanBuilder

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

---------

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>
2023-08-24 08:10:07 +00:00
JeremyHi
7746e5b172 feat: dist row inserter (#2231)
* feat: fronend row inserter

* feat: row splitter

chore: row splitter's unit test

* feat: RowDistInserter

* feat: make influxdb line protocol using row-based protocol

* Update src/partition/src/row_splitter.rs

Co-authored-by: Yingwen <realevenyag@gmail.com>

* Update src/frontend/src/instance/distributed/row_inserter.rs

Co-authored-by: Yingwen <realevenyag@gmail.com>

* chore: by review comment

* Update src/frontend/src/instance/distributed/row_inserter.rs

Co-authored-by: LFC <bayinamine@gmail.com>

* chore: by comment

---------

Co-authored-by: Yingwen <realevenyag@gmail.com>
Co-authored-by: LFC <bayinamine@gmail.com>
2023-08-24 06:58:05 +00:00
Weny Xu
a7e0e2330e fix: invalidate cache after altering (#2239) 2023-08-24 03:56:17 +00:00
Lei, HUANG
19d2d77b41 fix: parse large timestamp (#2185)
* feat: support parsing large timestamp values

* chore: update sqlness tests

* fix: tests

* fix: allow larger window
2023-08-24 03:52:15 +00:00
Yingwen
4ee1034012 feat(mito): merge reader for mito2 (#2210)
* feat: Implement slice and first/last timestamp for Batch

* feat(mito): implements sort/concat for Batch

* chore: fix typo

* chore: remove comments

* feat: sort and dedup

* test: test batch operations

* chore: cast enum to test op type

* test: test filter related api

* sytle: fix clippy

* feat: implement Node and CompareFirst

* feat: merge reader wip

* feat: merge wip

* feat: use batch's operation to sort and dedup

* feat: implement BatchReader for MergeReader

* feat: simplify codes

* test: test merge reader

* refactor: use test util to create batch

* refactor: remove unused imports

* feat: update comment

* chore: remove metadata() from Source

* chroe: update comment

* feat: source supports batch iterator

* chore: update comment
2023-08-24 03:37:51 +00:00
Ruihang Xia
e5ba3d1708 feat: rewrite the dist analyzer (#2238)
* it works!

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* clean up

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* add documents

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* remove unstable timestamp from sqlness test

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* rename rewriter struct

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

---------

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
2023-08-24 03:29:08 +00:00
dennis zhuang
8b1f4eb958 feat: types sqlness tests (#2073)
* feat: timestamp types sqlness tests

* feat: adds timestamp tests

* test: add string tests

* test: comment a case in timestamp

* test: add float type tests

* chore: adds TODO

* feat: set TZ=UTC for sqlness test
2023-08-24 03:26:19 +00:00
discord9
eca7e87129 chore: try from value (#2236)
* chore: try from value

* chore: add TryFromValueError variant
2023-08-24 02:44:13 +00:00
Weny Xu
beb92ba1d2 refactor: use table id instead of table ident (#2233) 2023-08-23 13:28:08 +00:00
Lei, HUANG
fdb5ad23bf refactor: use Batch::sort_and_dedup instead of Values::sort_in_place (#2235) 2023-08-23 08:56:49 +00:00
Ruihang Xia
d581688fd2 fix: dist planner has wrong behavior in table with multiple partitions (#2237)
* fix: dist planner has wrong behavior in table with multiple partitions

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* Update tests/cases/distributed/explain/multi_partitions.sql

Co-authored-by: Zhenchi <zhongzc_arch@outlook.com>

---------

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
Co-authored-by: Zhenchi <zhongzc_arch@outlook.com>
2023-08-23 08:32:20 +00:00
Bamboo1
4dbc32f532 refactor: remove associate type in scheduler to simplify it #2153 (#2194)
* feature: add a simple scheduler using flume

Signed-off-by: ZhuZiyi <zyzhu2001@gmail.com>

* fix: only use a sender rather clone many senders

Signed-off-by: ZhuZiyi <zyzhu2001@gmail.com>

* fix: use select to avoid loop

Signed-off-by: ZhuZiyi <zyzhu2001@gmail.com>

* feat: add parameters in new function to build the flume capacity and number of receivers

Signed-off-by: ZhuZiyi <zyzhu2001@gmail.com>

* test: add countdownlatch test concurrency

Signed-off-by: ZhuZiyi <zyzhu2001@gmail.com>

* test: add barrier replacing countdownlatch to test concurrency and add wait all tasks finished in stop

Signed-off-by: ZhuZiyi <zyzhu2001@gmail.com>

* chore: add some document annotation

Signed-off-by: ZhuZiyi <zyzhu2001@gmail.com>

* chore: add license header

Signed-off-by: ZhuZiyi <zyzhu2001@gmail.com>

* chore: code format

Signed-off-by: ZhuZiyi <zyzhu2001@gmail.com>

* chore: add Cargo.lock

Signed-off-by: ZhuZiyi <zyzhu2001@gmail.com>

* chore: Cargo.toml format

Signed-off-by: ZhuZiyi <zyzhu2001@gmail.com>

* chore: delete println in test

Signed-off-by: ZhuZiyi <zyzhu2001@gmail.com>

* chore: code format

Signed-off-by: ZhuZiyi <zyzhu2001@gmail.com>

* chore: code format

Signed-off-by: ZhuZiyi <zyzhu2001@gmail.com>

* feat: add error handle

Signed-off-by: ZhuZiyi <zyzhu2001@gmail.com>

* fix: fix error handle and add test scheduler stop

Signed-off-by: ZhuZiyi <zyzhu2001@gmail.com>

* chore: spelling mistake

Signed-off-by: ZhuZiyi <zyzhu2001@gmail.com>

* fix: wait all tasks finished

Signed-off-by: ZhuZiyi <zyzhu2001@gmail.com>

* chore: add todo which need wrap Future returned by send_async

Signed-off-by: ZhuZiyi <zyzhu2001@gmail.com>

* chore: code format

Signed-off-by: ZhuZiyi <zyzhu2001@gmail.com>

* test: remove unnessary sleep in test

Signed-off-by: ZhuZiyi <zyzhu2001@gmail.com>

* fix: resolve some conflicts

Signed-off-by: ZhuZiyi <zyzhu2001@gmail.com>

* fix: resolve conversation

Signed-off-by: ZhuZiyi <zyzhu2001@gmail.com>

* chore: code format

Signed-off-by: ZhuZiyi <zyzhu2001@gmail.com>

* chore: code format

Signed-off-by: ZhuZiyi <zyzhu2001@gmail.com>

* feat: modify the function of schedule to synchronize and drop sender after stopping scheduler

Signed-off-by: ZhuZiyi <zyzhu2001@gmail.com>

---------

Signed-off-by: ZhuZiyi <zyzhu2001@gmail.com>
2023-08-23 06:28:00 +00:00
Zhenchi
af95e46512 refactor(table): eliminate calls to DistTable.delete (#2225)
* refactor(table): eliminate calls to DistTable.delete

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* fix: format

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* fix: clippy

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

---------

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>
2023-08-23 02:33:48 +00:00
Weny Xu
d81ddd8879 chore: fix clippy (#2232) 2023-08-23 02:24:29 +00:00
Ning Sun
88247e4284 fix!: resolve residual issues with removing prometheus port (#2227)
* fix: resolve residual issues when removing prometheus port

* fix: remove prometheus from sample config as well
2023-08-23 01:49:11 +00:00
Ruihang Xia
18250c4803 feat: implement Flight and gRPC services for RegionServer (#2226)
* extract FlightCraft trait

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* split service handler in GrpcServer

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* left grpc server implement

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* start region server if configured

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

---------

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
2023-08-22 13:30:09 +00:00
dennis zhuang
18fa0e01ed feat: remove checkpoint_on_startup (#2228)
feat: update flushed manifest version when it is larger
2023-08-22 13:09:34 +00:00
Yingwen
cc3e198975 feat(mito): Implement operations like concat and sort for Batch (#2203)
* feat: Implement slice and first/last timestamp for Batch

* feat(mito): implements sort/concat for Batch

* chore: fix typo

* chore: remove comments

* feat: sort and dedup

* test: test batch operations

* chore: cast enum to test op type

* test: test filter related api

* sytle: fix clippy

* docs: comment for slice

* chore: address CR comment

Don't return Option in get_timestamp()/get_sequence()
2023-08-22 12:03:02 +00:00
Yingwen
cd3755c615 feat(mito): Support handling RegionWriteRequest (#2218)
* feat: convert region request to worker write request

* chore: remove unused codes

* test: fix tests compiler errors

* chore: remove create/close/open request from worker requests

* chore: add comment

* chore: fix typo
2023-08-22 11:16:00 +00:00
Lei, HUANG
be1e13c713 feat(mito2): time series memtable (#2208)
* feat: time series memtable

* feat: add some test

* fix: some clippy warnings

* chore: some rustdoc

* refactor: test

* fix: remove useless functions

* feat: add config for TimeSeriesMemtable

* chore: some optimize

* refactor: remove bucketing

* refactor: avoid cloing RegionMetadataRef across all Series; make initial_builder_capacity a const; sort batch only by timestamp and sequence
2023-08-22 08:40:46 +00:00
Zhenchi
cb3561f3b3 refactor(table): eliminate calls to DistTable.insert (#2219)
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>
2023-08-22 06:15:02 +00:00
Niwaka
b3b43fe1c3 fix: table options can't be found in distributed mode (#2209)
* fix: table options can't be found in distributed mode

* refactor: use iterator for regions_numbers

* chore: remove TODO
2023-08-22 03:53:56 +00:00
WU Jingdi
b411769de6 feat: Implement a basical range select query (#2138)
* feat: Implement a basical range select query

* chore: support any timestamp type & CR fix
2023-08-22 03:07:14 +00:00
niebayes
e5f4ca2dab feat: streaming do_get (#2171)
* feat: rewrite do_get for streaming get flight data

* feat: rewrite do_get call stack but leave the async stream adapter not modified yet

* feat: rewrite the async stream adapter to accept greptime record batch stream

* fix: resolve some PR comments

* feat: rewrite tests to adapt to the streaming do_get

* feat: add unit tests for streaming do_get

* feat: rewrite timer metric of merge scan

* remove unhelpful unit tests for streaming do_get

* add a new metric timer for merge scan and fix some test errors

* rewrite mysql writer to write query results in a streaming manner

* fix: fix fmt errors

* fix: rewrite sqlness runner to take into account the streaming do_get

* fix: fix toml format errors

* fix: resolve some PR comments

* fix: resolve some PR comments

* fix: refactor do_get to increase readability

* fix: refactor mysql try_write_one to increase readability
2023-08-22 02:54:05 +00:00
Weny Xu
5b7b2cf77d fix: fix ddl client can not update leader addr (#2205)
* fix: fix ddl client can not update leader addr

* chore: apply suggestions from CR

* feat: add message to context

* fix: only retry if unavailable or deadline exceeded

* chore: apply suggestions from CR
2023-08-21 13:57:29 +00:00
shuiyisong
9352649f22 chore: add table region key to delete in upgrade tool (#2214) 2023-08-21 08:16:10 +00:00
shuiyisong
c5f507c20e fix: add user_info extension to prom_store handler (#2212)
chore: add user_info extention to prom_store auth
2023-08-21 04:55:34 +00:00
JeremyHi
033b650d0d feat: row write protocol (#2189)
* feat: datanode's row insrter

* refactor: ExprFactory

* feat: row inserter in standalon mode

* chore: minor refactor

* feat: influxdb line protocol's row protocol

* chore: minor refactor

* improve: avoid to use too many string

* no longer async

Co-authored-by: Ruihang Xia <waynestxia@gmail.com>

* chore: do not check empty data

* chore: by review comment

* chore: by comment

* chore: by review comment

* chore: by review comment

---------

Co-authored-by: Ruihang Xia <waynestxia@gmail.com>
2023-08-19 13:08:44 +00:00
dennis zhuang
272f649b22 fix: some TODO in sqlness cases and refactor meta-client error (#2207)
* fix: some TODO in sqlness cases and refactor meta-client error

* fix: delete tests/cases/standalone/alter/drop_col_not_null_next.output
2023-08-18 10:09:11 +00:00
Ruihang Xia
3150f4b22e fix: specify input ordering and distribution for prom plan (#2204)
* fix: specify input ordering and distribution for prom plan

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* update sqlness result

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

---------

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
2023-08-18 09:45:46 +00:00
Weny Xu
e1ce1d86a1 refactor: unite key serialization method (#2195) 2023-08-18 09:42:19 +00:00
ZonaHe
b8595e1960 feat: update dashboard to v0.3.1 (#2192)
Co-authored-by: ZonaHex <ZonaHex@users.noreply.github.com>
2023-08-18 09:42:18 +00:00
shuiyisong
61e6656fea fix: auth in prometheus gateway service (#2206)
* fix: auth in prometheus gateway service

* chore: remove unused code

Co-authored-by: Ruihang Xia <waynestxia@gmail.com>

---------

Co-authored-by: Ruihang Xia <waynestxia@gmail.com>
2023-08-18 09:41:38 +00:00
Ruihang Xia
1bbec75f5b fix: skip partition clause in show create table (#2200)
* fix: skip partition clause in show create table

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* update test results

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

---------

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
2023-08-18 09:10:31 +00:00
Zhenchi
8d6a2d0b59 refactor: apply numbers to ThinTable (#2202)
* refactor: apply numbers to `ThinTable`

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* refactor: tiny polish

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* fix: unused import

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

---------

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>
2023-08-18 03:11:37 +00:00
Weny Xu
177036475a fix: support to copy from parquet with typecast (#2201) 2023-08-18 03:09:54 +00:00
Zhenchi
87a730658a refactor: add ThinTable to proxy tables from infoschema (#2193)
* refactor: add thin table to proxy tables in info_schema

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* fix(catalog): fix typo in DataSourceAdapter struct name

* fix: remove redundant Send + Sync

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* refactor(catalog): rename DataSourceAdapter to InformationTableDataSource

* feat(catalog): add ThinTableAdapter for adapting ThinTable to Table interface

* rebase develop

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* refactor: default impl for table_type of InformationTable

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* refactor: filter_pushdown as table field

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* fix: remove explicit type declaration

---------

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>
2023-08-17 15:19:14 +00:00
JeremyHi
b67e5bbf70 fix: invalid err msg (#2196) 2023-08-17 11:12:35 +00:00
Ruihang Xia
4aaf6aa51b feat: implement query API for RegionServer (#2197)
* some initial change

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* impl dummy structs

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* decode and send logical plan

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* implement table scan

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* add some comments

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

---------

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
2023-08-17 11:02:31 +00:00
Weny Xu
6e6ff5a606 refactor: update table metadata in single txn (#2172)
* refactor: table-metadata-manager

* feat: remove comparing when deleting metadata

* fix: fix comment typos

* chore: apply suggestions from CR

* test: add tests for updating DatanodeTable

* fix: fix clippy

* chore: apply suggestions from CR

* refactor: improve update table route tests

* refactor: return Txn instead of TxnRequest

* chore: apply suggestions from CR

* chore: apply suggestions from CR

* refactor: update table metadata in single txn

* feat: check table exists before drop table executing

* test: add tests for table metadata manager

* refactor: remove table region manager

* chore: apply suggestions from CR

* feat: add bench program

* chore: apply suggestions from CR
2023-08-17 06:29:19 +00:00
Yingwen
4ba12155fe feat(mito): Implement SST format for mito2 (#2178)
* chore: update comment

* feat: stream writer takes arrow's types

* feat: Define Batch struct

* feat: arrow_schema_to_store

* refactor: rename

* feat: write parquet in new format with tsids

* feat: reader support projection

* feat: Impl read compat

* refactor: rename SchemaCompat to CompatRecordBatch

* feat: changing sst format

* feat: make it compile

* feat: remove tsid and some structs

* feat: from_sst_record_batch wip

* chore: push array

* chore: wip

* feat: decode batches from RecordBatch

* feat: reader converts record batches

* feat: remove compat mod

* chore: remove some codes

* feat: sort fields by column id

* test: test to_sst_arrow_schema

* feat: do not sort fields

* test: more test helpers

* feat: simplify projection

* fix: projection indices is incorrect

* refactor: define write/read format

* test: test write format

* test: test projection

* test: test convert record batch

* feat: remove unused errors

* refactor: wrap get_field_batch_columns

* chore: clippy

* chore: fix clippy

* feat: build arrow schema from region meta in ReadFormat

* feat: initialize the parquet reader at `build()`

* chore: fix typo
2023-08-17 06:25:50 +00:00
Weny Xu
832e5dcfd7 chore: remove allow-unused (#2184) 2023-08-17 03:15:12 +00:00
shuiyisong
d45ee8b42a chore: fix collect region stat on non-base table (#2190) 2023-08-17 02:13:49 +00:00
JeremyHi
6cd7319d67 refactor: grpc insert (#2188)
* feat: interval type for row protocol

* feat: minor refactor grpc insert

* Update src/common/grpc-expr/src/util.rs

Co-authored-by: Ruihang Xia <waynestxia@gmail.com>

* fix: by comment

---------

Co-authored-by: Ruihang Xia <waynestxia@gmail.com>
2023-08-16 11:25:25 +00:00
Yingwen
bb062003ef ci: fallback to run_id to avoid cancelling other jobs (#2186)
ci: fallback to run id to avoid cancelling other jobs
2023-08-16 09:24:17 +00:00
Weny Xu
8ea1763033 refactor: refactor table metadata manager (#2159)
* refactor: table-metadata-manager

* feat: remove comparing when deleting metadata

* fix: fix comment typos

* chore: apply suggestions from CR

* test: add tests for updating DatanodeTable

* fix: fix clippy

* chore: apply suggestions from CR

* refactor: improve update table route tests

* refactor: return Txn instead of TxnRequest

* chore: apply suggestions from CR

* chore: apply suggestions from CR
2023-08-16 06:43:03 +00:00
Zhenchi
1afe96e397 refactor: prevent dist table from invoking scan (#2179)
* refactor: prevent dist table from invoking `scan`

* refactor: reorg code

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* chore: add comment

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

---------

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>
2023-08-16 04:43:33 +00:00
Ruihang Xia
814c599029 ci: cancel in-progress actions on new commit (#2182)
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
2023-08-16 04:21:14 +00:00
Ruihang Xia
4c3169431b feat: move region metadata to store-api (#2181)
* add metadata & handle_read

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* move metadata to store-api

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* dep aquamarine

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* remove deadcode

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* remove temporary code

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* Update src/store-api/Cargo.toml

Co-authored-by: Lei, HUANG <6406592+v0y4g3r@users.noreply.github.com>

* remove old mod

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

---------

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
Co-authored-by: Lei, HUANG <6406592+v0y4g3r@users.noreply.github.com>
2023-08-16 04:18:26 +00:00
sh2
202540823f refactor!: move prometheus routes to default http server (#2005)
* move prometheus routes to default http server

Signed-off-by: sh2 <shawnhxh@outlook.com>

* fix ci test and remove the server logic of prometheus

* remove unused import and prometheus relevant code

* fix ci: rustfmt and test

* fix ci: silly fmt

* fix ci: silly silly fmt

* change `/prom_store` back to `/prometheus`

* remove unused variable

---------

Signed-off-by: sh2 <shawnhxh@outlook.com>
2023-08-16 03:21:14 +00:00
dennis zhuang
0967678a51 feat: don't enable telemetry for debug building (#2177) 2023-08-16 01:53:11 +00:00
shuiyisong
c8cde704cf chore: minor auth crate change (#2176)
* chore: pub auth_mysql

* chore: pub all error

* chore: remove back to error

* chore: wrap failed permission check result to err

* chore: minor change
2023-08-15 10:49:22 +00:00
JeremyHi
24dc827ff9 feat: grpc handler result (#2107)
* feat: grpc handler inner result

* feat: ext header, x-greptime-err-code, x-greptime-err-msg

* fix: sqlness case

* chore: by comment

* fix: convert status to Error
2023-08-15 09:34:00 +00:00
Weny Xu
f5e44ba4cf docs: rfc of update metadata in single txn (#2165)
* docs: rfc of update metadata in single txn

* chore: apply suggestion from CR
2023-08-15 17:44:07 +08:00
zyy17
32c3ac4fcf refactor: improve the image building performance (#2175)
* refactor: use '--output type=local' in 'build-greptime-by-buildx' target to reduce unnecessary 'docker cp'"

Signed-off-by: zyy17 <zyylsxm@gmail.com>

* refactor: improve the image building performance

* ci: release centos dev builder

* ci: use 'make build-by-dev-builder' to improve docker build performance

* refactor: add 'which' command in centos

* fix: add 'OUTPUT_DIR' to fix 'make docker-image-buildx' error

* fix: fix incorrect dockerfile path

Signed-off-by: zyy17 <zyylsxm@gmail.com>

* refactor: remove configure-aws-credentials action and use env variables

Signed-off-by: zyy17 <zyylsxm@gmail.com>

* ci: update slack notification prompt

* refactor: clean up the target directory before building artifacts of centos7

---------

Signed-off-by: zyy17 <zyylsxm@gmail.com>
2023-08-15 09:28:09 +00:00
Niwaka
a8f2e4468d feat: handle multiple grpc deletes (#2150)
* feat: handle multiple grpc deletes

* fix: make DistDeleter::grpc_delete return usize

* fix: remove backtrace from MissingTimeIndexColumn

* fix: avoid using unwrap in PartitionRuleManager::split_delete_request

* fix: simplify MissingTimeIndexColumn
2023-08-15 08:22:46 +00:00
Yingwen
d4565c0a94 feat(mito): Defines the read Batch struct for mito2 (#2174)
* feat: define batch

* feat: define Batch struct

* feat: stream writer takes arrow's types

* feat: make it compile

* feat: use uint64vector and uint8vector

* feat: add timestamps and primary key
2023-08-15 06:39:21 +00:00
Ruihang Xia
2168970814 feat: define region server and related requests (#2160)
* define region server and related requests

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* fill request body

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* change mito2's request type

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* fix clippy

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* chore: bump greptime-proto to d9167cab (row insert/delete)

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* fix test compile

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* remove name_to_index

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* address cr comments

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* finilise

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

---------

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
2023-08-15 06:27:27 +00:00
Weny Xu
69a2036cee feat!: add deserializer for Partition (#2169)
* feat!: add deserializer for Partition

* fix: fix tests
2023-08-15 03:36:58 +00:00
Lei, HUANG
e924b44e83 refactor: KeyValues return ValueRef (#2170)
* refactor: KeyValues return ValueRef

* 1. Change KeyValues returned value from pb value to ValueRef
2. Replace OpType/SemanticType with pb's OpType and SemanticType to avoid duplicated conversions.

* feat: define min value of OpType as a const

* fix: toml format
2023-08-14 14:51:13 +00:00
Yingwen
768239eb49 fix: panic on truncate table in distributed mode (#2173) 2023-08-14 14:20:20 +00:00
Ning Sun
f3157df190 fix: normalize otlp string keys (#2168) 2023-08-14 09:39:54 +00:00
dennis zhuang
b353bd20db fix: print_anonymous_usage_data_disclaimer at wrong place (#2167) 2023-08-14 08:01:10 +00:00
Lei, HUANG
55b5df9c51 feat: row wise converter (#2162)
* feat: impl mem-comparable encoding for timestamp

* fix: test cases

* impl time series encode/decoder

* fix: merge unsupported match arms

* fix: clippy

* chore: big number delimiter

* feat: encode timestamps as i64

* fix: remove useless error variant
2023-08-14 07:13:39 +00:00
Ruihang Xia
393047a541 feat: implement metric for MergeScanExec (#2166)
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
2023-08-14 07:10:45 +00:00
LFC
606b489d53 feat: redact secrets in sql when logging (#2141) 2023-08-14 06:40:00 +00:00
Weny Xu
d0b3607633 feat: add table route manager and upgrade tool (#2145)
* feat: add table route manager and upgrade tool

* test: add table route manager tests

* feat: add new TableRouteValue struct

* chore: apply suggestions from CR

* refactor: change HashMap to BTreeMap

* feat: add version to TableRouteValue
2023-08-14 04:19:44 +00:00
Weny Xu
5b012a1f67 feat!: switch to new catalog/schema key (#2140)
* feat!: switch to new catalog/schema key

* chore: apply suggestions from CR
2023-08-14 03:08:43 +00:00
Ruihang Xia
f6b53984da fix(metasrv)!: do not overwrite boolean options unconditionally (#2161)
* fix: do not overwrite boolean options unconditionally

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* fix sqlness start command

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

---------

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
2023-08-14 03:04:54 +00:00
shuiyisong
7f51141ed0 refactor: auth crate (#2148)
* chore: move user_info to auth crate

* chore: temp commit before resolving tests compile error

* chore: fix compile issue

* chore: minor fix

* chore: tmp save

* chore: change user_info to trait

* chore: minor change & use auth result user info in pg session setup

* chore: add as_any to user_info

* chore: rename user_info

* chore: remove ice file

* chore: add permission checker

* chore: add grpc permission check

* chore: add session spawn user_info to query_ctx

* chore: minor update

* chore: add permission checker to sql handler & temp save

* chore: add permission checker to prometheus handler

* chore: add permission checker to opentsdb handler

* chore: add permission checker to other handlers

* chore: add test

* chore: add user_info setting on http entrance

* chore: fix toml

* chore: remove box in permission req

* chore: cr issue

* chore: cr issue
2023-08-14 02:51:26 +00:00
482 changed files with 30079 additions and 8791 deletions

View File

@@ -42,10 +42,21 @@ runs:
username: ${{ inputs.dockerhub-image-registry-username }}
password: ${{ inputs.dockerhub-image-registry-token }}
- name: Build and push dev builder image to dockerhub
- name: Build and push ubuntu dev builder image to dockerhub
shell: bash
run:
make dev-builder \
BASE_IMAGE=ubuntu \
BUILDX_MULTI_PLATFORM_BUILD=true \
IMAGE_REGISTRY=${{ inputs.dockerhub-image-registry }} \
IMAGE_NAMESPACE=${{ inputs.dockerhub-image-namespace }} \
IMAGE_TAG=${{ inputs.version }}
- name: Build and push centos dev builder image to dockerhub
shell: bash
run:
make dev-builder \
BASE_IMAGE=centos \
BUILDX_MULTI_PLATFORM_BUILD=true \
IMAGE_REGISTRY=${{ inputs.dockerhub-image-registry }} \
IMAGE_NAMESPACE=${{ inputs.dockerhub-image-namespace }} \
@@ -59,11 +70,23 @@ runs:
username: ${{ inputs.acr-image-registry-username }}
password: ${{ inputs.acr-image-registry-password }}
- name: Build and push dev builder image to ACR
- name: Build and push ubuntu dev builder image to ACR
shell: bash
continue-on-error: true
run: # buildx will cache the images that already built, so it will not take long time to build the images again.
make dev-builder \
BASE_IMAGE=ubuntu \
BUILDX_MULTI_PLATFORM_BUILD=true \
IMAGE_REGISTRY=${{ inputs.acr-image-registry }} \
IMAGE_NAMESPACE=${{ inputs.acr-image-namespace }} \
IMAGE_TAG=${{ inputs.version }}
- name: Build and push centos dev builder image to ACR
shell: bash
continue-on-error: true
run: # buildx will cache the images that already built, so it will not take long time to build the images again.
make dev-builder \
BASE_IMAGE=centos \
BUILDX_MULTI_PLATFORM_BUILD=true \
IMAGE_REGISTRY=${{ inputs.acr-image-registry }} \
IMAGE_NAMESPACE=${{ inputs.acr-image-namespace }} \

View File

@@ -32,6 +32,10 @@ inputs:
description: Upload to S3
required: false
default: 'true'
upload-latest-artifacts:
description: Upload the latest artifacts to S3
required: false
default: 'true'
working-dir:
description: Working directory to build the artifacts
required: false
@@ -43,7 +47,7 @@ runs:
shell: bash
run: |
cd ${{ inputs.working-dir }} && \
make build-greptime-by-buildx \
make build-by-dev-builder \
CARGO_PROFILE=${{ inputs.cargo-profile }} \
FEATURES=${{ inputs.features }} \
BASE_IMAGE=${{ inputs.base-image }}
@@ -52,11 +56,12 @@ runs:
uses: ./.github/actions/upload-artifacts
with:
artifacts-dir: ${{ inputs.artifacts-dir }}
target-file: ./greptime
target-file: ./target/${{ inputs.cargo-profile }}/greptime
version: ${{ inputs.version }}
release-to-s3-bucket: ${{ inputs.release-to-s3-bucket }}
aws-access-key-id: ${{ inputs.aws-access-key-id }}
aws-secret-access-key: ${{ inputs.aws-secret-access-key }}
aws-region: ${{ inputs.aws-region }}
upload-to-s3: ${{ inputs.upload-to-s3 }}
upload-latest-artifacts: ${{ inputs.upload-latest-artifacts }}
working-dir: ${{ inputs.working-dir }}

View File

@@ -40,7 +40,7 @@ runs:
image-registry-password: ${{ inputs.image-registry-password }}
image-name: ${{ inputs.image-name }}
image-tag: ${{ inputs.version }}
docker-file: docker/ci/Dockerfile
docker-file: docker/ci/ubuntu/Dockerfile
amd64-artifact-name: greptime-linux-amd64-pyo3-${{ inputs.version }}
arm64-artifact-name: greptime-linux-arm64-pyo3-${{ inputs.version }}
platforms: linux/amd64,linux/arm64
@@ -56,7 +56,7 @@ runs:
image-registry-password: ${{ inputs.image-registry-password }}
image-name: ${{ inputs.image-name }}-centos
image-tag: ${{ inputs.version }}
docker-file: docker/ci/Dockerfile-centos
docker-file: docker/ci/centos/Dockerfile
amd64-artifact-name: greptime-linux-amd64-centos-${{ inputs.version }}
platforms: linux/amd64
push-latest-tag: ${{ inputs.push-latest-tag }}

View File

@@ -33,6 +33,10 @@ inputs:
description: Upload to S3
required: false
default: 'true'
upload-latest-artifacts:
description: Upload the latest artifacts to S3
required: false
default: 'true'
working-dir:
description: Working directory to build the artifacts
required: false
@@ -69,6 +73,7 @@ runs:
aws-secret-access-key: ${{ inputs.aws-secret-access-key }}
aws-region: ${{ inputs.aws-region }}
upload-to-s3: ${{ inputs.upload-to-s3 }}
upload-latest-artifacts: ${{ inputs.upload-latest-artifacts }}
working-dir: ${{ inputs.working-dir }}
- name: Build greptime without pyo3
@@ -85,8 +90,14 @@ runs:
aws-secret-access-key: ${{ inputs.aws-secret-access-key }}
aws-region: ${{ inputs.aws-region }}
upload-to-s3: ${{ inputs.upload-to-s3 }}
upload-latest-artifacts: ${{ inputs.upload-latest-artifacts }}
working-dir: ${{ inputs.working-dir }}
- name: Clean up the target directory # Clean up the target directory for the centos7 base image, or it will still use the objects of last build.
shell: bash
run: |
rm -rf ./target/
- name: Build greptime on centos base image
uses: ./.github/actions/build-greptime-binary
if: ${{ inputs.arch == 'amd64' && inputs.dev-mode == 'false' }} # Only build centos7 base image for amd64.
@@ -101,4 +112,5 @@ runs:
aws-secret-access-key: ${{ inputs.aws-secret-access-key }}
aws-region: ${{ inputs.aws-region }}
upload-to-s3: ${{ inputs.upload-to-s3 }}
upload-latest-artifacts: ${{ inputs.upload-latest-artifacts }}
working-dir: ${{ inputs.working-dir }}

View File

@@ -26,6 +26,18 @@ inputs:
description: Upload to S3
required: false
default: 'true'
upload-latest-artifacts:
description: Upload the latest artifacts to S3
required: false
default: 'true'
upload-max-retry-times:
description: Max retry times for uploading artifacts to S3
required: false
default: "20"
upload-retry-timeout:
description: Timeout for uploading artifacts to S3
required: false
default: "10" # minutes
working-dir:
description: Working directory to upload the artifacts
required: false
@@ -66,20 +78,16 @@ runs:
name: ${{ inputs.artifacts-dir }}.sha256sum
path: ${{ inputs.working-dir }}/${{ inputs.artifacts-dir }}.sha256sum
- name: Configure AWS credentials
if: ${{ inputs.upload-to-s3 == 'true' }}
uses: aws-actions/configure-aws-credentials@v2
with:
aws-access-key-id: ${{ inputs.aws-access-key-id }}
aws-secret-access-key: ${{ inputs.aws-secret-access-key }}
aws-region: ${{ inputs.aws-region }}
- name: Upload artifacts to S3
if: ${{ inputs.upload-to-s3 == 'true' }}
uses: nick-invision/retry@v2
env:
AWS_ACCESS_KEY_ID: ${{ inputs.aws-access-key-id }}
AWS_SECRET_ACCESS_KEY: ${{ inputs.aws-secret-access-key }}
AWS_DEFAULT_REGION: ${{ inputs.aws-region }}
with:
max_attempts: 20
timeout_minutes: 5
max_attempts: ${{ inputs.upload-max-retry-times }}
timeout_minutes: ${{ inputs.upload-retry-timeout }}
# The bucket layout will be:
# releases/greptimedb
# ├── v0.1.0
@@ -96,3 +104,22 @@ runs:
aws s3 cp \
${{ inputs.artifacts-dir }}.sha256sum \
s3://${{ inputs.release-to-s3-bucket }}/releases/greptimedb/${{ inputs.version }}/${{ inputs.artifacts-dir }}.sha256sum
- name: Upload latest artifacts to S3
if: ${{ inputs.upload-to-s3 == 'true' && inputs.upload-latest-artifacts == 'true' }} # We'll also upload the latest artifacts to S3 in the scheduled and formal release.
uses: nick-invision/retry@v2
env:
AWS_ACCESS_KEY_ID: ${{ inputs.aws-access-key-id }}
AWS_SECRET_ACCESS_KEY: ${{ inputs.aws-secret-access-key }}
AWS_DEFAULT_REGION: ${{ inputs.aws-region }}
with:
max_attempts: ${{ inputs.upload-max-retry-times }}
timeout_minutes: ${{ inputs.upload-retry-timeout }}
command: |
cd ${{ inputs.working-dir }} && \
aws s3 cp \
${{ inputs.artifacts-dir }}.tar.gz \
s3://${{ inputs.release-to-s3-bucket }}/releases/greptimedb/latest/${{ inputs.artifacts-dir }}.tar.gz && \
aws s3 cp \
${{ inputs.artifacts-dir }}.sha256sum \
s3://${{ inputs.release-to-s3-bucket }}/releases/greptimedb/latest/${{ inputs.artifacts-dir }}.sha256sum

View File

@@ -334,11 +334,11 @@ jobs:
if: ${{ needs.release-images-to-dockerhub.outputs.build-result == 'success' }}
with:
payload: |
{"text": "GreptimeDB ${{ env.NEXT_RELEASE_VERSION }} build successful"}
{"text": "GreptimeDB's ${{ env.NEXT_RELEASE_VERSION }} build has completed successfully."}
- name: Notifiy nightly build failed result
uses: slackapi/slack-github-action@v1.23.0
if: ${{ needs.release-images-to-dockerhub.outputs.build-result != 'success' }}
with:
payload: |
{"text": "GreptimeDB ${{ env.NEXT_RELEASE_VERSION }} build failed, please check 'https://github.com/GreptimeTeam/greptimedb/actions/workflows/${{ env.NEXT_RELEASE_VERSION }}-build.yml'"}
{"text": "GreptimeDB's ${{ env.NEXT_RELEASE_VERSION }} build has failed, please check 'https://github.com/GreptimeTeam/greptimedb/actions/workflows/${{ env.NEXT_RELEASE_VERSION }}-build.yml'."}

View File

@@ -24,6 +24,10 @@ on:
name: CI
concurrency:
group: ${{ github.workflow }}-${{ github.head_ref || github.run_id }}
cancel-in-progress: true
env:
RUST_TOOLCHAIN: nightly-2023-08-07

View File

@@ -151,6 +151,7 @@ jobs:
aws-access-key-id: ${{ secrets.AWS_CN_ACCESS_KEY_ID }}
aws-secret-access-key: ${{ secrets.AWS_CN_SECRET_ACCESS_KEY }}
aws-region: ${{ vars.AWS_RELEASE_BUCKET_REGION }}
upload-latest-artifacts: false
build-linux-arm64-artifacts:
name: Build linux-arm64 artifacts
@@ -174,6 +175,7 @@ jobs:
aws-access-key-id: ${{ secrets.AWS_CN_ACCESS_KEY_ID }}
aws-secret-access-key: ${{ secrets.AWS_CN_SECRET_ACCESS_KEY }}
aws-region: ${{ vars.AWS_RELEASE_BUCKET_REGION }}
upload-latest-artifacts: false
release-images-to-dockerhub:
name: Build and push images to DockerHub
@@ -299,11 +301,11 @@ jobs:
if: ${{ needs.release-images-to-dockerhub.outputs.nightly-build-result == 'success' }}
with:
payload: |
{"text": "GreptimeDB nightly build successful"}
{"text": "GreptimeDB's ${{ env.NEXT_RELEASE_VERSION }} build has completed successfully."}
- name: Notifiy nightly build failed result
uses: slackapi/slack-github-action@v1.23.0
if: ${{ needs.release-images-to-dockerhub.outputs.nightly-build-result != 'success' }}
with:
payload: |
{"text": "GreptimeDB nightly build failed, please check 'https://github.com/GreptimeTeam/greptimedb/actions/workflows/nightly-build.yml'"}
{"text": "GreptimeDB's ${{ env.NEXT_RELEASE_VERSION }} build has failed, please check 'https://github.com/GreptimeTeam/greptimedb/actions/workflows/${{ env.NEXT_RELEASE_VERSION }}-build.yml'."}

749
Cargo.lock generated

File diff suppressed because it is too large Load Diff

View File

@@ -2,6 +2,7 @@
members = [
"benchmarks",
"src/api",
"src/auth",
"src/catalog",
"src/client",
"src/cmd",
@@ -45,6 +46,7 @@ members = [
"src/sql",
"src/storage",
"src/store-api",
"src/flow",
"src/table",
"src/table-procedure",
"tests-integration",
@@ -53,7 +55,7 @@ members = [
resolver = "2"
[workspace.package]
version = "0.3.2"
version = "0.4.0-nightly"
edition = "2021"
license = "Apache-2.0"
@@ -66,17 +68,18 @@ arrow-schema = { version = "43.0", features = ["serde"] }
async-stream = "0.3"
async-trait = "0.1"
chrono = { version = "0.4", features = ["serde"] }
datafusion = { git = "https://github.com/waynexia/arrow-datafusion.git", rev = "2ceb7f927c40787773fdc466d6a4b79f3a6c0001" }
datafusion-common = { git = "https://github.com/waynexia/arrow-datafusion.git", rev = "2ceb7f927c40787773fdc466d6a4b79f3a6c0001" }
datafusion-expr = { git = "https://github.com/waynexia/arrow-datafusion.git", rev = "2ceb7f927c40787773fdc466d6a4b79f3a6c0001" }
datafusion-optimizer = { git = "https://github.com/waynexia/arrow-datafusion.git", rev = "2ceb7f927c40787773fdc466d6a4b79f3a6c0001" }
datafusion-physical-expr = { git = "https://github.com/waynexia/arrow-datafusion.git", rev = "2ceb7f927c40787773fdc466d6a4b79f3a6c0001" }
datafusion-sql = { git = "https://github.com/waynexia/arrow-datafusion.git", rev = "2ceb7f927c40787773fdc466d6a4b79f3a6c0001" }
datafusion-substrait = { git = "https://github.com/waynexia/arrow-datafusion.git", rev = "2ceb7f927c40787773fdc466d6a4b79f3a6c0001" }
datafusion = { git = "https://github.com/waynexia/arrow-datafusion.git", rev = "c0b0fca548e99d020c76e1a1cd7132aab26000e1" }
datafusion-common = { git = "https://github.com/waynexia/arrow-datafusion.git", rev = "c0b0fca548e99d020c76e1a1cd7132aab26000e1" }
datafusion-expr = { git = "https://github.com/waynexia/arrow-datafusion.git", rev = "c0b0fca548e99d020c76e1a1cd7132aab26000e1" }
datafusion-optimizer = { git = "https://github.com/waynexia/arrow-datafusion.git", rev = "c0b0fca548e99d020c76e1a1cd7132aab26000e1" }
datafusion-physical-expr = { git = "https://github.com/waynexia/arrow-datafusion.git", rev = "c0b0fca548e99d020c76e1a1cd7132aab26000e1" }
datafusion-sql = { git = "https://github.com/waynexia/arrow-datafusion.git", rev = "c0b0fca548e99d020c76e1a1cd7132aab26000e1" }
datafusion-substrait = { git = "https://github.com/waynexia/arrow-datafusion.git", rev = "c0b0fca548e99d020c76e1a1cd7132aab26000e1" }
derive_builder = "0.12"
futures = "0.3"
futures-util = "0.3"
greptime-proto = { git = "https://github.com/GreptimeTeam/greptime-proto.git", rev = "940694cfd05c1e93c1dd7aab486184c9e2853098" }
greptime-proto = { git = "https://github.com/GreptimeTeam/greptime-proto.git", rev = "4a277f27caa035a801d5b9c020a0449777736614" }
humantime-serde = "1.1"
itertools = "0.10"
lazy_static = "1.4"
once_cell = "1.18"
@@ -89,9 +92,10 @@ regex = "1.8"
serde = { version = "1.0", features = ["derive"] }
serde_json = "1.0"
snafu = { version = "0.7", features = ["backtraces"] }
sqlparser = { git = "https://github.com/GreptimeTeam/sqlparser-rs.git", rev = "c3814f08afa19786b13d72b1731a1e8b3cac4ab9", features = [
sqlparser = { git = "https://github.com/GreptimeTeam/sqlparser-rs.git", rev = "296a4f6c73b129d6f565a42a2e5e53c6bc2b9da4", features = [
"visitor",
] }
strum = { version = "0.25", features = ["derive"] }
tempfile = "3"
tokio = { version = "1.28", features = ["full"] }
tokio-util = { version = "0.7", features = ["io-util", "compat"] }
@@ -102,6 +106,7 @@ metrics = "0.20"
meter-core = { git = "https://github.com/GreptimeTeam/greptime-meter.git", rev = "abbd357c1e193cd270ea65ee7652334a150b628f" }
## workspaces members
api = { path = "src/api" }
auth = { path = "src/auth" }
catalog = { path = "src/catalog" }
client = { path = "src/client" }
cmd = { path = "src/cmd" }

View File

@@ -12,6 +12,8 @@ BUILDX_BUILDER_NAME ?= gtbuilder
BASE_IMAGE ?= ubuntu
RUST_TOOLCHAIN ?= $(shell cat rust-toolchain.toml | grep channel | cut -d'"' -f2)
CARGO_REGISTRY_CACHE ?= ${HOME}/.cargo/registry
ARCH := $(shell uname -m | sed 's/x86_64/amd64/' | sed 's/aarch64/arm64/')
OUTPUT_DIR := $(shell if [ "$(RELEASE)" = "true" ]; then echo "release"; elif [ ! -z "$(CARGO_PROFILE)" ]; then echo "$(CARGO_PROFILE)" ; else echo "debug"; fi)
# The arguments for running integration tests.
ETCD_VERSION ?= v3.5.9
@@ -43,6 +45,10 @@ ifneq ($(strip $(TARGET)),)
CARGO_BUILD_OPTS += --target ${TARGET}
endif
ifneq ($(strip $(RELEASE)),)
CARGO_BUILD_OPTS += --release
endif
ifeq ($(BUILDX_MULTI_PLATFORM_BUILD), true)
BUILDX_MULTI_PLATFORM_BUILD_OPTS := --platform linux/amd64,linux/arm64 --push
else
@@ -52,26 +58,20 @@ endif
##@ Build
.PHONY: build
build: ## Build debug version greptime. If USE_DEV_BUILDER is true, the binary will be built in dev-builder.
ifeq ($(USE_DEV_BUILDER), true)
docker run --network=host \
-v ${PWD}:/greptimedb -v ${CARGO_REGISTRY_CACHE}:/root/.cargo/registry \
-w /greptimedb ${IMAGE_REGISTRY}/${IMAGE_NAMESPACE}/dev-builder:latest \
make build CARGO_PROFILE=${CARGO_PROFILE} FEATURES=${FEATURES} TARGET_DIR=${TARGET_DIR}
else
build: ## Build debug version greptime.
cargo build ${CARGO_BUILD_OPTS}
endif
.PHONY: release
release: ## Build release version greptime. If USE_DEV_BUILDER is true, the binary will be built in dev-builder.
ifeq ($(USE_DEV_BUILDER), true)
.POHNY: build-by-dev-builder
build-by-dev-builder: ## Build greptime by dev-builder.
docker run --network=host \
-v ${PWD}:/greptimedb -v ${CARGO_REGISTRY_CACHE}:/root/.cargo/registry \
-w /greptimedb ${IMAGE_REGISTRY}/${IMAGE_NAMESPACE}/dev-builder:latest \
make release CARGO_PROFILE=${CARGO_PROFILE} FEATURES=${FEATURES} TARGET_DIR=${TARGET_DIR}
else
cargo build --release ${CARGO_BUILD_OPTS}
endif
-w /greptimedb ${IMAGE_REGISTRY}/${IMAGE_NAMESPACE}/dev-builder-${BASE_IMAGE}:latest \
make build \
CARGO_PROFILE=${CARGO_PROFILE} \
FEATURES=${FEATURES} \
TARGET_DIR=${TARGET_DIR} \
TARGET=${TARGET} \
RELEASE=${RELEASE}
.PHONY: clean
clean: ## Clean the project.
@@ -90,30 +90,27 @@ check-toml: ## Check all TOML files.
taplo format --check
.PHONY: docker-image
docker-image: multi-platform-buildx ## Build docker image.
docker-image: build-by-dev-builder ## Build docker image.
mkdir -p ${ARCH} && \
cp ./target/${OUTPUT_DIR}/greptime ${ARCH}/greptime && \
docker build -f docker/ci/${BASE_IMAGE}/Dockerfile -t ${IMAGE_REGISTRY}/${IMAGE_NAMESPACE}/greptimedb:${IMAGE_TAG} . && \
rm -r ${ARCH}
.PHONY: docker-image-buildx
docker-image-buildx: multi-platform-buildx ## Build docker image by buildx.
docker buildx build --builder ${BUILDX_BUILDER_NAME} \
--build-arg="CARGO_PROFILE=${CARGO_PROFILE}" --build-arg="FEATURES=${FEATURES}" \
-f docker/${BASE_IMAGE}/Dockerfile \
--build-arg="CARGO_PROFILE=${CARGO_PROFILE}" \
--build-arg="FEATURES=${FEATURES}" \
--build-arg="OUTPUT_DIR=${OUTPUT_DIR}" \
-f docker/buildx/${BASE_IMAGE}/Dockerfile \
-t ${IMAGE_REGISTRY}/${IMAGE_NAMESPACE}/greptimedb:${IMAGE_TAG} ${BUILDX_MULTI_PLATFORM_BUILD_OPTS} .
.PHONY: build-greptime-by-buildx
build-greptime-by-buildx: multi-platform-buildx ## Build greptime binary by docker buildx. The binary will be copied to the current directory.
docker buildx build --builder ${BUILDX_BUILDER_NAME} \
--target=builder \
--build-arg="CARGO_PROFILE=${CARGO_PROFILE}" --build-arg="FEATURES=${FEATURES}" \
-f docker/${BASE_IMAGE}/Dockerfile \
-t ${IMAGE_REGISTRY}/${IMAGE_NAMESPACE}/greptimedb-builder:${IMAGE_TAG} ${BUILDX_MULTI_PLATFORM_BUILD_OPTS} .
docker run --rm -v ${PWD}:/data \
--entrypoint cp ${IMAGE_REGISTRY}/${IMAGE_NAMESPACE}/greptimedb-builder:${IMAGE_TAG} \
/out/target/${CARGO_PROFILE}/greptime /data/greptime
.PHONY: dev-builder
dev-builder: multi-platform-buildx ## Build dev-builder image.
docker buildx build --builder ${BUILDX_BUILDER_NAME} \
--build-arg="RUST_TOOLCHAIN=${RUST_TOOLCHAIN}" \
-f docker/dev-builder/Dockerfile \
-t ${IMAGE_REGISTRY}/${IMAGE_NAMESPACE}/dev-builder:${IMAGE_TAG} ${BUILDX_MULTI_PLATFORM_BUILD_OPTS} .
-f docker/dev-builder/${BASE_IMAGE}/Dockerfile \
-t ${IMAGE_REGISTRY}/${IMAGE_NAMESPACE}/dev-builder-${BASE_IMAGE}:${IMAGE_TAG} ${BUILDX_MULTI_PLATFORM_BUILD_OPTS} .
.PHONY: multi-platform-buildx
multi-platform-buildx: ## Create buildx multi-platform builder.
@@ -155,7 +152,7 @@ stop-etcd: ## Stop single node etcd for testing purpose.
run-it-in-container: start-etcd ## Run integration tests in dev-builder.
docker run --network=host \
-v ${PWD}:/greptimedb -v ${CARGO_REGISTRY_CACHE}:/root/.cargo/registry -v /tmp:/tmp \
-w /greptimedb ${IMAGE_REGISTRY}/${IMAGE_NAMESPACE}/dev-builder:latest \
-w /greptimedb ${IMAGE_REGISTRY}/${IMAGE_NAMESPACE}/dev-builder-${BASE_IMAGE}:latest \
make test sqlness-test BUILD_JOBS=${BUILD_JOBS}
##@ General

View File

@@ -57,8 +57,6 @@ max_purge_tasks = 32
checkpoint_margin = 10
# Region manifest logs and checkpoints gc execution duration
gc_duration = '10m'
# Whether to try creating a manifest checkpoint on region opening
checkpoint_on_startup = false
# Storage flush options
[storage.flush]

View File

@@ -53,10 +53,6 @@ enable = true
[prom_store_options]
enable = true
# Prometheus protocol options, see `standalone.example.toml`.
[prometheus_options]
addr = "127.0.0.1:4004"
# Metasrv client options, see `datanode.example.toml`.
[meta_client_options]
metasrv_addrs = ["127.0.0.1:3002"]

View File

@@ -26,7 +26,7 @@ enable_telemetry = true
# Procedure storage options.
[procedure]
# Procedure max retry time.
max_retry_times = 3
max_retry_times = 12
# Initial retry delay of procedures, increases exponentially
retry_delay = "500ms"

View File

@@ -76,11 +76,6 @@ enable = true
# Whether to enable Prometheus remote write and read in HTTP API, true by default.
enable = true
# Prometheus protocol options
[prometheus_options]
# Prometheus API server address, "127.0.0.1:4004" by default.
addr = "127.0.0.1:4004"
# WAL options.
[wal]
# WAL data directory
@@ -121,8 +116,6 @@ max_purge_tasks = 32
checkpoint_margin = 10
# Region manifest logs and checkpoints gc execution duration
gc_duration = '10m'
# Whether to try creating a manifest checkpoint on region opening
checkpoint_on_startup = false
# Storage flush options
[storage.flush]

View File

@@ -2,6 +2,7 @@ FROM centos:7 as builder
ARG CARGO_PROFILE
ARG FEATURES
ARG OUTPUT_DIR
ENV LANG en_US.utf8
WORKDIR /greptimedb
@@ -13,7 +14,8 @@ RUN yum install -y epel-release \
openssl-devel \
centos-release-scl \
rh-python38 \
rh-python38-python-devel
rh-python38-python-devel \
which
# Install protoc
RUN curl -LO https://github.com/protocolbuffers/protobuf/releases/download/v3.15.8/protoc-3.15.8-linux-x86_64.zip
@@ -35,17 +37,18 @@ RUN --mount=target=.,rw \
# Export the binary to the clean image.
FROM centos:7 as base
ARG CARGO_PROFILE
ARG OUTPUT_DIR
RUN yum install -y epel-release \
openssl \
openssl-devel \
centos-release-scl \
rh-python38 \
rh-python38-python-devel
rh-python38-python-devel \
which
WORKDIR /greptime
COPY --from=builder /out/target/${CARGO_PROFILE}/greptime /greptime/bin/
COPY --from=builder /out/target/${OUTPUT_DIR}/greptime /greptime/bin/
ENV PATH /greptime/bin/:$PATH
ENTRYPOINT ["greptime"]

View File

@@ -2,6 +2,7 @@ FROM ubuntu:22.04 as builder
ARG CARGO_PROFILE
ARG FEATURES
ARG OUTPUT_DIR
ENV LANG en_US.utf8
WORKDIR /greptimedb
@@ -25,7 +26,7 @@ RUN curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- --no-mo
ENV PATH /root/.cargo/bin/:$PATH
# Build the project in release mode.
RUN --mount=target=.,rw \
RUN --mount=target=. \
--mount=type=cache,target=/root/.cargo/registry \
make build \
CARGO_PROFILE=${CARGO_PROFILE} \
@@ -36,7 +37,7 @@ RUN --mount=target=.,rw \
# TODO(zyy17): Maybe should use the more secure container image.
FROM ubuntu:22.04 as base
ARG CARGO_PROFILE
ARG OUTPUT_DIR
RUN apt-get update && DEBIAN_FRONTEND=noninteractive apt-get \
-y install ca-certificates \
@@ -50,7 +51,7 @@ COPY ./docker/python/requirements.txt /etc/greptime/requirements.txt
RUN python3 -m pip install -r /etc/greptime/requirements.txt
WORKDIR /greptime
COPY --from=builder /out/target/${CARGO_PROFILE}/greptime /greptime/bin/
COPY --from=builder /out/target/${OUTPUT_DIR}/greptime /greptime/bin/
ENV PATH /greptime/bin/:$PATH
ENTRYPOINT ["greptime"]

View File

@@ -0,0 +1,29 @@
FROM centos:7 as builder
ENV LANG en_US.utf8
# Install dependencies
RUN ulimit -n 1024000 && yum groupinstall -y 'Development Tools'
RUN yum install -y epel-release \
openssl \
openssl-devel \
centos-release-scl \
rh-python38 \
rh-python38-python-devel \
which
# Install protoc
RUN curl -LO https://github.com/protocolbuffers/protobuf/releases/download/v3.15.8/protoc-3.15.8-linux-x86_64.zip
RUN unzip protoc-3.15.8-linux-x86_64.zip -d /usr/local/
# Install Rust
SHELL ["/bin/bash", "-c"]
RUN curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- --no-modify-path --default-toolchain none -y
ENV PATH /opt/rh/rh-python38/root/usr/bin:/usr/local/bin:/root/.cargo/bin/:$PATH
# Install Rust toolchains.
ARG RUST_TOOLCHAIN
RUN rustup toolchain install ${RUST_TOOLCHAIN}
# Install nextest.
RUN cargo install cargo-nextest --locked

View File

@@ -0,0 +1,90 @@
---
Feature Name: Update Metadata in single transaction
Tracking Issue: https://github.com/GreptimeTeam/greptimedb/issues/1715
Date: 2023-08-13
Author: "Feng Yangsen <fengys1996@gmail.com>, Xu Wenkang <wenymedia@gmail.com>"
---
# Summary
Update Metadata in single transaction.
# Motivation
Currently, multiple transactions are involved during the procedure. This implementation is inefficient, and it's hard to make data consistent. Therefore, We can update multiple metadata in a single transaction.
# Details
Now we have the following table metadata keys:
**TableInfo**
```rust
// __table_info/{table_id}
pub struct TableInfoKey {
table_id: TableId,
}
pub struct TableInfoValue {
pub table_info: RawTableInfo,
version: u64,
}
```
**TableRoute**
```rust
// __table_route/{table_id}
pub struct NextTableRouteKey {
table_id: TableId,
}
pub struct TableRoute {
pub region_routes: Vec<RegionRoute>,
}
```
**DatanodeTable**
```rust
// __table_route/{datanode_id}/{table_id}
pub struct DatanodeTableKey {
datanode_id: DatanodeId,
table_id: TableId,
}
pub struct DatanodeTableValue {
pub table_id: TableId,
pub regions: Vec<RegionNumber>,
version: u64,
}
```
**TableNameKey**
```rust
// __table_name/{CatalogName}/{SchemaName}/{TableName}
pub struct TableNameKey<'a> {
pub catalog: &'a str,
pub schema: &'a str,
pub table: &'a str,
}
pub struct TableNameValue {
table_id: TableId,
}
```
These table metadata only updates in the following operations.
## Region Failover
It needs to update `TableRoute` key and `DatanodeTable` keys. If the `TableRoute` equals the Snapshot of `TableRoute` submitting the Failover task, then we can safely update these keys.
After submitting Failover tasks to acquire locks for execution, the `TableRoute` may be updated by another task. After acquiring the lock, we can get the latest `TableRoute` again and then execute it if needed.
## Create Table DDL
Creates all of the above keys. `TableRoute`, `TableInfo`, should be empty.
The **TableNameKey**'s lock will be held by the procedure framework.
## Drop Table DDL
`TableInfoKey` and `NextTableRouteKey` will be added with `__removed-` prefix, and the other above keys will be deleted. The transaction will not compare any keys.
## Alter Table DDL
1. Rename table, updates `TableInfo` and `TableName`. Compares `TableInfo`, and the new `TableNameKey` should be empty, and TableInfo should equal the Snapshot when submitting DDL.
The old and new **TableNameKey**'s lock will be held by the procedure framework.
2. Alter table, updates `TableInfo`. `TableInfo` should equal the Snapshot when submitting DDL.

View File

@@ -16,3 +16,6 @@ tonic.workspace = true
[build-dependencies]
tonic-build = "0.9"
[dev-dependencies]
paste = "1.0"

View File

@@ -12,18 +12,33 @@
// See the License for the specific language governing permissions and
// limitations under the License.
use std::sync::Arc;
use common_base::BitVec;
use common_time::interval::IntervalUnit;
use common_time::time::Time;
use common_time::timestamp::TimeUnit;
use common_time::Interval;
use datatypes::prelude::ConcreteDataType;
use datatypes::types::{IntervalType, TimeType, TimestampType};
use datatypes::value::Value;
use datatypes::vectors::VectorRef;
use common_time::{Date, DateTime, Interval, Timestamp};
use datatypes::prelude::{ConcreteDataType, ValueRef};
use datatypes::scalars::ScalarVector;
use datatypes::types::{
Int16Type, Int8Type, IntervalType, TimeType, TimestampType, UInt16Type, UInt8Type,
};
use datatypes::value::{OrderedF32, OrderedF64, Value};
use datatypes::vectors::{
BinaryVector, BooleanVector, DateTimeVector, DateVector, Float32Vector, Float64Vector,
Int32Vector, Int64Vector, IntervalDayTimeVector, IntervalMonthDayNanoVector,
IntervalYearMonthVector, PrimitiveVector, StringVector, TimeMicrosecondVector,
TimeMillisecondVector, TimeNanosecondVector, TimeSecondVector, TimestampMicrosecondVector,
TimestampMillisecondVector, TimestampNanosecondVector, TimestampSecondVector, UInt32Vector,
UInt64Vector, VectorRef,
};
use greptime_proto::v1;
use greptime_proto::v1::ddl_request::Expr;
use greptime_proto::v1::greptime_request::Request;
use greptime_proto::v1::query_request::Query;
use greptime_proto::v1::{DdlRequest, IntervalMonthDayNano, QueryRequest};
use greptime_proto::v1::value::ValueData;
use greptime_proto::v1::{DdlRequest, IntervalMonthDayNano, QueryRequest, SemanticType};
use snafu::prelude::*;
use crate::error::{self, Result};
@@ -40,6 +55,10 @@ impl ColumnDataTypeWrapper {
Ok(Self(datatype))
}
pub fn new(datatype: ColumnDataType) -> Self {
Self(datatype)
}
pub fn datatype(&self) -> ColumnDataType {
self.0
}
@@ -297,7 +316,9 @@ pub fn request_type(request: &Request) -> &'static str {
Request::Inserts(_) => "inserts",
Request::Query(query_req) => query_request_type(query_req),
Request::Ddl(ddl_req) => ddl_request_type(ddl_req),
Request::Delete(_) => "delete",
Request::Deletes(_) => "deletes",
Request::RowInserts(_) => "row_inserts",
Request::RowDeletes(_) => "row_deletes",
}
}
@@ -336,16 +357,479 @@ pub fn convert_i128_to_interval(v: i128) -> IntervalMonthDayNano {
}
}
pub fn pb_value_to_value_ref(value: &v1::Value) -> ValueRef {
let Some(value) = &value.value_data else {
return ValueRef::Null;
};
match value {
ValueData::I8Value(v) => ValueRef::Int8(*v as i8),
ValueData::I16Value(v) => ValueRef::Int16(*v as i16),
ValueData::I32Value(v) => ValueRef::Int32(*v),
ValueData::I64Value(v) => ValueRef::Int64(*v),
ValueData::U8Value(v) => ValueRef::UInt8(*v as u8),
ValueData::U16Value(v) => ValueRef::UInt16(*v as u16),
ValueData::U32Value(v) => ValueRef::UInt32(*v),
ValueData::U64Value(v) => ValueRef::UInt64(*v),
ValueData::F32Value(f) => ValueRef::Float32(OrderedF32::from(*f)),
ValueData::F64Value(f) => ValueRef::Float64(OrderedF64::from(*f)),
ValueData::BoolValue(b) => ValueRef::Boolean(*b),
ValueData::BinaryValue(bytes) => ValueRef::Binary(bytes.as_slice()),
ValueData::StringValue(string) => ValueRef::String(string.as_str()),
ValueData::DateValue(d) => ValueRef::Date(Date::from(*d)),
ValueData::DatetimeValue(d) => ValueRef::DateTime(DateTime::new(*d)),
ValueData::TsSecondValue(t) => ValueRef::Timestamp(Timestamp::new_second(*t)),
ValueData::TsMillisecondValue(t) => ValueRef::Timestamp(Timestamp::new_millisecond(*t)),
ValueData::TsMicrosecondValue(t) => ValueRef::Timestamp(Timestamp::new_microsecond(*t)),
ValueData::TsNanosecondValue(t) => ValueRef::Timestamp(Timestamp::new_nanosecond(*t)),
ValueData::TimeSecondValue(t) => ValueRef::Time(Time::new_second(*t)),
ValueData::TimeMillisecondValue(t) => ValueRef::Time(Time::new_millisecond(*t)),
ValueData::TimeMicrosecondValue(t) => ValueRef::Time(Time::new_microsecond(*t)),
ValueData::TimeNanosecondValue(t) => ValueRef::Time(Time::new_nanosecond(*t)),
ValueData::IntervalYearMonthValues(v) => ValueRef::Interval(Interval::from_i32(*v)),
ValueData::IntervalDayTimeValues(v) => ValueRef::Interval(Interval::from_i64(*v)),
ValueData::IntervalMonthDayNanoValues(v) => {
let interval = Interval::from_month_day_nano(v.months, v.days, v.nanoseconds);
ValueRef::Interval(interval)
}
}
}
pub fn pb_values_to_vector_ref(data_type: &ConcreteDataType, values: Values) -> VectorRef {
match data_type {
ConcreteDataType::Boolean(_) => Arc::new(BooleanVector::from(values.bool_values)),
ConcreteDataType::Int8(_) => Arc::new(PrimitiveVector::<Int8Type>::from_iter_values(
values.i8_values.into_iter().map(|x| x as i8),
)),
ConcreteDataType::Int16(_) => Arc::new(PrimitiveVector::<Int16Type>::from_iter_values(
values.i16_values.into_iter().map(|x| x as i16),
)),
ConcreteDataType::Int32(_) => Arc::new(Int32Vector::from_vec(values.i32_values)),
ConcreteDataType::Int64(_) => Arc::new(Int64Vector::from_vec(values.i64_values)),
ConcreteDataType::UInt8(_) => Arc::new(PrimitiveVector::<UInt8Type>::from_iter_values(
values.u8_values.into_iter().map(|x| x as u8),
)),
ConcreteDataType::UInt16(_) => Arc::new(PrimitiveVector::<UInt16Type>::from_iter_values(
values.u16_values.into_iter().map(|x| x as u16),
)),
ConcreteDataType::UInt32(_) => Arc::new(UInt32Vector::from_vec(values.u32_values)),
ConcreteDataType::UInt64(_) => Arc::new(UInt64Vector::from_vec(values.u64_values)),
ConcreteDataType::Float32(_) => Arc::new(Float32Vector::from_vec(values.f32_values)),
ConcreteDataType::Float64(_) => Arc::new(Float64Vector::from_vec(values.f64_values)),
ConcreteDataType::Binary(_) => Arc::new(BinaryVector::from(values.binary_values)),
ConcreteDataType::String(_) => Arc::new(StringVector::from_vec(values.string_values)),
ConcreteDataType::Date(_) => Arc::new(DateVector::from_vec(values.date_values)),
ConcreteDataType::DateTime(_) => Arc::new(DateTimeVector::from_vec(values.datetime_values)),
ConcreteDataType::Timestamp(unit) => match unit {
TimestampType::Second(_) => {
Arc::new(TimestampSecondVector::from_vec(values.ts_second_values))
}
TimestampType::Millisecond(_) => Arc::new(TimestampMillisecondVector::from_vec(
values.ts_millisecond_values,
)),
TimestampType::Microsecond(_) => Arc::new(TimestampMicrosecondVector::from_vec(
values.ts_microsecond_values,
)),
TimestampType::Nanosecond(_) => Arc::new(TimestampNanosecondVector::from_vec(
values.ts_nanosecond_values,
)),
},
ConcreteDataType::Time(unit) => match unit {
TimeType::Second(_) => Arc::new(TimeSecondVector::from_iter_values(
values.time_second_values.iter().map(|x| *x as i32),
)),
TimeType::Millisecond(_) => Arc::new(TimeMillisecondVector::from_iter_values(
values.time_millisecond_values.iter().map(|x| *x as i32),
)),
TimeType::Microsecond(_) => Arc::new(TimeMicrosecondVector::from_vec(
values.time_microsecond_values,
)),
TimeType::Nanosecond(_) => Arc::new(TimeNanosecondVector::from_vec(
values.time_nanosecond_values,
)),
},
ConcreteDataType::Interval(unit) => match unit {
IntervalType::YearMonth(_) => Arc::new(IntervalYearMonthVector::from_vec(
values.interval_year_month_values,
)),
IntervalType::DayTime(_) => Arc::new(IntervalDayTimeVector::from_vec(
values.interval_day_time_values,
)),
IntervalType::MonthDayNano(_) => {
Arc::new(IntervalMonthDayNanoVector::from_iter_values(
values.interval_month_day_nano_values.iter().map(|x| {
Interval::from_month_day_nano(x.months, x.days, x.nanoseconds).to_i128()
}),
))
}
},
ConcreteDataType::Null(_) | ConcreteDataType::List(_) | ConcreteDataType::Dictionary(_) => {
unreachable!()
}
}
}
pub fn pb_values_to_values(data_type: &ConcreteDataType, values: Values) -> Vec<Value> {
// TODO(fys): use macros to optimize code
match data_type {
ConcreteDataType::Int64(_) => values
.i64_values
.into_iter()
.map(|val| val.into())
.collect(),
ConcreteDataType::Float64(_) => values
.f64_values
.into_iter()
.map(|val| val.into())
.collect(),
ConcreteDataType::String(_) => values
.string_values
.into_iter()
.map(|val| val.into())
.collect(),
ConcreteDataType::Boolean(_) => values
.bool_values
.into_iter()
.map(|val| val.into())
.collect(),
ConcreteDataType::Int8(_) => values
.i8_values
.into_iter()
// Safety: Since i32 only stores i8 data here, so i32 as i8 is safe.
.map(|val| (val as i8).into())
.collect(),
ConcreteDataType::Int16(_) => values
.i16_values
.into_iter()
// Safety: Since i32 only stores i16 data here, so i32 as i16 is safe.
.map(|val| (val as i16).into())
.collect(),
ConcreteDataType::Int32(_) => values
.i32_values
.into_iter()
.map(|val| val.into())
.collect(),
ConcreteDataType::UInt8(_) => values
.u8_values
.into_iter()
// Safety: Since i32 only stores u8 data here, so i32 as u8 is safe.
.map(|val| (val as u8).into())
.collect(),
ConcreteDataType::UInt16(_) => values
.u16_values
.into_iter()
// Safety: Since i32 only stores u16 data here, so i32 as u16 is safe.
.map(|val| (val as u16).into())
.collect(),
ConcreteDataType::UInt32(_) => values
.u32_values
.into_iter()
.map(|val| val.into())
.collect(),
ConcreteDataType::UInt64(_) => values
.u64_values
.into_iter()
.map(|val| val.into())
.collect(),
ConcreteDataType::Float32(_) => values
.f32_values
.into_iter()
.map(|val| val.into())
.collect(),
ConcreteDataType::Binary(_) => values
.binary_values
.into_iter()
.map(|val| val.into())
.collect(),
ConcreteDataType::DateTime(_) => values
.datetime_values
.into_iter()
.map(|v| Value::DateTime(v.into()))
.collect(),
ConcreteDataType::Date(_) => values
.date_values
.into_iter()
.map(|v| Value::Date(v.into()))
.collect(),
ConcreteDataType::Timestamp(TimestampType::Second(_)) => values
.ts_second_values
.into_iter()
.map(|v| Value::Timestamp(Timestamp::new_second(v)))
.collect(),
ConcreteDataType::Timestamp(TimestampType::Millisecond(_)) => values
.ts_millisecond_values
.into_iter()
.map(|v| Value::Timestamp(Timestamp::new_millisecond(v)))
.collect(),
ConcreteDataType::Timestamp(TimestampType::Microsecond(_)) => values
.ts_microsecond_values
.into_iter()
.map(|v| Value::Timestamp(Timestamp::new_microsecond(v)))
.collect(),
ConcreteDataType::Timestamp(TimestampType::Nanosecond(_)) => values
.ts_nanosecond_values
.into_iter()
.map(|v| Value::Timestamp(Timestamp::new_nanosecond(v)))
.collect(),
ConcreteDataType::Time(TimeType::Second(_)) => values
.time_second_values
.into_iter()
.map(|v| Value::Time(Time::new_second(v)))
.collect(),
ConcreteDataType::Time(TimeType::Millisecond(_)) => values
.time_millisecond_values
.into_iter()
.map(|v| Value::Time(Time::new_millisecond(v)))
.collect(),
ConcreteDataType::Time(TimeType::Microsecond(_)) => values
.time_microsecond_values
.into_iter()
.map(|v| Value::Time(Time::new_microsecond(v)))
.collect(),
ConcreteDataType::Time(TimeType::Nanosecond(_)) => values
.time_nanosecond_values
.into_iter()
.map(|v| Value::Time(Time::new_nanosecond(v)))
.collect(),
ConcreteDataType::Interval(IntervalType::YearMonth(_)) => values
.interval_year_month_values
.into_iter()
.map(|v| Value::Interval(Interval::from_i32(v)))
.collect(),
ConcreteDataType::Interval(IntervalType::DayTime(_)) => values
.interval_day_time_values
.into_iter()
.map(|v| Value::Interval(Interval::from_i64(v)))
.collect(),
ConcreteDataType::Interval(IntervalType::MonthDayNano(_)) => values
.interval_month_day_nano_values
.into_iter()
.map(|v| {
Value::Interval(Interval::from_month_day_nano(
v.months,
v.days,
v.nanoseconds,
))
})
.collect(),
ConcreteDataType::Null(_) | ConcreteDataType::List(_) | ConcreteDataType::Dictionary(_) => {
unreachable!()
}
}
}
/// Returns true if the pb semantic type is valid.
pub fn is_semantic_type_eq(type_value: i32, semantic_type: SemanticType) -> bool {
type_value == semantic_type as i32
}
/// Returns true if the pb type value is valid.
pub fn is_column_type_value_eq(type_value: i32, expect_type: &ConcreteDataType) -> bool {
let Some(column_type) = ColumnDataType::from_i32(type_value) else {
return false;
};
is_column_type_eq(column_type, expect_type)
}
/// Convert value into proto's value.
pub fn to_proto_value(value: Value) -> Option<v1::Value> {
let proto_value = match value {
Value::Null => v1::Value { value_data: None },
Value::Boolean(v) => v1::Value {
value_data: Some(ValueData::BoolValue(v)),
},
Value::UInt8(v) => v1::Value {
value_data: Some(ValueData::U8Value(v.into())),
},
Value::UInt16(v) => v1::Value {
value_data: Some(ValueData::U16Value(v.into())),
},
Value::UInt32(v) => v1::Value {
value_data: Some(ValueData::U32Value(v)),
},
Value::UInt64(v) => v1::Value {
value_data: Some(ValueData::U64Value(v)),
},
Value::Int8(v) => v1::Value {
value_data: Some(ValueData::I8Value(v.into())),
},
Value::Int16(v) => v1::Value {
value_data: Some(ValueData::I16Value(v.into())),
},
Value::Int32(v) => v1::Value {
value_data: Some(ValueData::I32Value(v)),
},
Value::Int64(v) => v1::Value {
value_data: Some(ValueData::I64Value(v)),
},
Value::Float32(v) => v1::Value {
value_data: Some(ValueData::F32Value(*v)),
},
Value::Float64(v) => v1::Value {
value_data: Some(ValueData::F64Value(*v)),
},
Value::String(v) => v1::Value {
value_data: Some(ValueData::StringValue(v.as_utf8().to_string())),
},
Value::Binary(v) => v1::Value {
value_data: Some(ValueData::BinaryValue(v.to_vec())),
},
Value::Date(v) => v1::Value {
value_data: Some(ValueData::DateValue(v.val())),
},
Value::DateTime(v) => v1::Value {
value_data: Some(ValueData::DatetimeValue(v.val())),
},
Value::Timestamp(v) => match v.unit() {
TimeUnit::Second => v1::Value {
value_data: Some(ValueData::TsSecondValue(v.value())),
},
TimeUnit::Millisecond => v1::Value {
value_data: Some(ValueData::TsMillisecondValue(v.value())),
},
TimeUnit::Microsecond => v1::Value {
value_data: Some(ValueData::TsMicrosecondValue(v.value())),
},
TimeUnit::Nanosecond => v1::Value {
value_data: Some(ValueData::TsNanosecondValue(v.value())),
},
},
Value::Time(v) => match v.unit() {
TimeUnit::Second => v1::Value {
value_data: Some(ValueData::TimeSecondValue(v.value())),
},
TimeUnit::Millisecond => v1::Value {
value_data: Some(ValueData::TimeMillisecondValue(v.value())),
},
TimeUnit::Microsecond => v1::Value {
value_data: Some(ValueData::TimeMicrosecondValue(v.value())),
},
TimeUnit::Nanosecond => v1::Value {
value_data: Some(ValueData::TimeNanosecondValue(v.value())),
},
},
Value::Interval(v) => match v.unit() {
IntervalUnit::YearMonth => v1::Value {
value_data: Some(ValueData::IntervalYearMonthValues(v.to_i32())),
},
IntervalUnit::DayTime => v1::Value {
value_data: Some(ValueData::IntervalDayTimeValues(v.to_i64())),
},
IntervalUnit::MonthDayNano => v1::Value {
value_data: Some(ValueData::IntervalMonthDayNanoValues(
convert_i128_to_interval(v.to_i128()),
)),
},
},
Value::List(_) => return None,
};
Some(proto_value)
}
/// Returns the [ColumnDataType] of the value.
///
/// If value is null, returns `None`.
pub fn proto_value_type(value: &v1::Value) -> Option<ColumnDataType> {
let value_type = match value.value_data.as_ref()? {
ValueData::I8Value(_) => ColumnDataType::Int8,
ValueData::I16Value(_) => ColumnDataType::Int16,
ValueData::I32Value(_) => ColumnDataType::Int32,
ValueData::I64Value(_) => ColumnDataType::Int64,
ValueData::U8Value(_) => ColumnDataType::Uint8,
ValueData::U16Value(_) => ColumnDataType::Uint16,
ValueData::U32Value(_) => ColumnDataType::Uint32,
ValueData::U64Value(_) => ColumnDataType::Uint64,
ValueData::F32Value(_) => ColumnDataType::Float32,
ValueData::F64Value(_) => ColumnDataType::Float64,
ValueData::BoolValue(_) => ColumnDataType::Boolean,
ValueData::BinaryValue(_) => ColumnDataType::Binary,
ValueData::StringValue(_) => ColumnDataType::String,
ValueData::DateValue(_) => ColumnDataType::Date,
ValueData::DatetimeValue(_) => ColumnDataType::Datetime,
ValueData::TsSecondValue(_) => ColumnDataType::TimestampSecond,
ValueData::TsMillisecondValue(_) => ColumnDataType::TimestampMillisecond,
ValueData::TsMicrosecondValue(_) => ColumnDataType::TimestampMicrosecond,
ValueData::TsNanosecondValue(_) => ColumnDataType::TimestampNanosecond,
ValueData::TimeSecondValue(_) => ColumnDataType::TimeSecond,
ValueData::TimeMillisecondValue(_) => ColumnDataType::TimeMillisecond,
ValueData::TimeMicrosecondValue(_) => ColumnDataType::TimeMicrosecond,
ValueData::TimeNanosecondValue(_) => ColumnDataType::TimeNanosecond,
ValueData::IntervalYearMonthValues(_) => ColumnDataType::IntervalYearMonth,
ValueData::IntervalDayTimeValues(_) => ColumnDataType::IntervalDayTime,
ValueData::IntervalMonthDayNanoValues(_) => ColumnDataType::IntervalMonthDayNano,
};
Some(value_type)
}
/// Convert [ConcreteDataType] to [ColumnDataType].
pub fn to_column_data_type(data_type: &ConcreteDataType) -> Option<ColumnDataType> {
let column_data_type = match data_type {
ConcreteDataType::Boolean(_) => ColumnDataType::Boolean,
ConcreteDataType::Int8(_) => ColumnDataType::Int8,
ConcreteDataType::Int16(_) => ColumnDataType::Int16,
ConcreteDataType::Int32(_) => ColumnDataType::Int32,
ConcreteDataType::Int64(_) => ColumnDataType::Int64,
ConcreteDataType::UInt8(_) => ColumnDataType::Uint8,
ConcreteDataType::UInt16(_) => ColumnDataType::Uint16,
ConcreteDataType::UInt32(_) => ColumnDataType::Uint32,
ConcreteDataType::UInt64(_) => ColumnDataType::Uint64,
ConcreteDataType::Float32(_) => ColumnDataType::Float32,
ConcreteDataType::Float64(_) => ColumnDataType::Float64,
ConcreteDataType::Binary(_) => ColumnDataType::Binary,
ConcreteDataType::String(_) => ColumnDataType::String,
ConcreteDataType::Date(_) => ColumnDataType::Date,
ConcreteDataType::DateTime(_) => ColumnDataType::Datetime,
ConcreteDataType::Timestamp(TimestampType::Second(_)) => ColumnDataType::TimestampSecond,
ConcreteDataType::Timestamp(TimestampType::Millisecond(_)) => {
ColumnDataType::TimestampMillisecond
}
ConcreteDataType::Timestamp(TimestampType::Microsecond(_)) => {
ColumnDataType::TimestampMicrosecond
}
ConcreteDataType::Timestamp(TimestampType::Nanosecond(_)) => {
ColumnDataType::TimestampNanosecond
}
ConcreteDataType::Time(TimeType::Second(_)) => ColumnDataType::TimeSecond,
ConcreteDataType::Time(TimeType::Millisecond(_)) => ColumnDataType::TimeMillisecond,
ConcreteDataType::Time(TimeType::Microsecond(_)) => ColumnDataType::TimeMicrosecond,
ConcreteDataType::Time(TimeType::Nanosecond(_)) => ColumnDataType::TimeNanosecond,
ConcreteDataType::Null(_)
| ConcreteDataType::Interval(_)
| ConcreteDataType::List(_)
| ConcreteDataType::Dictionary(_) => return None,
};
Some(column_data_type)
}
/// Returns true if the column type is equal to expected type.
fn is_column_type_eq(column_type: ColumnDataType, expect_type: &ConcreteDataType) -> bool {
if let Some(expect) = to_column_data_type(expect_type) {
column_type == expect
} else {
false
}
}
#[cfg(test)]
mod tests {
use std::sync::Arc;
use datatypes::types::{
IntervalDayTimeType, IntervalMonthDayNanoType, IntervalYearMonthType, TimeMillisecondType,
TimeSecondType, TimestampMillisecondType, TimestampSecondType,
};
use datatypes::vectors::{
BooleanVector, IntervalDayTimeVector, IntervalMonthDayNanoVector, IntervalYearMonthVector,
TimeMicrosecondVector, TimeMillisecondVector, TimeNanosecondVector, TimeSecondVector,
TimestampMicrosecondVector, TimestampMillisecondVector, TimestampNanosecondVector,
TimestampSecondVector, Vector,
};
use paste::paste;
use super::*;
@@ -766,4 +1250,278 @@ mod tests {
assert_eq!(interval.days, 0);
assert_eq!(interval.nanoseconds, 3000);
}
#[test]
fn test_convert_timestamp_values() {
// second
let actual = pb_values_to_values(
&ConcreteDataType::Timestamp(TimestampType::Second(TimestampSecondType)),
Values {
ts_second_values: vec![1_i64, 2_i64, 3_i64],
..Default::default()
},
);
let expect = vec![
Value::Timestamp(Timestamp::new_second(1_i64)),
Value::Timestamp(Timestamp::new_second(2_i64)),
Value::Timestamp(Timestamp::new_second(3_i64)),
];
assert_eq!(expect, actual);
// millisecond
let actual = pb_values_to_values(
&ConcreteDataType::Timestamp(TimestampType::Millisecond(TimestampMillisecondType)),
Values {
ts_millisecond_values: vec![1_i64, 2_i64, 3_i64],
..Default::default()
},
);
let expect = vec![
Value::Timestamp(Timestamp::new_millisecond(1_i64)),
Value::Timestamp(Timestamp::new_millisecond(2_i64)),
Value::Timestamp(Timestamp::new_millisecond(3_i64)),
];
assert_eq!(expect, actual);
}
#[test]
fn test_convert_time_values() {
// second
let actual = pb_values_to_values(
&ConcreteDataType::Time(TimeType::Second(TimeSecondType)),
Values {
time_second_values: vec![1_i64, 2_i64, 3_i64],
..Default::default()
},
);
let expect = vec![
Value::Time(Time::new_second(1_i64)),
Value::Time(Time::new_second(2_i64)),
Value::Time(Time::new_second(3_i64)),
];
assert_eq!(expect, actual);
// millisecond
let actual = pb_values_to_values(
&ConcreteDataType::Time(TimeType::Millisecond(TimeMillisecondType)),
Values {
time_millisecond_values: vec![1_i64, 2_i64, 3_i64],
..Default::default()
},
);
let expect = vec![
Value::Time(Time::new_millisecond(1_i64)),
Value::Time(Time::new_millisecond(2_i64)),
Value::Time(Time::new_millisecond(3_i64)),
];
assert_eq!(expect, actual);
}
#[test]
fn test_convert_interval_values() {
// year_month
let actual = pb_values_to_values(
&ConcreteDataType::Interval(IntervalType::YearMonth(IntervalYearMonthType)),
Values {
interval_year_month_values: vec![1_i32, 2_i32, 3_i32],
..Default::default()
},
);
let expect = vec![
Value::Interval(Interval::from_year_month(1_i32)),
Value::Interval(Interval::from_year_month(2_i32)),
Value::Interval(Interval::from_year_month(3_i32)),
];
assert_eq!(expect, actual);
// day_time
let actual = pb_values_to_values(
&ConcreteDataType::Interval(IntervalType::DayTime(IntervalDayTimeType)),
Values {
interval_day_time_values: vec![1_i64, 2_i64, 3_i64],
..Default::default()
},
);
let expect = vec![
Value::Interval(Interval::from_i64(1_i64)),
Value::Interval(Interval::from_i64(2_i64)),
Value::Interval(Interval::from_i64(3_i64)),
];
assert_eq!(expect, actual);
// month_day_nano
let actual = pb_values_to_values(
&ConcreteDataType::Interval(IntervalType::MonthDayNano(IntervalMonthDayNanoType)),
Values {
interval_month_day_nano_values: vec![
IntervalMonthDayNano {
months: 1,
days: 2,
nanoseconds: 3,
},
IntervalMonthDayNano {
months: 5,
days: 6,
nanoseconds: 7,
},
IntervalMonthDayNano {
months: 9,
days: 10,
nanoseconds: 11,
},
],
..Default::default()
},
);
let expect = vec![
Value::Interval(Interval::from_month_day_nano(1, 2, 3)),
Value::Interval(Interval::from_month_day_nano(5, 6, 7)),
Value::Interval(Interval::from_month_day_nano(9, 10, 11)),
];
assert_eq!(expect, actual);
}
macro_rules! test_convert_values {
($grpc_data_type: ident, $values: expr, $concrete_data_type: ident, $expected_ret: expr) => {
paste! {
#[test]
fn [<test_convert_ $grpc_data_type _values>]() {
let values = Values {
[<$grpc_data_type _values>]: $values,
..Default::default()
};
let data_type = ConcreteDataType::[<$concrete_data_type _datatype>]();
let result = pb_values_to_values(&data_type, values);
assert_eq!(
$expected_ret,
result
);
}
}
};
}
test_convert_values!(
i8,
vec![1_i32, 2, 3],
int8,
vec![Value::Int8(1), Value::Int8(2), Value::Int8(3)]
);
test_convert_values!(
u8,
vec![1_u32, 2, 3],
uint8,
vec![Value::UInt8(1), Value::UInt8(2), Value::UInt8(3)]
);
test_convert_values!(
i16,
vec![1_i32, 2, 3],
int16,
vec![Value::Int16(1), Value::Int16(2), Value::Int16(3)]
);
test_convert_values!(
u16,
vec![1_u32, 2, 3],
uint16,
vec![Value::UInt16(1), Value::UInt16(2), Value::UInt16(3)]
);
test_convert_values!(
i32,
vec![1, 2, 3],
int32,
vec![Value::Int32(1), Value::Int32(2), Value::Int32(3)]
);
test_convert_values!(
u32,
vec![1, 2, 3],
uint32,
vec![Value::UInt32(1), Value::UInt32(2), Value::UInt32(3)]
);
test_convert_values!(
i64,
vec![1, 2, 3],
int64,
vec![Value::Int64(1), Value::Int64(2), Value::Int64(3)]
);
test_convert_values!(
u64,
vec![1, 2, 3],
uint64,
vec![Value::UInt64(1), Value::UInt64(2), Value::UInt64(3)]
);
test_convert_values!(
f32,
vec![1.0, 2.0, 3.0],
float32,
vec![
Value::Float32(1.0.into()),
Value::Float32(2.0.into()),
Value::Float32(3.0.into())
]
);
test_convert_values!(
f64,
vec![1.0, 2.0, 3.0],
float64,
vec![
Value::Float64(1.0.into()),
Value::Float64(2.0.into()),
Value::Float64(3.0.into())
]
);
test_convert_values!(
string,
vec!["1".to_string(), "2".to_string(), "3".to_string()],
string,
vec![
Value::String("1".into()),
Value::String("2".into()),
Value::String("3".into())
]
);
test_convert_values!(
binary,
vec!["1".into(), "2".into(), "3".into()],
binary,
vec![
Value::Binary(b"1".to_vec().into()),
Value::Binary(b"2".to_vec().into()),
Value::Binary(b"3".to_vec().into())
]
);
test_convert_values!(
date,
vec![1, 2, 3],
date,
vec![
Value::Date(1.into()),
Value::Date(2.into()),
Value::Date(3.into())
]
);
test_convert_values!(
datetime,
vec![1.into(), 2.into(), 3.into()],
datetime,
vec![
Value::DateTime(1.into()),
Value::DateTime(2.into()),
Value::DateTime(3.into())
]
);
}

26
src/auth/Cargo.toml Normal file
View File

@@ -0,0 +1,26 @@
[package]
name = "auth"
version.workspace = true
edition.workspace = true
license.workspace = true
# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html
[features]
default = []
testing = []
[dependencies]
api.workspace = true
async-trait.workspace = true
common-error.workspace = true
digest = "0.10"
hex = { version = "0.4" }
secrecy = { version = "0.8", features = ["serde", "alloc"] }
sha1 = "0.10"
snafu.workspace = true
sql.workspace = true
tokio.workspace = true
[dev-dependencies]
common-test-util.workspace = true

147
src/auth/src/common.rs Normal file
View File

@@ -0,0 +1,147 @@
// Copyright 2023 Greptime Team
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
use std::sync::Arc;
use digest::Digest;
use secrecy::SecretString;
use sha1::Sha1;
use snafu::{ensure, OptionExt};
use crate::error::{IllegalParamSnafu, InvalidConfigSnafu, Result, UserPasswordMismatchSnafu};
use crate::user_info::DefaultUserInfo;
use crate::user_provider::static_user_provider::{StaticUserProvider, STATIC_USER_PROVIDER};
use crate::{UserInfoRef, UserProviderRef};
pub(crate) const DEFAULT_USERNAME: &str = "greptime";
/// construct a [`UserInfo`] impl with name
/// use default username `greptime` if None is provided
pub fn userinfo_by_name(username: Option<String>) -> UserInfoRef {
DefaultUserInfo::with_name(username.unwrap_or_else(|| DEFAULT_USERNAME.to_string()))
}
pub fn user_provider_from_option(opt: &String) -> Result<UserProviderRef> {
let (name, content) = opt.split_once(':').context(InvalidConfigSnafu {
value: opt.to_string(),
msg: "UserProviderOption must be in format `<option>:<value>`",
})?;
match name {
STATIC_USER_PROVIDER => {
let provider =
StaticUserProvider::try_from(content).map(|p| Arc::new(p) as UserProviderRef)?;
Ok(provider)
}
_ => InvalidConfigSnafu {
value: name.to_string(),
msg: "Invalid UserProviderOption",
}
.fail(),
}
}
type Username<'a> = &'a str;
type HostOrIp<'a> = &'a str;
#[derive(Debug, Clone)]
pub enum Identity<'a> {
UserId(Username<'a>, Option<HostOrIp<'a>>),
}
pub type HashedPassword<'a> = &'a [u8];
pub type Salt<'a> = &'a [u8];
/// Authentication information sent by the client.
pub enum Password<'a> {
PlainText(SecretString),
MysqlNativePassword(HashedPassword<'a>, Salt<'a>),
PgMD5(HashedPassword<'a>, Salt<'a>),
}
pub fn auth_mysql(
auth_data: HashedPassword,
salt: Salt,
username: &str,
save_pwd: &[u8],
) -> Result<()> {
ensure!(
auth_data.len() == 20,
IllegalParamSnafu {
msg: "Illegal mysql password length"
}
);
// ref: https://github.com/mysql/mysql-server/blob/a246bad76b9271cb4333634e954040a970222e0a/sql/auth/password.cc#L62
let hash_stage_2 = double_sha1(save_pwd);
let tmp = sha1_two(salt, &hash_stage_2);
// xor auth_data and tmp
let mut xor_result = [0u8; 20];
for i in 0..20 {
xor_result[i] = auth_data[i] ^ tmp[i];
}
let candidate_stage_2 = sha1_one(&xor_result);
if candidate_stage_2 == hash_stage_2 {
Ok(())
} else {
UserPasswordMismatchSnafu {
username: username.to_string(),
}
.fail()
}
}
fn sha1_two(input_1: &[u8], input_2: &[u8]) -> Vec<u8> {
let mut hasher = Sha1::new();
hasher.update(input_1);
hasher.update(input_2);
hasher.finalize().to_vec()
}
fn sha1_one(data: &[u8]) -> Vec<u8> {
let mut hasher = Sha1::new();
hasher.update(data);
hasher.finalize().to_vec()
}
fn double_sha1(data: &[u8]) -> Vec<u8> {
sha1_one(&sha1_one(data))
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn test_sha() {
let sha_1_answer: Vec<u8> = vec![
124, 74, 141, 9, 202, 55, 98, 175, 97, 229, 149, 32, 148, 61, 194, 100, 148, 248, 148,
27,
];
let sha_1 = sha1_one("123456".as_bytes());
assert_eq!(sha_1, sha_1_answer);
let double_sha1_answer: Vec<u8> = vec![
107, 180, 131, 126, 183, 67, 41, 16, 94, 228, 86, 141, 218, 125, 198, 126, 210, 202,
42, 217,
];
let double_sha1 = double_sha1("123456".as_bytes());
assert_eq!(double_sha1, double_sha1_answer);
let sha1_2_answer: Vec<u8> = vec![
132, 115, 215, 211, 99, 186, 164, 206, 168, 152, 217, 192, 117, 47, 240, 252, 142, 244,
37, 204,
];
let sha1_2 = sha1_two("123456".as_bytes(), "654321".as_bytes());
assert_eq!(sha1_2, sha1_2_answer);
}
}

View File

@@ -4,7 +4,7 @@
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
@@ -12,83 +12,9 @@
// See the License for the specific language governing permissions and
// limitations under the License.
use std::sync::Arc;
use common_error::ext::{BoxedError, ErrorExt};
use common_error::status_code::StatusCode;
use secrecy::SecretString;
use session::context::UserInfo;
use snafu::{Location, OptionExt, Snafu};
use crate::auth::user_provider::StaticUserProvider;
pub mod user_provider;
#[async_trait::async_trait]
pub trait UserProvider: Send + Sync {
fn name(&self) -> &str;
/// [`authenticate`] checks whether a user is valid and allowed to access the database.
async fn authenticate(&self, id: Identity<'_>, password: Password<'_>) -> Result<UserInfo>;
/// [`authorize`] checks whether a connection request
/// from a certain user to a certain catalog/schema is legal.
/// This method should be called after [`authenticate`].
async fn authorize(&self, catalog: &str, schema: &str, user_info: &UserInfo) -> Result<()>;
/// [`auth`] is a combination of [`authenticate`] and [`authorize`].
/// In most cases it's preferred for both convenience and performance.
async fn auth(
&self,
id: Identity<'_>,
password: Password<'_>,
catalog: &str,
schema: &str,
) -> Result<UserInfo> {
let user_info = self.authenticate(id, password).await?;
self.authorize(catalog, schema, &user_info).await?;
Ok(user_info)
}
}
pub type UserProviderRef = Arc<dyn UserProvider>;
type Username<'a> = &'a str;
type HostOrIp<'a> = &'a str;
#[derive(Debug, Clone)]
pub enum Identity<'a> {
UserId(Username<'a>, Option<HostOrIp<'a>>),
}
pub type HashedPassword<'a> = &'a [u8];
pub type Salt<'a> = &'a [u8];
/// Authentication information sent by the client.
pub enum Password<'a> {
PlainText(SecretString),
MysqlNativePassword(HashedPassword<'a>, Salt<'a>),
PgMD5(HashedPassword<'a>, Salt<'a>),
}
pub fn user_provider_from_option(opt: &String) -> Result<UserProviderRef> {
let (name, content) = opt.split_once(':').context(InvalidConfigSnafu {
value: opt.to_string(),
msg: "UserProviderOption must be in format `<option>:<value>`",
})?;
match name {
user_provider::STATIC_USER_PROVIDER => {
let provider =
StaticUserProvider::try_from(content).map(|p| Arc::new(p) as UserProviderRef)?;
Ok(provider)
}
_ => InvalidConfigSnafu {
value: name.to_string(),
msg: "Invalid UserProviderOption",
}
.fail(),
}
}
use snafu::{Location, Snafu};
#[derive(Debug, Snafu)]
#[snafu(visibility(pub))]
@@ -134,6 +60,9 @@ pub enum Error {
schema: String,
username: String,
},
#[snafu(display("User is not authorized to perform this action"))]
PermissionDenied { location: Location },
}
impl ErrorExt for Error {
@@ -149,6 +78,7 @@ impl ErrorExt for Error {
Error::UnsupportedPasswordType { .. } => StatusCode::UnsupportedPasswordType,
Error::UserPasswordMismatch { .. } => StatusCode::UserPasswordMismatch,
Error::AccessDenied { .. } => StatusCode::AccessDenied,
Error::PermissionDenied { .. } => StatusCode::PermissionDenied,
}
}

34
src/auth/src/lib.rs Normal file
View File

@@ -0,0 +1,34 @@
// Copyright 2023 Greptime Team
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
mod common;
pub mod error;
mod permission;
mod user_info;
mod user_provider;
#[cfg(feature = "testing")]
pub mod tests;
pub use common::{
auth_mysql, user_provider_from_option, userinfo_by_name, HashedPassword, Identity, Password,
};
pub use permission::{PermissionChecker, PermissionReq, PermissionResp};
pub use user_info::UserInfo;
pub use user_provider::UserProvider;
/// pub type alias
pub type UserInfoRef = std::sync::Arc<dyn UserInfo>;
pub type UserProviderRef = std::sync::Arc<dyn UserProvider>;
pub type PermissionCheckerRef = std::sync::Arc<dyn PermissionChecker>;

View File

@@ -0,0 +1,64 @@
// Copyright 2023 Greptime Team
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
use std::fmt::Debug;
use api::v1::greptime_request::Request;
use sql::statements::statement::Statement;
use crate::error::{PermissionDeniedSnafu, Result};
use crate::{PermissionCheckerRef, UserInfoRef};
#[derive(Debug, Clone)]
pub enum PermissionReq<'a> {
GrpcRequest(&'a Request),
SqlStatement(&'a Statement),
PromQuery,
Opentsdb,
LineProtocol,
PromStoreWrite,
PromStoreRead,
Otlp,
}
#[derive(Debug)]
pub enum PermissionResp {
Allow,
Reject,
}
pub trait PermissionChecker: Send + Sync {
fn check_permission(
&self,
user_info: Option<UserInfoRef>,
req: PermissionReq,
) -> Result<PermissionResp>;
}
impl PermissionChecker for Option<&PermissionCheckerRef> {
fn check_permission(
&self,
user_info: Option<UserInfoRef>,
req: PermissionReq,
) -> Result<PermissionResp> {
match self {
Some(checker) => match checker.check_permission(user_info, req) {
Ok(PermissionResp::Reject) => PermissionDeniedSnafu.fail(),
Ok(PermissionResp::Allow) => Ok(PermissionResp::Allow),
Err(e) => Err(e),
},
None => Ok(PermissionResp::Allow),
}
}
}

View File

@@ -11,14 +11,14 @@
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
use secrecy::ExposeSecret;
use servers::auth::user_provider::auth_mysql;
use servers::auth::{
AccessDeniedSnafu, Identity, Password, UnsupportedPasswordTypeSnafu, UserNotFoundSnafu,
UserPasswordMismatchSnafu, UserProvider,
use crate::error::{
AccessDeniedSnafu, Result, UnsupportedPasswordTypeSnafu, UserNotFoundSnafu,
UserPasswordMismatchSnafu,
};
use session::context::UserInfo;
use crate::user_info::DefaultUserInfo;
use crate::{auth_mysql, Identity, Password, UserInfoRef, UserProvider};
pub struct DatabaseAuthInfo<'a> {
pub catalog: &'a str,
@@ -56,17 +56,13 @@ impl UserProvider for MockUserProvider {
"mock_user_provider"
}
async fn authenticate(
&self,
id: Identity<'_>,
password: Password<'_>,
) -> servers::auth::Result<UserInfo> {
async fn authenticate(&self, id: Identity<'_>, password: Password<'_>) -> Result<UserInfoRef> {
match id {
Identity::UserId(username, _host) => match password {
Password::PlainText(password) => {
if username == "greptime" {
if password.expose_secret() == "greptime" {
Ok(UserInfo::new("greptime"))
Ok(DefaultUserInfo::with_name("greptime"))
} else {
UserPasswordMismatchSnafu {
username: username.to_string(),
@@ -82,7 +78,7 @@ impl UserProvider for MockUserProvider {
}
Password::MysqlNativePassword(auth_data, salt) => {
auth_mysql(auth_data, salt, username, "greptime".as_bytes())
.map(|_| UserInfo::new(username))
.map(|_| DefaultUserInfo::with_name(username))
}
_ => UnsupportedPasswordTypeSnafu {
password_type: "mysql_native_password",
@@ -92,12 +88,7 @@ impl UserProvider for MockUserProvider {
}
}
async fn authorize(
&self,
catalog: &str,
schema: &str,
user_info: &UserInfo,
) -> servers::auth::Result<()> {
async fn authorize(&self, catalog: &str, schema: &str, user_info: &UserInfoRef) -> Result<()> {
if catalog == self.catalog && schema == self.schema && user_info.username() == self.username
{
Ok(())
@@ -114,6 +105,8 @@ impl UserProvider for MockUserProvider {
#[tokio::test]
async fn test_auth_by_plain_text() {
use crate::error;
let user_provider = MockUserProvider::default();
assert_eq!("mock_user_provider", user_provider.name());
@@ -137,7 +130,7 @@ async fn test_auth_by_plain_text() {
assert!(auth_result.is_err());
assert!(matches!(
auth_result.err().unwrap(),
servers::auth::Error::UnsupportedPasswordType { .. }
error::Error::UnsupportedPasswordType { .. }
));
// auth failed, err: user not exist.
@@ -150,7 +143,7 @@ async fn test_auth_by_plain_text() {
assert!(auth_result.is_err());
assert!(matches!(
auth_result.err().unwrap(),
servers::auth::Error::UserNotFound { .. }
error::Error::UserNotFound { .. }
));
// auth failed, err: wrong password
@@ -163,7 +156,7 @@ async fn test_auth_by_plain_text() {
assert!(auth_result.is_err());
assert!(matches!(
auth_result.err().unwrap(),
servers::auth::Error::UserPasswordMismatch { .. }
error::Error::UserPasswordMismatch { .. }
))
}
@@ -176,8 +169,8 @@ async fn test_schema_validate() {
username: "test_user",
});
let right_user = UserInfo::new("test_user");
let wrong_user = UserInfo::default();
let right_user = DefaultUserInfo::with_name("test_user");
let wrong_user = DefaultUserInfo::with_name("greptime");
// check catalog
let re = validator

47
src/auth/src/user_info.rs Normal file
View File

@@ -0,0 +1,47 @@
// Copyright 2023 Greptime Team
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
use std::any::Any;
use std::fmt::Debug;
use std::sync::Arc;
use crate::UserInfoRef;
pub trait UserInfo: Debug + Sync + Send {
fn as_any(&self) -> &dyn Any;
fn username(&self) -> &str;
}
#[derive(Debug)]
pub(crate) struct DefaultUserInfo {
username: String,
}
impl DefaultUserInfo {
pub(crate) fn with_name(username: impl Into<String>) -> UserInfoRef {
Arc::new(Self {
username: username.into(),
})
}
}
impl UserInfo for DefaultUserInfo {
fn as_any(&self) -> &dyn Any {
self
}
fn username(&self) -> &str {
self.username.as_str()
}
}

View File

@@ -0,0 +1,46 @@
// Copyright 2023 Greptime Team
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
pub(crate) mod static_user_provider;
use crate::common::{Identity, Password};
use crate::error::Result;
use crate::UserInfoRef;
#[async_trait::async_trait]
pub trait UserProvider: Send + Sync {
fn name(&self) -> &str;
/// [`authenticate`] checks whether a user is valid and allowed to access the database.
async fn authenticate(&self, id: Identity<'_>, password: Password<'_>) -> Result<UserInfoRef>;
/// [`authorize`] checks whether a connection request
/// from a certain user to a certain catalog/schema is legal.
/// This method should be called after [`authenticate`].
async fn authorize(&self, catalog: &str, schema: &str, user_info: &UserInfoRef) -> Result<()>;
/// [`auth`] is a combination of [`authenticate`] and [`authorize`].
/// In most cases it's preferred for both convenience and performance.
async fn auth(
&self,
id: Identity<'_>,
password: Password<'_>,
catalog: &str,
schema: &str,
) -> Result<UserInfoRef> {
let user_info = self.authenticate(id, password).await?;
self.authorize(catalog, schema, &user_info).await?;
Ok(user_info)
}
}

View File

@@ -19,20 +19,17 @@ use std::io::BufRead;
use std::path::Path;
use async_trait::async_trait;
use digest;
use digest::Digest;
use secrecy::ExposeSecret;
use session::context::UserInfo;
use sha1::Sha1;
use snafu::{ensure, OptionExt, ResultExt};
use crate::auth::{
Error, HashedPassword, Identity, IllegalParamSnafu, InvalidConfigSnafu, IoSnafu, Password,
Result, Salt, UnsupportedPasswordTypeSnafu, UserNotFoundSnafu, UserPasswordMismatchSnafu,
UserProvider,
use crate::error::{
Error, IllegalParamSnafu, InvalidConfigSnafu, IoSnafu, Result, UnsupportedPasswordTypeSnafu,
UserNotFoundSnafu, UserPasswordMismatchSnafu,
};
use crate::user_info::DefaultUserInfo;
use crate::{auth_mysql, Identity, Password, UserInfoRef, UserProvider};
pub const STATIC_USER_PROVIDER: &str = "static_user_provider";
pub(crate) const STATIC_USER_PROVIDER: &str = "static_user_provider";
impl TryFrom<&str> for StaticUserProvider {
type Error = Error;
@@ -91,7 +88,7 @@ impl TryFrom<&str> for StaticUserProvider {
}
}
pub struct StaticUserProvider {
pub(crate) struct StaticUserProvider {
users: HashMap<String, Vec<u8>>,
}
@@ -105,7 +102,7 @@ impl UserProvider for StaticUserProvider {
&self,
input_id: Identity<'_>,
input_pwd: Password<'_>,
) -> Result<UserInfo> {
) -> Result<UserInfoRef> {
match input_id {
Identity::UserId(username, _) => {
ensure!(
@@ -127,7 +124,7 @@ impl UserProvider for StaticUserProvider {
}
);
return if save_pwd == pwd.expose_secret().as_bytes() {
Ok(UserInfo::new(username))
Ok(DefaultUserInfo::with_name(username))
} else {
UserPasswordMismatchSnafu {
username: username.to_string(),
@@ -136,14 +133,8 @@ impl UserProvider for StaticUserProvider {
};
}
Password::MysqlNativePassword(auth_data, salt) => {
ensure!(
auth_data.len() == 20,
IllegalParamSnafu {
msg: "Illegal MySQL native password format, length != 20"
}
);
auth_mysql(auth_data, salt, username, save_pwd)
.map(|_| UserInfo::new(username))
.map(|_| DefaultUserInfo::with_name(username))
}
Password::PgMD5(_, _) => UnsupportedPasswordTypeSnafu {
password_type: "pg_md5",
@@ -154,88 +145,28 @@ impl UserProvider for StaticUserProvider {
}
}
async fn authorize(&self, _catalog: &str, _schema: &str, _user_info: &UserInfo) -> Result<()> {
async fn authorize(
&self,
_catalog: &str,
_schema: &str,
_user_info: &UserInfoRef,
) -> Result<()> {
// default allow all
Ok(())
}
}
pub fn auth_mysql(
auth_data: HashedPassword,
salt: Salt,
username: &str,
save_pwd: &[u8],
) -> Result<()> {
// ref: https://github.com/mysql/mysql-server/blob/a246bad76b9271cb4333634e954040a970222e0a/sql/auth/password.cc#L62
let hash_stage_2 = double_sha1(save_pwd);
let tmp = sha1_two(salt, &hash_stage_2);
// xor auth_data and tmp
let mut xor_result = [0u8; 20];
for i in 0..20 {
xor_result[i] = auth_data[i] ^ tmp[i];
}
let candidate_stage_2 = sha1_one(&xor_result);
if candidate_stage_2 == hash_stage_2 {
Ok(())
} else {
UserPasswordMismatchSnafu {
username: username.to_string(),
}
.fail()
}
}
fn sha1_two(input_1: &[u8], input_2: &[u8]) -> Vec<u8> {
let mut hasher = Sha1::new();
hasher.update(input_1);
hasher.update(input_2);
hasher.finalize().to_vec()
}
fn sha1_one(data: &[u8]) -> Vec<u8> {
let mut hasher = Sha1::new();
hasher.update(data);
hasher.finalize().to_vec()
}
fn double_sha1(data: &[u8]) -> Vec<u8> {
sha1_one(&sha1_one(data))
}
#[cfg(test)]
pub mod test {
use std::fs::File;
use std::io::{LineWriter, Write};
use common_test_util::temp_dir::create_temp_dir;
use session::context::UserInfo;
use crate::auth::user_provider::{double_sha1, sha1_one, sha1_two, StaticUserProvider};
use crate::auth::{Identity, Password, UserProvider};
#[test]
fn test_sha() {
let sha_1_answer: Vec<u8> = vec![
124, 74, 141, 9, 202, 55, 98, 175, 97, 229, 149, 32, 148, 61, 194, 100, 148, 248, 148,
27,
];
let sha_1 = sha1_one("123456".as_bytes());
assert_eq!(sha_1, sha_1_answer);
let double_sha1_answer: Vec<u8> = vec![
107, 180, 131, 126, 183, 67, 41, 16, 94, 228, 86, 141, 218, 125, 198, 126, 210, 202,
42, 217,
];
let double_sha1 = double_sha1("123456".as_bytes());
assert_eq!(double_sha1, double_sha1_answer);
let sha1_2_answer: Vec<u8> = vec![
132, 115, 215, 211, 99, 186, 164, 206, 168, 152, 217, 192, 117, 47, 240, 252, 142, 244,
37, 204,
];
let sha1_2 = sha1_two("123456".as_bytes(), "654321".as_bytes());
assert_eq!(sha1_2, sha1_2_answer);
}
use crate::user_info::DefaultUserInfo;
use crate::user_provider::static_user_provider::StaticUserProvider;
use crate::user_provider::{Identity, Password};
use crate::UserProvider;
async fn test_authenticate(provider: &dyn UserProvider, username: &str, password: &str) {
let re = provider
@@ -249,9 +180,10 @@ pub mod test {
#[tokio::test]
async fn test_authorize() {
let user_info = DefaultUserInfo::with_name("root");
let provider = StaticUserProvider::try_from("cmd:root=123456,admin=654321").unwrap();
provider
.authorize("catalog", "schema", &UserInfo::new("root"))
.authorize("catalog", "schema", &user_info)
.await
.unwrap();
}

61
src/auth/tests/mod.rs Normal file
View File

@@ -0,0 +1,61 @@
// Copyright 2023 Greptime Team
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
#![feature(assert_matches)]
use std::assert_matches::assert_matches;
use std::sync::Arc;
use api::v1::greptime_request::Request;
use auth::error::Error::InternalState;
use auth::{PermissionChecker, PermissionCheckerRef, PermissionReq, PermissionResp, UserInfoRef};
use sql::statements::show::{ShowDatabases, ShowKind};
use sql::statements::statement::Statement;
struct DummyPermissionChecker;
impl PermissionChecker for DummyPermissionChecker {
fn check_permission(
&self,
_user_info: Option<UserInfoRef>,
req: PermissionReq,
) -> auth::error::Result<PermissionResp> {
match req {
PermissionReq::GrpcRequest(_) => Ok(PermissionResp::Allow),
PermissionReq::SqlStatement(_) => Ok(PermissionResp::Reject),
_ => Err(InternalState {
msg: "testing".to_string(),
}),
}
}
}
#[test]
fn test_permission_checker() {
let checker: PermissionCheckerRef = Arc::new(DummyPermissionChecker);
let grpc_result = checker.check_permission(
None,
PermissionReq::GrpcRequest(&Request::Query(Default::default())),
);
assert_matches!(grpc_result, Ok(PermissionResp::Allow));
let sql_result = checker.check_permission(
None,
PermissionReq::SqlStatement(&Statement::ShowDatabases(ShowDatabases::new(ShowKind::All))),
);
assert_matches!(sql_result, Ok(PermissionResp::Reject));
let err_result = checker.check_permission(None, PermissionReq::Opentsdb);
assert_matches!(err_result, Err(InternalState { msg }) if msg == "testing");
}

View File

@@ -27,6 +27,19 @@ use crate::DeregisterTableRequest;
#[derive(Debug, Snafu)]
#[snafu(visibility(pub))]
pub enum Error {
#[snafu(display("Failed to list catalogs, source: {}", source))]
ListCatalogs {
location: Location,
source: BoxedError,
},
#[snafu(display("Failed to list {}'s schemas, source: {}", catalog, source))]
ListSchemas {
location: Location,
catalog: String,
source: BoxedError,
},
#[snafu(display(
"Failed to re-compile script due to internal error, source: {}",
source
@@ -284,6 +297,10 @@ impl ErrorExt for Error {
StatusCode::InvalidArguments
}
Error::ListCatalogs { source, .. } | Error::ListSchemas { source, .. } => {
source.status_code()
}
Error::OpenSystemCatalog { source, .. }
| Error::CreateSystemCatalog { source, .. }
| Error::InsertCatalogRecord { source, .. }

View File

@@ -15,30 +15,27 @@
mod columns;
mod tables;
use std::any::Any;
use std::collections::HashMap;
use std::sync::{Arc, Weak};
use async_trait::async_trait;
use common_catalog::consts::{
INFORMATION_SCHEMA_COLUMNS_TABLE_ID, INFORMATION_SCHEMA_NAME,
INFORMATION_SCHEMA_TABLES_TABLE_ID,
};
use common_catalog::consts::INFORMATION_SCHEMA_NAME;
use common_error::ext::BoxedError;
use common_recordbatch::{RecordBatchStreamAdaptor, SendableRecordBatchStream};
use datatypes::schema::SchemaRef;
use futures_util::StreamExt;
use snafu::ResultExt;
use store_api::data_source::DataSource;
use store_api::storage::{ScanRequest, TableId};
use table::data_source::DataSource;
use table::error::{SchemaConversionSnafu, TablesRecordBatchSnafu};
use table::metadata::{TableIdent, TableInfoBuilder, TableMetaBuilder, TableType};
use table::{Result as TableResult, Table, TableRef};
use table::metadata::{
FilterPushDownType, TableInfoBuilder, TableInfoRef, TableMetaBuilder, TableType,
};
use table::thin_table::{ThinTable, ThinTableAdapter};
use table::TableRef;
use self::columns::InformationSchemaColumns;
use crate::error::Result;
use crate::information_schema::tables::InformationSchemaTables;
use crate::table_factory::TableFactory;
use crate::CatalogManager;
pub const TABLES: &str = "tables";
@@ -63,192 +60,117 @@ impl InformationSchemaProvider {
catalog_name: String,
catalog_manager: Weak<dyn CatalogManager>,
) -> HashMap<String, TableRef> {
let provider = Self::new(catalog_name, catalog_manager);
let mut schema = HashMap::new();
schema.insert(
TABLES.to_string(),
Arc::new(InformationTable::new(
catalog_name.clone(),
INFORMATION_SCHEMA_TABLES_TABLE_ID,
TABLES.to_string(),
Arc::new(InformationSchemaTables::new(
catalog_name.clone(),
catalog_manager.clone(),
)),
)) as _,
);
schema.insert(
COLUMNS.to_string(),
Arc::new(InformationTable::new(
catalog_name.clone(),
INFORMATION_SCHEMA_COLUMNS_TABLE_ID,
COLUMNS.to_string(),
Arc::new(InformationSchemaColumns::new(catalog_name, catalog_manager)),
)) as _,
);
schema.insert(TABLES.to_owned(), provider.table(TABLES).unwrap());
schema.insert(COLUMNS.to_owned(), provider.table(COLUMNS).unwrap());
schema
}
pub fn table(&self, name: &str) -> Result<Option<TableRef>> {
let (stream_builder, table_id) = match name.to_ascii_lowercase().as_ref() {
TABLES => (
Arc::new(InformationSchemaTables::new(
self.catalog_name.clone(),
self.catalog_manager.clone(),
)) as _,
INFORMATION_SCHEMA_TABLES_TABLE_ID,
),
COLUMNS => (
Arc::new(InformationSchemaColumns::new(
self.catalog_name.clone(),
self.catalog_manager.clone(),
)) as _,
INFORMATION_SCHEMA_COLUMNS_TABLE_ID,
),
_ => {
return Ok(None);
}
};
pub fn table(&self, name: &str) -> Option<TableRef> {
self.information_table(name).map(|table| {
let table_info = Self::table_info(self.catalog_name.clone(), &table);
let filter_pushdown = FilterPushDownType::Unsupported;
let thin_table = ThinTable::new(table_info, filter_pushdown);
Ok(Some(Arc::new(InformationTable::new(
self.catalog_name.clone(),
table_id,
name.to_string(),
stream_builder,
))))
let data_source = Arc::new(InformationTableDataSource::new(table));
Arc::new(ThinTableAdapter::new(thin_table, data_source)) as _
})
}
pub fn table_factory(&self, name: &str) -> Result<Option<TableFactory>> {
let (stream_builder, table_id) = match name.to_ascii_lowercase().as_ref() {
TABLES => (
Arc::new(InformationSchemaTables::new(
self.catalog_name.clone(),
self.catalog_manager.clone(),
)) as _,
INFORMATION_SCHEMA_TABLES_TABLE_ID,
),
COLUMNS => (
Arc::new(InformationSchemaColumns::new(
self.catalog_name.clone(),
self.catalog_manager.clone(),
)) as _,
INFORMATION_SCHEMA_COLUMNS_TABLE_ID,
),
_ => {
return Ok(None);
}
};
let data_source = Arc::new(InformationTable::new(
self.catalog_name.clone(),
table_id,
name.to_string(),
stream_builder,
));
Ok(Some(Arc::new(move || data_source.clone())))
}
}
// TODO(ruihang): make it a more generic trait:
// https://github.com/GreptimeTeam/greptimedb/pull/1639#discussion_r1205001903
pub trait InformationStreamBuilder: Send + Sync {
fn to_stream(&self) -> Result<SendableRecordBatchStream>;
fn schema(&self) -> SchemaRef;
}
pub struct InformationTable {
catalog_name: String,
table_id: TableId,
name: String,
stream_builder: Arc<dyn InformationStreamBuilder>,
}
impl InformationTable {
pub fn new(
catalog_name: String,
table_id: TableId,
name: String,
stream_builder: Arc<dyn InformationStreamBuilder>,
) -> Self {
Self {
catalog_name,
table_id,
name,
stream_builder,
fn information_table(&self, name: &str) -> Option<InformationTableRef> {
match name.to_ascii_lowercase().as_str() {
TABLES => Some(Arc::new(InformationSchemaTables::new(
self.catalog_name.clone(),
self.catalog_manager.clone(),
)) as _),
COLUMNS => Some(Arc::new(InformationSchemaColumns::new(
self.catalog_name.clone(),
self.catalog_manager.clone(),
)) as _),
_ => None,
}
}
}
#[async_trait]
impl Table for InformationTable {
fn as_any(&self) -> &dyn Any {
self
}
fn schema(&self) -> SchemaRef {
self.stream_builder.schema()
}
fn table_info(&self) -> table::metadata::TableInfoRef {
fn table_info(catalog_name: String, table: &InformationTableRef) -> TableInfoRef {
let table_meta = TableMetaBuilder::default()
.schema(self.stream_builder.schema())
.schema(table.schema())
.primary_key_indices(vec![])
.next_column_id(0)
.build()
.unwrap();
Arc::new(
TableInfoBuilder::default()
.ident(TableIdent {
table_id: self.table_id,
version: 0,
})
.name(self.name.clone())
.catalog_name(self.catalog_name.clone())
.schema_name(INFORMATION_SCHEMA_NAME.to_string())
.meta(table_meta)
.table_type(TableType::Temporary)
.build()
.unwrap(),
)
let table_info = TableInfoBuilder::default()
.table_id(table.table_id())
.name(table.table_name().to_owned())
.catalog_name(catalog_name)
.schema_name(INFORMATION_SCHEMA_NAME.to_owned())
.meta(table_meta)
.table_type(table.table_type())
.build()
.unwrap();
Arc::new(table_info)
}
}
trait InformationTable {
fn table_id(&self) -> TableId;
fn table_name(&self) -> &'static str;
fn schema(&self) -> SchemaRef;
fn to_stream(&self) -> Result<SendableRecordBatchStream>;
fn table_type(&self) -> TableType {
TableType::Temporary
}
}
async fn scan_to_stream(&self, request: ScanRequest) -> TableResult<SendableRecordBatchStream> {
self.get_stream(request)
type InformationTableRef = Arc<dyn InformationTable + Send + Sync>;
struct InformationTableDataSource {
table: InformationTableRef,
}
impl InformationTableDataSource {
fn new(table: InformationTableRef) -> Self {
Self { table }
}
fn try_project(&self, projection: &[usize]) -> std::result::Result<SchemaRef, BoxedError> {
let schema = self
.table
.schema()
.try_project(projection)
.context(SchemaConversionSnafu)
.map_err(BoxedError::new)?;
Ok(Arc::new(schema))
}
}
impl DataSource for InformationTable {
fn get_stream(&self, request: ScanRequest) -> TableResult<SendableRecordBatchStream> {
impl DataSource for InformationTableDataSource {
fn get_stream(
&self,
request: ScanRequest,
) -> std::result::Result<SendableRecordBatchStream, BoxedError> {
let projection = request.projection;
let projected_schema = if let Some(projection) = &projection {
Arc::new(
self.schema()
.try_project(projection)
.context(SchemaConversionSnafu)?,
)
} else {
self.schema()
let projected_schema = match &projection {
Some(projection) => self.try_project(projection)?,
None => self.table.schema(),
};
let stream = self
.stream_builder
.table
.to_stream()
.map_err(BoxedError::new)
.context(TablesRecordBatchSnafu)?
.map(move |batch| {
batch.and_then(|batch| {
if let Some(projection) = &projection {
batch.try_project(projection)
} else {
Ok(batch)
}
})
.context(TablesRecordBatchSnafu)
.map_err(BoxedError::new)?
.map(move |batch| match &projection {
Some(p) => batch.and_then(|b| b.try_project(p)),
None => batch,
});
let stream = RecordBatchStreamAdaptor {
schema: projected_schema,
stream: Box::pin(stream),

View File

@@ -16,8 +16,8 @@ use std::sync::{Arc, Weak};
use arrow_schema::SchemaRef as ArrowSchemaRef;
use common_catalog::consts::{
INFORMATION_SCHEMA_NAME, SEMANTIC_TYPE_FIELD, SEMANTIC_TYPE_PRIMARY_KEY,
SEMANTIC_TYPE_TIME_INDEX,
INFORMATION_SCHEMA_COLUMNS_TABLE_ID, INFORMATION_SCHEMA_NAME, SEMANTIC_TYPE_FIELD,
SEMANTIC_TYPE_PRIMARY_KEY, SEMANTIC_TYPE_TIME_INDEX,
};
use common_error::ext::BoxedError;
use common_query::physical_plan::TaskContext;
@@ -31,9 +31,10 @@ use datatypes::scalars::ScalarVectorBuilder;
use datatypes::schema::{ColumnSchema, Schema, SchemaRef};
use datatypes::vectors::{StringVectorBuilder, VectorRef};
use snafu::{OptionExt, ResultExt};
use store_api::storage::TableId;
use super::tables::InformationSchemaTables;
use super::{InformationStreamBuilder, COLUMNS, TABLES};
use super::{InformationTable, COLUMNS, TABLES};
use crate::error::{
CreateRecordBatchSnafu, InternalSnafu, Result, UpgradeWeakCatalogManagerRefSnafu,
};
@@ -81,7 +82,15 @@ impl InformationSchemaColumns {
}
}
impl InformationStreamBuilder for InformationSchemaColumns {
impl InformationTable for InformationSchemaColumns {
fn table_id(&self) -> TableId {
INFORMATION_SCHEMA_COLUMNS_TABLE_ID
}
fn table_name(&self) -> &'static str {
COLUMNS
}
fn schema(&self) -> SchemaRef {
self.schema.clone()
}

View File

@@ -30,13 +30,14 @@ use datatypes::prelude::{ConcreteDataType, ScalarVectorBuilder, VectorRef};
use datatypes::schema::{ColumnSchema, Schema, SchemaRef};
use datatypes::vectors::{StringVectorBuilder, UInt32VectorBuilder};
use snafu::{OptionExt, ResultExt};
use store_api::storage::TableId;
use table::metadata::TableType;
use super::{COLUMNS, TABLES};
use crate::error::{
CreateRecordBatchSnafu, InternalSnafu, Result, UpgradeWeakCatalogManagerRefSnafu,
};
use crate::information_schema::InformationStreamBuilder;
use crate::information_schema::InformationTable;
use crate::CatalogManager;
pub(super) struct InformationSchemaTables {
@@ -74,7 +75,15 @@ impl InformationSchemaTables {
}
}
impl InformationStreamBuilder for InformationSchemaTables {
impl InformationTable for InformationSchemaTables {
fn table_id(&self) -> TableId {
INFORMATION_SCHEMA_TABLES_TABLE_ID
}
fn table_name(&self) -> &'static str {
TABLES
}
fn schema(&self) -> SchemaRef {
self.schema.clone()
}

View File

@@ -25,7 +25,7 @@ use api::v1::meta::{RegionStat, TableIdent, TableName};
use common_telemetry::{info, warn};
use snafu::ResultExt;
use table::engine::{EngineContext, TableEngineRef};
use table::metadata::TableId;
use table::metadata::{TableId, TableType};
use table::requests::CreateTableRequest;
use table::TableRef;
@@ -37,7 +37,6 @@ pub mod local;
mod metrics;
pub mod remote;
pub mod system;
pub mod table_factory;
pub mod table_source;
pub mod tables;
@@ -240,6 +239,10 @@ pub async fn datanode_stat(catalog_manager: &CatalogManagerRef) -> (u64, Vec<Reg
continue;
};
if table.table_type() != TableType::Base {
continue;
}
let table_info = table.table_info();
let region_numbers = &table_info.meta.region_numbers;
region_number += region_numbers.len() as u64;

View File

@@ -136,18 +136,17 @@ impl LocalCatalogManager {
schema: INFORMATION_SCHEMA_NAME.to_string(),
table_name: SYSTEM_CATALOG_TABLE_NAME.to_string(),
table_id: SYSTEM_CATALOG_TABLE_ID,
table: self.system.information_schema.system.clone(),
table: self.system.information_schema.system.as_table_ref(),
};
self.catalogs.register_table(register_table_req).await?;
// Add numbers table for test
let numbers_table = Arc::new(NumbersTable::default());
let register_number_table_req = RegisterTableRequest {
catalog: DEFAULT_CATALOG_NAME.to_string(),
schema: DEFAULT_SCHEMA_NAME.to_string(),
table_name: NUMBERS_TABLE_NAME.to_string(),
table_id: NUMBERS_TABLE_ID,
table: numbers_table,
table: NumbersTable::table(NUMBERS_TABLE_ID),
};
self.catalogs

View File

@@ -97,26 +97,7 @@ impl CatalogManager for MemoryCatalogManager {
}
async fn deregister_table(&self, request: DeregisterTableRequest) -> Result<()> {
let mut catalogs = self.catalogs.write().unwrap();
let schema = catalogs
.get_mut(&request.catalog)
.with_context(|| CatalogNotFoundSnafu {
catalog_name: &request.catalog,
})?
.get_mut(&request.schema)
.with_context(|| SchemaNotFoundSnafu {
catalog: &request.catalog,
schema: &request.schema,
})?;
let result = schema.remove(&request.table_name);
if result.is_some() {
decrement_gauge!(
crate::metrics::METRIC_CATALOG_MANAGER_TABLE_COUNT,
1.0,
&[crate::metrics::db_label(&request.catalog, &request.schema)],
);
}
Ok(())
self.deregister_table_sync(request)
}
async fn register_schema(&self, request: RegisterSchemaRequest) -> Result<bool> {
@@ -157,15 +138,7 @@ impl CatalogManager for MemoryCatalogManager {
}
async fn schema_exist(&self, catalog: &str, schema: &str) -> Result<bool> {
Ok(self
.catalogs
.read()
.unwrap()
.get(catalog)
.with_context(|| CatalogNotFoundSnafu {
catalog_name: catalog,
})?
.contains_key(schema))
self.schema_exist_sync(catalog, schema)
}
async fn table(
@@ -187,7 +160,7 @@ impl CatalogManager for MemoryCatalogManager {
}
async fn catalog_exist(&self, catalog: &str) -> Result<bool> {
Ok(self.catalogs.read().unwrap().get(catalog).is_some())
self.catalog_exist_sync(catalog)
}
async fn table_exist(&self, catalog: &str, schema: &str, table: &str) -> Result<bool> {
@@ -245,7 +218,7 @@ impl CatalogManager for MemoryCatalogManager {
}
impl MemoryCatalogManager {
/// Create a manager with some default setups
/// Creates a manager with some default setups
/// (e.g. default catalog/schema and information schema)
pub fn with_default_setup() -> Arc<Self> {
let manager = Arc::new(Self {
@@ -267,19 +240,23 @@ impl MemoryCatalogManager {
manager
}
/// Registers a catalog and return the catalog already exist
pub fn register_catalog_if_absent(&self, name: String) -> bool {
let mut catalogs = self.catalogs.write().unwrap();
let entry = catalogs.entry(name);
match entry {
Entry::Occupied(_) => true,
Entry::Vacant(v) => {
let _ = v.insert(HashMap::new());
false
}
}
fn schema_exist_sync(&self, catalog: &str, schema: &str) -> Result<bool> {
Ok(self
.catalogs
.read()
.unwrap()
.get(catalog)
.with_context(|| CatalogNotFoundSnafu {
catalog_name: catalog,
})?
.contains_key(schema))
}
fn catalog_exist_sync(&self, catalog: &str) -> Result<bool> {
Ok(self.catalogs.read().unwrap().get(catalog).is_some())
}
/// Registers a catalog if it does not exist and returns false if the schema exists.
pub fn register_catalog_sync(self: &Arc<Self>, name: String) -> Result<bool> {
let mut catalogs = self.catalogs.write().unwrap();
@@ -294,6 +271,32 @@ impl MemoryCatalogManager {
}
}
pub fn deregister_table_sync(&self, request: DeregisterTableRequest) -> Result<()> {
let mut catalogs = self.catalogs.write().unwrap();
let schema = catalogs
.get_mut(&request.catalog)
.with_context(|| CatalogNotFoundSnafu {
catalog_name: &request.catalog,
})?
.get_mut(&request.schema)
.with_context(|| SchemaNotFoundSnafu {
catalog: &request.catalog,
schema: &request.schema,
})?;
let result = schema.remove(&request.table_name);
if result.is_some() {
decrement_gauge!(
crate::metrics::METRIC_CATALOG_MANAGER_TABLE_COUNT,
1.0,
&[crate::metrics::db_label(&request.catalog, &request.schema)],
);
}
Ok(())
}
/// Registers a schema if it does not exist.
/// It returns an error if the catalog does not exist,
/// and returns false if the schema exists.
pub fn register_schema_sync(&self, request: RegisterSchemaRequest) -> Result<bool> {
let mut catalogs = self.catalogs.write().unwrap();
let catalog = catalogs
@@ -312,6 +315,7 @@ impl MemoryCatalogManager {
}
}
/// Registers a schema and returns an error if the catalog or schema does not exist.
pub fn register_table_sync(&self, request: RegisterTableRequest) -> Result<bool> {
let mut catalogs = self.catalogs.write().unwrap();
let schema = catalogs
@@ -353,9 +357,25 @@ impl MemoryCatalogManager {
#[cfg(any(test, feature = "testing"))]
pub fn new_with_table(table: TableRef) -> Arc<Self> {
let manager = Self::with_default_setup();
let catalog = &table.table_info().catalog_name;
let schema = &table.table_info().schema_name;
if !manager.catalog_exist_sync(catalog).unwrap() {
manager.register_catalog_sync(catalog.to_string()).unwrap();
}
if !manager.schema_exist_sync(catalog, schema).unwrap() {
manager
.register_schema_sync(RegisterSchemaRequest {
catalog: catalog.to_string(),
schema: schema.to_string(),
})
.unwrap();
}
let request = RegisterTableRequest {
catalog: DEFAULT_CATALOG_NAME.to_string(),
schema: DEFAULT_SCHEMA_NAME.to_string(),
catalog: catalog.to_string(),
schema: schema.to_string(),
table_name: table.table_info().name.clone(),
table_id: table.table_info().ident.table_id,
table,
@@ -388,7 +408,7 @@ mod tests {
schema: DEFAULT_SCHEMA_NAME.to_string(),
table_name: NUMBERS_TABLE_NAME.to_string(),
table_id: NUMBERS_TABLE_ID,
table: Arc::new(NumbersTable::default()),
table: NumbersTable::table(NUMBERS_TABLE_ID),
};
let _ = catalog_list.register_table(register_request).await.unwrap();
@@ -423,7 +443,7 @@ mod tests {
schema: DEFAULT_SCHEMA_NAME.to_string(),
table_name: table_name.to_string(),
table_id,
table: Arc::new(NumbersTable::new(table_id)),
table: NumbersTable::table(table_id),
};
assert!(catalog.register_table(register_request).await.unwrap());
assert!(catalog
@@ -465,7 +485,7 @@ mod tests {
schema: DEFAULT_SCHEMA_NAME.to_string(),
table_name: new_table_name.to_string(),
table_id: table_id + 1,
table: Arc::new(NumbersTable::new(table_id + 1)),
table: NumbersTable::table(table_id + 1),
};
let result = catalog.register_table(dup_register_request).await;
let err = result.err().unwrap();
@@ -477,7 +497,7 @@ mod tests {
let catalog = MemoryCatalogManager::with_default_setup();
let table_name = "num";
let table_id = 2333;
let table: TableRef = Arc::new(NumbersTable::new(table_id));
let table = NumbersTable::table(table_id);
// register table
let register_table_req = RegisterTableRequest {
@@ -524,10 +544,14 @@ mod tests {
}
#[test]
pub fn test_register_if_absent() {
pub fn test_register_catalog_sync() {
let list = MemoryCatalogManager::with_default_setup();
assert!(!list.register_catalog_if_absent("test_catalog".to_string(),));
assert!(list.register_catalog_if_absent("test_catalog".to_string()));
assert!(list
.register_catalog_sync("test_catalog".to_string())
.unwrap());
assert!(!list
.register_catalog_sync("test_catalog".to_string())
.unwrap());
}
#[tokio::test]
@@ -540,7 +564,7 @@ mod tests {
schema: DEFAULT_SCHEMA_NAME.to_string(),
table_name: table_name.to_string(),
table_id: 2333,
table: Arc::new(NumbersTable::default()),
table: NumbersTable::table(2333),
};
let _ = catalog.register_table(register_table_req).await.unwrap();
assert!(catalog
@@ -582,7 +606,7 @@ mod tests {
schema: schema_name.clone(),
table_name,
table_id: 0,
table: Arc::new(NumbersTable::default()),
table: NumbersTable::table(0),
};
catalog
.clone()

View File

@@ -17,12 +17,13 @@ use std::sync::Arc;
use async_trait::async_trait;
use common_catalog::consts::MITO_ENGINE;
use common_meta::helper::{CatalogKey, SchemaKey};
use common_meta::ident::TableIdent;
use common_meta::key::catalog_name::CatalogNameKey;
use common_meta::key::datanode_table::DatanodeTableValue;
use common_meta::key::schema_name::SchemaNameKey;
use common_meta::key::TableMetadataManagerRef;
use common_meta::kv_backend::KvBackendRef;
use common_telemetry::{error, info, warn};
use futures_util::TryStreamExt;
use metrics::increment_gauge;
use snafu::{ensure, OptionExt, ResultExt};
use table::engine::manager::TableEngineManagerRef;
@@ -45,7 +46,6 @@ use crate::{
/// Catalog manager based on metasrv.
pub struct RemoteCatalogManager {
node_id: u64,
backend: KvBackendRef,
engine_manager: TableEngineManagerRef,
system_table_requests: Mutex<Vec<RegisterSystemTableRequest>>,
region_alive_keepers: Arc<RegionAliveKeepers>,
@@ -57,14 +57,12 @@ impl RemoteCatalogManager {
pub fn new(
engine_manager: TableEngineManagerRef,
node_id: u64,
backend: KvBackendRef,
region_alive_keepers: Arc<RegionAliveKeepers>,
table_metadata_manager: TableMetadataManagerRef,
) -> Self {
Self {
engine_manager,
node_id,
backend,
system_table_requests: Default::default(),
region_alive_keepers,
memory_catalog_manager: MemoryCatalogManager::with_default_setup(),
@@ -77,6 +75,7 @@ impl RemoteCatalogManager {
.table_metadata_manager
.datanode_table_manager()
.tables(self.node_id)
.try_collect::<Vec<_>>()
.await
.context(TableMetadataManagerSnafu)?;
@@ -86,6 +85,7 @@ impl RemoteCatalogManager {
let engine_manager = self.engine_manager.clone();
let memory_catalog_manager = self.memory_catalog_manager.clone();
let table_metadata_manager = self.table_metadata_manager.clone();
let region_alive_keepers = self.region_alive_keepers.clone();
common_runtime::spawn_bg(async move {
let table_id = datanode_table_value.table_id;
if let Err(e) = open_and_register_table(
@@ -93,6 +93,7 @@ impl RemoteCatalogManager {
datanode_table_value,
memory_catalog_manager,
table_metadata_manager,
region_alive_keepers,
)
.await
{
@@ -110,13 +111,6 @@ impl RemoteCatalogManager {
.context(ParallelOpenTableSnafu)?;
Ok(())
}
fn build_schema_key(&self, catalog_name: String, schema_name: String) -> SchemaKey {
SchemaKey {
catalog_name,
schema_name,
}
}
}
async fn open_and_register_table(
@@ -124,6 +118,7 @@ async fn open_and_register_table(
datanode_table_value: DatanodeTableValue,
memory_catalog_manager: Arc<MemoryCatalogManager>,
table_metadata_manager: TableMetadataManagerRef,
region_alive_keepers: Arc<RegionAliveKeepers>,
) -> Result<()> {
let context = EngineContext {};
@@ -200,7 +195,8 @@ async fn open_and_register_table(
table_id,
table,
};
let registered = memory_catalog_manager.register_table_sync(request)?;
let registered =
register_table(&memory_catalog_manager, &region_alive_keepers, request).await?;
ensure!(
registered,
TableExistsSnafu {
@@ -211,6 +207,32 @@ async fn open_and_register_table(
Ok(())
}
async fn register_table(
memory_catalog_manager: &Arc<MemoryCatalogManager>,
region_alive_keepers: &Arc<RegionAliveKeepers>,
request: RegisterTableRequest,
) -> Result<bool> {
let table = request.table.clone();
let registered = memory_catalog_manager.register_table_sync(request)?;
if registered {
let table_info = table.table_info();
let table_ident = TableIdent {
catalog: table_info.catalog_name.clone(),
schema: table_info.schema_name.clone(),
table: table_info.name.clone(),
table_id: table_info.table_id(),
engine: table_info.meta.engine.clone(),
};
region_alive_keepers
.register_table(table_ident, table, memory_catalog_manager.clone())
.await?;
}
Ok(registered)
}
#[async_trait]
impl CatalogManager for RemoteCatalogManager {
async fn start(&self) -> Result<()> {
@@ -229,25 +251,12 @@ impl CatalogManager for RemoteCatalogManager {
}
async fn register_table(&self, request: RegisterTableRequest) -> Result<bool> {
let table = request.table.clone();
let registered = self.memory_catalog_manager.register_table_sync(request)?;
if registered {
let table_info = table.table_info();
let table_ident = TableIdent {
catalog: table_info.catalog_name.clone(),
schema: table_info.schema_name.clone(),
table: table_info.name.clone(),
table_id: table_info.table_id(),
engine: table_info.meta.engine.clone(),
};
self.region_alive_keepers
.register_table(table_ident, table)
.await?;
}
Ok(registered)
register_table(
&self.memory_catalog_manager,
&self.region_alive_keepers,
request,
)
.await
}
async fn deregister_table(&self, request: DeregisterTableRequest) -> Result<()> {
@@ -323,16 +332,12 @@ impl CatalogManager for RemoteCatalogManager {
return Ok(true);
}
let key = self
.build_schema_key(catalog.to_string(), schema.to_string())
.to_string();
let remote_schema_exists = self
.backend
.get(key.as_bytes())
.table_metadata_manager
.schema_manager()
.exist(SchemaNameKey::new(catalog, schema))
.await
.context(TableMetadataManagerSnafu)?
.is_some();
.context(TableMetadataManagerSnafu)?;
// Create schema locally if remote schema exists. Since local schema is managed by memory
// catalog manager, creating a local schema is relatively cheap (just a HashMap).
// Besides, if this method ("schema_exist) is called, it's very likely that someone wants to
@@ -368,16 +373,13 @@ impl CatalogManager for RemoteCatalogManager {
return Ok(true);
}
let key = CatalogKey {
catalog_name: catalog.to_string(),
};
let key = CatalogNameKey::new(catalog);
let remote_catalog_exists = self
.backend
.get(key.to_string().as_bytes())
.table_metadata_manager
.catalog_manager()
.exist(key)
.await
.context(TableMetadataManagerSnafu)?
.is_some();
.context(TableMetadataManagerSnafu)?;
// Create catalog locally if remote catalog exists. Since local catalog is managed by memory
// catalog manager, creating a local catalog is relatively cheap (just a HashMap).

View File

@@ -29,6 +29,7 @@ use snafu::{OptionExt, ResultExt};
use store_api::storage::RegionNumber;
use table::engine::manager::TableEngineManagerRef;
use table::engine::{CloseTableResult, EngineContext, TableEngineRef};
use table::metadata::TableId;
use table::requests::CloseTableRequest;
use table::TableRef;
use tokio::sync::{mpsc, oneshot, Mutex};
@@ -36,11 +37,13 @@ use tokio::task::JoinHandle;
use tokio::time::{Duration, Instant};
use crate::error::{Result, TableEngineNotFoundSnafu};
use crate::local::MemoryCatalogManager;
use crate::DeregisterTableRequest;
/// [RegionAliveKeepers] manages all [RegionAliveKeeper] in a scope of tables.
pub struct RegionAliveKeepers {
table_engine_manager: TableEngineManagerRef,
keepers: Arc<Mutex<HashMap<TableIdent, Arc<RegionAliveKeeper>>>>,
keepers: Arc<Mutex<HashMap<TableId, Arc<RegionAliveKeeper>>>>,
heartbeat_interval_millis: u64,
started: AtomicBool,
@@ -65,12 +68,18 @@ impl RegionAliveKeepers {
}
}
pub async fn find_keeper(&self, table_ident: &TableIdent) -> Option<Arc<RegionAliveKeeper>> {
self.keepers.lock().await.get(table_ident).cloned()
pub async fn find_keeper(&self, table_id: TableId) -> Option<Arc<RegionAliveKeeper>> {
self.keepers.lock().await.get(&table_id).cloned()
}
pub async fn register_table(&self, table_ident: TableIdent, table: TableRef) -> Result<()> {
let keeper = self.find_keeper(&table_ident).await;
pub async fn register_table(
&self,
table_ident: TableIdent,
table: TableRef,
catalog_manager: Arc<MemoryCatalogManager>,
) -> Result<()> {
let table_id = table_ident.table_id;
let keeper = self.find_keeper(table_id).await;
if keeper.is_some() {
return Ok(());
}
@@ -84,6 +93,7 @@ impl RegionAliveKeepers {
let keeper = Arc::new(RegionAliveKeeper::new(
table_engine,
catalog_manager,
table_ident.clone(),
self.heartbeat_interval_millis,
));
@@ -92,7 +102,7 @@ impl RegionAliveKeepers {
}
let mut keepers = self.keepers.lock().await;
let _ = keepers.insert(table_ident.clone(), keeper.clone());
let _ = keepers.insert(table_id, keeper.clone());
if self.started.load(Ordering::Relaxed) {
keeper.start().await;
@@ -108,15 +118,16 @@ impl RegionAliveKeepers {
&self,
table_ident: &TableIdent,
) -> Option<Arc<RegionAliveKeeper>> {
self.keepers.lock().await.remove(table_ident).map(|x| {
let table_id = table_ident.table_id;
self.keepers.lock().await.remove(&table_id).map(|x| {
info!("Deregister RegionAliveKeeper for table {table_ident}");
x
})
}
pub async fn register_region(&self, region_ident: &RegionIdent) {
let table_ident = &region_ident.table_ident;
let Some(keeper) = self.find_keeper(table_ident).await else {
let table_id = region_ident.table_ident.table_id;
let Some(keeper) = self.find_keeper(table_id).await else {
// Alive keeper could be affected by lagging msg, just warn and ignore.
warn!("Alive keeper for region {region_ident} is not found!");
return;
@@ -125,8 +136,8 @@ impl RegionAliveKeepers {
}
pub async fn deregister_region(&self, region_ident: &RegionIdent) {
let table_ident = &region_ident.table_ident;
let Some(keeper) = self.find_keeper(table_ident).await else {
let table_id = region_ident.table_ident.table_id;
let Some(keeper) = self.find_keeper(table_id).await else {
// Alive keeper could be affected by lagging msg, just warn and ignore.
warn!("Alive keeper for region {region_ident} is not found!");
return;
@@ -178,7 +189,8 @@ impl HeartbeatResponseHandler for RegionAliveKeepers {
}
};
let Some(keeper) = self.keepers.lock().await.get(&table_ident).cloned() else {
let table_id = table_ident.table_id;
let Some(keeper) = self.keepers.lock().await.get(&table_id).cloned() else {
// Alive keeper could be affected by lagging msg, just warn and ignore.
warn!("Alive keeper for table {table_ident} is not found!");
continue;
@@ -199,6 +211,7 @@ impl HeartbeatResponseHandler for RegionAliveKeepers {
/// Datanode, it will "extend" the region's "lease", with a deadline for [RegionAliveKeeper] to
/// countdown.
pub struct RegionAliveKeeper {
catalog_manager: Arc<MemoryCatalogManager>,
table_engine: TableEngineRef,
table_ident: TableIdent,
countdown_task_handles: Arc<Mutex<HashMap<RegionNumber, Arc<CountdownTaskHandle>>>>,
@@ -209,10 +222,12 @@ pub struct RegionAliveKeeper {
impl RegionAliveKeeper {
fn new(
table_engine: TableEngineRef,
catalog_manager: Arc<MemoryCatalogManager>,
table_ident: TableIdent,
heartbeat_interval_millis: u64,
) -> Self {
Self {
catalog_manager,
table_engine,
table_ident,
countdown_task_handles: Arc::new(Mutex::new(HashMap::new())),
@@ -240,11 +255,29 @@ impl RegionAliveKeeper {
let _ = x.lock().await.remove(&region);
} // Else the countdown task handles map could be dropped because the keeper is dropped.
};
let catalog_manager = self.catalog_manager.clone();
let ident = self.table_ident.clone();
let handle = Arc::new(CountdownTaskHandle::new(
self.table_engine.clone(),
self.table_ident.clone(),
region,
|| on_task_finished,
move |result: Option<CloseTableResult>| {
if matches!(result, Some(CloseTableResult::Released(_))) {
let result = catalog_manager.deregister_table_sync(DeregisterTableRequest {
catalog: ident.catalog.to_string(),
schema: ident.schema.to_string(),
table_name: ident.table.to_string(),
});
info!(
"Deregister table: {} after countdown task finished, result: {result:?}",
ident.table_id
);
} else {
debug!("Countdown task returns: {result:?}");
}
on_task_finished
},
));
let mut handles = self.countdown_task_handles.lock().await;
@@ -343,7 +376,7 @@ impl CountdownTaskHandle {
table_engine: TableEngineRef,
table_ident: TableIdent,
region: RegionNumber,
on_task_finished: impl FnOnce() -> Fut + Send + 'static,
on_task_finished: impl FnOnce(Option<CloseTableResult>) -> Fut + Send + 'static,
) -> Self
where
Fut: Future<Output = ()> + Send,
@@ -357,8 +390,8 @@ impl CountdownTaskHandle {
rx,
};
let handler = common_runtime::spawn_bg(async move {
countdown_task.run().await;
on_task_finished().await;
let result = countdown_task.run().await;
on_task_finished(result).await;
});
Self {
@@ -410,7 +443,8 @@ struct CountdownTask {
}
impl CountdownTask {
async fn run(&mut self) {
// returns true if
async fn run(&mut self) -> Option<CloseTableResult> {
// 30 years. See `Instant::far_future`.
let far_future = Instant::now() + Duration::from_secs(86400 * 365 * 30);
@@ -464,10 +498,11 @@ impl CountdownTask {
"Region {region} of table {table_ident} is closed, result: {result:?}. \
RegionAliveKeeper out.",
);
break;
return Some(result);
}
}
}
None
}
async fn close_region(&self) -> CloseTableResult {
@@ -543,11 +578,16 @@ mod test {
table_options: TableOptions::default(),
engine: "MockTableEngine".to_string(),
}));
let catalog_manager = MemoryCatalogManager::new_with_table(table.clone());
keepers
.register_table(table_ident.clone(), table)
.register_table(table_ident.clone(), table, catalog_manager)
.await
.unwrap();
assert!(keepers.keepers.lock().await.contains_key(&table_ident));
assert!(keepers
.keepers
.lock()
.await
.contains_key(&table_ident.table_id));
(table_ident, keepers)
}
@@ -602,7 +642,7 @@ mod test {
.keepers
.lock()
.await
.get(&table_ident)
.get(&table_ident.table_id)
.cloned()
.unwrap();
@@ -649,7 +689,7 @@ mod test {
})
.await;
let mut regions = keepers
.find_keeper(&table_ident)
.find_keeper(table_ident.table_id)
.await
.unwrap()
.countdown_task_handles
@@ -676,7 +716,8 @@ mod test {
table_id: 1024,
engine: "mito".to_string(),
};
let keeper = RegionAliveKeeper::new(table_engine, table_ident, 1000);
let catalog_manager = MemoryCatalogManager::with_default_setup();
let keeper = RegionAliveKeeper::new(table_engine, catalog_manager, table_ident, 1000);
let region = 1;
assert!(keeper.find_handle(&region).await.is_none());
@@ -719,7 +760,7 @@ mod test {
table_engine.clone(),
table_ident.clone(),
1,
|| async move { finished_clone.store(true, Ordering::Relaxed) },
|_| async move { finished_clone.store(true, Ordering::Relaxed) },
);
let tx = handle.tx.clone();
@@ -741,7 +782,7 @@ mod test {
let finished = Arc::new(AtomicBool::new(false));
let finished_clone = finished.clone();
let handle = CountdownTaskHandle::new(table_engine, table_ident, 1, || async move {
let handle = CountdownTaskHandle::new(table_engine, table_ident, 1, |_| async move {
finished_clone.store(true, Ordering::Relaxed)
});
handle.tx.send(CountdownCommand::Start(100)).await.unwrap();

View File

@@ -12,7 +12,6 @@
// See the License for the specific language governing permissions and
// limitations under the License.
use std::any::Any;
use std::collections::HashMap;
use std::sync::Arc;
@@ -21,24 +20,23 @@ use common_catalog::consts::{
SYSTEM_CATALOG_NAME, SYSTEM_CATALOG_TABLE_ID, SYSTEM_CATALOG_TABLE_NAME,
};
use common_recordbatch::SendableRecordBatchStream;
use common_telemetry::debug;
use common_telemetry::{debug, warn};
use common_time::util;
use datatypes::prelude::{ConcreteDataType, ScalarVector, VectorRef};
use datatypes::schema::{ColumnSchema, RawSchema, SchemaRef};
use datatypes::schema::{ColumnSchema, RawSchema};
use datatypes::vectors::{BinaryVector, TimestampMillisecondVector, UInt8Vector};
use serde::{Deserialize, Serialize};
use snafu::{ensure, OptionExt, ResultExt};
use store_api::storage::ScanRequest;
use table::engine::{EngineContext, TableEngineRef};
use table::metadata::{TableId, TableInfoRef, TableType};
use table::requests::{
CreateTableRequest, DeleteRequest, InsertRequest, OpenTableRequest, TableOptions,
};
use table::{Result as TableResult, Table, TableRef};
use table::metadata::TableId;
use table::requests::{CreateTableRequest, InsertRequest, OpenTableRequest, TableOptions};
use table::TableRef;
use crate::error::{
self, CreateSystemCatalogSnafu, EmptyValueSnafu, Error, InvalidEntryTypeSnafu, InvalidKeySnafu,
OpenSystemCatalogSnafu, Result, ValueDeserializeSnafu,
self, CreateSystemCatalogSnafu, DeregisterTableSnafu, EmptyValueSnafu, Error,
InsertCatalogRecordSnafu, InvalidEntryTypeSnafu, InvalidKeySnafu, OpenSystemCatalogSnafu,
Result, ValueDeserializeSnafu,
};
use crate::DeregisterTableRequest;
@@ -48,42 +46,6 @@ pub const VALUE_INDEX: usize = 3;
pub struct SystemCatalogTable(TableRef);
#[async_trait::async_trait]
impl Table for SystemCatalogTable {
fn as_any(&self) -> &dyn Any {
self
}
fn schema(&self) -> SchemaRef {
self.0.schema()
}
async fn scan_to_stream(&self, request: ScanRequest) -> TableResult<SendableRecordBatchStream> {
self.0.scan_to_stream(request).await
}
/// Insert values into table.
async fn insert(&self, request: InsertRequest) -> TableResult<usize> {
self.0.insert(request).await
}
fn table_info(&self) -> TableInfoRef {
self.0.table_info()
}
fn table_type(&self) -> TableType {
self.0.table_type()
}
async fn delete(&self, request: DeleteRequest) -> TableResult<usize> {
self.0.delete(request).await
}
fn statistics(&self) -> Option<table::stats::TableStatistics> {
self.0.statistics()
}
}
impl SystemCatalogTable {
pub async fn new(engine: TableEngineRef) -> Result<Self> {
let request = OpenTableRequest {
@@ -126,6 +88,54 @@ impl SystemCatalogTable {
}
}
pub async fn register_table(
&self,
catalog: String,
schema: String,
table_name: String,
table_id: TableId,
engine: String,
) -> Result<usize> {
let insert_request =
build_table_insert_request(catalog, schema, table_name, table_id, engine);
self.0
.insert(insert_request)
.await
.context(InsertCatalogRecordSnafu)
}
pub(crate) async fn deregister_table(
&self,
request: &DeregisterTableRequest,
table_id: TableId,
) -> Result<()> {
let deletion_request = build_table_deletion_request(request, table_id);
self.0
.insert(deletion_request)
.await
.map(|x| {
if x != 1 {
let table = common_catalog::format_full_table_name(
&request.catalog,
&request.schema,
&request.table_name
);
warn!("Failed to delete table record from information_schema, unexpected returned result: {x}, table: {table}");
}
})
.with_context(|_| DeregisterTableSnafu {
request: request.clone(),
})
}
pub async fn register_schema(&self, catalog: String, schema: String) -> Result<usize> {
let insert_request = build_schema_insert_request(catalog, schema);
self.0
.insert(insert_request)
.await
.context(InsertCatalogRecordSnafu)
}
/// Create a stream of all entries inside system catalog table
pub async fn records(&self) -> Result<SendableRecordBatchStream> {
let full_projection = None;
@@ -137,11 +147,16 @@ impl SystemCatalogTable {
limit: None,
};
let stream = self
.0
.scan_to_stream(scan_req)
.await
.context(error::SystemCatalogTableScanSnafu)?;
Ok(stream)
}
pub fn as_table_ref(&self) -> TableRef {
self.0.clone()
}
}
/// Build system catalog table schema.
@@ -541,14 +556,14 @@ mod tests {
async fn test_system_table_type() {
let (_dir, table_engine) = prepare_table_engine().await;
let system_table = SystemCatalogTable::new(table_engine).await.unwrap();
assert_eq!(Base, system_table.table_type());
assert_eq!(Base, system_table.as_table_ref().table_type());
}
#[tokio::test]
async fn test_system_table_info() {
let (_dir, table_engine) = prepare_table_engine().await;
let system_table = SystemCatalogTable::new(table_engine).await.unwrap();
let info = system_table.table_info();
let info = system_table.as_table_ref().table_info();
assert_eq!(TableType::Base, info.table_type);
assert_eq!(SYSTEM_CATALOG_TABLE_NAME, info.name);
assert_eq!(SYSTEM_CATALOG_TABLE_ID, info.ident.table_id);
@@ -561,14 +576,16 @@ mod tests {
let (_, table_engine) = prepare_table_engine().await;
let catalog_table = SystemCatalogTable::new(table_engine).await.unwrap();
let table_insertion = build_table_insert_request(
DEFAULT_CATALOG_NAME.to_string(),
DEFAULT_SCHEMA_NAME.to_string(),
"my_table".to_string(),
1,
MITO_ENGINE.to_string(),
);
let result = catalog_table.insert(table_insertion).await.unwrap();
let result = catalog_table
.register_table(
DEFAULT_CATALOG_NAME.to_string(),
DEFAULT_SCHEMA_NAME.to_string(),
"my_table".to_string(),
1,
MITO_ENGINE.to_string(),
)
.await
.unwrap();
assert_eq!(result, 1);
let records = catalog_table.records().await.unwrap();
@@ -598,16 +615,17 @@ mod tests {
});
assert_eq!(entry, expected);
let table_deletion = build_table_deletion_request(
&DeregisterTableRequest {
catalog: DEFAULT_CATALOG_NAME.to_string(),
schema: DEFAULT_SCHEMA_NAME.to_string(),
table_name: "my_table".to_string(),
},
1,
);
let result = catalog_table.insert(table_deletion).await.unwrap();
assert_eq!(result, 1);
catalog_table
.deregister_table(
&DeregisterTableRequest {
catalog: DEFAULT_CATALOG_NAME.to_string(),
schema: DEFAULT_SCHEMA_NAME.to_string(),
table_name: "my_table".to_string(),
},
1,
)
.await
.unwrap();
let records = catalog_table.records().await.unwrap();
let batches = RecordBatches::try_collect(records).await.unwrap().take();

View File

@@ -16,16 +16,9 @@
use std::sync::Arc;
use common_telemetry::logging;
use snafu::ResultExt;
use table::metadata::TableId;
use table::Table;
use crate::error::{self, InsertCatalogRecordSnafu, Result as CatalogResult};
use crate::system::{
build_schema_insert_request, build_table_deletion_request, build_table_insert_request,
SystemCatalogTable,
};
use crate::system::SystemCatalogTable;
use crate::DeregisterTableRequest;
pub struct InformationSchema {
@@ -54,36 +47,21 @@ impl SystemCatalog {
table_id: TableId,
engine: String,
) -> crate::error::Result<usize> {
let request = build_table_insert_request(catalog, schema, table_name, table_id, engine);
self.information_schema
.system
.insert(request)
.register_table(catalog, schema, table_name, table_id, engine)
.await
.context(InsertCatalogRecordSnafu)
}
pub(crate) async fn deregister_table(
&self,
request: &DeregisterTableRequest,
table_id: TableId,
) -> CatalogResult<()> {
) -> crate::error::Result<()> {
self.information_schema
.system
.insert(build_table_deletion_request(request, table_id))
.deregister_table(request, table_id)
.await
.map(|x| {
if x != 1 {
let table = common_catalog::format_full_table_name(
&request.catalog,
&request.schema,
&request.table_name
);
logging::warn!("Failed to delete table record from information_schema, unexpected returned result: {x}, table: {table}");
}
})
.with_context(|_| error::DeregisterTableSnafu {
request: request.clone(),
})
}
pub async fn register_schema(
@@ -91,11 +69,9 @@ impl SystemCatalog {
catalog: String,
schema: String,
) -> crate::error::Result<usize> {
let request = build_schema_insert_request(catalog, schema);
self.information_schema
.system
.insert(request)
.register_schema(catalog, schema)
.await
.context(InsertCatalogRecordSnafu)
}
}

View File

@@ -24,7 +24,6 @@ mod tests {
use mito::config::EngineConfig;
use table::engine::manager::MemoryTableEngineManager;
use table::table::numbers::NumbersTable;
use table::TableRef;
use tokio::sync::Mutex;
async fn create_local_catalog_manager(
@@ -49,13 +48,12 @@ mod tests {
// register table
let table_name = "test_table";
let table_id = 42;
let table = Arc::new(NumbersTable::new(table_id));
let request = RegisterTableRequest {
catalog: DEFAULT_CATALOG_NAME.to_string(),
schema: DEFAULT_SCHEMA_NAME.to_string(),
table_name: table_name.to_string(),
table_id,
table: table.clone(),
table: NumbersTable::table(table_id),
};
assert!(catalog_manager.register_table(request).await.unwrap());
@@ -89,7 +87,7 @@ mod tests {
schema: DEFAULT_SCHEMA_NAME.to_string(),
table_name: "test_table".to_string(),
table_id: 42,
table: Arc::new(NumbersTable::new(42)),
table: NumbersTable::table(42),
};
assert!(catalog_manager
.register_table(request.clone())
@@ -105,7 +103,7 @@ mod tests {
schema: DEFAULT_SCHEMA_NAME.to_string(),
table_name: "test_table".to_string(),
table_id: 43,
table: Arc::new(NumbersTable::new(43)),
table: NumbersTable::table(43),
})
.await
.unwrap_err();
@@ -124,7 +122,7 @@ mod tests {
rt.block_on(async { create_local_catalog_manager().await.unwrap() });
let catalog_manager = Arc::new(catalog_manager);
let succeed: Arc<Mutex<Option<TableRef>>> = Arc::new(Mutex::new(None));
let succeed = Arc::new(Mutex::new(None));
let mut handles = Vec::with_capacity(8);
for i in 0..8 {
@@ -132,20 +130,21 @@ mod tests {
let succeed = succeed.clone();
let handle = rt.spawn(async move {
let table_id = 42 + i;
let table = Arc::new(NumbersTable::new(table_id));
let table = NumbersTable::table(table_id);
let table_info = table.table_info();
let req = RegisterTableRequest {
catalog: DEFAULT_CATALOG_NAME.to_string(),
schema: DEFAULT_SCHEMA_NAME.to_string(),
table_name: "test_table".to_string(),
table_id,
table: table.clone(),
table,
};
match catalog.register_table(req).await {
Ok(res) => {
if res {
let mut succeed = succeed.lock().await;
info!("Successfully registered table: {}", table_id);
*succeed = Some(table);
*succeed = Some(table_info);
}
}
Err(_) => {
@@ -161,7 +160,7 @@ mod tests {
handle.await.unwrap();
}
let guard = succeed.lock().await;
let table = guard.as_ref().unwrap();
let table_info = guard.as_ref().unwrap();
let table_registered = catalog_manager
.table(DEFAULT_CATALOG_NAME, DEFAULT_SCHEMA_NAME, "test_table")
.await
@@ -169,7 +168,7 @@ mod tests {
.unwrap();
assert_eq!(
table_registered.table_info().ident.table_id,
table.table_info().ident.table_id
table_info.ident.table_id
);
});
}

View File

@@ -21,7 +21,6 @@ mod tests {
use std::sync::Arc;
use std::time::Duration;
use catalog::error::Error;
use catalog::remote::mock::MockTableEngine;
use catalog::remote::region_alive_keeper::RegionAliveKeepers;
use catalog::remote::{CachedMetaKvBackend, RemoteCatalogManager};
@@ -29,12 +28,13 @@ mod tests {
use common_catalog::consts::{
DEFAULT_CATALOG_NAME, DEFAULT_SCHEMA_NAME, INFORMATION_SCHEMA_NAME, MITO_ENGINE,
};
use common_meta::helper::{CatalogKey, CatalogValue, SchemaKey, SchemaValue};
use common_meta::helper::CatalogValue;
use common_meta::ident::TableIdent;
use common_meta::key::catalog_name::CatalogNameKey;
use common_meta::key::TableMetadataManager;
use common_meta::kv_backend::memory::MemoryKvBackend;
use common_meta::kv_backend::KvBackend;
use common_meta::rpc::store::{CompareAndPutRequest, PutRequest, RangeRequest};
use common_meta::rpc::store::{CompareAndPutRequest, PutRequest};
use datatypes::schema::RawSchema;
use table::engine::manager::{MemoryTableEngineManager, TableEngineManagerRef};
use table::engine::{EngineContext, TableEngineRef};
@@ -54,79 +54,43 @@ mod tests {
}
}
#[tokio::test]
async fn test_backend() {
let backend = MemoryKvBackend::<Error>::default();
let default_catalog_key = CatalogKey {
catalog_name: DEFAULT_CATALOG_NAME.to_string(),
}
.to_string();
let req = PutRequest::new()
.with_key(default_catalog_key.as_bytes())
.with_value(CatalogValue.as_bytes().unwrap());
backend.put(req).await.unwrap();
let schema_key = SchemaKey {
catalog_name: DEFAULT_CATALOG_NAME.to_string(),
schema_name: DEFAULT_SCHEMA_NAME.to_string(),
}
.to_string();
let req = PutRequest::new()
.with_key(schema_key.as_bytes())
.with_value(SchemaValue.as_bytes().unwrap());
backend.put(req).await.unwrap();
let req = RangeRequest::new().with_prefix(b"__c-".to_vec());
let res = backend
.range(req)
.await
.unwrap()
.kvs
.into_iter()
.map(|kv| String::from_utf8_lossy(kv.key()).to_string());
assert_eq!(
vec!["__c-greptime".to_string()],
res.into_iter().collect::<Vec<_>>()
);
}
#[tokio::test]
async fn test_cached_backend() {
let backend = CachedMetaKvBackend::wrap(Arc::new(MemoryKvBackend::default()));
let default_catalog_key = CatalogKey {
catalog_name: DEFAULT_CATALOG_NAME.to_string(),
}
.to_string();
let default_catalog_key = CatalogNameKey::new(DEFAULT_CATALOG_NAME).to_string();
let req = PutRequest::new()
.with_key(default_catalog_key.as_bytes())
.with_value(CatalogValue.as_bytes().unwrap());
backend.put(req).await.unwrap();
let ret = backend.get(b"__c-greptime").await.unwrap();
let ret = backend.get(b"__catalog_name/greptime").await.unwrap();
let _ = ret.unwrap();
let req = CompareAndPutRequest::new()
.with_key(b"__c-greptime".to_vec())
.with_key(b"__catalog_name/greptime".to_vec())
.with_expect(CatalogValue.as_bytes().unwrap())
.with_value(b"123".to_vec());
let _ = backend.compare_and_put(req).await.unwrap();
let ret = backend.get(b"__c-greptime").await.unwrap();
let ret = backend.get(b"__catalog_name/greptime").await.unwrap();
assert_eq!(b"123", ret.as_ref().unwrap().value.as_slice());
let req = PutRequest::new()
.with_key(b"__c-greptime".to_vec())
.with_key(b"__catalog_name/greptime".to_vec())
.with_value(b"1234".to_vec());
let _ = backend.put(req).await;
let ret = backend.get(b"__c-greptime").await.unwrap();
let ret = backend.get(b"__catalog_name/greptime").await.unwrap();
assert_eq!(b"1234", ret.unwrap().value.as_slice());
backend.delete(b"__c-greptime", false).await.unwrap();
backend
.delete(b"__catalog_name/greptime", false)
.await
.unwrap();
let ret = backend.get(b"__c-greptime").await.unwrap();
let ret = backend.get(b"__catalog_name/greptime").await.unwrap();
assert!(ret.is_none());
}
@@ -134,12 +98,12 @@ mod tests {
let backend = Arc::new(MemoryKvBackend::default());
let req = PutRequest::new()
.with_key(b"__c-greptime".to_vec())
.with_key(b"__catalog_name/greptime".to_vec())
.with_value(b"".to_vec());
backend.put(req).await.unwrap();
let req = PutRequest::new()
.with_key(b"__s-greptime-public".to_vec())
.with_key(b"__schema_name/greptime-public".to_vec())
.with_value(b"".to_vec());
backend.put(req).await.unwrap();
@@ -156,7 +120,6 @@ mod tests {
let catalog_manager = RemoteCatalogManager::new(
engine_manager.clone(),
node_id,
cached_backend.clone(),
region_alive_keepers.clone(),
Arc::new(TableMetadataManager::new(cached_backend)),
);
@@ -433,7 +396,7 @@ mod tests {
assert!(catalog_manager.register_table(request).await.unwrap());
let keeper = region_alive_keepers
.find_keeper(&table_before)
.find_keeper(table_before.table_id)
.await
.unwrap();
let deadline = keeper.deadline(0).await.unwrap();
@@ -472,7 +435,7 @@ mod tests {
assert!(catalog_manager.register_table(request).await.unwrap());
let keeper = region_alive_keepers
.find_keeper(&table_after)
.find_keeper(table_after.table_id)
.await
.unwrap();
let deadline = keeper.deadline(0).await.unwrap();
@@ -480,7 +443,7 @@ mod tests {
assert!(deadline <= Instant::now() + Duration::from_secs(20));
let keeper = region_alive_keepers
.find_keeper(&table_before)
.find_keeper(table_before.table_id)
.await
.unwrap();
let deadline = keeper.deadline(0).await.unwrap();

View File

@@ -22,6 +22,7 @@ common-telemetry = { workspace = true }
common-time = { workspace = true }
datafusion.workspace = true
datatypes = { workspace = true }
derive_builder.workspace = true
enum_dispatch = "0.3"
futures-util.workspace = true
moka = { version = "0.9", features = ["future"] }

View File

@@ -17,6 +17,7 @@ use std::sync::Arc;
use api::v1::greptime_database_client::GreptimeDatabaseClient;
use api::v1::health_check_client::HealthCheckClient;
use api::v1::prometheus_gateway_client::PrometheusGatewayClient;
use api::v1::region::region_client::RegionClient as PbRegionClient;
use api::v1::HealthCheckRequest;
use arrow_flight::flight_service_client::FlightServiceClient;
use common_grpc::channel_manager::ChannelManager;
@@ -82,11 +83,6 @@ impl Client {
Default::default()
}
pub fn with_manager(channel_manager: ChannelManager) -> Self {
let inner = Arc::new(Inner::with_manager(channel_manager));
Self { inner }
}
pub fn with_urls<U, A>(urls: A) -> Self
where
U: AsRef<str>,
@@ -157,6 +153,11 @@ impl Client {
})
}
pub(crate) fn raw_region_client(&self) -> Result<PbRegionClient<Channel>> {
let (_, channel) = self.find_channel()?;
Ok(PbRegionClient::new(channel))
}
pub fn make_prometheus_gateway_client(&self) -> Result<PrometheusGatewayClient<Channel>> {
let (_, channel) = self.find_channel()?;
Ok(PrometheusGatewayClient::new(channel))

View File

@@ -17,20 +17,23 @@ use api::v1::ddl_request::Expr as DdlExpr;
use api::v1::greptime_request::Request;
use api::v1::query_request::Query;
use api::v1::{
AlterExpr, AuthHeader, CompactTableExpr, CreateTableExpr, DdlRequest, DeleteRequest,
AlterExpr, AuthHeader, CompactTableExpr, CreateTableExpr, DdlRequest, DeleteRequests,
DropTableExpr, FlushTableExpr, GreptimeRequest, InsertRequests, PromRangeQuery, QueryRequest,
RequestHeader, TruncateTableExpr,
RequestHeader, RowInsertRequests, TruncateTableExpr,
};
use arrow_flight::{FlightData, Ticket};
use arrow_flight::Ticket;
use async_stream::stream;
use common_error::ext::{BoxedError, ErrorExt};
use common_grpc::flight::{flight_messages_to_recordbatches, FlightDecoder, FlightMessage};
use common_grpc::flight::{FlightDecoder, FlightMessage};
use common_query::Output;
use common_recordbatch::error::ExternalSnafu;
use common_recordbatch::RecordBatchStreamAdaptor;
use common_telemetry::{logging, timer};
use futures_util::{TryFutureExt, TryStreamExt};
use futures_util::StreamExt;
use prost::Message;
use snafu::{ensure, ResultExt};
use crate::error::{ConvertFlightDataSnafu, IllegalFlightMessagesSnafu, ServerSnafu};
use crate::error::{ConvertFlightDataSnafu, Error, IllegalFlightMessagesSnafu, ServerSnafu};
use crate::{error, from_grpc_response, metrics, Client, Result, StreamInserter};
#[derive(Clone, Debug, Default)]
@@ -112,6 +115,11 @@ impl Database {
self.handle(Request::Inserts(requests)).await
}
pub async fn row_insert(&self, requests: RowInsertRequests) -> Result<u32> {
let _timer = timer!(metrics::METRIC_GRPC_INSERT);
self.handle(Request::RowInserts(requests)).await
}
pub fn streaming_inserter(&self) -> Result<StreamInserter> {
self.streaming_inserter_with_channel_size(65536)
}
@@ -132,9 +140,9 @@ impl Database {
Ok(stream_inserter)
}
pub async fn delete(&self, request: DeleteRequest) -> Result<u32> {
pub async fn delete(&self, request: DeleteRequests) -> Result<u32> {
let _timer = timer!(metrics::METRIC_GRPC_DELETE);
self.handle(Request::Delete(request)).await
self.handle(Request::Deletes(request)).await
}
async fn handle(&self, request: Request) -> Result<u32> {
@@ -283,55 +291,81 @@ impl Database {
let mut client = self.client.make_flight_client()?;
let flight_data: Vec<FlightData> = client
.mut_inner()
.do_get(request)
.and_then(|response| response.into_inner().try_collect())
.await
.map_err(|e| {
let tonic_code = e.code();
let e: error::Error = e.into();
let code = e.status_code();
let msg = e.to_string();
ServerSnafu { code, msg }
.fail::<()>()
.map_err(BoxedError::new)
.context(error::FlightGetSnafu {
tonic_code,
addr: client.addr(),
})
.map_err(|error| {
logging::error!(
"Failed to do Flight get, addr: {}, code: {}, source: {}",
client.addr(),
tonic_code,
error
);
error
})
.unwrap_err()
})?;
let decoder = &mut FlightDecoder::default();
let flight_messages = flight_data
.into_iter()
.map(|x| decoder.try_decode(x).context(ConvertFlightDataSnafu))
.collect::<Result<Vec<_>>>()?;
let output = if let Some(FlightMessage::AffectedRows(rows)) = flight_messages.get(0) {
ensure!(
flight_messages.len() == 1,
IllegalFlightMessagesSnafu {
reason: "Expect 'AffectedRows' Flight messages to be one and only!"
}
let response = client.mut_inner().do_get(request).await.map_err(|e| {
let tonic_code = e.code();
let e: error::Error = e.into();
let code = e.status_code();
let msg = e.to_string();
let error = Error::FlightGet {
tonic_code,
addr: client.addr().to_string(),
source: BoxedError::new(ServerSnafu { code, msg }.build()),
};
logging::error!(
"Failed to do Flight get, addr: {}, code: {}, source: {}",
client.addr(),
tonic_code,
error
);
Output::AffectedRows(*rows)
} else {
let recordbatches = flight_messages_to_recordbatches(flight_messages)
.context(ConvertFlightDataSnafu)?;
Output::RecordBatches(recordbatches)
error
})?;
let flight_data_stream = response.into_inner();
let mut decoder = FlightDecoder::default();
let mut flight_message_stream = flight_data_stream.map(move |flight_data| {
flight_data
.map_err(Error::from)
.and_then(|data| decoder.try_decode(data).context(ConvertFlightDataSnafu))
});
let Some(first_flight_message) = flight_message_stream.next().await else {
return IllegalFlightMessagesSnafu {
reason: "Expect the response not to be empty",
}
.fail();
};
Ok(output)
let first_flight_message = first_flight_message?;
match first_flight_message {
FlightMessage::AffectedRows(rows) => {
ensure!(
flight_message_stream.next().await.is_none(),
IllegalFlightMessagesSnafu {
reason: "Expect 'AffectedRows' Flight messages to be the one and the only!"
}
);
Ok(Output::AffectedRows(rows))
}
FlightMessage::Recordbatch(_) => IllegalFlightMessagesSnafu {
reason: "The first flight message cannot be a RecordBatch message",
}
.fail(),
FlightMessage::Schema(schema) => {
let stream = Box::pin(stream!({
while let Some(flight_message) = flight_message_stream.next().await {
let flight_message = flight_message
.map_err(BoxedError::new)
.context(ExternalSnafu)?;
let FlightMessage::Recordbatch(record_batch) = flight_message else {
yield IllegalFlightMessagesSnafu {reason: "A Schema message must be succeeded exclusively by a set of RecordBatch messages"}
.fail()
.map_err(BoxedError::new)
.context(ExternalSnafu);
break;
};
yield Ok(record_batch);
}
}));
let record_batch_stream = RecordBatchStreamAdaptor {
schema,
stream,
output_ordering: None,
};
Ok(Output::Stream(Box::pin(record_batch_stream)))
}
}
}
}
@@ -342,106 +376,11 @@ pub struct FlightContext {
#[cfg(test)]
mod tests {
use std::sync::Arc;
use api::helper::ColumnDataTypeWrapper;
use api::v1::auth_header::AuthScheme;
use api::v1::{AuthHeader, Basic, Column};
use common_grpc::select::{null_mask, values};
use common_grpc_expr::column_to_vector;
use datatypes::prelude::{Vector, VectorRef};
use datatypes::vectors::{
BinaryVector, BooleanVector, DateTimeVector, DateVector, Float32Vector, Float64Vector,
Int16Vector, Int32Vector, Int64Vector, Int8Vector, StringVector, UInt16Vector,
UInt32Vector, UInt64Vector, UInt8Vector,
};
use api::v1::{AuthHeader, Basic};
use crate::database::FlightContext;
#[test]
fn test_column_to_vector() {
let mut column = create_test_column(Arc::new(BooleanVector::from(vec![true])));
column.datatype = -100;
let result = column_to_vector(&column, 1);
assert!(result.is_err());
assert_eq!(
result.unwrap_err().to_string(),
"Column datatype error, source: Unknown proto column datatype: -100"
);
macro_rules! test_with_vector {
($vector: expr) => {
let vector = Arc::new($vector);
let column = create_test_column(vector.clone());
let result = column_to_vector(&column, vector.len() as u32).unwrap();
assert_eq!(result, vector as VectorRef);
};
}
test_with_vector!(BooleanVector::from(vec![Some(true), None, Some(false)]));
test_with_vector!(Int8Vector::from(vec![Some(i8::MIN), None, Some(i8::MAX)]));
test_with_vector!(Int16Vector::from(vec![
Some(i16::MIN),
None,
Some(i16::MAX)
]));
test_with_vector!(Int32Vector::from(vec![
Some(i32::MIN),
None,
Some(i32::MAX)
]));
test_with_vector!(Int64Vector::from(vec![
Some(i64::MIN),
None,
Some(i64::MAX)
]));
test_with_vector!(UInt8Vector::from(vec![Some(u8::MIN), None, Some(u8::MAX)]));
test_with_vector!(UInt16Vector::from(vec![
Some(u16::MIN),
None,
Some(u16::MAX)
]));
test_with_vector!(UInt32Vector::from(vec![
Some(u32::MIN),
None,
Some(u32::MAX)
]));
test_with_vector!(UInt64Vector::from(vec![
Some(u64::MIN),
None,
Some(u64::MAX)
]));
test_with_vector!(Float32Vector::from(vec![
Some(f32::MIN),
None,
Some(f32::MAX)
]));
test_with_vector!(Float64Vector::from(vec![
Some(f64::MIN),
None,
Some(f64::MAX)
]));
test_with_vector!(BinaryVector::from(vec![
Some(b"".to_vec()),
None,
Some(b"hello".to_vec())
]));
test_with_vector!(StringVector::from(vec![Some(""), None, Some("foo"),]));
test_with_vector!(DateVector::from(vec![Some(1), None, Some(3)]));
test_with_vector!(DateTimeVector::from(vec![Some(4), None, Some(6)]));
}
fn create_test_column(vector: VectorRef) -> Column {
let wrapper: ColumnDataTypeWrapper = vector.data_type().try_into().unwrap();
Column {
column_name: "test".to_string(),
semantic_type: 1,
values: Some(values(&[vector.clone()]).unwrap()),
null_mask: null_mask(&[vector.clone()], vector.len()),
datatype: wrapper.datatype() as i32,
}
}
#[test]
fn test_flight_ctx() {
let mut ctx = FlightContext::default();

View File

@@ -13,11 +13,10 @@
// limitations under the License.
use std::any::Any;
use std::str::FromStr;
use common_error::ext::{BoxedError, ErrorExt};
use common_error::status_code::StatusCode;
use common_error::{INNER_ERROR_CODE, INNER_ERROR_MSG};
use common_error::{GREPTIME_ERROR_CODE, GREPTIME_ERROR_MSG};
use snafu::{Location, Snafu};
use tonic::{Code, Status};
@@ -107,11 +106,18 @@ impl From<Status> for Error {
.and_then(|v| String::from_utf8(v.as_bytes().to_vec()).ok())
}
let code = get_metadata_value(&e, INNER_ERROR_CODE)
.and_then(|s| StatusCode::from_str(&s).ok())
let code = get_metadata_value(&e, GREPTIME_ERROR_CODE)
.and_then(|s| {
if let Ok(code) = s.parse::<u32>() {
StatusCode::from_u32(code)
} else {
None
}
})
.unwrap_or(StatusCode::Unknown);
let msg = get_metadata_value(&e, INNER_ERROR_MSG).unwrap_or(e.to_string());
let msg =
get_metadata_value(&e, GREPTIME_ERROR_MSG).unwrap_or_else(|| e.message().to_string());
Self::Server { code, msg }
}

View File

@@ -18,6 +18,7 @@ mod database;
pub mod error;
pub mod load_balance;
mod metrics;
pub mod region;
mod stream_insert;
pub use api;

View File

@@ -25,3 +25,4 @@ pub const METRIC_GRPC_FLUSH_TABLE: &str = "grpc.flush_table";
pub const METRIC_GRPC_COMPACT_TABLE: &str = "grpc.compact_table";
pub const METRIC_GRPC_TRUNCATE_TABLE: &str = "grpc.truncate_table";
pub const METRIC_GRPC_DO_GET: &str = "grpc.do_get";
pub(crate) const METRIC_REGION_REQUEST_GRPC: &str = "grpc.region_request";

146
src/client/src/region.rs Normal file
View File

@@ -0,0 +1,146 @@
// Copyright 2023 Greptime Team
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
use api::v1::region::{region_request, RegionRequest, RegionRequestHeader, RegionResponse};
use api::v1::ResponseHeader;
use common_error::status_code::StatusCode;
use common_telemetry::timer;
use snafu::OptionExt;
use crate::error::{IllegalDatabaseResponseSnafu, Result, ServerSnafu};
use crate::{metrics, Client};
type AffectedRows = u64;
#[derive(Debug)]
pub struct RegionRequester {
trace_id: Option<u64>,
span_id: Option<u64>,
client: Client,
}
impl RegionRequester {
pub fn new(client: Client) -> Self {
// TODO(LFC): Pass in trace_id and span_id from some context when we have it.
Self {
trace_id: None,
span_id: None,
client,
}
}
pub async fn handle(self, request: region_request::Body) -> Result<AffectedRows> {
let request_type = request.as_ref().to_string();
let request = RegionRequest {
header: Some(RegionRequestHeader {
trace_id: self.trace_id,
span_id: self.span_id,
}),
body: Some(request),
};
let _timer = timer!(
metrics::METRIC_REGION_REQUEST_GRPC,
&[("request_type", request_type)]
);
let mut client = self.client.raw_region_client()?;
let RegionResponse {
header,
affected_rows,
} = client.handle(request).await?.into_inner();
check_response_header(header)?;
Ok(affected_rows)
}
}
fn check_response_header(header: Option<ResponseHeader>) -> Result<()> {
let status = header
.and_then(|header| header.status)
.context(IllegalDatabaseResponseSnafu {
err_msg: "either response header or status is missing",
})?;
if StatusCode::is_success(status.status_code) {
Ok(())
} else {
let code =
StatusCode::from_u32(status.status_code).context(IllegalDatabaseResponseSnafu {
err_msg: format!("unknown server status: {:?}", status),
})?;
ServerSnafu {
code,
msg: status.err_msg,
}
.fail()
}
}
#[cfg(test)]
mod test {
use api::v1::Status as PbStatus;
use super::*;
use crate::Error::{IllegalDatabaseResponse, Server};
#[test]
fn test_check_response_header() {
let result = check_response_header(None);
assert!(matches!(
result.unwrap_err(),
IllegalDatabaseResponse { .. }
));
let result = check_response_header(Some(ResponseHeader { status: None }));
assert!(matches!(
result.unwrap_err(),
IllegalDatabaseResponse { .. }
));
let result = check_response_header(Some(ResponseHeader {
status: Some(PbStatus {
status_code: StatusCode::Success as u32,
err_msg: "".to_string(),
}),
}));
assert!(result.is_ok());
let result = check_response_header(Some(ResponseHeader {
status: Some(PbStatus {
status_code: u32::MAX,
err_msg: "".to_string(),
}),
}));
assert!(matches!(
result.unwrap_err(),
IllegalDatabaseResponse { .. }
));
let result = check_response_header(Some(ResponseHeader {
status: Some(PbStatus {
status_code: StatusCode::Internal as u32,
err_msg: "blabla".to_string(),
}),
}));
let Server { code, msg } = result.unwrap_err() else {
unreachable!()
};
assert_eq!(code, StatusCode::Internal);
assert_eq!(msg, "blabla");
}
}

View File

@@ -16,6 +16,7 @@ use api::v1::greptime_database_client::GreptimeDatabaseClient;
use api::v1::greptime_request::Request;
use api::v1::{
AuthHeader, GreptimeRequest, GreptimeResponse, InsertRequest, InsertRequests, RequestHeader,
RowInsertRequest, RowInsertRequests,
};
use tokio::sync::mpsc;
use tokio::task::JoinHandle;
@@ -84,6 +85,18 @@ impl StreamInserter {
})
}
pub async fn row_insert(&self, requests: Vec<RowInsertRequest>) -> Result<()> {
let inserts = RowInsertRequests { inserts: requests };
let request = self.to_rpc_request(Request::RowInserts(inserts));
self.sender.send(request).await.map_err(|e| {
error::ClientStreamingSnafu {
err_msg: e.to_string(),
}
.build()
})
}
pub async fn finish(self) -> Result<u32> {
drop(self.sender);

View File

@@ -17,6 +17,7 @@ metrics-process = ["servers/metrics-process"]
[dependencies]
anymap = "1.0.0-beta.2"
async-trait.workspace = true
auth.workspace = true
catalog = { workspace = true }
chrono.workspace = true
clap = { version = "3.1", features = ["derive"] }
@@ -41,6 +42,7 @@ meta-srv = { workspace = true }
metrics.workspace = true
nu-ansi-term = "0.46"
partition = { workspace = true }
prost.workspace = true
query = { workspace = true }
rand.workspace = true
rustyline = "10.1"

View File

@@ -16,6 +16,8 @@ mod bench;
mod cmd;
mod helper;
mod repl;
// TODO(weny): Removes it
#[allow(deprecated)]
mod upgrade;
use async_trait::async_trait;

View File

@@ -12,53 +12,30 @@
// See the License for the specific language governing permissions and
// limitations under the License.
mod datanode_table;
mod table_info;
mod table_name;
mod table_region;
use std::collections::BTreeMap;
use std::future::Future;
use std::sync::Arc;
use std::time::{Duration, Instant};
use std::time::Duration;
use async_trait::async_trait;
use clap::Parser;
use common_meta::key::table_region::RegionDistribution;
use common_meta::key::{TableMetadataManager, TableMetadataManagerRef};
use common_meta::peer::Peer;
use common_meta::rpc::router::{Region, RegionRoute};
use common_meta::table_name::TableName;
use common_telemetry::info;
use datatypes::data_type::ConcreteDataType;
use datatypes::schema::{ColumnSchema, RawSchema};
use meta_srv::service::store::etcd::EtcdStore;
use meta_srv::service::store::kv::KvBackendAdapter;
use rand::prelude::SliceRandom;
use rand::Rng;
use table::metadata::{RawTableInfo, RawTableMeta, TableId, TableIdent, TableType};
use crate::cli::bench::datanode_table::DatanodeTableBencher;
use crate::cli::bench::table_info::TableInfoBencher;
use crate::cli::bench::table_name::TableNameBencher;
use crate::cli::bench::table_region::TableRegionBencher;
use self::metadata::TableMetadataBencher;
use crate::cli::{Instance, Tool};
use crate::error::Result;
async fn bench<F, Fut>(desc: &str, f: F, count: u32)
where
F: Fn(u32) -> Fut,
Fut: Future<Output = ()>,
{
let mut total = Duration::default();
for i in 1..=count {
let start = Instant::now();
f(i).await;
total += start.elapsed();
}
let cost = total.as_millis() as f64 / count as f64;
info!("{desc}, average operation cost: {cost:.2} ms");
}
mod metadata;
async fn bench_self_recorded<F, Fut>(desc: &str, f: F, count: u32)
where
@@ -107,31 +84,11 @@ struct BenchTableMetadata {
#[async_trait]
impl Tool for BenchTableMetadata {
async fn do_work(&self) -> Result<()> {
info!("Start benching table name manager ...");
TableNameBencher::new(self.table_metadata_manager.table_name_manager(), self.count)
.start()
.await;
info!("Start benching table info manager ...");
TableInfoBencher::new(self.table_metadata_manager.table_info_manager(), self.count)
.start()
.await;
info!("Start benching table region manager ...");
TableRegionBencher::new(
self.table_metadata_manager.table_region_manager(),
self.count,
)
.start()
.await;
info!("Start benching datanode table manager ...");
DatanodeTableBencher::new(
self.table_metadata_manager.datanode_table_manager(),
self.count,
)
.start()
.await;
let bencher = TableMetadataBencher::new(self.table_metadata_manager.clone(), self.count);
bencher.bench_create().await;
bencher.bench_get().await;
bencher.bench_rename().await;
bencher.bench_delete().await;
Ok(())
}
}
@@ -184,16 +141,25 @@ fn create_table_info(table_id: TableId, table_name: TableName) -> RawTableInfo {
}
}
fn create_region_distribution() -> RegionDistribution {
let mut regions = (1..=100).collect::<Vec<u32>>();
regions.shuffle(&mut rand::thread_rng());
fn create_region_routes() -> Vec<RegionRoute> {
let mut regions = Vec::with_capacity(100);
let mut rng = rand::thread_rng();
let mut region_distribution = RegionDistribution::new();
for datanode_id in 0..10 {
region_distribution.insert(
datanode_id as u64,
regions[datanode_id * 10..(datanode_id + 1) * 10].to_vec(),
);
for region_id in 0..64u64 {
regions.push(RegionRoute {
region: Region {
id: region_id.into(),
name: String::new(),
partition: None,
attrs: BTreeMap::new(),
},
leader_peer: Some(Peer {
id: rng.gen_range(0..10),
addr: String::new(),
}),
follower_peers: vec![],
});
}
region_distribution
regions
}

View File

@@ -1,131 +0,0 @@
// Copyright 2023 Greptime Team
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
use common_meta::key::datanode_table::{DatanodeTableKey, DatanodeTableManager};
use super::bench;
pub struct DatanodeTableBencher<'a> {
datanode_table_manager: &'a DatanodeTableManager,
count: u32,
}
impl<'a> DatanodeTableBencher<'a> {
pub fn new(datanode_table_manager: &'a DatanodeTableManager, count: u32) -> Self {
Self {
datanode_table_manager,
count,
}
}
pub async fn start(&self) {
self.bench_create().await;
self.bench_get().await;
self.bench_move_region().await;
self.bench_tables().await;
self.bench_remove().await;
}
async fn bench_create(&self) {
let desc = format!(
"DatanodeTableBencher: create {} datanode table keys",
self.count
);
bench(
&desc,
|i| async move {
self.datanode_table_manager
.create(1, i, vec![1, 2, 3, 4])
.await
.unwrap();
},
self.count,
)
.await;
}
async fn bench_get(&self) {
let desc = format!(
"DatanodeTableBencher: get {} datanode table keys",
self.count
);
bench(
&desc,
|i| async move {
let key = DatanodeTableKey::new(1, i);
assert!(self
.datanode_table_manager
.get(&key)
.await
.unwrap()
.is_some());
},
self.count,
)
.await;
}
async fn bench_move_region(&self) {
let desc = format!(
"DatanodeTableBencher: move {} datanode table regions",
self.count
);
bench(
&desc,
|i| async move {
self.datanode_table_manager
.move_region(1, 2, i, 1)
.await
.unwrap();
},
self.count,
)
.await;
}
async fn bench_tables(&self) {
let desc = format!(
"DatanodeTableBencher: list {} datanode table keys",
self.count
);
bench(
&desc,
|_| async move {
assert!(!self
.datanode_table_manager
.tables(1)
.await
.unwrap()
.is_empty());
},
self.count,
)
.await;
}
async fn bench_remove(&self) {
let desc = format!(
"DatanodeTableBencher: remove {} datanode table keys",
self.count
);
bench(
&desc,
|i| async move {
self.datanode_table_manager.remove(1, i).await.unwrap();
},
self.count,
)
.await;
}
}

View File

@@ -0,0 +1,136 @@
// Copyright 2023 Greptime Team
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
use std::time::Instant;
use common_meta::key::TableMetadataManagerRef;
use common_meta::table_name::TableName;
use super::{bench_self_recorded, create_region_routes, create_table_info};
pub struct TableMetadataBencher {
table_metadata_manager: TableMetadataManagerRef,
count: u32,
}
impl TableMetadataBencher {
pub fn new(table_metadata_manager: TableMetadataManagerRef, count: u32) -> Self {
Self {
table_metadata_manager,
count,
}
}
pub async fn bench_create(&self) {
let desc = format!(
"TableMetadataBencher: creating {} table metadata",
self.count
);
bench_self_recorded(
&desc,
|i| async move {
let table_name = format!("bench_table_name_{}", i);
let table_name = TableName::new("bench_catalog", "bench_schema", table_name);
let table_info = create_table_info(i, table_name);
let region_routes = create_region_routes();
let start = Instant::now();
self.table_metadata_manager
.create_table_metadata(table_info, region_routes)
.await
.unwrap();
start.elapsed()
},
self.count,
)
.await;
}
pub async fn bench_get(&self) {
let desc = format!(
"TableMetadataBencher: getting {} table info and region routes",
self.count
);
bench_self_recorded(
&desc,
|i| async move {
let start = Instant::now();
self.table_metadata_manager
.get_full_table_info(i)
.await
.unwrap();
start.elapsed()
},
self.count,
)
.await;
}
pub async fn bench_delete(&self) {
let desc = format!(
"TableMetadataBencher: deleting {} table metadata",
self.count
);
bench_self_recorded(
&desc,
|i| async move {
let (table_info, table_route) = self
.table_metadata_manager
.get_full_table_info(i)
.await
.unwrap();
let start = Instant::now();
let _ = self
.table_metadata_manager
.delete_table_metadata(&table_info.unwrap(), &table_route.unwrap())
.await;
start.elapsed()
},
self.count,
)
.await;
}
pub async fn bench_rename(&self) {
let desc = format!("TableMetadataBencher: renaming {} table", self.count);
bench_self_recorded(
&desc,
|i| async move {
let (table_info, _) = self
.table_metadata_manager
.get_full_table_info(i)
.await
.unwrap();
let new_table_name = format!("renamed_{}", i);
let start = Instant::now();
let _ = self
.table_metadata_manager
.rename_table(table_info.unwrap(), new_table_name)
.await;
start.elapsed()
},
self.count,
)
.await;
}
}

View File

@@ -1,111 +0,0 @@
// Copyright 2023 Greptime Team
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
use std::time::Instant;
use common_meta::key::table_info::TableInfoManager;
use common_meta::table_name::TableName;
use super::{bench, bench_self_recorded, create_table_info};
pub struct TableInfoBencher<'a> {
table_info_manager: &'a TableInfoManager,
count: u32,
}
impl<'a> TableInfoBencher<'a> {
pub fn new(table_info_manager: &'a TableInfoManager, count: u32) -> Self {
Self {
table_info_manager,
count,
}
}
pub async fn start(&self) {
self.bench_create().await;
self.bench_get().await;
self.bench_compare_and_put().await;
self.bench_remove().await;
}
async fn bench_create(&self) {
let desc = format!("TableInfoBencher: create {} table infos", self.count);
bench(
&desc,
|i| async move {
let table_name = format!("bench_table_name_{}", i);
let table_name = TableName::new("bench_catalog", "bench_schema", table_name);
let table_info = create_table_info(i, table_name);
self.table_info_manager
.create(i, &table_info)
.await
.unwrap();
},
self.count,
)
.await;
}
async fn bench_get(&self) {
let desc = format!("TableInfoBencher: get {} table infos", self.count);
bench(
&desc,
|i| async move {
assert!(self.table_info_manager.get(i).await.unwrap().is_some());
},
self.count,
)
.await;
}
async fn bench_compare_and_put(&self) {
let desc = format!(
"TableInfoBencher: compare_and_put {} table infos",
self.count
);
bench_self_recorded(
&desc,
|i| async move {
let table_info_value = self.table_info_manager.get(i).await.unwrap().unwrap();
let mut new_table_info = table_info_value.table_info.clone();
new_table_info.ident.version += 1;
let start = Instant::now();
self.table_info_manager
.compare_and_put(i, Some(table_info_value), new_table_info)
.await
.unwrap()
.unwrap();
start.elapsed()
},
self.count,
)
.await;
}
async fn bench_remove(&self) {
let desc = format!("TableInfoBencher: remove {} table infos", self.count);
bench(
&desc,
|i| async move {
self.table_info_manager.remove(i).await.unwrap();
},
self.count,
)
.await;
}
}

View File

@@ -1,131 +0,0 @@
// Copyright 2023 Greptime Team
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
use common_meta::key::table_name::{TableNameKey, TableNameManager};
use super::bench;
pub struct TableNameBencher<'a> {
table_name_manager: &'a TableNameManager,
count: u32,
}
impl<'a> TableNameBencher<'a> {
pub fn new(table_name_manager: &'a TableNameManager, count: u32) -> Self {
Self {
table_name_manager,
count,
}
}
pub async fn start(&self) {
self.bench_create().await;
self.bench_rename().await;
self.bench_get().await;
self.bench_tables().await;
self.bench_remove().await;
}
async fn bench_create(&self) {
let desc = format!("TableNameBencher: create {} table names", self.count);
bench(
&desc,
|i| async move {
let table_name = format!("bench_table_name_{}", i);
let table_name_key = create_table_name_key(&table_name);
self.table_name_manager
.create(&table_name_key, i)
.await
.unwrap();
},
self.count,
)
.await;
}
async fn bench_rename(&self) {
let desc = format!("TableNameBencher: rename {} table names", self.count);
bench(
&desc,
|i| async move {
let table_name = format!("bench_table_name_{}", i);
let new_table_name = format!("bench_table_name_new_{}", i);
let table_name_key = create_table_name_key(&table_name);
self.table_name_manager
.rename(table_name_key, i, &new_table_name)
.await
.unwrap();
},
self.count,
)
.await;
}
async fn bench_get(&self) {
let desc = format!("TableNameBencher: get {} table names", self.count);
bench(
&desc,
|i| async move {
let table_name = format!("bench_table_name_new_{}", i);
let table_name_key = create_table_name_key(&table_name);
assert!(self
.table_name_manager
.get(table_name_key)
.await
.unwrap()
.is_some());
},
self.count,
)
.await;
}
async fn bench_tables(&self) {
let desc = format!("TableNameBencher: list all {} table names", self.count);
bench(
&desc,
|_| async move {
assert!(!self
.table_name_manager
.tables("bench_catalog", "bench_schema")
.await
.unwrap()
.is_empty());
},
self.count,
)
.await;
}
async fn bench_remove(&self) {
let desc = format!("TableNameBencher: remove {} table names", self.count);
bench(
&desc,
|i| async move {
let table_name = format!("bench_table_name_new_{}", i);
let table_name_key = create_table_name_key(&table_name);
self.table_name_manager
.remove(table_name_key)
.await
.unwrap();
},
self.count,
)
.await;
}
}
fn create_table_name_key(table_name: &str) -> TableNameKey {
TableNameKey::new("bench_catalog", "bench_schema", table_name)
}

View File

@@ -1,112 +0,0 @@
// Copyright 2023 Greptime Team
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
use std::time::Instant;
use common_meta::key::table_region::TableRegionManager;
use super::{bench, bench_self_recorded, create_region_distribution};
pub struct TableRegionBencher<'a> {
table_region_manager: &'a TableRegionManager,
count: u32,
}
impl<'a> TableRegionBencher<'a> {
pub fn new(table_region_manager: &'a TableRegionManager, count: u32) -> Self {
Self {
table_region_manager,
count,
}
}
pub async fn start(&self) {
self.bench_create().await;
self.bench_get().await;
self.bench_compare_and_put().await;
self.bench_remove().await;
}
async fn bench_create(&self) {
let desc = format!("TableRegionBencher: create {} table regions", self.count);
bench_self_recorded(
&desc,
|i| async move {
let region_distribution = create_region_distribution();
let start = Instant::now();
self.table_region_manager
.create(i, &region_distribution)
.await
.unwrap();
start.elapsed()
},
self.count,
)
.await;
}
async fn bench_get(&self) {
let desc = format!("TableRegionBencher: get {} table regions", self.count);
bench(
&desc,
|i| async move {
assert!(self.table_region_manager.get(i).await.unwrap().is_some());
},
self.count,
)
.await;
}
async fn bench_compare_and_put(&self) {
let desc = format!(
"TableRegionBencher: compare_and_put {} table regions",
self.count
);
bench_self_recorded(
&desc,
|i| async move {
let table_region_value = self.table_region_manager.get(i).await.unwrap().unwrap();
let new_region_distribution = create_region_distribution();
let start = Instant::now();
self.table_region_manager
.compare_and_put(i, Some(table_region_value), new_region_distribution)
.await
.unwrap()
.unwrap();
start.elapsed()
},
self.count,
)
.await;
}
async fn bench_remove(&self) {
let desc = format!("TableRegionBencher: remove {} table regions", self.count);
bench(
&desc,
|i| async move {
assert!(self.table_region_manager.remove(i).await.unwrap().is_some());
},
self.count,
)
.await;
}
}

View File

@@ -16,6 +16,7 @@ use std::sync::Arc;
use async_trait::async_trait;
use clap::Parser;
use client::api::v1::meta::TableRouteValue;
use common_meta::error as MetaError;
use common_meta::helper::{CatalogKey as v1CatalogKey, SchemaKey as v1SchemaKey, TableGlobalValue};
use common_meta::key::catalog_name::{CatalogNameKey, CatalogNameValue};
@@ -23,9 +24,11 @@ use common_meta::key::datanode_table::{DatanodeTableKey, DatanodeTableValue};
use common_meta::key::schema_name::{SchemaNameKey, SchemaNameValue};
use common_meta::key::table_info::{TableInfoKey, TableInfoValue};
use common_meta::key::table_name::{TableNameKey, TableNameValue};
use common_meta::key::table_region::{RegionDistribution, TableRegionKey, TableRegionValue};
use common_meta::key::TableMetaKey;
use common_meta::key::table_region::{TableRegionKey, TableRegionValue};
use common_meta::key::table_route::{NextTableRouteKey, TableRouteValue as NextTableRouteValue};
use common_meta::key::{RegionDistribution, TableMetaKey};
use common_meta::range_stream::PaginationStream;
use common_meta::rpc::router::TableRoute;
use common_meta::rpc::store::{BatchDeleteRequest, BatchPutRequest, PutRequest, RangeRequest};
use common_meta::rpc::KeyValue;
use common_meta::util::get_prefix_end_key;
@@ -34,6 +37,7 @@ use etcd_client::Client;
use futures::TryStreamExt;
use meta_srv::service::store::etcd::EtcdStore;
use meta_srv::service::store::kv::{KvBackendAdapter, KvStoreRef};
use prost::Message;
use snafu::ResultExt;
use crate::cli::{Instance, Tool};
@@ -45,6 +49,15 @@ pub struct UpgradeCommand {
etcd_addr: String,
#[clap(long)]
dryrun: bool,
#[clap(long)]
skip_table_global_keys: bool,
#[clap(long)]
skip_catalog_keys: bool,
#[clap(long)]
skip_schema_keys: bool,
#[clap(long)]
skip_table_route_keys: bool,
}
impl UpgradeCommand {
@@ -57,6 +70,10 @@ impl UpgradeCommand {
let tool = MigrateTableMetadata {
etcd_store: EtcdStore::with_etcd_client(client),
dryrun: self.dryrun,
skip_catalog_keys: self.skip_catalog_keys,
skip_table_global_keys: self.skip_table_global_keys,
skip_schema_keys: self.skip_schema_keys,
skip_table_route_keys: self.skip_table_route_keys,
};
Ok(Instance::Tool(Box::new(tool)))
}
@@ -65,15 +82,32 @@ impl UpgradeCommand {
struct MigrateTableMetadata {
etcd_store: KvStoreRef,
dryrun: bool,
skip_table_global_keys: bool,
skip_catalog_keys: bool,
skip_schema_keys: bool,
skip_table_route_keys: bool,
}
#[async_trait]
impl Tool for MigrateTableMetadata {
// migrates database's metadata from 0.3 to 0.4.
async fn do_work(&self) -> Result<()> {
self.migrate_table_global_values().await?;
self.migrate_catalog_keys().await?;
self.migrate_schema_keys().await?;
if !self.skip_table_global_keys {
self.migrate_table_global_values().await?;
}
if !self.skip_catalog_keys {
self.migrate_catalog_keys().await?;
}
if !self.skip_schema_keys {
self.migrate_schema_keys().await?;
}
if !self.skip_table_route_keys {
self.migrate_table_route_keys().await?;
}
Ok(())
}
}
@@ -81,6 +115,64 @@ impl Tool for MigrateTableMetadata {
const PAGE_SIZE: usize = 1000;
impl MigrateTableMetadata {
async fn migrate_table_route_keys(&self) -> Result<()> {
let key = b"__meta_table_route".to_vec();
let range_end = get_prefix_end_key(&key);
let mut keys = Vec::new();
info!("Start scanning key from: {}", String::from_utf8_lossy(&key));
let mut stream = PaginationStream::new(
KvBackendAdapter::wrap(self.etcd_store.clone()),
RangeRequest::new().with_range(key, range_end),
PAGE_SIZE,
Arc::new(|kv: KeyValue| {
let value =
TableRouteValue::decode(&kv.value[..]).context(MetaError::DecodeProtoSnafu)?;
Ok((kv.key, value))
}),
);
while let Some((key, value)) = stream.try_next().await.context(error::IterStreamSnafu)? {
let table_id = self.migrate_table_route_key(value).await?;
keys.push(key);
keys.push(TableRegionKey::new(table_id).as_raw_key())
}
info!("Total migrated TableRouteKeys: {}", keys.len() / 2);
self.delete_migrated_keys(keys).await;
Ok(())
}
async fn migrate_table_route_key(&self, value: TableRouteValue) -> Result<u32> {
let table_route = TableRoute::try_from_raw(
&value.peers,
value.table_route.expect("expected table_route"),
)
.unwrap();
let new_table_value = NextTableRouteValue::new(table_route.region_routes);
let table_id = table_route.table.id as u32;
let new_key = NextTableRouteKey::new(table_id);
info!("Creating '{new_key}'");
if self.dryrun {
info!("Dryrun: do nothing");
} else {
self.etcd_store
.put(
PutRequest::new()
.with_key(new_key.as_raw_key())
.with_value(new_table_value.try_as_raw_value().unwrap()),
)
.await
.unwrap();
}
Ok(table_id)
}
async fn migrate_schema_keys(&self) -> Result<()> {
// The schema key prefix.
let key = b"__s".to_vec();
@@ -113,7 +205,7 @@ impl MigrateTableMetadata {
async fn migrate_schema_key(&self, key: &v1SchemaKey) -> Result<()> {
let new_key = SchemaNameKey::new(&key.catalog_name, &key.schema_name);
let schema_name_value = SchemaNameValue;
let schema_name_value = SchemaNameValue::default();
info!("Creating '{new_key}'");
@@ -220,7 +312,7 @@ impl MigrateTableMetadata {
async fn delete_migrated_keys(&self, keys: Vec<Vec<u8>>) {
for keys in keys.chunks(PAGE_SIZE) {
info!("Deleting {} TableGlobalKeys", keys.len());
info!("Deleting {} keys", keys.len());
let req = BatchDeleteRequest {
keys: keys.to_vec(),
prev_kv: false,

View File

@@ -229,7 +229,6 @@ mod tests {
[storage.manifest]
checkpoint_margin = 9
gc_duration = '7s'
checkpoint_on_startup = true
compress = true
[logging]
@@ -289,7 +288,6 @@ mod tests {
RegionManifestConfig {
checkpoint_margin: Some(9),
gc_duration: Some(Duration::from_secs(7)),
checkpoint_on_startup: true,
compress: true
},
options.storage.manifest,
@@ -383,9 +381,6 @@ mod tests {
max_files_in_level0 = 7
max_purge_tasks = 32
[storage.manifest]
checkpoint_on_startup = true
[logging]
level = "debug"
dir = "/tmp/greptimedb/test/logs"

View File

@@ -80,7 +80,7 @@ pub enum Error {
#[snafu(display("Illegal auth config: {}", source))]
IllegalAuthConfig {
location: Location,
source: servers::auth::Error,
source: auth::error::Error,
},
#[snafu(display("Unsupported selector type, {} source: {}", selector_type, source))]

View File

@@ -14,16 +14,16 @@
use std::sync::Arc;
use auth::UserProviderRef;
use clap::Parser;
use common_base::Plugins;
use common_telemetry::logging;
use frontend::frontend::FrontendOptions;
use frontend::instance::{FrontendInstance, Instance as FeInstance};
use frontend::service_config::{InfluxdbOptions, PrometheusOptions};
use frontend::service_config::InfluxdbOptions;
use meta_client::MetaClientOptions;
use servers::auth::UserProviderRef;
use servers::tls::{TlsMode, TlsOption};
use servers::{auth, Mode};
use servers::Mode;
use snafu::ResultExt;
use crate::error::{self, IllegalAuthConfigSnafu, Result, StartCatalogManagerSnafu};
@@ -99,8 +99,6 @@ pub struct StartCommand {
#[clap(long)]
mysql_addr: Option<String>,
#[clap(long)]
prom_addr: Option<String>,
#[clap(long)]
postgres_addr: Option<String>,
#[clap(long)]
opentsdb_addr: Option<String>,
@@ -171,10 +169,6 @@ impl StartCommand {
}
}
if let Some(addr) = &self.prom_addr {
opts.prometheus_options = Some(PrometheusOptions { addr: addr.clone() });
}
if let Some(addr) = &self.postgres_addr {
if let Some(postgres_opts) = &mut opts.postgres_options {
postgres_opts.addr = addr.clone();
@@ -236,10 +230,10 @@ mod tests {
use std::io::Write;
use std::time::Duration;
use auth::{Identity, Password, UserProviderRef};
use common_base::readable_size::ReadableSize;
use common_test_util::temp_dir::create_named_temp_file;
use frontend::service_config::GrpcOptions;
use servers::auth::{Identity, Password, UserProviderRef};
use super::*;
use crate::options::ENV_VAR_SEP;
@@ -248,7 +242,6 @@ mod tests {
fn test_try_from_start_command() {
let command = StartCommand {
http_addr: Some("127.0.0.1:1234".to_string()),
prom_addr: Some("127.0.0.1:4444".to_string()),
mysql_addr: Some("127.0.0.1:5678".to_string()),
postgres_addr: Some("127.0.0.1:5432".to_string()),
opentsdb_addr: Some("127.0.0.1:4321".to_string()),
@@ -276,10 +269,6 @@ mod tests {
opts.opentsdb_options.as_ref().unwrap().addr,
"127.0.0.1:4321"
);
assert_eq!(
opts.prometheus_options.as_ref().unwrap().addr,
"127.0.0.1:4444"
);
let default_opts = FrontendOptions::default();
assert_eq!(

View File

@@ -91,9 +91,9 @@ struct StartCommand {
#[clap(short, long)]
selector: Option<String>,
#[clap(long)]
use_memory_store: bool,
use_memory_store: Option<bool>,
#[clap(long)]
disable_region_failover: bool,
disable_region_failover: Option<bool>,
#[clap(long)]
http_addr: Option<String>,
#[clap(long)]
@@ -136,9 +136,13 @@ impl StartCommand {
.context(error::UnsupportedSelectorTypeSnafu { selector_type })?;
}
opts.use_memory_store = self.use_memory_store;
if let Some(use_memory_store) = self.use_memory_store {
opts.use_memory_store = use_memory_store;
}
opts.disable_region_failover = self.disable_region_failover;
if let Some(disable_region_failover) = self.disable_region_failover {
opts.disable_region_failover = disable_region_failover;
}
if let Some(http_addr) = &self.http_addr {
opts.http_opts.addr = http_addr.clone();

View File

@@ -201,17 +201,6 @@ mod tests {
.join(ENV_VAR_SEP),
Some("42s"),
),
(
// storage.manifest.checkpoint_on_startup = true
[
env_prefix.to_string(),
"storage".to_uppercase(),
"manifest".to_uppercase(),
"checkpoint_on_startup".to_uppercase(),
]
.join(ENV_VAR_SEP),
Some("true"),
),
(
// wal.dir = /other/wal/dir
[
@@ -253,7 +242,6 @@ mod tests {
opts.storage.manifest.gc_duration,
Some(Duration::from_secs(42))
);
assert!(opts.storage.manifest.checkpoint_on_startup);
assert_eq!(
opts.meta_client_options.unwrap().metasrv_addrs,
vec![

View File

@@ -24,7 +24,6 @@ use frontend::frontend::FrontendOptions;
use frontend::instance::{FrontendInstance, Instance as FeInstance};
use frontend::service_config::{
GrpcOptions, InfluxdbOptions, MysqlOptions, OpentsdbOptions, PostgresOptions, PromStoreOptions,
PrometheusOptions,
};
use serde::{Deserialize, Serialize};
use servers::http::HttpOptions;
@@ -91,7 +90,6 @@ pub struct StandaloneOptions {
pub opentsdb_options: Option<OpentsdbOptions>,
pub influxdb_options: Option<InfluxdbOptions>,
pub prom_store_options: Option<PromStoreOptions>,
pub prometheus_options: Option<PrometheusOptions>,
pub wal: WalConfig,
pub storage: StorageConfig,
pub procedure: ProcedureConfig,
@@ -111,7 +109,6 @@ impl Default for StandaloneOptions {
opentsdb_options: Some(OpentsdbOptions::default()),
influxdb_options: Some(InfluxdbOptions::default()),
prom_store_options: Some(PromStoreOptions::default()),
prometheus_options: Some(PrometheusOptions::default()),
wal: WalConfig::default(),
storage: StorageConfig::default(),
procedure: ProcedureConfig::default(),
@@ -131,7 +128,6 @@ impl StandaloneOptions {
opentsdb_options: self.opentsdb_options,
influxdb_options: self.influxdb_options,
prom_store_options: self.prom_store_options,
prometheus_options: self.prometheus_options,
meta_client_options: None,
logging: self.logging,
..Default::default()
@@ -193,8 +189,6 @@ struct StartCommand {
#[clap(long)]
mysql_addr: Option<String>,
#[clap(long)]
prom_addr: Option<String>,
#[clap(long)]
postgres_addr: Option<String>,
#[clap(long)]
opentsdb_addr: Option<String>,
@@ -271,10 +265,6 @@ impl StartCommand {
}
}
if let Some(addr) = &self.prom_addr {
opts.prometheus_options = Some(PrometheusOptions { addr: addr.clone() })
}
if let Some(addr) = &self.postgres_addr {
if let Some(postgres_opts) = &mut opts.postgres_options {
postgres_opts.addr = addr.clone();
@@ -345,9 +335,9 @@ mod tests {
use std::io::Write;
use std::time::Duration;
use auth::{Identity, Password, UserProviderRef};
use common_base::readable_size::ReadableSize;
use common_test_util::temp_dir::create_named_temp_file;
use servers::auth::{Identity, Password, UserProviderRef};
use servers::Mode;
use super::*;
@@ -408,7 +398,6 @@ mod tests {
[storage.manifest]
checkpoint_margin = 9
gc_duration = '7s'
checkpoint_on_startup = true
[http_options]
addr = "127.0.0.1:4000"

View File

@@ -35,8 +35,14 @@ pub const INFORMATION_SCHEMA_TABLES_TABLE_ID: u32 = 3;
pub const INFORMATION_SCHEMA_COLUMNS_TABLE_ID: u32 = 4;
pub const MITO_ENGINE: &str = "mito";
pub const MITO2_ENGINE: &str = "mito2";
pub fn default_engine() -> &'static str {
MITO_ENGINE
}
pub const IMMUTABLE_FILE_ENGINE: &str = "file";
pub const SEMANTIC_TYPE_PRIMARY_KEY: &str = "PRIMARY KEY";
pub const SEMANTIC_TYPE_PRIMARY_KEY: &str = "TAG";
pub const SEMANTIC_TYPE_FIELD: &str = "FIELD";
pub const SEMANTIC_TYPE_TIME_INDEX: &str = "TIME INDEX";
pub const SEMANTIC_TYPE_TIME_INDEX: &str = "TIMESTAMP";

View File

@@ -27,7 +27,7 @@ orc-rust = "0.2"
paste = "1.0"
regex = "1.7"
snafu.workspace = true
strum = { version = "0.21", features = ["derive"] }
strum.workspace = true
tokio-util.workspace = true
tokio.workspace = true
url = "2.3"

View File

@@ -6,4 +6,4 @@ license.workspace = true
[dependencies]
snafu = { version = "0.7", features = ["backtraces"] }
strum = { version = "0.24", features = ["std", "derive"] }
strum.workspace = true

View File

@@ -17,7 +17,7 @@ pub mod format;
pub mod mock;
pub mod status_code;
pub const INNER_ERROR_CODE: &str = "INNER_ERROR_CODE";
pub const INNER_ERROR_MSG: &str = "INNER_ERROR_MSG";
pub const GREPTIME_ERROR_CODE: &str = "x-greptime-err-code";
pub const GREPTIME_ERROR_MSG: &str = "x-greptime-err-msg";
pub use snafu;

View File

@@ -84,6 +84,8 @@ pub enum StatusCode {
InvalidAuthHeader = 7004,
/// Illegal request to connect catalog-schema
AccessDenied = 7005,
/// User is not authorized to perform the operation
PermissionDenied = 7006,
// ====== End of auth related status code =====
}
@@ -120,7 +122,8 @@ impl StatusCode {
| StatusCode::UserPasswordMismatch
| StatusCode::AuthHeaderNotFound
| StatusCode::InvalidAuthHeader
| StatusCode::AccessDenied => false,
| StatusCode::AccessDenied
| StatusCode::PermissionDenied => false,
}
}
@@ -151,7 +154,8 @@ impl StatusCode {
| StatusCode::UserPasswordMismatch
| StatusCode::AuthHeaderNotFound
| StatusCode::InvalidAuthHeader
| StatusCode::AccessDenied => false,
| StatusCode::AccessDenied
| StatusCode::PermissionDenied => false,
}
}

View File

@@ -50,10 +50,11 @@ impl GreptimeDBTelemetryTask {
}
pub fn start(&self, runtime: Runtime) -> Result<()> {
print_anonymous_usage_data_disclaimer();
match self {
GreptimeDBTelemetryTask::Enable(task) => task.start(runtime),
GreptimeDBTelemetryTask::Enable(task) => {
print_anonymous_usage_data_disclaimer();
task.start(runtime)
}
GreptimeDBTelemetryTask::Disable => Ok(()),
}
}
@@ -412,6 +413,6 @@ mod tests {
let uuid = default_get_uuid(&Some(working_home.clone()));
assert!(uuid.is_some());
assert_eq!(uuid, default_get_uuid(&Some(working_home.clone())));
assert_eq!(uuid, default_get_uuid(&Some(working_home.clone())));
assert_eq!(uuid, default_get_uuid(&Some(working_home)));
}
}

View File

@@ -23,7 +23,11 @@ use table::requests::DeleteRequest;
use crate::error::{ColumnDataTypeSnafu, IllegalDeleteRequestSnafu, Result};
use crate::insert::add_values_to_builder;
pub fn to_table_delete_request(request: GrpcDeleteRequest) -> Result<DeleteRequest> {
pub fn to_table_delete_request(
catalog_name: &str,
schema_name: &str,
request: GrpcDeleteRequest,
) -> Result<DeleteRequest> {
let row_count = request.row_count as usize;
let mut key_column_values = HashMap::with_capacity(request.key_columns.len());
@@ -52,7 +56,12 @@ pub fn to_table_delete_request(request: GrpcDeleteRequest) -> Result<DeleteReque
);
}
Ok(DeleteRequest { key_column_values })
Ok(DeleteRequest {
catalog_name: catalog_name.to_string(),
schema_name: schema_name.to_string(),
table_name: request.table_name,
key_column_values,
})
}
#[cfg(test)]
@@ -94,8 +103,12 @@ mod tests {
row_count: 3,
};
let mut request = to_table_delete_request(grpc_request).unwrap();
let mut request =
to_table_delete_request("foo_catalog", "foo_schema", grpc_request).unwrap();
assert_eq!(request.catalog_name, "foo_catalog");
assert_eq!(request.schema_name, "foo_schema");
assert_eq!(request.table_name, "foo");
assert_eq!(
Arc::new(Int32Vector::from_slice(vec![1, 2, 3])) as VectorRef,
request.key_column_values.remove("id").unwrap()

View File

@@ -47,6 +47,9 @@ pub enum Error {
location: Location,
},
#[snafu(display("Duplicated column name in gRPC requests, name: {}", name,))]
DuplicatedColumnName { name: String, location: Location },
#[snafu(display("Missing timestamp column, msg: {}", msg))]
MissingTimestampColumn { msg: String, location: Location },
@@ -101,9 +104,9 @@ impl ErrorExt for Error {
Error::IllegalDeleteRequest { .. } => StatusCode::InvalidArguments,
Error::ColumnDataType { .. } => StatusCode::Internal,
Error::DuplicatedTimestampColumn { .. } | Error::MissingTimestampColumn { .. } => {
StatusCode::InvalidArguments
}
Error::DuplicatedTimestampColumn { .. }
| Error::DuplicatedColumnName { .. }
| Error::MissingTimestampColumn { .. } => StatusCode::InvalidArguments,
Error::InvalidColumnProto { .. } => StatusCode::InvalidArguments,
Error::CreateVector { .. } => StatusCode::InvalidArguments,
Error::MissingField { .. } => StatusCode::InvalidArguments,

View File

@@ -12,233 +12,31 @@
// See the License for the specific language governing permissions and
// limitations under the License.
use std::collections::{HashMap, HashSet};
use std::sync::Arc;
use std::collections::HashMap;
use api::helper;
use api::helper::ColumnDataTypeWrapper;
use api::v1::column::Values;
use api::v1::{
AddColumn, AddColumns, Column, ColumnDataType, ColumnDef, CreateTableExpr,
InsertRequest as GrpcInsertRequest, SemanticType,
};
use api::v1::{AddColumns, Column, CreateTableExpr, InsertRequest as GrpcInsertRequest};
use common_base::BitVec;
use common_time::time::Time;
use common_time::timestamp::Timestamp;
use common_time::{Date, DateTime, Interval};
use datatypes::data_type::{ConcreteDataType, DataType};
use datatypes::prelude::{ValueRef, VectorRef};
use datatypes::scalars::ScalarVector;
use datatypes::prelude::VectorRef;
use datatypes::schema::SchemaRef;
use datatypes::types::{
Int16Type, Int8Type, IntervalType, TimeType, TimestampType, UInt16Type, UInt8Type,
};
use datatypes::value::Value;
use datatypes::vectors::{
BinaryVector, BooleanVector, DateTimeVector, DateVector, Float32Vector, Float64Vector,
Int32Vector, Int64Vector, IntervalDayTimeVector, IntervalMonthDayNanoVector,
IntervalYearMonthVector, PrimitiveVector, StringVector, TimeMicrosecondVector,
TimeMillisecondVector, TimeNanosecondVector, TimeSecondVector, TimestampMicrosecondVector,
TimestampMillisecondVector, TimestampNanosecondVector, TimestampSecondVector, UInt32Vector,
UInt64Vector,
};
use snafu::{ensure, OptionExt, ResultExt};
use snafu::{ensure, ResultExt};
use table::engine::TableReference;
use table::metadata::TableId;
use table::requests::InsertRequest;
use crate::error::{
ColumnAlreadyExistsSnafu, ColumnDataTypeSnafu, CreateVectorSnafu,
DuplicatedTimestampColumnSnafu, InvalidColumnProtoSnafu, MissingTimestampColumnSnafu, Result,
ColumnAlreadyExistsSnafu, ColumnDataTypeSnafu, CreateVectorSnafu, Result,
UnexpectedValuesLengthSnafu,
};
const TAG_SEMANTIC_TYPE: i32 = SemanticType::Tag as i32;
const TIMESTAMP_SEMANTIC_TYPE: i32 = SemanticType::Timestamp as i32;
#[inline]
fn build_column_def(column_name: &str, datatype: i32, nullable: bool) -> ColumnDef {
ColumnDef {
name: column_name.to_string(),
datatype,
is_nullable: nullable,
default_constraint: vec![],
}
}
use crate::util;
use crate::util::ColumnExpr;
pub fn find_new_columns(schema: &SchemaRef, columns: &[Column]) -> Result<Option<AddColumns>> {
let mut columns_to_add = Vec::default();
let mut new_columns: HashSet<String> = HashSet::default();
for Column {
column_name,
semantic_type,
datatype,
..
} in columns
{
if schema.column_schema_by_name(column_name).is_none() && !new_columns.contains(column_name)
{
let column_def = Some(build_column_def(column_name, *datatype, true));
columns_to_add.push(AddColumn {
column_def,
is_key: *semantic_type == TAG_SEMANTIC_TYPE,
location: None,
});
let _ = new_columns.insert(column_name.to_string());
}
}
if columns_to_add.is_empty() {
Ok(None)
} else {
Ok(Some(AddColumns {
add_columns: columns_to_add,
}))
}
}
pub fn column_to_vector(column: &Column, rows: u32) -> Result<VectorRef> {
let wrapper = ColumnDataTypeWrapper::try_new(column.datatype).context(ColumnDataTypeSnafu)?;
let column_datatype = wrapper.datatype();
let rows = rows as usize;
let mut vector = ConcreteDataType::from(wrapper).create_mutable_vector(rows);
if let Some(values) = &column.values {
let values = collect_column_values(column_datatype, values);
let mut values_iter = values.into_iter();
let null_mask = BitVec::from_slice(&column.null_mask);
let mut nulls_iter = null_mask.iter().by_vals().fuse();
for i in 0..rows {
if let Some(true) = nulls_iter.next() {
vector.push_null();
} else {
let value_ref = values_iter
.next()
.with_context(|| InvalidColumnProtoSnafu {
err_msg: format!(
"value not found at position {} of column {}",
i, &column.column_name
),
})?;
vector
.try_push_value_ref(value_ref)
.context(CreateVectorSnafu)?;
}
}
} else {
(0..rows).for_each(|_| vector.push_null());
}
Ok(vector.to_vector())
}
fn collect_column_values(column_datatype: ColumnDataType, values: &Values) -> Vec<ValueRef> {
macro_rules! collect_values {
($value: expr, $mapper: expr) => {
$value.iter().map($mapper).collect::<Vec<ValueRef>>()
};
}
match column_datatype {
ColumnDataType::Boolean => collect_values!(values.bool_values, |v| ValueRef::from(*v)),
ColumnDataType::Int8 => collect_values!(values.i8_values, |v| ValueRef::from(*v as i8)),
ColumnDataType::Int16 => {
collect_values!(values.i16_values, |v| ValueRef::from(*v as i16))
}
ColumnDataType::Int32 => {
collect_values!(values.i32_values, |v| ValueRef::from(*v))
}
ColumnDataType::Int64 => {
collect_values!(values.i64_values, |v| ValueRef::from(*v))
}
ColumnDataType::Uint8 => {
collect_values!(values.u8_values, |v| ValueRef::from(*v as u8))
}
ColumnDataType::Uint16 => {
collect_values!(values.u16_values, |v| ValueRef::from(*v as u16))
}
ColumnDataType::Uint32 => {
collect_values!(values.u32_values, |v| ValueRef::from(*v))
}
ColumnDataType::Uint64 => {
collect_values!(values.u64_values, |v| ValueRef::from(*v))
}
ColumnDataType::Float32 => collect_values!(values.f32_values, |v| ValueRef::from(*v)),
ColumnDataType::Float64 => collect_values!(values.f64_values, |v| ValueRef::from(*v)),
ColumnDataType::Binary => {
collect_values!(values.binary_values, |v| ValueRef::from(v.as_slice()))
}
ColumnDataType::String => {
collect_values!(values.string_values, |v| ValueRef::from(v.as_str()))
}
ColumnDataType::Date => {
collect_values!(values.date_values, |v| ValueRef::Date(Date::new(*v)))
}
ColumnDataType::Datetime => {
collect_values!(values.datetime_values, |v| ValueRef::DateTime(
DateTime::new(*v)
))
}
ColumnDataType::TimestampSecond => {
collect_values!(values.ts_second_values, |v| ValueRef::Timestamp(
Timestamp::new_second(*v)
))
}
ColumnDataType::TimestampMillisecond => {
collect_values!(values.ts_millisecond_values, |v| ValueRef::Timestamp(
Timestamp::new_millisecond(*v)
))
}
ColumnDataType::TimestampMicrosecond => {
collect_values!(values.ts_millisecond_values, |v| ValueRef::Timestamp(
Timestamp::new_microsecond(*v)
))
}
ColumnDataType::TimestampNanosecond => {
collect_values!(values.ts_millisecond_values, |v| ValueRef::Timestamp(
Timestamp::new_nanosecond(*v)
))
}
ColumnDataType::TimeSecond => {
collect_values!(values.time_second_values, |v| ValueRef::Time(
Time::new_second(*v)
))
}
ColumnDataType::TimeMillisecond => {
collect_values!(values.time_millisecond_values, |v| ValueRef::Time(
Time::new_millisecond(*v)
))
}
ColumnDataType::TimeMicrosecond => {
collect_values!(values.time_millisecond_values, |v| ValueRef::Time(
Time::new_microsecond(*v)
))
}
ColumnDataType::TimeNanosecond => {
collect_values!(values.time_millisecond_values, |v| ValueRef::Time(
Time::new_nanosecond(*v)
))
}
ColumnDataType::IntervalYearMonth => {
collect_values!(values.interval_year_month_values, |v| {
ValueRef::Interval(Interval::from_i32(*v))
})
}
ColumnDataType::IntervalDayTime => {
collect_values!(values.interval_day_time_values, |v| {
ValueRef::Interval(Interval::from_i64(*v))
})
}
ColumnDataType::IntervalMonthDayNano => {
collect_values!(values.interval_month_day_nano_values, |v| {
ValueRef::Interval(Interval::from_month_day_nano(
v.months,
v.days,
v.nanoseconds,
))
})
}
}
let column_exprs = ColumnExpr::from_columns(columns);
util::extract_new_columns(schema, column_exprs)
}
/// Try to build create table request from insert data.
@@ -250,70 +48,15 @@ pub fn build_create_expr_from_insertion(
columns: &[Column],
engine: &str,
) -> Result<CreateTableExpr> {
let mut new_columns: HashSet<String> = HashSet::default();
let mut column_defs = Vec::default();
let mut primary_key_indices = Vec::default();
let mut timestamp_index = usize::MAX;
for Column {
column_name,
semantic_type,
datatype,
..
} in columns
{
if !new_columns.contains(column_name) {
let mut is_nullable = true;
match *semantic_type {
TAG_SEMANTIC_TYPE => primary_key_indices.push(column_defs.len()),
TIMESTAMP_SEMANTIC_TYPE => {
ensure!(
timestamp_index == usize::MAX,
DuplicatedTimestampColumnSnafu {
exists: &columns[timestamp_index].column_name,
duplicated: column_name,
}
);
timestamp_index = column_defs.len();
// Timestamp column must not be null.
is_nullable = false;
}
_ => {}
}
let column_def = build_column_def(column_name, *datatype, is_nullable);
column_defs.push(column_def);
let _ = new_columns.insert(column_name.to_string());
}
}
ensure!(
timestamp_index != usize::MAX,
MissingTimestampColumnSnafu { msg: table_name }
);
let timestamp_field_name = columns[timestamp_index].column_name.clone();
let primary_keys = primary_key_indices
.iter()
.map(|idx| columns[*idx].column_name.clone())
.collect::<Vec<_>>();
let expr = CreateTableExpr {
catalog_name: catalog_name.to_string(),
schema_name: schema_name.to_string(),
table_name: table_name.to_string(),
desc: "Created on insertion".to_string(),
column_defs,
time_index: timestamp_field_name,
primary_keys,
create_if_not_exists: true,
table_options: Default::default(),
table_id: table_id.map(|id| api::v1::TableId { id }),
region_numbers: vec![0], // TODO:(hl): region number should be allocated by frontend
engine: engine.to_string(),
};
Ok(expr)
let table_name = TableReference::full(catalog_name, schema_name, table_name);
let column_exprs = ColumnExpr::from_columns(columns);
util::build_create_table_expr(
table_id,
&table_name,
column_exprs,
engine,
"Created on insertion",
)
}
pub fn to_table_insert_request(
@@ -364,10 +107,10 @@ pub(crate) fn add_values_to_builder(
null_mask: Vec<u8>,
) -> Result<VectorRef> {
if null_mask.is_empty() {
Ok(values_to_vector(&data_type, values))
Ok(helper::pb_values_to_vector_ref(&data_type, values))
} else {
let builder = &mut data_type.create_mutable_vector(row_count);
let values = convert_values(&data_type, values);
let values = helper::pb_values_to_values(&data_type, values);
let null_mask = BitVec::from_vec(null_mask);
ensure!(
null_mask.count_ones() + values.len() == row_count,
@@ -392,231 +135,6 @@ pub(crate) fn add_values_to_builder(
}
}
fn values_to_vector(data_type: &ConcreteDataType, values: Values) -> VectorRef {
match data_type {
ConcreteDataType::Boolean(_) => Arc::new(BooleanVector::from(values.bool_values)),
ConcreteDataType::Int8(_) => Arc::new(PrimitiveVector::<Int8Type>::from_iter_values(
values.i8_values.into_iter().map(|x| x as i8),
)),
ConcreteDataType::Int16(_) => Arc::new(PrimitiveVector::<Int16Type>::from_iter_values(
values.i16_values.into_iter().map(|x| x as i16),
)),
ConcreteDataType::Int32(_) => Arc::new(Int32Vector::from_vec(values.i32_values)),
ConcreteDataType::Int64(_) => Arc::new(Int64Vector::from_vec(values.i64_values)),
ConcreteDataType::UInt8(_) => Arc::new(PrimitiveVector::<UInt8Type>::from_iter_values(
values.u8_values.into_iter().map(|x| x as u8),
)),
ConcreteDataType::UInt16(_) => Arc::new(PrimitiveVector::<UInt16Type>::from_iter_values(
values.u16_values.into_iter().map(|x| x as u16),
)),
ConcreteDataType::UInt32(_) => Arc::new(UInt32Vector::from_vec(values.u32_values)),
ConcreteDataType::UInt64(_) => Arc::new(UInt64Vector::from_vec(values.u64_values)),
ConcreteDataType::Float32(_) => Arc::new(Float32Vector::from_vec(values.f32_values)),
ConcreteDataType::Float64(_) => Arc::new(Float64Vector::from_vec(values.f64_values)),
ConcreteDataType::Binary(_) => Arc::new(BinaryVector::from(values.binary_values)),
ConcreteDataType::String(_) => Arc::new(StringVector::from_vec(values.string_values)),
ConcreteDataType::Date(_) => Arc::new(DateVector::from_vec(values.date_values)),
ConcreteDataType::DateTime(_) => Arc::new(DateTimeVector::from_vec(values.datetime_values)),
ConcreteDataType::Timestamp(unit) => match unit {
TimestampType::Second(_) => {
Arc::new(TimestampSecondVector::from_vec(values.ts_second_values))
}
TimestampType::Millisecond(_) => Arc::new(TimestampMillisecondVector::from_vec(
values.ts_millisecond_values,
)),
TimestampType::Microsecond(_) => Arc::new(TimestampMicrosecondVector::from_vec(
values.ts_microsecond_values,
)),
TimestampType::Nanosecond(_) => Arc::new(TimestampNanosecondVector::from_vec(
values.ts_nanosecond_values,
)),
},
ConcreteDataType::Time(unit) => match unit {
TimeType::Second(_) => Arc::new(TimeSecondVector::from_iter_values(
values.time_second_values.iter().map(|x| *x as i32),
)),
TimeType::Millisecond(_) => Arc::new(TimeMillisecondVector::from_iter_values(
values.time_millisecond_values.iter().map(|x| *x as i32),
)),
TimeType::Microsecond(_) => Arc::new(TimeMicrosecondVector::from_vec(
values.time_microsecond_values,
)),
TimeType::Nanosecond(_) => Arc::new(TimeNanosecondVector::from_vec(
values.time_nanosecond_values,
)),
},
ConcreteDataType::Interval(unit) => match unit {
IntervalType::YearMonth(_) => Arc::new(IntervalYearMonthVector::from_vec(
values.interval_year_month_values,
)),
IntervalType::DayTime(_) => Arc::new(IntervalDayTimeVector::from_vec(
values.interval_day_time_values,
)),
IntervalType::MonthDayNano(_) => {
Arc::new(IntervalMonthDayNanoVector::from_iter_values(
values.interval_month_day_nano_values.iter().map(|x| {
Interval::from_month_day_nano(x.months, x.days, x.nanoseconds).to_i128()
}),
))
}
},
ConcreteDataType::Null(_) | ConcreteDataType::List(_) | ConcreteDataType::Dictionary(_) => {
unreachable!()
}
}
}
fn convert_values(data_type: &ConcreteDataType, values: Values) -> Vec<Value> {
// TODO(fys): use macros to optimize code
match data_type {
ConcreteDataType::Int64(_) => values
.i64_values
.into_iter()
.map(|val| val.into())
.collect(),
ConcreteDataType::Float64(_) => values
.f64_values
.into_iter()
.map(|val| val.into())
.collect(),
ConcreteDataType::String(_) => values
.string_values
.into_iter()
.map(|val| val.into())
.collect(),
ConcreteDataType::Boolean(_) => values
.bool_values
.into_iter()
.map(|val| val.into())
.collect(),
ConcreteDataType::Int8(_) => values
.i8_values
.into_iter()
// Safety: Since i32 only stores i8 data here, so i32 as i8 is safe.
.map(|val| (val as i8).into())
.collect(),
ConcreteDataType::Int16(_) => values
.i16_values
.into_iter()
// Safety: Since i32 only stores i16 data here, so i32 as i16 is safe.
.map(|val| (val as i16).into())
.collect(),
ConcreteDataType::Int32(_) => values
.i32_values
.into_iter()
.map(|val| val.into())
.collect(),
ConcreteDataType::UInt8(_) => values
.u8_values
.into_iter()
// Safety: Since i32 only stores u8 data here, so i32 as u8 is safe.
.map(|val| (val as u8).into())
.collect(),
ConcreteDataType::UInt16(_) => values
.u16_values
.into_iter()
// Safety: Since i32 only stores u16 data here, so i32 as u16 is safe.
.map(|val| (val as u16).into())
.collect(),
ConcreteDataType::UInt32(_) => values
.u32_values
.into_iter()
.map(|val| val.into())
.collect(),
ConcreteDataType::UInt64(_) => values
.u64_values
.into_iter()
.map(|val| val.into())
.collect(),
ConcreteDataType::Float32(_) => values
.f32_values
.into_iter()
.map(|val| val.into())
.collect(),
ConcreteDataType::Binary(_) => values
.binary_values
.into_iter()
.map(|val| val.into())
.collect(),
ConcreteDataType::DateTime(_) => values
.datetime_values
.into_iter()
.map(|v| Value::DateTime(v.into()))
.collect(),
ConcreteDataType::Date(_) => values
.date_values
.into_iter()
.map(|v| Value::Date(v.into()))
.collect(),
ConcreteDataType::Timestamp(TimestampType::Second(_)) => values
.ts_second_values
.into_iter()
.map(|v| Value::Timestamp(Timestamp::new_second(v)))
.collect(),
ConcreteDataType::Timestamp(TimestampType::Millisecond(_)) => values
.ts_millisecond_values
.into_iter()
.map(|v| Value::Timestamp(Timestamp::new_millisecond(v)))
.collect(),
ConcreteDataType::Timestamp(TimestampType::Microsecond(_)) => values
.ts_microsecond_values
.into_iter()
.map(|v| Value::Timestamp(Timestamp::new_microsecond(v)))
.collect(),
ConcreteDataType::Timestamp(TimestampType::Nanosecond(_)) => values
.ts_nanosecond_values
.into_iter()
.map(|v| Value::Timestamp(Timestamp::new_nanosecond(v)))
.collect(),
ConcreteDataType::Time(TimeType::Second(_)) => values
.time_second_values
.into_iter()
.map(|v| Value::Time(Time::new_second(v)))
.collect(),
ConcreteDataType::Time(TimeType::Millisecond(_)) => values
.time_millisecond_values
.into_iter()
.map(|v| Value::Time(Time::new_millisecond(v)))
.collect(),
ConcreteDataType::Time(TimeType::Microsecond(_)) => values
.time_microsecond_values
.into_iter()
.map(|v| Value::Time(Time::new_microsecond(v)))
.collect(),
ConcreteDataType::Time(TimeType::Nanosecond(_)) => values
.time_nanosecond_values
.into_iter()
.map(|v| Value::Time(Time::new_nanosecond(v)))
.collect(),
ConcreteDataType::Interval(IntervalType::YearMonth(_)) => values
.interval_year_month_values
.into_iter()
.map(|v| Value::Interval(Interval::from_i32(v)))
.collect(),
ConcreteDataType::Interval(IntervalType::DayTime(_)) => values
.interval_day_time_values
.into_iter()
.map(|v| Value::Interval(Interval::from_i64(v)))
.collect(),
ConcreteDataType::Interval(IntervalType::MonthDayNano(_)) => values
.interval_month_day_nano_values
.into_iter()
.map(|v| {
Value::Interval(Interval::from_month_day_nano(
v.months,
v.days,
v.nanoseconds,
))
})
.collect(),
ConcreteDataType::Null(_) | ConcreteDataType::List(_) | ConcreteDataType::Dictionary(_) => {
unreachable!()
}
}
}
fn is_null(null_mask: &BitVec, idx: usize) -> Option<bool> {
null_mask.get(idx).as_deref().copied()
}
@@ -635,12 +153,7 @@ mod tests {
use common_time::timestamp::{TimeUnit, Timestamp};
use datatypes::data_type::ConcreteDataType;
use datatypes::schema::{ColumnSchema, SchemaBuilder};
use datatypes::types::{
IntervalDayTimeType, IntervalMonthDayNanoType, IntervalYearMonthType, TimeMillisecondType,
TimeSecondType, TimeType, TimestampMillisecondType, TimestampSecondType, TimestampType,
};
use datatypes::value::Value;
use paste::paste;
use snafu::ResultExt;
use super::*;
@@ -881,280 +394,6 @@ mod tests {
assert_eq!(Value::Timestamp(Timestamp::new_millisecond(101)), ts.get(1));
}
macro_rules! test_convert_values {
($grpc_data_type: ident, $values: expr, $concrete_data_type: ident, $expected_ret: expr) => {
paste! {
#[test]
fn [<test_convert_ $grpc_data_type _values>]() {
let values = Values {
[<$grpc_data_type _values>]: $values,
..Default::default()
};
let data_type = ConcreteDataType::[<$concrete_data_type _datatype>]();
let result = convert_values(&data_type, values);
assert_eq!(
$expected_ret,
result
);
}
}
};
}
test_convert_values!(
i8,
vec![1_i32, 2, 3],
int8,
vec![Value::Int8(1), Value::Int8(2), Value::Int8(3)]
);
test_convert_values!(
u8,
vec![1_u32, 2, 3],
uint8,
vec![Value::UInt8(1), Value::UInt8(2), Value::UInt8(3)]
);
test_convert_values!(
i16,
vec![1_i32, 2, 3],
int16,
vec![Value::Int16(1), Value::Int16(2), Value::Int16(3)]
);
test_convert_values!(
u16,
vec![1_u32, 2, 3],
uint16,
vec![Value::UInt16(1), Value::UInt16(2), Value::UInt16(3)]
);
test_convert_values!(
i32,
vec![1, 2, 3],
int32,
vec![Value::Int32(1), Value::Int32(2), Value::Int32(3)]
);
test_convert_values!(
u32,
vec![1, 2, 3],
uint32,
vec![Value::UInt32(1), Value::UInt32(2), Value::UInt32(3)]
);
test_convert_values!(
i64,
vec![1, 2, 3],
int64,
vec![Value::Int64(1), Value::Int64(2), Value::Int64(3)]
);
test_convert_values!(
u64,
vec![1, 2, 3],
uint64,
vec![Value::UInt64(1), Value::UInt64(2), Value::UInt64(3)]
);
test_convert_values!(
f32,
vec![1.0, 2.0, 3.0],
float32,
vec![
Value::Float32(1.0.into()),
Value::Float32(2.0.into()),
Value::Float32(3.0.into())
]
);
test_convert_values!(
f64,
vec![1.0, 2.0, 3.0],
float64,
vec![
Value::Float64(1.0.into()),
Value::Float64(2.0.into()),
Value::Float64(3.0.into())
]
);
test_convert_values!(
string,
vec!["1".to_string(), "2".to_string(), "3".to_string()],
string,
vec![
Value::String("1".into()),
Value::String("2".into()),
Value::String("3".into())
]
);
test_convert_values!(
binary,
vec!["1".into(), "2".into(), "3".into()],
binary,
vec![
Value::Binary(b"1".to_vec().into()),
Value::Binary(b"2".to_vec().into()),
Value::Binary(b"3".to_vec().into())
]
);
test_convert_values!(
date,
vec![1, 2, 3],
date,
vec![
Value::Date(1.into()),
Value::Date(2.into()),
Value::Date(3.into())
]
);
test_convert_values!(
datetime,
vec![1.into(), 2.into(), 3.into()],
datetime,
vec![
Value::DateTime(1.into()),
Value::DateTime(2.into()),
Value::DateTime(3.into())
]
);
#[test]
fn test_convert_timestamp_values() {
// second
let actual = convert_values(
&ConcreteDataType::Timestamp(TimestampType::Second(TimestampSecondType)),
Values {
ts_second_values: vec![1_i64, 2_i64, 3_i64],
..Default::default()
},
);
let expect = vec![
Value::Timestamp(Timestamp::new_second(1_i64)),
Value::Timestamp(Timestamp::new_second(2_i64)),
Value::Timestamp(Timestamp::new_second(3_i64)),
];
assert_eq!(expect, actual);
// millisecond
let actual = convert_values(
&ConcreteDataType::Timestamp(TimestampType::Millisecond(TimestampMillisecondType)),
Values {
ts_millisecond_values: vec![1_i64, 2_i64, 3_i64],
..Default::default()
},
);
let expect = vec![
Value::Timestamp(Timestamp::new_millisecond(1_i64)),
Value::Timestamp(Timestamp::new_millisecond(2_i64)),
Value::Timestamp(Timestamp::new_millisecond(3_i64)),
];
assert_eq!(expect, actual);
}
#[test]
fn test_convert_time_values() {
// second
let actual = convert_values(
&ConcreteDataType::Time(TimeType::Second(TimeSecondType)),
Values {
time_second_values: vec![1_i64, 2_i64, 3_i64],
..Default::default()
},
);
let expect = vec![
Value::Time(Time::new_second(1_i64)),
Value::Time(Time::new_second(2_i64)),
Value::Time(Time::new_second(3_i64)),
];
assert_eq!(expect, actual);
// millisecond
let actual = convert_values(
&ConcreteDataType::Time(TimeType::Millisecond(TimeMillisecondType)),
Values {
time_millisecond_values: vec![1_i64, 2_i64, 3_i64],
..Default::default()
},
);
let expect = vec![
Value::Time(Time::new_millisecond(1_i64)),
Value::Time(Time::new_millisecond(2_i64)),
Value::Time(Time::new_millisecond(3_i64)),
];
assert_eq!(expect, actual);
}
#[test]
fn test_convert_interval_values() {
// year_month
let actual = convert_values(
&ConcreteDataType::Interval(IntervalType::YearMonth(IntervalYearMonthType)),
Values {
interval_year_month_values: vec![1_i32, 2_i32, 3_i32],
..Default::default()
},
);
let expect = vec![
Value::Interval(Interval::from_year_month(1_i32)),
Value::Interval(Interval::from_year_month(2_i32)),
Value::Interval(Interval::from_year_month(3_i32)),
];
assert_eq!(expect, actual);
// day_time
let actual = convert_values(
&ConcreteDataType::Interval(IntervalType::DayTime(IntervalDayTimeType)),
Values {
interval_day_time_values: vec![1_i64, 2_i64, 3_i64],
..Default::default()
},
);
let expect = vec![
Value::Interval(Interval::from_i64(1_i64)),
Value::Interval(Interval::from_i64(2_i64)),
Value::Interval(Interval::from_i64(3_i64)),
];
assert_eq!(expect, actual);
// month_day_nano
let actual = convert_values(
&ConcreteDataType::Interval(IntervalType::MonthDayNano(IntervalMonthDayNanoType)),
Values {
interval_month_day_nano_values: vec![
IntervalMonthDayNano {
months: 1,
days: 2,
nanoseconds: 3,
},
IntervalMonthDayNano {
months: 5,
days: 6,
nanoseconds: 7,
},
IntervalMonthDayNano {
months: 9,
days: 10,
nanoseconds: 11,
},
],
..Default::default()
},
);
let expect = vec![
Value::Interval(Interval::from_month_day_nano(1, 2, 3)),
Value::Interval(Interval::from_month_day_nano(5, 6, 7)),
Value::Interval(Interval::from_month_day_nano(9, 10, 11)),
];
assert_eq!(expect, actual);
}
#[test]
fn test_is_null() {
let null_mask = BitVec::from_slice(&[0b0000_0001, 0b0000_1000]);
@@ -1178,7 +417,7 @@ mod tests {
};
let host_column = Column {
column_name: "host".to_string(),
semantic_type: TAG_SEMANTIC_TYPE,
semantic_type: SemanticType::Tag as i32,
values: Some(host_vals),
null_mask: vec![0],
datatype: ColumnDataType::String as i32,
@@ -1248,7 +487,7 @@ mod tests {
};
let ts_column = Column {
column_name: "ts".to_string(),
semantic_type: TIMESTAMP_SEMANTIC_TYPE,
semantic_type: SemanticType::Timestamp as i32,
values: Some(ts_vals),
null_mask: vec![0],
datatype: ColumnDataType::TimestampMillisecond as i32,

View File

@@ -16,6 +16,7 @@ mod alter;
pub mod delete;
pub mod error;
pub mod insert;
pub mod util;
pub use alter::{alter_expr_to_request, create_expr_to_request, create_table_schema};
pub use insert::{build_create_expr_from_insertion, column_to_vector, find_new_columns};
pub use insert::{build_create_expr_from_insertion, find_new_columns};

View File

@@ -0,0 +1,188 @@
// Copyright 2023 Greptime Team
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
use std::collections::HashSet;
use api::v1::{
AddColumn, AddColumns, Column, ColumnDef, ColumnSchema, CreateTableExpr, SemanticType,
};
use datatypes::schema::Schema;
use snafu::{ensure, OptionExt};
use table::engine::TableReference;
use table::metadata::TableId;
use crate::error::{
DuplicatedColumnNameSnafu, DuplicatedTimestampColumnSnafu, MissingTimestampColumnSnafu, Result,
};
pub struct ColumnExpr<'a> {
pub column_name: &'a str,
pub datatype: i32,
pub semantic_type: i32,
}
impl<'a> ColumnExpr<'a> {
#[inline]
pub fn from_columns(columns: &'a [Column]) -> Vec<Self> {
columns.iter().map(Self::from).collect()
}
#[inline]
pub fn from_column_schemas(schemas: &'a [ColumnSchema]) -> Vec<Self> {
schemas.iter().map(Self::from).collect()
}
}
impl<'a> From<&'a Column> for ColumnExpr<'a> {
fn from(column: &'a Column) -> Self {
Self {
column_name: &column.column_name,
datatype: column.datatype,
semantic_type: column.semantic_type,
}
}
}
impl<'a> From<&'a ColumnSchema> for ColumnExpr<'a> {
fn from(schema: &'a ColumnSchema) -> Self {
Self {
column_name: &schema.column_name,
datatype: schema.datatype,
semantic_type: schema.semantic_type,
}
}
}
pub fn build_create_table_expr(
table_id: Option<TableId>,
table_name: &TableReference<'_>,
column_exprs: Vec<ColumnExpr>,
engine: &str,
desc: &str,
) -> Result<CreateTableExpr> {
// Check for duplicate names. If found, raise an error.
//
// The introduction of hashset incurs additional memory overhead
// but achieves a time complexity of O(1).
//
// The separate iteration over `column_exprs` is because the CPU prefers
// smaller loops, and avoid cloning String.
let mut distinct_names = HashSet::with_capacity(column_exprs.len());
for ColumnExpr { column_name, .. } in &column_exprs {
ensure!(
distinct_names.insert(*column_name),
DuplicatedColumnNameSnafu { name: *column_name }
);
}
let mut column_defs = Vec::with_capacity(column_exprs.len());
let mut primary_keys = Vec::default();
let mut time_index = None;
for ColumnExpr {
column_name,
datatype,
semantic_type,
} in column_exprs
{
let mut is_nullable = true;
match semantic_type {
v if v == SemanticType::Tag as i32 => primary_keys.push(column_name.to_string()),
v if v == SemanticType::Timestamp as i32 => {
ensure!(
time_index.is_none(),
DuplicatedTimestampColumnSnafu {
exists: time_index.unwrap(),
duplicated: column_name,
}
);
time_index = Some(column_name.to_string());
// Timestamp column must not be null.
is_nullable = false;
}
_ => {}
}
let column_def = ColumnDef {
name: column_name.to_string(),
datatype,
is_nullable,
default_constraint: vec![],
};
column_defs.push(column_def);
}
let time_index = time_index.context(MissingTimestampColumnSnafu {
msg: format!("table is {}", table_name.table),
})?;
let expr = CreateTableExpr {
catalog_name: table_name.catalog.to_string(),
schema_name: table_name.schema.to_string(),
table_name: table_name.table.to_string(),
desc: desc.to_string(),
column_defs,
time_index,
primary_keys,
create_if_not_exists: true,
table_options: Default::default(),
table_id: table_id.map(|id| api::v1::TableId { id }),
// TODO(hl): region number should be allocated by frontend
region_numbers: vec![0],
engine: engine.to_string(),
};
Ok(expr)
}
pub fn extract_new_columns(
schema: &Schema,
column_exprs: Vec<ColumnExpr>,
) -> Result<Option<AddColumns>> {
let columns_to_add = column_exprs
.into_iter()
.filter(|expr| schema.column_schema_by_name(expr.column_name).is_none())
.map(|expr| {
let is_key = expr.semantic_type == SemanticType::Tag as i32;
let column_def = Some(ColumnDef {
name: expr.column_name.to_string(),
datatype: expr.datatype,
is_nullable: true,
default_constraint: vec![],
});
AddColumn {
column_def,
is_key,
location: None,
}
})
.collect::<Vec<_>>();
if columns_to_add.is_empty() {
Ok(None)
} else {
let mut distinct_names = HashSet::with_capacity(columns_to_add.len());
for add_column in &columns_to_add {
let name = add_column.column_def.as_ref().unwrap().name.as_str();
ensure!(
distinct_names.insert(name),
DuplicatedColumnNameSnafu { name }
);
}
Ok(Some(AddColumns {
add_columns: columns_to_add,
}))
}
}

View File

@@ -14,6 +14,7 @@ common-error = { workspace = true }
common-recordbatch = { workspace = true }
common-runtime = { workspace = true }
common-telemetry = { workspace = true }
common-time = { workspace = true }
dashmap = "5.4"
datafusion.workspace = true
datatypes = { workspace = true }

View File

@@ -75,6 +75,9 @@ pub enum Error {
location: Location,
source: datatypes::error::Error,
},
#[snafu(display("Not supported: {}", feat))]
NotSupported { feat: String },
}
impl ErrorExt for Error {
@@ -83,7 +86,8 @@ impl ErrorExt for Error {
Error::InvalidTlsConfig { .. }
| Error::InvalidConfigFilePath { .. }
| Error::TypeMismatch { .. }
| Error::InvalidFlightData { .. } => StatusCode::InvalidArguments,
| Error::InvalidFlightData { .. }
| Error::NotSupported { .. } => StatusCode::InvalidArguments,
Error::CreateChannel { .. }
| Error::Conversion { .. }

View File

@@ -18,9 +18,11 @@ use std::fmt::Display;
use api::helper::values_with_capacity;
use api::v1::{Column, ColumnDataType, SemanticType};
use common_base::BitVec;
use common_time::timestamp::TimeUnit;
use snafu::ensure;
use crate::error::{Result, TypeMismatchSnafu};
use crate::Error;
type ColumnName = String;
@@ -259,6 +261,24 @@ impl Display for Precision {
}
}
impl TryFrom<Precision> for TimeUnit {
type Error = Error;
fn try_from(precision: Precision) -> std::result::Result<Self, Self::Error> {
Ok(match precision {
Precision::Second => TimeUnit::Second,
Precision::Millisecond => TimeUnit::Millisecond,
Precision::Microsecond => TimeUnit::Microsecond,
Precision::Nanosecond => TimeUnit::Nanosecond,
_ => {
return Err(Error::NotSupported {
feat: format!("convert {precision} into TimeUnit"),
})
}
})
}
}
#[cfg(test)]
mod tests {
use api::v1::{ColumnDataType, SemanticType};

View File

@@ -15,6 +15,7 @@ common-telemetry = { workspace = true }
common-time = { workspace = true }
etcd-client.workspace = true
futures.workspace = true
humantime-serde.workspace = true
lazy_static.workspace = true
prost.workspace = true
regex.workspace = true

View File

@@ -24,6 +24,12 @@ use table::metadata::TableId;
#[derive(Debug, Snafu)]
#[snafu(visibility(pub))]
pub enum Error {
#[snafu(display("Failed to decode protobuf, source: {}", source))]
DecodeProto {
location: Location,
source: prost::DecodeError,
},
#[snafu(display("Failed to encode object into json, source: {}", source))]
EncodeJson {
location: Location,
@@ -48,6 +54,13 @@ pub enum Error {
location: Location,
},
#[snafu(display("Failed to parse value {} into key {}", value, key))]
ParseOption {
key: String,
value: String,
location: Location,
},
#[snafu(display("Corrupted table route data, err: {}", err_msg))]
RouteInfoCorrupted { err_msg: String, location: Location },
@@ -145,6 +158,7 @@ impl ErrorExt for Error {
IllegalServerState { .. } | EtcdTxnOpResponse { .. } => StatusCode::Internal,
SerdeJson { .. }
| ParseOption { .. }
| RouteInfoCorrupted { .. }
| InvalidProtoMsg { .. }
| InvalidTableMetadata { .. }
@@ -164,7 +178,8 @@ impl ErrorExt for Error {
EncodeJson { .. }
| DecodeJson { .. }
| PayloadNotExist { .. }
| ConvertRawKey { .. } => StatusCode::Unexpected,
| ConvertRawKey { .. }
| DecodeProto { .. } => StatusCode::Unexpected,
MetaSrv { source, .. } => source.status_code(),

View File

@@ -67,6 +67,7 @@ impl TableGlobalValue {
}
}
#[deprecated(since = "0.4.0", note = "Please use the CatalogNameKey instead")]
pub struct CatalogKey {
pub catalog_name: String,
}
@@ -95,6 +96,7 @@ impl CatalogKey {
#[derive(Debug, Serialize, Deserialize)]
pub struct CatalogValue;
#[deprecated(since = "0.4.0", note = "Please use the SchemaNameKey instead")]
pub struct SchemaKey {
pub catalog_name: String,
pub schema_name: String,

View File

@@ -17,6 +17,7 @@ use std::fmt::{Display, Formatter};
use api::v1::meta::{TableIdent as RawTableIdent, TableName};
use serde::{Deserialize, Serialize};
use snafu::OptionExt;
use table::engine::TableReference;
use crate::error::{Error, InvalidProtoMsgSnafu};
@@ -29,6 +30,12 @@ pub struct TableIdent {
pub engine: String,
}
impl TableIdent {
pub fn table_ref(&self) -> TableReference {
TableReference::full(&self.catalog, &self.schema, &self.table)
}
}
impl Display for TableIdent {
fn fmt(&self, f: &mut Formatter<'_>) -> std::fmt::Result {
write!(

View File

@@ -36,10 +36,6 @@
//! - The value is a [TableNameValue] struct; it contains the table id.
//! - Used in the table name to table id lookup.
//!
//! 6. Table region key: `__table_region/{table_id}`
//! - The value is a [TableRegionValue] struct; it contains the region distribution of the
//! table in the Datanodes.
//!
//! All keys have related managers. The managers take care of the serialization and deserialization
//! of keys and values, and the interaction with the underlying KV store backend.
//!
@@ -52,24 +48,35 @@ pub mod datanode_table;
pub mod schema_name;
pub mod table_info;
pub mod table_name;
// TODO(weny): removes it.
#[allow(deprecated)]
pub mod table_region;
mod table_route;
// TODO(weny): removes it.
#[allow(deprecated)]
pub mod table_route;
use std::collections::BTreeMap;
use std::sync::Arc;
use datanode_table::{DatanodeTableKey, DatanodeTableManager, DatanodeTableValue};
use lazy_static::lazy_static;
use regex::Regex;
use snafu::ResultExt;
use snafu::{ensure, OptionExt, ResultExt};
use store_api::storage::RegionNumber;
use table::metadata::{RawTableInfo, TableId};
use table_info::{TableInfoKey, TableInfoManager, TableInfoValue};
use table_name::{TableNameKey, TableNameManager, TableNameValue};
use table_region::{TableRegionKey, TableRegionManager, TableRegionValue};
use self::catalog_name::CatalogNameValue;
use self::schema_name::SchemaNameValue;
use crate::error::{InvalidTableMetadataSnafu, Result, SerdeJsonSnafu};
use self::catalog_name::{CatalogManager, CatalogNameValue};
use self::schema_name::{SchemaManager, SchemaNameValue};
use self::table_route::{TableRouteManager, TableRouteValue};
use crate::error::{self, Result, SerdeJsonSnafu};
#[allow(deprecated)]
pub use crate::key::table_route::{TableRouteKey, TABLE_ROUTE_PREFIX};
use crate::kv_backend::txn::Txn;
use crate::kv_backend::KvBackendRef;
use crate::rpc::router::{region_distribution, RegionRoute};
use crate::DatanodeId;
pub const REMOVED_PREFIX: &str = "__removed";
@@ -82,6 +89,8 @@ const TABLE_REGION_KEY_PREFIX: &str = "__table_region";
const CATALOG_NAME_KEY_PREFIX: &str = "__catalog_name";
const SCHEMA_NAME_KEY_PREFIX: &str = "__schema_name";
pub type RegionDistribution = BTreeMap<DatanodeId, Vec<RegionNumber>>;
lazy_static! {
static ref DATANODE_TABLE_KEY_PATTERN: Regex =
Regex::new(&format!("^{DATANODE_TABLE_KEY_PREFIX}/([0-9])/([0-9])$")).unwrap();
@@ -123,8 +132,25 @@ pub type TableMetadataManagerRef = Arc<TableMetadataManager>;
pub struct TableMetadataManager {
table_name_manager: TableNameManager,
table_info_manager: TableInfoManager,
table_region_manager: TableRegionManager,
datanode_table_manager: DatanodeTableManager,
catalog_manager: CatalogManager,
schema_manager: SchemaManager,
table_route_manager: TableRouteManager,
kv_backend: KvBackendRef,
}
macro_rules! ensure_values {
($got:expr, $expected_value:expr, $name:expr) => {
ensure!(
$got == $expected_value,
error::UnexpectedSnafu {
err_msg: format!(
"Reads the different value: {:?} during {}, expected: {:?}",
$got, $name, $expected_value
)
}
);
};
}
impl TableMetadataManager {
@@ -132,8 +158,11 @@ impl TableMetadataManager {
TableMetadataManager {
table_name_manager: TableNameManager::new(kv_backend.clone()),
table_info_manager: TableInfoManager::new(kv_backend.clone()),
table_region_manager: TableRegionManager::new(kv_backend.clone()),
datanode_table_manager: DatanodeTableManager::new(kv_backend),
datanode_table_manager: DatanodeTableManager::new(kv_backend.clone()),
catalog_manager: CatalogManager::new(kv_backend.clone()),
schema_manager: SchemaManager::new(kv_backend.clone()),
table_route_manager: TableRouteManager::new(kv_backend.clone()),
kv_backend,
}
}
@@ -145,15 +174,295 @@ impl TableMetadataManager {
&self.table_info_manager
}
pub fn table_region_manager(&self) -> &TableRegionManager {
&self.table_region_manager
}
pub fn datanode_table_manager(&self) -> &DatanodeTableManager {
&self.datanode_table_manager
}
pub fn catalog_manager(&self) -> &CatalogManager {
&self.catalog_manager
}
pub fn schema_manager(&self) -> &SchemaManager {
&self.schema_manager
}
pub fn table_route_manager(&self) -> &TableRouteManager {
&self.table_route_manager
}
pub async fn get_full_table_info(
&self,
table_id: TableId,
) -> Result<(Option<TableInfoValue>, Option<TableRouteValue>)> {
let (get_table_route_txn, table_route_decoder) =
self.table_route_manager.build_get_txn(table_id);
let (get_table_info_txn, table_info_decoder) =
self.table_info_manager.build_get_txn(table_id);
let txn = Txn::merge_all(vec![get_table_route_txn, get_table_info_txn]);
let r = self.kv_backend.txn(txn).await?;
let table_info_value = table_info_decoder(&r.responses)?;
let table_route_value = table_route_decoder(&r.responses)?;
Ok((table_info_value, table_route_value))
}
/// Creates metadata for table and returns an error if different metadata exists.
/// The caller MUST ensure it has the exclusive access to `TableNameKey`.
pub async fn create_table_metadata(
&self,
mut table_info: RawTableInfo,
region_routes: Vec<RegionRoute>,
) -> Result<()> {
let region_numbers = region_routes
.iter()
.map(|region| region.region.id.region_number())
.collect::<Vec<_>>();
table_info.meta.region_numbers = region_numbers;
let table_id = table_info.ident.table_id;
// Creates table name.
let table_name = TableNameKey::new(
&table_info.catalog_name,
&table_info.schema_name,
&table_info.name,
);
let create_table_name_txn = self
.table_name_manager()
.build_create_txn(&table_name, table_id)?;
// Creates table info.
let table_info_value = TableInfoValue::new(table_info);
let (create_table_info_txn, on_create_table_info_failure) = self
.table_info_manager()
.build_create_txn(table_id, &table_info_value)?;
// Creates datanode table key value pairs.
let distribution = region_distribution(&region_routes)?;
let create_datanode_table_txn = self
.datanode_table_manager()
.build_create_txn(table_id, distribution)?;
// Creates table route.
let table_route_value = TableRouteValue::new(region_routes);
let (create_table_route_txn, on_create_table_route_failure) = self
.table_route_manager()
.build_create_txn(table_id, &table_route_value)?;
let txn = Txn::merge_all(vec![
create_table_name_txn,
create_table_info_txn,
create_datanode_table_txn,
create_table_route_txn,
]);
let r = self.kv_backend.txn(txn).await?;
// Checks whether metadata was already created.
if !r.succeeded {
let remote_table_info =
on_create_table_info_failure(&r.responses)?.context(error::UnexpectedSnafu {
err_msg: "Reads the empty table info during the create table metadata",
})?;
let remote_table_route =
on_create_table_route_failure(&r.responses)?.context(error::UnexpectedSnafu {
err_msg: "Reads the empty table route during the create table metadata",
})?;
let op_name = "the creating table metadata";
ensure_values!(remote_table_info, table_info_value, op_name);
ensure_values!(remote_table_route, table_route_value, op_name);
}
Ok(())
}
/// Deletes metadata for table.
/// The caller MUST ensure it has the exclusive access to `TableNameKey`.
pub async fn delete_table_metadata(
&self,
table_info_value: &TableInfoValue,
table_route_value: &TableRouteValue,
) -> Result<()> {
let table_info = &table_info_value.table_info;
let table_id = table_info.ident.table_id;
// Deletes table name.
let table_name = TableNameKey::new(
&table_info.catalog_name,
&table_info.schema_name,
&table_info.name,
);
let delete_table_name_txn = self
.table_name_manager()
.build_delete_txn(&table_name, table_id)?;
// Deletes table info.
let delete_table_info_txn = self
.table_info_manager()
.build_delete_txn(table_id, table_info_value)?;
// Deletes datanode table key value pairs.
let distribution = region_distribution(&table_route_value.region_routes)?;
let delete_datanode_txn = self
.datanode_table_manager()
.build_delete_txn(table_id, distribution)?;
// Deletes table route.
let delete_table_route_txn = self
.table_route_manager()
.build_delete_txn(table_id, table_route_value)?;
let txn = Txn::merge_all(vec![
delete_table_name_txn,
delete_table_info_txn,
delete_datanode_txn,
delete_table_route_txn,
]);
// It's always successes.
let _ = self.kv_backend.txn(txn).await?;
Ok(())
}
/// Renames the table name and returns an error if different metadata exists.
/// The caller MUST ensure it has the exclusive access to old and new `TableNameKey`s,
/// and the new `TableNameKey` MUST be empty.
pub async fn rename_table(
&self,
current_table_info_value: TableInfoValue,
new_table_name: String,
) -> Result<()> {
let current_table_info = &current_table_info_value.table_info;
let table_id = current_table_info.ident.table_id;
let table_name_key = TableNameKey::new(
&current_table_info.catalog_name,
&current_table_info.schema_name,
&current_table_info.name,
);
let new_table_name_key = TableNameKey::new(
&current_table_info.catalog_name,
&current_table_info.schema_name,
&new_table_name,
);
// Updates table name.
let update_table_name_txn = self.table_name_manager().build_update_txn(
&table_name_key,
&new_table_name_key,
table_id,
)?;
let new_table_info_value = current_table_info_value.with_update(move |table_info| {
table_info.name = new_table_name;
});
// Updates table info.
let (update_table_info_txn, on_update_table_info_failure) = self
.table_info_manager()
.build_update_txn(table_id, &current_table_info_value, &new_table_info_value)?;
let txn = Txn::merge_all(vec![update_table_name_txn, update_table_info_txn]);
let r = self.kv_backend.txn(txn).await?;
// Checks whether metadata was already updated.
if !r.succeeded {
let remote_table_info =
on_update_table_info_failure(&r.responses)?.context(error::UnexpectedSnafu {
err_msg: "Reads the empty table info during the rename table metadata",
})?;
let op_name = "the renaming table metadata";
ensure_values!(remote_table_info, new_table_info_value, op_name);
}
Ok(())
}
/// Updates table info and returns an error if different metadata exists.
pub async fn update_table_info(
&self,
current_table_info_value: TableInfoValue,
new_table_info: RawTableInfo,
) -> Result<()> {
let table_id = current_table_info_value.table_info.ident.table_id;
let new_table_info_value = current_table_info_value.update(new_table_info);
// Updates table info.
let (update_table_info_txn, on_update_table_info_failure) = self
.table_info_manager()
.build_update_txn(table_id, &current_table_info_value, &new_table_info_value)?;
let r = self.kv_backend.txn(update_table_info_txn).await?;
// Checks whether metadata was already updated.
if !r.succeeded {
let remote_table_info =
on_update_table_info_failure(&r.responses)?.context(error::UnexpectedSnafu {
err_msg: "Reads the empty table info during the updating table info",
})?;
let op_name = "the updating table info";
ensure_values!(remote_table_info, new_table_info_value, op_name);
}
Ok(())
}
pub async fn update_table_route(
&self,
table_id: TableId,
current_table_route_value: TableRouteValue,
new_region_routes: Vec<RegionRoute>,
) -> Result<()> {
// Updates the datanode table key value pairs.
let current_region_distribution =
region_distribution(&current_table_route_value.region_routes)?;
let new_region_distribution = region_distribution(&new_region_routes)?;
let update_datanode_table_txn = self.datanode_table_manager().build_update_txn(
table_id,
current_region_distribution,
new_region_distribution,
)?;
// Updates the table_route.
let new_table_route_value = current_table_route_value.update(new_region_routes);
let (update_table_route_txn, on_update_table_route_failure) = self
.table_route_manager()
.build_update_txn(table_id, &current_table_route_value, &new_table_route_value)?;
let txn = Txn::merge_all(vec![update_datanode_table_txn, update_table_route_txn]);
let r = self.kv_backend.txn(txn).await?;
// Checks whether metadata was already updated.
if !r.succeeded {
let remote_table_route =
on_update_table_route_failure(&r.responses)?.context(error::UnexpectedSnafu {
err_msg: "Reads the empty table route during the updating table route",
})?;
let op_name = "the updating table route";
ensure_values!(remote_table_route, new_table_route_value, op_name);
}
Ok(())
}
}
#[macro_export]
macro_rules! impl_table_meta_key {
($($val_ty: ty), *) => {
$(
@@ -166,28 +475,36 @@ macro_rules! impl_table_meta_key {
}
}
impl_table_meta_key!(
TableNameKey<'_>,
TableInfoKey,
TableRegionKey,
DatanodeTableKey
);
impl_table_meta_key!(TableNameKey<'_>, TableInfoKey, DatanodeTableKey);
#[macro_export]
macro_rules! impl_table_meta_value {
($($val_ty: ty), *) => {
$(
impl $val_ty {
pub fn try_from_raw_value(raw_value: Vec<u8>) -> Result<Self> {
let raw_value = String::from_utf8(raw_value).map_err(|e| {
InvalidTableMetadataSnafu { err_msg: e.to_string() }.build()
})?;
serde_json::from_str(&raw_value).context(SerdeJsonSnafu)
pub fn try_from_raw_value(raw_value: &[u8]) -> Result<Self> {
serde_json::from_slice(raw_value).context(SerdeJsonSnafu)
}
pub fn try_as_raw_value(&self) -> Result<Vec<u8>> {
serde_json::to_string(self)
.map(|x| x.into_bytes())
.context(SerdeJsonSnafu)
serde_json::to_vec(self).context(SerdeJsonSnafu)
}
}
)*
}
}
#[macro_export]
macro_rules! impl_optional_meta_value {
($($val_ty: ty), *) => {
$(
impl $val_ty {
pub fn try_from_raw_value(raw_value: &[u8]) -> Result<Option<Self>> {
serde_json::from_slice(raw_value).context(SerdeJsonSnafu)
}
pub fn try_as_raw_value(&self) -> Result<Vec<u8>> {
serde_json::to_vec(self).context(SerdeJsonSnafu)
}
}
)*
@@ -195,17 +512,35 @@ macro_rules! impl_table_meta_value {
}
impl_table_meta_value! {
CatalogNameValue,
SchemaNameValue,
TableNameValue,
TableInfoValue,
TableRegionValue,
DatanodeTableValue
DatanodeTableValue,
TableRouteValue
}
impl_optional_meta_value! {
CatalogNameValue,
SchemaNameValue
}
#[cfg(test)]
mod tests {
use crate::key::to_removed_key;
use std::collections::BTreeMap;
use std::sync::Arc;
use datatypes::prelude::ConcreteDataType;
use datatypes::schema::{ColumnSchema, SchemaBuilder};
use futures::TryStreamExt;
use table::metadata::{RawTableInfo, TableInfo, TableInfoBuilder, TableMetaBuilder};
use super::datanode_table::DatanodeTableKey;
use crate::key::table_info::TableInfoValue;
use crate::key::table_name::TableNameKey;
use crate::key::table_route::TableRouteValue;
use crate::key::{to_removed_key, TableMetadataManager};
use crate::kv_backend::memory::MemoryKvBackend;
use crate::peer::Peer;
use crate::rpc::router::{region_distribution, Region, RegionRoute};
#[test]
fn test_to_removed_key() {
@@ -213,4 +548,361 @@ mod tests {
let removed = "__removed-test_key";
assert_eq!(removed, to_removed_key(key));
}
fn new_test_table_info(region_numbers: impl Iterator<Item = u32>) -> TableInfo {
let column_schemas = vec![
ColumnSchema::new("col1", ConcreteDataType::int32_datatype(), true),
ColumnSchema::new(
"ts",
ConcreteDataType::timestamp_millisecond_datatype(),
false,
)
.with_time_index(true),
ColumnSchema::new("col2", ConcreteDataType::int32_datatype(), true),
];
let schema = SchemaBuilder::try_from(column_schemas)
.unwrap()
.version(123)
.build()
.unwrap();
let meta = TableMetaBuilder::default()
.schema(Arc::new(schema))
.primary_key_indices(vec![0])
.engine("engine")
.next_column_id(3)
.region_numbers(region_numbers.collect::<Vec<_>>())
.build()
.unwrap();
TableInfoBuilder::default()
.table_id(10)
.table_version(5)
.name("mytable")
.meta(meta)
.build()
.unwrap()
}
fn new_test_region_route() -> RegionRoute {
new_region_route(1, 2)
}
fn new_region_route(region_id: u64, datanode: u64) -> RegionRoute {
RegionRoute {
region: Region {
id: region_id.into(),
name: "r1".to_string(),
partition: None,
attrs: BTreeMap::new(),
},
leader_peer: Some(Peer::new(datanode, "a2")),
follower_peers: vec![],
}
}
#[tokio::test]
async fn test_create_table_metadata() {
let mem_kv = Arc::new(MemoryKvBackend::default());
let table_metadata_manager = TableMetadataManager::new(mem_kv);
let region_route = new_test_region_route();
let region_routes = vec![region_route.clone()];
let table_info: RawTableInfo =
new_test_table_info(region_routes.iter().map(|r| r.region.id.region_number())).into();
// creates metadata.
table_metadata_manager
.create_table_metadata(table_info.clone(), region_routes.clone())
.await
.unwrap();
// if metadata was already created, it should be ok.
table_metadata_manager
.create_table_metadata(table_info.clone(), region_routes.clone())
.await
.unwrap();
let mut modified_region_routes = region_routes.clone();
modified_region_routes.push(region_route.clone());
// if remote metadata was exists, it should return an error.
assert!(table_metadata_manager
.create_table_metadata(table_info.clone(), modified_region_routes)
.await
.is_err());
let (remote_table_info, remote_table_route) = table_metadata_manager
.get_full_table_info(10)
.await
.unwrap();
assert_eq!(remote_table_info.unwrap().table_info, table_info);
assert_eq!(remote_table_route.unwrap().region_routes, region_routes);
}
#[tokio::test]
async fn test_delete_table_metadata() {
let mem_kv = Arc::new(MemoryKvBackend::default());
let table_metadata_manager = TableMetadataManager::new(mem_kv);
let region_route = new_test_region_route();
let region_routes = vec![region_route.clone()];
let table_info: RawTableInfo =
new_test_table_info(region_routes.iter().map(|r| r.region.id.region_number())).into();
let table_id = table_info.ident.table_id;
let datanode_id = 2;
let table_route_value = TableRouteValue::new(region_routes.clone());
// creates metadata.
table_metadata_manager
.create_table_metadata(table_info.clone(), region_routes.clone())
.await
.unwrap();
let table_info_value = TableInfoValue::new(table_info.clone());
// deletes metadata.
table_metadata_manager
.delete_table_metadata(&table_info_value, &table_route_value)
.await
.unwrap();
// if metadata was already deleted, it should be ok.
table_metadata_manager
.delete_table_metadata(&table_info_value, &table_route_value)
.await
.unwrap();
assert!(table_metadata_manager
.table_info_manager()
.get(table_id)
.await
.unwrap()
.is_none());
assert!(table_metadata_manager
.table_route_manager()
.get(table_id)
.await
.unwrap()
.is_none());
assert!(table_metadata_manager
.datanode_table_manager()
.tables(datanode_id)
.try_collect::<Vec<_>>()
.await
.unwrap()
.is_empty());
// Checks removed values
let removed_table_info = table_metadata_manager
.table_info_manager()
.get_removed(table_id)
.await
.unwrap()
.unwrap();
assert_eq!(removed_table_info.table_info, table_info);
let removed_table_route = table_metadata_manager
.table_route_manager()
.get_removed(table_id)
.await
.unwrap()
.unwrap();
assert_eq!(removed_table_route.region_routes, region_routes);
}
#[tokio::test]
async fn test_rename_table() {
let mem_kv = Arc::new(MemoryKvBackend::default());
let table_metadata_manager = TableMetadataManager::new(mem_kv);
let region_route = new_test_region_route();
let region_routes = vec![region_route.clone()];
let table_info: RawTableInfo =
new_test_table_info(region_routes.iter().map(|r| r.region.id.region_number())).into();
let table_id = table_info.ident.table_id;
// creates metadata.
table_metadata_manager
.create_table_metadata(table_info.clone(), region_routes.clone())
.await
.unwrap();
let new_table_name = "another_name".to_string();
let table_info_value = TableInfoValue::new(table_info.clone());
table_metadata_manager
.rename_table(table_info_value.clone(), new_table_name.clone())
.await
.unwrap();
// if remote metadata was updated, it should be ok.
table_metadata_manager
.rename_table(table_info_value.clone(), new_table_name.clone())
.await
.unwrap();
let mut modified_table_info = table_info.clone();
modified_table_info.name = "hi".to_string();
let modified_table_info_value = table_info_value.update(modified_table_info);
// if the table_info_value is wrong, it should return an error.
// The ABA problem.
assert!(table_metadata_manager
.rename_table(modified_table_info_value.clone(), new_table_name.clone())
.await
.is_err());
let old_table_name = TableNameKey::new(
&table_info.catalog_name,
&table_info.schema_name,
&table_info.name,
);
let new_table_name = TableNameKey::new(
&table_info.catalog_name,
&table_info.schema_name,
&new_table_name,
);
assert!(table_metadata_manager
.table_name_manager()
.get(old_table_name)
.await
.unwrap()
.is_none());
assert_eq!(
table_metadata_manager
.table_name_manager()
.get(new_table_name)
.await
.unwrap()
.unwrap()
.table_id(),
table_id
);
}
#[tokio::test]
async fn test_update_table_info() {
let mem_kv = Arc::new(MemoryKvBackend::default());
let table_metadata_manager = TableMetadataManager::new(mem_kv);
let region_route = new_test_region_route();
let region_routes = vec![region_route.clone()];
let table_info: RawTableInfo =
new_test_table_info(region_routes.iter().map(|r| r.region.id.region_number())).into();
let table_id = table_info.ident.table_id;
// creates metadata.
table_metadata_manager
.create_table_metadata(table_info.clone(), region_routes.clone())
.await
.unwrap();
let mut new_table_info = table_info.clone();
new_table_info.name = "hi".to_string();
let current_table_info_value = TableInfoValue::new(table_info.clone());
// should be ok.
table_metadata_manager
.update_table_info(current_table_info_value.clone(), new_table_info.clone())
.await
.unwrap();
// if table info was updated, it should be ok.
table_metadata_manager
.update_table_info(current_table_info_value.clone(), new_table_info.clone())
.await
.unwrap();
// updated table_info should equal the `new_table_info`
let updated_table_info = table_metadata_manager
.table_info_manager()
.get(table_id)
.await
.unwrap()
.unwrap();
assert_eq!(updated_table_info.table_info, new_table_info);
let mut wrong_table_info = table_info.clone();
wrong_table_info.name = "wrong".to_string();
let wrong_table_info_value = current_table_info_value.update(wrong_table_info);
// if the current_table_info_value is wrong, it should return an error.
// The ABA problem.
assert!(table_metadata_manager
.update_table_info(wrong_table_info_value, new_table_info)
.await
.is_err())
}
async fn assert_datanode_table(
table_metadata_manager: &TableMetadataManager,
table_id: u32,
region_routes: &[RegionRoute],
) {
let region_distribution = region_distribution(region_routes).unwrap();
for (datanode, regions) in region_distribution {
let got = table_metadata_manager
.datanode_table_manager()
.get(&DatanodeTableKey::new(datanode, table_id))
.await
.unwrap()
.unwrap();
assert_eq!(got.regions, regions)
}
}
#[tokio::test]
async fn test_update_table_route() {
let mem_kv = Arc::new(MemoryKvBackend::default());
let table_metadata_manager = TableMetadataManager::new(mem_kv);
let region_route = new_test_region_route();
let region_routes = vec![region_route.clone()];
let table_info: RawTableInfo =
new_test_table_info(region_routes.iter().map(|r| r.region.id.region_number())).into();
let table_id = table_info.ident.table_id;
let current_table_route_value = TableRouteValue::new(region_routes.clone());
// creates metadata.
table_metadata_manager
.create_table_metadata(table_info.clone(), region_routes.clone())
.await
.unwrap();
assert_datanode_table(&table_metadata_manager, table_id, &region_routes).await;
let new_region_routes = vec![
new_region_route(1, 1),
new_region_route(2, 2),
new_region_route(3, 3),
];
// it should be ok.
table_metadata_manager
.update_table_route(
table_id,
current_table_route_value.clone(),
new_region_routes.clone(),
)
.await
.unwrap();
assert_datanode_table(&table_metadata_manager, table_id, &new_region_routes).await;
// if the table route was updated. it should be ok.
table_metadata_manager
.update_table_route(
table_id,
current_table_route_value.clone(),
new_region_routes.clone(),
)
.await
.unwrap();
let current_table_route_value = current_table_route_value.update(new_region_routes.clone());
let new_region_routes = vec![new_region_route(2, 4), new_region_route(5, 5)];
// it should be ok.
table_metadata_manager
.update_table_route(
table_id,
current_table_route_value.clone(),
new_region_routes.clone(),
)
.await
.unwrap();
assert_datanode_table(&table_metadata_manager, table_id, &new_region_routes).await;
// if the current_table_route_value is wrong, it should return an error.
// The ABA problem.
let wrong_table_route_value = current_table_route_value.update(vec![
new_region_route(1, 1),
new_region_route(2, 2),
new_region_route(3, 3),
new_region_route(4, 4),
]);
assert!(table_metadata_manager
.update_table_route(table_id, wrong_table_route_value, new_region_routes)
.await
.is_err());
}
}

View File

@@ -15,6 +15,7 @@
use std::fmt::Display;
use std::sync::Arc;
use common_catalog::consts::DEFAULT_CATALOG_NAME;
use futures::stream::BoxStream;
use futures::StreamExt;
use serde::{Deserialize, Serialize};
@@ -32,6 +33,14 @@ pub struct CatalogNameKey<'a> {
pub catalog: &'a str,
}
impl<'a> Default for CatalogNameKey<'a> {
fn default() -> Self {
Self {
catalog: DEFAULT_CATALOG_NAME,
}
}
}
#[derive(Debug, Serialize, Deserialize)]
pub struct CatalogNameValue;
@@ -103,6 +112,12 @@ impl CatalogManager {
Ok(())
}
pub async fn exist(&self, catalog: CatalogNameKey<'_>) -> Result<bool> {
let raw_key = catalog.as_raw_key();
Ok(self.kv_backend.get(&raw_key).await?.is_some())
}
pub async fn catalog_names(&self) -> BoxStream<'static, Result<String>> {
let start_key = CatalogNameKey::range_start_key();
let req = RangeRequest::new().with_prefix(start_key.as_bytes());
@@ -121,6 +136,7 @@ impl CatalogManager {
#[cfg(test)]
mod tests {
use super::*;
use crate::kv_backend::memory::MemoryKvBackend;
#[test]
fn test_serialization() {
@@ -132,4 +148,19 @@ mod tests {
assert_eq!(key, parsed);
}
#[tokio::test]
async fn test_key_exist() {
let manager = CatalogManager::new(Arc::new(MemoryKvBackend::default()));
let catalog_key = CatalogNameKey::new("my-catalog");
manager.create(catalog_key).await.unwrap();
assert!(manager.exist(catalog_key).await.unwrap());
let wrong_catalog_key = CatalogNameKey::new("my-wrong");
assert!(!manager.exist(wrong_catalog_key).await.unwrap());
}
}

View File

@@ -12,17 +12,24 @@
// See the License for the specific language governing permissions and
// limitations under the License.
use std::sync::Arc;
use futures::stream::BoxStream;
use futures::StreamExt;
use serde::{Deserialize, Serialize};
use snafu::{ensure, OptionExt};
use snafu::OptionExt;
use store_api::storage::RegionNumber;
use table::metadata::TableId;
use super::{DATANODE_TABLE_KEY_PATTERN, DATANODE_TABLE_KEY_PREFIX};
use crate::error::{InvalidTableMetadataSnafu, MoveRegionSnafu, Result, UnexpectedSnafu};
use crate::key::{to_removed_key, TableMetaKey};
use crate::kv_backend::txn::{Compare, CompareOp, Txn, TxnOp};
use crate::error::{InvalidTableMetadataSnafu, Result};
use crate::key::{
RegionDistribution, TableMetaKey, DATANODE_TABLE_KEY_PATTERN, DATANODE_TABLE_KEY_PREFIX,
};
use crate::kv_backend::txn::{Txn, TxnOp};
use crate::kv_backend::KvBackendRef;
use crate::rpc::store::{BatchGetRequest, CompareAndPutRequest, MoveValueRequest, RangeRequest};
use crate::range_stream::{PaginationStream, DEFAULT_PAGE_SIZE};
use crate::rpc::store::RangeRequest;
use crate::rpc::KeyValue;
use crate::DatanodeId;
pub struct DatanodeTableKey {
@@ -42,7 +49,10 @@ impl DatanodeTableKey {
format!("{}/{datanode_id}", DATANODE_TABLE_KEY_PREFIX)
}
#[allow(unused)]
pub fn range_start_key(datanode_id: DatanodeId) -> String {
format!("{}/", Self::prefix(datanode_id))
}
pub fn strip_table_id(raw_key: &[u8]) -> Result<TableId> {
let key = String::from_utf8(raw_key.to_vec()).map_err(|e| {
InvalidTableMetadataSnafu {
@@ -88,6 +98,13 @@ impl DatanodeTableValue {
}
}
/// Decodes `KeyValue` to ((),`DatanodeTableValue`)
pub fn datanode_table_value_decoder(kv: KeyValue) -> Result<((), DatanodeTableValue)> {
let value = DatanodeTableValue::try_from_raw_value(&kv.value)?;
Ok(((), value))
}
pub struct DatanodeTableManager {
kv_backend: KvBackendRef,
}
@@ -101,330 +118,114 @@ impl DatanodeTableManager {
self.kv_backend
.get(&key.as_raw_key())
.await?
.map(|kv| DatanodeTableValue::try_from_raw_value(kv.value))
.map(|kv| DatanodeTableValue::try_from_raw_value(&kv.value))
.transpose()
}
/// Create DatanodeTable key and value. If the key already exists, check if the value is the same.
pub async fn create(
pub fn tables(
&self,
datanode_id: DatanodeId,
table_id: TableId,
regions: Vec<RegionNumber>,
) -> Result<()> {
let key = DatanodeTableKey::new(datanode_id, table_id);
let val = DatanodeTableValue::new(table_id, regions.clone());
let req = CompareAndPutRequest::new()
.with_key(key.as_raw_key())
.with_value(val.try_as_raw_value()?);
) -> BoxStream<'static, Result<DatanodeTableValue>> {
let start_key = DatanodeTableKey::range_start_key(datanode_id);
let req = RangeRequest::new().with_prefix(start_key.as_bytes());
let resp = self.kv_backend.compare_and_put(req).await?;
if !resp.success {
let Some(curr) = resp
.prev_kv
.map(|kv| DatanodeTableValue::try_from_raw_value(kv.value))
.transpose()?
else {
return UnexpectedSnafu {
err_msg: format!("compare_and_put expect None but failed with current value None, key: {key}, val: {val:?}"),
}.fail();
};
let stream = PaginationStream::new(
self.kv_backend.clone(),
req,
DEFAULT_PAGE_SIZE,
Arc::new(datanode_table_value_decoder),
);
ensure!(
curr.table_id == table_id && curr.regions == regions,
UnexpectedSnafu {
err_msg: format!("current value '{curr:?}' already existed for key '{key}', {val:?} is not set"),
}
);
}
Ok(())
Box::pin(stream.map(|kv| kv.map(|kv| kv.1)))
}
pub async fn remove(&self, datanode_id: DatanodeId, table_id: TableId) -> Result<()> {
let key = DatanodeTableKey::new(datanode_id, table_id);
let removed_key = to_removed_key(&String::from_utf8_lossy(&key.as_raw_key()));
let req = MoveValueRequest::new(key.as_raw_key(), removed_key.as_bytes());
let _ = self.kv_backend.move_value(req).await?;
Ok(())
}
pub async fn move_region(
/// Builds the create datanode table transactions. It only executes while the primary keys comparing successes.
pub fn build_create_txn(
&self,
from_datanode: DatanodeId,
to_datanode: DatanodeId,
table_id: TableId,
region: RegionNumber,
) -> Result<()> {
let from_key = DatanodeTableKey::new(from_datanode, table_id);
let to_key = DatanodeTableKey::new(to_datanode, table_id);
let mut kvs = self
.kv_backend
.batch_get(BatchGetRequest {
keys: vec![from_key.as_raw_key(), to_key.as_raw_key()],
distribution: RegionDistribution,
) -> Result<Txn> {
let txns = distribution
.into_iter()
.map(|(datanode_id, regions)| {
let key = DatanodeTableKey::new(datanode_id, table_id);
let val = DatanodeTableValue::new(table_id, regions);
Ok(TxnOp::Put(key.as_raw_key(), val.try_as_raw_value()?))
})
.await?
.kvs;
.collect::<Result<Vec<_>>>()?;
ensure!(
!kvs.is_empty(),
MoveRegionSnafu {
table_id,
region,
err_msg: format!("DatanodeTableKey not found for Datanode {from_datanode}"),
}
);
let mut from_value = DatanodeTableValue::try_from_raw_value(kvs.remove(0).value)?;
let txn = Txn::new().and_then(txns);
ensure!(
from_value.regions.contains(&region),
MoveRegionSnafu {
table_id,
region,
err_msg: format!("target region not found in Datanode {from_datanode}"),
}
);
let to_value = if !kvs.is_empty() {
Some(DatanodeTableValue::try_from_raw_value(kvs.remove(0).value)?)
} else {
None
};
if let Some(v) = to_value.as_ref() {
ensure!(
!v.regions.contains(&region),
MoveRegionSnafu {
table_id,
region,
err_msg: format!("target region already existed in Datanode {to_datanode}"),
}
);
}
let compares = vec![
Compare::with_value(
from_key.as_raw_key(),
CompareOp::Equal,
from_value.try_as_raw_value()?,
),
Compare::new(
to_key.as_raw_key(),
CompareOp::Equal,
to_value
.as_ref()
.map(|x| x.try_as_raw_value())
.transpose()?,
),
];
let mut operations = Vec::with_capacity(2);
from_value.regions.retain(|x| *x != region);
if from_value.regions.is_empty() {
operations.push(TxnOp::Delete(from_key.as_raw_key()));
} else {
from_value.version += 1;
operations.push(TxnOp::Put(
from_key.as_raw_key(),
from_value.try_as_raw_value()?,
));
}
if let Some(mut v) = to_value {
v.regions.push(region);
v.version += 1;
operations.push(TxnOp::Put(to_key.as_raw_key(), v.try_as_raw_value()?));
} else {
let v = DatanodeTableValue::new(table_id, vec![region]);
operations.push(TxnOp::Put(to_key.as_raw_key(), v.try_as_raw_value()?));
}
let txn = Txn::new().when(compares).and_then(operations);
let resp = self.kv_backend.txn(txn).await?;
ensure!(
resp.succeeded,
MoveRegionSnafu {
table_id,
region,
err_msg: format!("txn failed with responses: {:?}", resp.responses),
}
);
Ok(())
Ok(txn)
}
pub async fn tables(&self, datanode_id: DatanodeId) -> Result<Vec<DatanodeTableValue>> {
let prefix = DatanodeTableKey::prefix(datanode_id);
let req = RangeRequest::new().with_prefix(prefix.as_bytes());
let resp = self.kv_backend.range(req).await?;
let table_ids = resp
.kvs
.into_iter()
.map(|kv| DatanodeTableValue::try_from_raw_value(kv.value))
/// Builds the update datanode table transactions. It only executes while the primary keys comparing successes.
pub(crate) fn build_update_txn(
&self,
table_id: TableId,
current_region_distribution: RegionDistribution,
new_region_distribution: RegionDistribution,
) -> Result<Txn> {
let mut opts = Vec::new();
// Removes the old datanode table key value pairs
for current_datanode in current_region_distribution.keys() {
if !new_region_distribution.contains_key(current_datanode) {
let key = DatanodeTableKey::new(*current_datanode, table_id);
let raw_key = key.as_raw_key();
opts.push(TxnOp::Delete(raw_key))
}
}
for (datanode, regions) in new_region_distribution.into_iter() {
if let Some(current_region) = current_region_distribution.get(&datanode) {
// Updates if need.
if *current_region != regions {
let key = DatanodeTableKey::new(datanode, table_id);
let raw_key = key.as_raw_key();
let val = DatanodeTableValue::new(table_id, regions).try_as_raw_value()?;
opts.push(TxnOp::Put(raw_key, val));
}
} else {
// New datanodes
let key = DatanodeTableKey::new(datanode, table_id);
let raw_key = key.as_raw_key();
let val = DatanodeTableValue::new(table_id, regions).try_as_raw_value()?;
opts.push(TxnOp::Put(raw_key, val));
}
}
let txn = Txn::new().and_then(opts);
Ok(txn)
}
/// Builds the delete datanode table transactions. It only executes while the primary keys comparing successes.
pub fn build_delete_txn(
&self,
table_id: TableId,
distribution: RegionDistribution,
) -> Result<Txn> {
let txns = distribution
.into_keys()
.map(|datanode_id| {
let key = DatanodeTableKey::new(datanode_id, table_id);
let raw_key = key.as_raw_key();
Ok(TxnOp::Delete(raw_key))
})
.collect::<Result<Vec<_>>>()?;
Ok(table_ids)
let txn = Txn::new().and_then(txns);
Ok(txn)
}
}
#[cfg(test)]
mod tests {
use std::sync::Arc;
use super::*;
use crate::kv_backend::memory::MemoryKvBackend;
use crate::kv_backend::KvBackend;
#[tokio::test]
async fn test_move_region() {
let manager = DatanodeTableManager::new(Arc::new(MemoryKvBackend::default()));
let result = manager.move_region(1, 2, 1, 1).await;
assert!(result.unwrap_err().to_string().contains(
"Failed to move region 1 in table 1, err: DatanodeTableKey not found for Datanode 1"
));
assert!(manager.create(1, 1, vec![1, 2, 3]).await.is_ok());
let result = manager.move_region(1, 2, 1, 100).await;
assert!(result.unwrap_err().to_string().contains(
"Failed to move region 100 in table 1, err: target region not found in Datanode 1"
));
// Move region 1 from datanode 1 to datanode 2.
// Note that the DatanodeTableValue is not existed for datanode 2 now.
assert!(manager.move_region(1, 2, 1, 1).await.is_ok());
let value = manager
.get(&DatanodeTableKey::new(1, 1))
.await
.unwrap()
.unwrap();
assert_eq!(
value,
DatanodeTableValue {
table_id: 1,
regions: vec![2, 3],
version: 1,
}
);
let value = manager
.get(&DatanodeTableKey::new(2, 1))
.await
.unwrap()
.unwrap();
assert_eq!(
value,
DatanodeTableValue {
table_id: 1,
regions: vec![1],
version: 0,
}
);
// Move region 2 from datanode 1 to datanode 2.
assert!(manager.move_region(1, 2, 1, 2).await.is_ok());
let value = manager
.get(&DatanodeTableKey::new(1, 1))
.await
.unwrap()
.unwrap();
assert_eq!(
value,
DatanodeTableValue {
table_id: 1,
regions: vec![3],
version: 2,
}
);
let value = manager
.get(&DatanodeTableKey::new(2, 1))
.await
.unwrap()
.unwrap();
assert_eq!(
value,
DatanodeTableValue {
table_id: 1,
regions: vec![1, 2],
version: 1,
}
);
// Move region 3 (the last region) from datanode 1 to datanode 2.
assert!(manager.move_region(1, 2, 1, 3).await.is_ok());
let value = manager.get(&DatanodeTableKey::new(1, 1)).await.unwrap();
assert!(value.is_none());
let value = manager
.get(&DatanodeTableKey::new(2, 1))
.await
.unwrap()
.unwrap();
assert_eq!(
value,
DatanodeTableValue {
table_id: 1,
regions: vec![1, 2, 3],
version: 2,
}
);
}
#[tokio::test]
async fn test_datanode_table_value_manager() {
let backend = Arc::new(MemoryKvBackend::default());
let manager = DatanodeTableManager::new(backend.clone());
assert!(manager.create(1, 1, vec![1, 2, 3]).await.is_ok());
assert!(manager.create(1, 2, vec![4, 5, 6]).await.is_ok());
assert!(manager.create(2, 1, vec![4, 5, 6]).await.is_ok());
assert!(manager.create(2, 2, vec![1, 2, 3]).await.is_ok());
// If the value is the same, "create" can be called again.
assert!(manager.create(2, 2, vec![1, 2, 3]).await.is_ok());
let err_msg = manager
.create(1, 1, vec![4, 5, 6])
.await
.unwrap_err()
.to_string();
assert!(err_msg.contains("Unexpected: current value 'DatanodeTableValue { table_id: 1, regions: [1, 2, 3], version: 0 }' already existed for key '__dn_table/1/1', DatanodeTableValue { table_id: 1, regions: [4, 5, 6], version: 0 } is not set"));
let to_be_removed_key = DatanodeTableKey::new(2, 1);
let expected_value = DatanodeTableValue {
table_id: 1,
regions: vec![4, 5, 6],
version: 0,
};
let value = manager.get(&to_be_removed_key).await.unwrap().unwrap();
assert_eq!(value, expected_value);
assert!(manager.remove(2, 1).await.is_ok());
assert!(manager.get(&to_be_removed_key).await.unwrap().is_none());
let kv = backend
.get(b"__removed-__dn_table/2/1")
.await
.unwrap()
.unwrap();
assert_eq!(b"__removed-__dn_table/2/1", kv.key());
let value = DatanodeTableValue::try_from_raw_value(kv.value).unwrap();
assert_eq!(value, expected_value);
let values = manager.tables(1).await.unwrap();
assert_eq!(values.len(), 2);
assert_eq!(
values[0],
DatanodeTableValue {
table_id: 1,
regions: vec![1, 2, 3],
version: 0,
}
);
assert_eq!(
values[1],
DatanodeTableValue {
table_id: 2,
regions: vec![4, 5, 6],
version: 0,
}
);
}
#[test]
fn test_serde() {
@@ -445,7 +246,7 @@ mod tests {
let raw_value = value.try_as_raw_value().unwrap();
assert_eq!(raw_value, literal);
let actual = DatanodeTableValue::try_from_raw_value(literal.to_vec()).unwrap();
let actual = DatanodeTableValue::try_from_raw_value(literal).unwrap();
assert_eq!(actual, value);
}

View File

@@ -12,29 +12,69 @@
// See the License for the specific language governing permissions and
// limitations under the License.
use std::collections::HashMap;
use std::fmt::Display;
use std::sync::Arc;
use std::time::Duration;
use common_catalog::consts::{DEFAULT_CATALOG_NAME, DEFAULT_SCHEMA_NAME};
use futures::stream::BoxStream;
use futures::StreamExt;
use humantime_serde::re::humantime;
use serde::{Deserialize, Serialize};
use snafu::{OptionExt, ResultExt};
use crate::error::{self, Error, InvalidTableMetadataSnafu, Result};
use crate::error::{self, Error, InvalidTableMetadataSnafu, ParseOptionSnafu, Result};
use crate::key::{TableMetaKey, SCHEMA_NAME_KEY_PATTERN, SCHEMA_NAME_KEY_PREFIX};
use crate::kv_backend::KvBackendRef;
use crate::range_stream::{PaginationStream, DEFAULT_PAGE_SIZE};
use crate::rpc::store::{PutRequest, RangeRequest};
use crate::rpc::KeyValue;
const OPT_KEY_TTL: &str = "ttl";
#[derive(Debug, Clone, Copy, PartialEq)]
pub struct SchemaNameKey<'a> {
pub catalog: &'a str,
pub schema: &'a str,
}
#[derive(Debug, Serialize, Deserialize)]
pub struct SchemaNameValue;
impl<'a> Default for SchemaNameKey<'a> {
fn default() -> Self {
Self {
catalog: DEFAULT_CATALOG_NAME,
schema: DEFAULT_SCHEMA_NAME,
}
}
}
#[derive(Debug, Default, Clone, PartialEq, Serialize, Deserialize)]
pub struct SchemaNameValue {
#[serde(default)]
#[serde(with = "humantime_serde")]
pub ttl: Option<Duration>,
}
impl TryFrom<&HashMap<String, String>> for SchemaNameValue {
type Error = Error;
fn try_from(value: &HashMap<String, String>) -> std::result::Result<Self, Self::Error> {
let ttl = value
.get(OPT_KEY_TTL)
.map(|ttl_str| {
ttl_str.parse::<humantime::Duration>().map_err(|_| {
ParseOptionSnafu {
key: OPT_KEY_TTL,
value: ttl_str.clone(),
}
.build()
})
})
.transpose()?
.map(|ttl| ttl.into());
Ok(Self { ttl })
}
}
impl<'a> SchemaNameKey<'a> {
pub fn new(catalog: &'a str, schema: &'a str) -> Self {
@@ -62,7 +102,7 @@ impl TableMetaKey for SchemaNameKey<'_> {
}
}
/// Decoder `KeyValue` to ({schema},())
/// Decodes `KeyValue` to ({schema},())
pub fn schema_decoder(kv: KeyValue) -> Result<(String, ())> {
let str = std::str::from_utf8(&kv.key).context(error::ConvertRawKeySnafu)?;
let schema_name = SchemaNameKey::try_from(str)?;
@@ -98,17 +138,35 @@ impl SchemaManager {
}
/// Creates `SchemaNameKey`.
pub async fn create(&self, schema: SchemaNameKey<'_>) -> Result<()> {
pub async fn create(
&self,
schema: SchemaNameKey<'_>,
value: Option<SchemaNameValue>,
) -> Result<()> {
let raw_key = schema.as_raw_key();
let req = PutRequest::new()
.with_key(raw_key)
.with_value(SchemaNameValue.try_as_raw_value()?);
.with_value(value.unwrap_or_default().try_as_raw_value()?);
self.kv_backend.put(req).await?;
Ok(())
}
pub async fn exist(&self, schema: SchemaNameKey<'_>) -> Result<bool> {
let raw_key = schema.as_raw_key();
Ok(self.kv_backend.get(&raw_key).await?.is_some())
}
pub async fn get(&self, schema: SchemaNameKey<'_>) -> Result<Option<SchemaNameValue>> {
let raw_key = schema.as_raw_key();
let value = self.kv_backend.get(&raw_key).await?;
value
.and_then(|v| SchemaNameValue::try_from_raw_value(v.value.as_ref()).transpose())
.transpose()
}
/// Returns a schema stream, it lists all schemas belong to the target `catalog`.
pub async fn schema_names(&self, catalog: &str) -> BoxStream<'static, Result<String>> {
let start_key = SchemaNameKey::range_start_key(catalog);
@@ -127,16 +185,44 @@ impl SchemaManager {
#[cfg(test)]
mod tests {
use super::*;
use crate::kv_backend::memory::MemoryKvBackend;
#[test]
fn test_serialization() {
let key = SchemaNameKey::new("my-catalog", "my-schema");
assert_eq!(key.to_string(), "__schema_name/my-catalog/my-schema");
let parsed: SchemaNameKey<'_> = "__schema_name/my-catalog/my-schema".try_into().unwrap();
assert_eq!(key, parsed);
let value = SchemaNameValue {
ttl: Some(Duration::from_secs(10)),
};
let mut opts: HashMap<String, String> = HashMap::new();
opts.insert("ttl".to_string(), "10s".to_string());
let from_value = SchemaNameValue::try_from(&opts).unwrap();
assert_eq!(value, from_value);
let parsed = SchemaNameValue::try_from_raw_value("{\"ttl\":\"10s\"}".as_bytes()).unwrap();
assert_eq!(Some(value), parsed);
let none = SchemaNameValue::try_from_raw_value("null".as_bytes()).unwrap();
assert!(none.is_none());
let err_empty = SchemaNameValue::try_from_raw_value("".as_bytes());
assert!(err_empty.is_err());
}
#[tokio::test]
async fn test_key_exist() {
let manager = SchemaManager::new(Arc::new(MemoryKvBackend::default()));
let schema_key = SchemaNameKey::new("my-catalog", "my-schema");
manager.create(schema_key, None).await.unwrap();
assert!(manager.exist(schema_key).await.unwrap());
let wrong_schema_key = SchemaNameKey::new("my-catalog", "my-wrong");
assert!(!manager.exist(wrong_schema_key).await.unwrap());
}
}

View File

@@ -13,14 +13,15 @@
// limitations under the License.
use serde::{Deserialize, Serialize};
use snafu::ensure;
use table::engine::TableReference;
use table::metadata::{RawTableInfo, TableId};
use super::TABLE_INFO_KEY_PREFIX;
use crate::error::{Result, UnexpectedSnafu};
use crate::error::Result;
use crate::key::{to_removed_key, TableMetaKey};
use crate::kv_backend::txn::{Compare, CompareOp, Txn, TxnOp, TxnOpResponse};
use crate::kv_backend::KvBackendRef;
use crate::rpc::store::{CompareAndPutRequest, MoveValueRequest};
use crate::table_name::TableName;
pub struct TableInfoKey {
table_id: TableId,
@@ -51,6 +52,41 @@ impl TableInfoValue {
version: 0,
}
}
pub(crate) fn update(&self, new_table_info: RawTableInfo) -> Self {
Self {
table_info: new_table_info,
version: self.version + 1,
}
}
pub(crate) fn with_update<F>(&self, update: F) -> Self
where
F: FnOnce(&mut RawTableInfo),
{
let mut new_table_info = self.table_info.clone();
update(&mut new_table_info);
Self {
table_info: new_table_info,
version: self.version + 1,
}
}
pub fn table_ref(&self) -> TableReference {
TableReference::full(
&self.table_info.catalog_name,
&self.table_info.schema_name,
&self.table_info.name,
)
}
pub fn table_name(&self) -> TableName {
TableName {
catalog_name: self.table_info.catalog_name.to_string(),
schema_name: self.table_info.schema_name.to_string(),
table_name: self.table_info.name.to_string(),
}
}
}
pub struct TableInfoManager {
@@ -62,203 +98,152 @@ impl TableInfoManager {
Self { kv_backend }
}
pub(crate) fn build_get_txn(
&self,
table_id: TableId,
) -> (
Txn,
impl FnOnce(&Vec<TxnOpResponse>) -> Result<Option<TableInfoValue>>,
) {
let key = TableInfoKey::new(table_id);
let raw_key = key.as_raw_key();
let txn = Txn::new().and_then(vec![TxnOp::Get(raw_key.clone())]);
(txn, Self::build_decode_fn(raw_key))
}
/// Builds a create table info transaction, it expected the `__table_info/{table_id}` wasn't occupied.
pub(crate) fn build_create_txn(
&self,
table_id: TableId,
table_info_value: &TableInfoValue,
) -> Result<(
Txn,
impl FnOnce(&Vec<TxnOpResponse>) -> Result<Option<TableInfoValue>>,
)> {
let key = TableInfoKey::new(table_id);
let raw_key = key.as_raw_key();
let txn = Txn::new()
.when(vec![Compare::with_not_exist_value(
raw_key.clone(),
CompareOp::Equal,
)])
.and_then(vec![TxnOp::Put(
raw_key.clone(),
table_info_value.try_as_raw_value()?,
)])
.or_else(vec![TxnOp::Get(raw_key.clone())]);
Ok((txn, Self::build_decode_fn(raw_key)))
}
/// Builds a update table info transaction, it expected the remote value equals the `current_current_table_info_value`.
/// It retrieves the latest value if the comparing failed.
pub(crate) fn build_update_txn(
&self,
table_id: TableId,
current_table_info_value: &TableInfoValue,
new_table_info_value: &TableInfoValue,
) -> Result<(
Txn,
impl FnOnce(&Vec<TxnOpResponse>) -> Result<Option<TableInfoValue>>,
)> {
let key = TableInfoKey::new(table_id);
let raw_key = key.as_raw_key();
let raw_value = current_table_info_value.try_as_raw_value()?;
let txn = Txn::new()
.when(vec![Compare::with_value(
raw_key.clone(),
CompareOp::Equal,
raw_value,
)])
.and_then(vec![TxnOp::Put(
raw_key.clone(),
new_table_info_value.try_as_raw_value()?,
)])
.or_else(vec![TxnOp::Get(raw_key.clone())]);
Ok((txn, Self::build_decode_fn(raw_key)))
}
/// Builds a delete table info transaction.
pub(crate) fn build_delete_txn(
&self,
table_id: TableId,
table_info_value: &TableInfoValue,
) -> Result<Txn> {
let key = TableInfoKey::new(table_id);
let raw_key = key.as_raw_key();
let raw_value = table_info_value.try_as_raw_value()?;
let removed_key = to_removed_key(&String::from_utf8_lossy(&raw_key));
let txn = Txn::new().and_then(vec![
TxnOp::Delete(raw_key),
TxnOp::Put(removed_key.into_bytes(), raw_value),
]);
Ok(txn)
}
fn build_decode_fn(
raw_key: Vec<u8>,
) -> impl FnOnce(&Vec<TxnOpResponse>) -> Result<Option<TableInfoValue>> {
move |kvs: &Vec<TxnOpResponse>| {
kvs.iter()
.filter_map(|resp| {
if let TxnOpResponse::ResponseGet(r) = resp {
Some(r)
} else {
None
}
})
.flat_map(|r| &r.kvs)
.find(|kv| kv.key == raw_key)
.map(|kv| TableInfoValue::try_from_raw_value(&kv.value))
.transpose()
}
}
#[cfg(test)]
pub async fn get_removed(&self, table_id: TableId) -> Result<Option<TableInfoValue>> {
let key = TableInfoKey::new(table_id).to_string();
let removed_key = to_removed_key(&key).into_bytes();
self.kv_backend
.get(&removed_key)
.await?
.map(|x| TableInfoValue::try_from_raw_value(&x.value))
.transpose()
}
pub async fn get(&self, table_id: TableId) -> Result<Option<TableInfoValue>> {
let key = TableInfoKey::new(table_id);
let raw_key = key.as_raw_key();
self.kv_backend
.get(&raw_key)
.await?
.map(|x| TableInfoValue::try_from_raw_value(x.value))
.map(|x| TableInfoValue::try_from_raw_value(&x.value))
.transpose()
}
/// Create TableInfo key and value. If the key already exists, check if the value is the same.
pub async fn create(&self, table_id: TableId, table_info: &RawTableInfo) -> Result<()> {
let result = self
.compare_and_put(table_id, None, table_info.clone())
.await?;
if let Err(curr) = result {
let Some(curr) = curr else {
return UnexpectedSnafu {
err_msg: format!("compare_and_put expect None but failed with current value None, table_id: {table_id}, table_info: {table_info:?}"),
}.fail();
};
ensure!(
&curr.table_info == table_info,
UnexpectedSnafu {
err_msg: format!(
"TableInfoValue for table {table_id} is updated before it is created!"
)
}
)
}
Ok(())
}
/// Compare and put value of key. `expect` is the expected value, if backend's current value associated
/// with key is the same as `expect`, the value will be updated to `val`.
///
/// - If the compare-and-set operation successfully updated value, this method will return an `Ok(Ok())`
/// - If associated value is not the same as `expect`, no value will be updated and an
/// `Ok(Err(Option<TableInfoValue>))` will be returned. The `Option<TableInfoValue>` indicates
/// the current associated value of key.
/// - If any error happens during operation, an `Err(Error)` will be returned.
pub async fn compare_and_put(
&self,
table_id: TableId,
expect: Option<TableInfoValue>,
table_info: RawTableInfo,
) -> Result<std::result::Result<(), Option<TableInfoValue>>> {
let key = TableInfoKey::new(table_id);
let raw_key = key.as_raw_key();
let (expect, version) = if let Some(x) = expect {
(x.try_as_raw_value()?, x.version + 1)
} else {
(vec![], 0)
};
let value = TableInfoValue {
table_info,
version,
};
let raw_value = value.try_as_raw_value()?;
let req = CompareAndPutRequest::new()
.with_key(raw_key)
.with_expect(expect)
.with_value(raw_value);
let resp = self.kv_backend.compare_and_put(req).await?;
Ok(if resp.success {
Ok(())
} else {
Err(resp
.prev_kv
.map(|x| TableInfoValue::try_from_raw_value(x.value))
.transpose()?)
})
}
pub async fn remove(&self, table_id: TableId) -> Result<()> {
let key = TableInfoKey::new(table_id).as_raw_key();
let removed_key = to_removed_key(&String::from_utf8_lossy(&key));
let req = MoveValueRequest::new(key, removed_key.as_bytes());
self.kv_backend.move_value(req).await?;
Ok(())
}
}
#[cfg(test)]
mod tests {
use std::sync::Arc;
use datatypes::prelude::ConcreteDataType;
use datatypes::schema::{ColumnSchema, RawSchema, Schema};
use table::metadata::{RawTableMeta, TableIdent, TableType};
use super::*;
use crate::kv_backend::memory::MemoryKvBackend;
use crate::kv_backend::KvBackend;
use crate::rpc::store::PutRequest;
#[test]
fn test_deserialization_compatibility() {
let s = r#"{"version":1,"table_info":{"ident":{"table_id":8714,"version":0},"name":"go_gc_duration_seconds","desc":"Created on insertion","catalog_name":"e87lehzy63d4cloud_docs_test","schema_name":"public","meta":{"schema":{"column_schemas":[{"name":"instance","data_type":{"String":null},"is_nullable":true,"is_time_index":false,"default_constraint":null,"metadata":{}},{"name":"job","data_type":{"String":null},"is_nullable":true,"is_time_index":false,"default_constraint":null,"metadata":{}},{"name":"quantile","data_type":{"String":null},"is_nullable":true,"is_time_index":false,"default_constraint":null,"metadata":{}},{"name":"greptime_timestamp","data_type":{"Timestamp":{"Millisecond":null}},"is_nullable":false,"is_time_index":true,"default_constraint":null,"metadata":{"greptime:time_index":"true"}},{"name":"greptime_value","data_type":{"Float64":{}},"is_nullable":true,"is_time_index":false,"default_constraint":null,"metadata":{}}],"timestamp_index":3,"version":0},"primary_key_indices":[0,1,2],"value_indices":[],"engine":"mito","next_column_id":5,"region_numbers":[],"engine_options":{},"options":{"write_buffer_size":null,"ttl":null,"extra_options":{}},"created_on":"1970-01-01T00:00:00Z"},"table_type":"Base"}}"#;
let v = TableInfoValue::try_from_raw_value(s.as_bytes().to_vec()).unwrap();
let v = TableInfoValue::try_from_raw_value(s.as_bytes()).unwrap();
assert!(v.table_info.meta.partition_key_indices.is_empty());
}
#[tokio::test]
async fn test_table_info_manager() {
let backend = Arc::new(MemoryKvBackend::default());
for i in 1..=3 {
let key = TableInfoKey::new(i).as_raw_key();
let val = TableInfoValue {
table_info: new_table_info(i),
version: 1,
}
.try_as_raw_value()
.unwrap();
let req = PutRequest::new().with_key(key).with_value(val);
backend.put(req).await.unwrap();
}
let manager = TableInfoManager::new(backend.clone());
assert!(manager.create(99, &new_table_info(99)).await.is_ok());
assert!(manager.create(99, &new_table_info(99)).await.is_ok());
let result = manager.create(99, &new_table_info(88)).await;
let err_msg = result.unwrap_err().to_string();
assert!(err_msg
.contains("Unexpected: TableInfoValue for table 99 is updated before it is created!"));
let val = manager.get(1).await.unwrap().unwrap();
assert_eq!(
val,
TableInfoValue {
table_info: new_table_info(1),
version: 1,
}
);
assert!(manager.get(4).await.unwrap().is_none());
// test cas failed, current value is not set
let table_info = new_table_info(4);
let result = manager
.compare_and_put(
4,
Some(TableInfoValue {
table_info: table_info.clone(),
version: 0,
}),
table_info.clone(),
)
.await
.unwrap();
assert!(result.unwrap_err().is_none());
let result = manager
.compare_and_put(4, None, table_info.clone())
.await
.unwrap();
assert!(result.is_ok());
// test cas failed, the new table info is not set
let new_table_info = new_table_info(4);
let result = manager
.compare_and_put(4, None, new_table_info.clone())
.await
.unwrap();
let actual = result.unwrap_err().unwrap();
assert_eq!(
actual,
TableInfoValue {
table_info: table_info.clone(),
version: 0,
}
);
// test cas success
let result = manager
.compare_and_put(4, Some(actual), new_table_info.clone())
.await
.unwrap();
assert!(result.is_ok());
assert!(manager.remove(4).await.is_ok());
let kv = backend
.get(b"__removed-__table_info/4")
.await
.unwrap()
.unwrap();
assert_eq!(b"__removed-__table_info/4", kv.key.as_slice());
let value = TableInfoValue::try_from_raw_value(kv.value).unwrap();
assert_eq!(value.table_info, new_table_info);
assert_eq!(value.version, 1);
}
#[test]
fn test_key_serde() {
let key = TableInfoKey::new(42);
@@ -273,7 +258,7 @@ mod tests {
version: 1,
};
let serialized = value.try_as_raw_value().unwrap();
let deserialized = TableInfoValue::try_from_raw_value(serialized).unwrap();
let deserialized = TableInfoValue::try_from_raw_value(&serialized).unwrap();
assert_eq!(value, deserialized);
}

View File

@@ -15,19 +15,16 @@
use std::sync::Arc;
use serde::{Deserialize, Serialize};
use snafu::{ensure, OptionExt};
use snafu::OptionExt;
use table::metadata::TableId;
use super::{TABLE_NAME_KEY_PATTERN, TABLE_NAME_KEY_PREFIX};
use crate::error::{
Error, InvalidTableMetadataSnafu, RenameTableSnafu, Result, TableAlreadyExistsSnafu,
TableNotExistSnafu, UnexpectedSnafu,
};
use crate::error::{Error, InvalidTableMetadataSnafu, Result};
use crate::key::{to_removed_key, TableMetaKey};
use crate::kv_backend::memory::MemoryKvBackend;
use crate::kv_backend::txn::{Compare, CompareOp, Txn, TxnOp};
use crate::kv_backend::txn::{Txn, TxnOp};
use crate::kv_backend::KvBackendRef;
use crate::rpc::store::{CompareAndPutRequest, MoveValueRequest, RangeRequest};
use crate::rpc::store::RangeRequest;
use crate::table_name::TableName;
#[derive(Debug, Clone, Copy)]
@@ -150,95 +147,57 @@ impl TableNameManager {
Self { kv_backend }
}
/// Create TableName key and value. If the key already exists, check if the value is the same.
pub async fn create(&self, key: &TableNameKey<'_>, table_id: TableId) -> Result<()> {
/// Builds a create table name transaction. It only executes while the primary keys comparing successes.
pub(crate) fn build_create_txn(
&self,
key: &TableNameKey<'_>,
table_id: TableId,
) -> Result<Txn> {
let raw_key = key.as_raw_key();
let value = TableNameValue::new(table_id);
let raw_value = value.try_as_raw_value()?;
let req = CompareAndPutRequest::new()
.with_key(raw_key)
.with_value(raw_value);
let result = self.kv_backend.compare_and_put(req).await?;
if !result.success {
let Some(curr) = result
.prev_kv
.map(|x| TableNameValue::try_from_raw_value(x.value))
.transpose()?
else {
return UnexpectedSnafu {
err_msg: format!("compare_and_put expect None but failed with current value None, key: {key}, value: {value:?}"),
}.fail();
};
ensure!(
curr.table_id == table_id,
TableAlreadyExistsSnafu {
table_id: curr.table_id
}
);
}
Ok(())
let txn = Txn::new().and_then(vec![TxnOp::Put(raw_key, raw_value)]);
Ok(txn)
}
/// Rename a TableNameKey to a new table name. Will check whether the TableNameValue matches the
/// `expected_table_id` first. Can be executed again if the first invocation is successful.
pub async fn rename(
/// Builds a update table name transaction. It only executes while the primary keys comparing successes.
pub(crate) fn build_update_txn(
&self,
key: TableNameKey<'_>,
expected_table_id: TableId,
new_table_name: &str,
) -> Result<()> {
let new_key = TableNameKey::new(key.catalog, key.schema, new_table_name);
key: &TableNameKey<'_>,
new_key: &TableNameKey<'_>,
table_id: TableId,
) -> Result<Txn> {
let raw_key = key.as_raw_key();
let new_raw_key = new_key.as_raw_key();
let value = TableNameValue::new(table_id);
let raw_value = value.try_as_raw_value()?;
if let Some(value) = self.get(key).await? {
ensure!(
value.table_id == expected_table_id,
RenameTableSnafu {
reason: format!(
"the input table name '{}' and id '{expected_table_id}' not match",
Into::<TableName>::into(key)
),
}
);
let txn = Txn::new().and_then(vec![
TxnOp::Delete(raw_key),
TxnOp::Put(new_raw_key, raw_value),
]);
Ok(txn)
}
let txn = Txn::new()
.when(vec![
Compare::with_value(
key.as_raw_key(),
CompareOp::Equal,
value.try_as_raw_value()?,
),
Compare::with_not_exist_value(new_key.as_raw_key(), CompareOp::Equal),
])
.and_then(vec![
TxnOp::Delete(key.as_raw_key()),
TxnOp::Put(new_key.as_raw_key(), value.try_as_raw_value()?),
]);
/// Builds a delete table name transaction. It only executes while the primary keys comparing successes.
pub(crate) fn build_delete_txn(
&self,
key: &TableNameKey<'_>,
table_id: TableId,
) -> Result<Txn> {
let raw_key = key.as_raw_key();
let value = TableNameValue::new(table_id);
let raw_value = value.try_as_raw_value()?;
let removed_key = to_removed_key(&String::from_utf8_lossy(&raw_key));
let resp = self.kv_backend.txn(txn).await?;
ensure!(
resp.succeeded,
RenameTableSnafu {
reason: format!("txn failed with response: {:?}", resp.responses)
}
);
} else {
let Some(value) = self.get(new_key).await? else {
// If we can't get the table by its original name, nor can we get by its altered
// name, then the table must not exist at the first place.
return TableNotExistSnafu {
table_name: TableName::from(key).to_string(),
}
.fail();
};
let txn = Txn::new().and_then(vec![
TxnOp::Delete(raw_key),
TxnOp::Put(removed_key.into_bytes(), raw_value),
]);
ensure!(
value.table_id == expected_table_id,
TableAlreadyExistsSnafu {
table_id: value.table_id
}
);
}
Ok(())
Ok(txn)
}
pub async fn get(&self, key: TableNameKey<'_>) -> Result<Option<TableNameValue>> {
@@ -246,10 +205,15 @@ impl TableNameManager {
self.kv_backend
.get(&raw_key)
.await?
.map(|x| TableNameValue::try_from_raw_value(x.value))
.map(|x| TableNameValue::try_from_raw_value(&x.value))
.transpose()
}
pub async fn exists(&self, key: TableNameKey<'_>) -> Result<bool> {
let raw_key = key.as_raw_key();
self.kv_backend.exists(&raw_key).await
}
pub async fn tables(
&self,
catalog: &str,
@@ -263,95 +227,17 @@ impl TableNameManager {
for kv in resp.kvs {
res.push((
TableNameKey::strip_table_name(kv.key())?,
TableNameValue::try_from_raw_value(kv.value)?,
TableNameValue::try_from_raw_value(&kv.value)?,
))
}
Ok(res)
}
pub async fn remove(&self, key: TableNameKey<'_>) -> Result<()> {
let raw_key = key.as_raw_key();
let removed_key = to_removed_key(&String::from_utf8_lossy(&raw_key));
let req = MoveValueRequest::new(raw_key, removed_key.as_bytes());
let _ = self.kv_backend.move_value(req).await?;
Ok(())
}
}
#[cfg(test)]
mod tests {
use std::sync::Arc;
use super::*;
use crate::kv_backend::memory::MemoryKvBackend;
use crate::kv_backend::KvBackend;
#[tokio::test]
async fn test_table_name_manager() {
let backend = Arc::new(MemoryKvBackend::default());
let manager = TableNameManager::new(backend.clone());
for i in 1..=3 {
let table_name = format!("table_{}", i);
let key = TableNameKey::new("my_catalog", "my_schema", &table_name);
assert!(manager.create(&key, i).await.is_ok());
}
let key = TableNameKey::new("my_catalog", "my_schema", "my_table");
assert!(manager.create(&key, 99).await.is_ok());
assert!(manager.create(&key, 99).await.is_ok());
let result = manager.create(&key, 9).await;
let err_msg = result.unwrap_err().to_string();
assert!(err_msg.contains("Table already exists, table_id: 99"));
let value = manager.get(key).await.unwrap().unwrap();
assert_eq!(value.table_id(), 99);
let not_existed = TableNameKey::new("x", "y", "z");
assert!(manager.get(not_existed).await.unwrap().is_none());
assert!(manager.remove(key).await.is_ok());
let kv = backend
.get(b"__removed-__table_name/my_catalog/my_schema/my_table")
.await
.unwrap()
.unwrap();
let value = TableNameValue::try_from_raw_value(kv.value).unwrap();
assert_eq!(value.table_id(), 99);
let key = TableNameKey::new("my_catalog", "my_schema", "table_1");
assert!(manager.rename(key, 1, "table_1_new").await.is_ok());
assert!(manager.rename(key, 1, "table_1_new").await.is_ok());
let result = manager.rename(key, 2, "table_1_new").await;
let err_msg = result.unwrap_err().to_string();
assert!(err_msg.contains("Table already exists, table_id: 1"));
let result = manager
.rename(
TableNameKey::new("my_catalog", "my_schema", "table_2"),
22,
"table_2_new",
)
.await;
let err_msg = result.unwrap_err().to_string();
assert!(err_msg.contains("Failed to rename table, reason: the input table name 'my_catalog.my_schema.table_2' and id '22' not match"));
let result = manager.rename(not_existed, 1, "zz").await;
let err_msg = result.unwrap_err().to_string();
assert!(err_msg.contains("Table does not exist, table_name: x.y.z"));
let tables = manager.tables("my_catalog", "my_schema").await.unwrap();
assert_eq!(tables.len(), 3);
assert_eq!(
tables,
vec![
("table_1_new".to_string(), TableNameValue::new(1)),
("table_2".to_string(), TableNameValue::new(2)),
("table_3".to_string(), TableNameValue::new(3))
]
)
}
#[test]
fn test_strip_table_name() {
@@ -397,9 +283,6 @@ mod tests {
let literal = br#"{"table_id":1}"#;
assert_eq!(value.try_as_raw_value().unwrap(), literal);
assert_eq!(
TableNameValue::try_from_raw_value(literal.to_vec()).unwrap(),
value
);
assert_eq!(TableNameValue::try_from_raw_value(literal).unwrap(), value);
}
}

View File

@@ -15,19 +15,21 @@
use std::collections::BTreeMap;
use serde::{Deserialize, Serialize};
use snafu::ensure;
use snafu::ResultExt;
use store_api::storage::RegionNumber;
use table::metadata::TableId;
use super::TABLE_REGION_KEY_PREFIX;
use crate::error::{Result, UnexpectedSnafu};
use crate::key::{to_removed_key, TableMetaKey};
use crate::kv_backend::KvBackendRef;
use crate::rpc::store::{CompareAndPutRequest, MoveValueRequest};
use crate::DatanodeId;
use crate::error::{Result, SerdeJsonSnafu};
use crate::key::TableMetaKey;
use crate::{impl_table_meta_key, impl_table_meta_value, DatanodeId};
pub type RegionDistribution = BTreeMap<DatanodeId, Vec<RegionNumber>>;
#[deprecated(
since = "0.4.0",
note = "Please use the TableRouteManager's get_region_distribution method instead"
)]
pub struct TableRegionKey {
table_id: TableId,
}
@@ -44,6 +46,12 @@ impl TableMetaKey for TableRegionKey {
}
}
impl_table_meta_key! {TableRegionKey}
#[deprecated(
since = "0.4.0",
note = "Please use the TableRouteManager's get_region_distribution method instead"
)]
#[derive(Debug, Clone, Serialize, Deserialize, PartialEq)]
pub struct TableRegionValue {
pub region_distribution: RegionDistribution,
@@ -59,196 +67,12 @@ impl TableRegionValue {
}
}
pub struct TableRegionManager {
kv_backend: KvBackendRef,
}
impl TableRegionManager {
pub fn new(kv_backend: KvBackendRef) -> Self {
Self { kv_backend }
}
pub async fn get(&self, table_id: TableId) -> Result<Option<TableRegionValue>> {
let key = TableRegionKey::new(table_id);
let raw_key = key.as_raw_key();
self.kv_backend
.get(&raw_key)
.await?
.map(|x| TableRegionValue::try_from_raw_value(x.value))
.transpose()
}
/// Create TableRegion key and value. If the key already exists, check if the value is the same.
pub async fn create(
&self,
table_id: TableId,
region_distribution: &RegionDistribution,
) -> Result<()> {
let result = self
.compare_and_put(table_id, None, region_distribution.clone())
.await?;
if let Err(curr) = result {
let Some(curr) = curr else {
return UnexpectedSnafu {
err_msg: format!("compare_and_put expect None but failed with current value None, table_id: {table_id}, region_distribution: {region_distribution:?}"),
}.fail();
};
ensure!(
&curr.region_distribution == region_distribution,
UnexpectedSnafu {
err_msg: format!(
"TableRegionValue for table {table_id} is updated before it is created!"
)
}
)
}
Ok(())
}
/// Compare and put value of key. `expect` is the expected value, if backend's current value associated
/// with key is the same as `expect`, the value will be updated to `val`.
///
/// - If the compare-and-set operation successfully updated value, this method will return an `Ok(Ok())`
/// - If associated value is not the same as `expect`, no value will be updated and an `Ok(Err(Vec<u8>))`
/// will be returned, the `Err(Vec<u8>)` indicates the current associated value of key.
/// - If any error happens during operation, an `Err(Error)` will be returned.
pub async fn compare_and_put(
&self,
table_id: TableId,
expect: Option<TableRegionValue>,
region_distribution: RegionDistribution,
) -> Result<std::result::Result<(), Option<TableRegionValue>>> {
let key = TableRegionKey::new(table_id);
let raw_key = key.as_raw_key();
let (expect, version) = if let Some(x) = expect {
(x.try_as_raw_value()?, x.version + 1)
} else {
(vec![], 0)
};
let value = TableRegionValue {
region_distribution,
version,
};
let raw_value = value.try_as_raw_value()?;
let req = CompareAndPutRequest::new()
.with_key(raw_key)
.with_expect(expect)
.with_value(raw_value);
let resp = self.kv_backend.compare_and_put(req).await?;
Ok(if resp.success {
Ok(())
} else {
Err(resp
.prev_kv
.map(|x| TableRegionValue::try_from_raw_value(x.value))
.transpose()?)
})
}
pub async fn remove(&self, table_id: TableId) -> Result<Option<TableRegionValue>> {
let key = TableRegionKey::new(table_id).as_raw_key();
let remove_key = to_removed_key(&String::from_utf8_lossy(&key));
let req = MoveValueRequest::new(key, remove_key.as_bytes());
let resp = self.kv_backend.move_value(req).await?;
resp.0
.map(|x| TableRegionValue::try_from_raw_value(x.value))
.transpose()
}
}
impl_table_meta_value! {TableRegionValue}
#[cfg(test)]
mod tests {
use std::sync::Arc;
use super::*;
use crate::kv_backend::memory::MemoryKvBackend;
use crate::kv_backend::KvBackend;
#[tokio::test]
async fn test_table_region_manager() {
let backend = Arc::new(MemoryKvBackend::default());
let manager = TableRegionManager::new(backend.clone());
let region_distribution =
RegionDistribution::from([(1, vec![1, 2, 3]), (2, vec![4, 5, 6])]);
let new_region_distribution =
RegionDistribution::from([(1, vec![4, 5, 6]), (2, vec![1, 2, 3])]);
let result = manager
.compare_and_put(1, None, region_distribution.clone())
.await
.unwrap();
assert!(result.is_ok());
let curr = manager
.compare_and_put(1, None, new_region_distribution.clone())
.await
.unwrap()
.unwrap_err()
.unwrap();
assert_eq!(
curr,
TableRegionValue {
region_distribution: region_distribution.clone(),
version: 0
}
);
assert!(manager
.compare_and_put(1, Some(curr), new_region_distribution.clone())
.await
.unwrap()
.is_ok());
assert!(manager.create(99, &region_distribution).await.is_ok());
assert!(manager.create(99, &region_distribution).await.is_ok());
let result = manager.create(99, &new_region_distribution).await;
let err_msg = result.unwrap_err().to_string();
assert!(err_msg.contains("TableRegionValue for table 99 is updated before it is created!"));
let value = manager.get(1).await.unwrap().unwrap();
assert_eq!(
value,
TableRegionValue {
region_distribution: new_region_distribution.clone(),
version: 1
}
);
let value = manager.get(99).await.unwrap().unwrap();
assert_eq!(
value,
TableRegionValue {
region_distribution,
version: 0
}
);
assert!(manager.get(2).await.unwrap().is_none());
let value = manager.remove(1).await.unwrap().unwrap();
assert_eq!(
value,
TableRegionValue {
region_distribution: new_region_distribution.clone(),
version: 1
}
);
assert!(manager.remove(123).await.unwrap().is_none());
let kv = backend
.get(b"__removed-__table_region/1")
.await
.unwrap()
.unwrap();
assert_eq!(b"__removed-__table_region/1", kv.key.as_slice());
let value = TableRegionValue::try_from_raw_value(kv.value).unwrap();
assert_eq!(value.region_distribution, new_region_distribution);
assert_eq!(value.version, 1);
}
#[test]
fn test_serde() {
@@ -264,7 +88,7 @@ mod tests {
assert_eq!(value.try_as_raw_value().unwrap(), literal);
assert_eq!(
TableRegionValue::try_from_raw_value(literal.to_vec()).unwrap(),
TableRegionValue::try_from_raw_value(literal).unwrap(),
value,
);
}

View File

@@ -15,12 +15,212 @@
use std::fmt::Display;
use api::v1::meta::TableName;
use serde::{Deserialize, Serialize};
use table::metadata::TableId;
use crate::key::to_removed_key;
use crate::error::Result;
use crate::key::{to_removed_key, RegionDistribution, TableMetaKey};
use crate::kv_backend::txn::{Compare, CompareOp, Txn, TxnOp, TxnOpResponse};
use crate::kv_backend::KvBackendRef;
use crate::rpc::router::{region_distribution, RegionRoute};
pub const TABLE_ROUTE_PREFIX: &str = "__meta_table_route";
pub const NEXT_TABLE_ROUTE_PREFIX: &str = "__table_route";
// TODO(weny): Renames it to TableRouteKey.
pub struct NextTableRouteKey {
pub table_id: TableId,
}
impl NextTableRouteKey {
pub fn new(table_id: TableId) -> Self {
Self { table_id }
}
}
#[derive(Debug, PartialEq, Serialize, Deserialize, Clone)]
pub struct TableRouteValue {
pub region_routes: Vec<RegionRoute>,
version: u64,
}
impl TableRouteValue {
pub fn new(region_routes: Vec<RegionRoute>) -> Self {
Self {
region_routes,
version: 0,
}
}
pub fn update(&self, region_routes: Vec<RegionRoute>) -> Self {
Self {
region_routes,
version: self.version + 1,
}
}
}
impl TableMetaKey for NextTableRouteKey {
fn as_raw_key(&self) -> Vec<u8> {
self.to_string().into_bytes()
}
}
impl Display for NextTableRouteKey {
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
write!(f, "{}/{}", NEXT_TABLE_ROUTE_PREFIX, self.table_id)
}
}
pub struct TableRouteManager {
kv_backend: KvBackendRef,
}
impl TableRouteManager {
pub fn new(kv_backend: KvBackendRef) -> Self {
Self { kv_backend }
}
pub(crate) fn build_get_txn(
&self,
table_id: TableId,
) -> (
Txn,
impl FnOnce(&Vec<TxnOpResponse>) -> Result<Option<TableRouteValue>>,
) {
let key = NextTableRouteKey::new(table_id);
let raw_key = key.as_raw_key();
let txn = Txn::new().and_then(vec![TxnOp::Get(raw_key.clone())]);
(txn, Self::build_decode_fn(raw_key))
}
/// Builds a create table route transaction. it expected the `__table_route/{table_id}` wasn't occupied.
pub(crate) fn build_create_txn(
&self,
table_id: TableId,
table_route_value: &TableRouteValue,
) -> Result<(
Txn,
impl FnOnce(&Vec<TxnOpResponse>) -> Result<Option<TableRouteValue>>,
)> {
let key = NextTableRouteKey::new(table_id);
let raw_key = key.as_raw_key();
let txn = Txn::new()
.when(vec![Compare::with_not_exist_value(
raw_key.clone(),
CompareOp::Equal,
)])
.and_then(vec![TxnOp::Put(
raw_key.clone(),
table_route_value.try_as_raw_value()?,
)])
.or_else(vec![TxnOp::Get(raw_key.clone())]);
Ok((txn, Self::build_decode_fn(raw_key)))
}
/// Builds a update table route transaction, it expected the remote value equals the `current_table_route_value`.
/// It retrieves the latest value if the comparing failed.
pub(crate) fn build_update_txn(
&self,
table_id: TableId,
current_table_route_value: &TableRouteValue,
new_table_route_value: &TableRouteValue,
) -> Result<(
Txn,
impl FnOnce(&Vec<TxnOpResponse>) -> Result<Option<TableRouteValue>>,
)> {
let key = NextTableRouteKey::new(table_id);
let raw_key = key.as_raw_key();
let raw_value = current_table_route_value.try_as_raw_value()?;
let new_raw_value: Vec<u8> = new_table_route_value.try_as_raw_value()?;
let txn = Txn::new()
.when(vec![Compare::with_value(
raw_key.clone(),
CompareOp::Equal,
raw_value,
)])
.and_then(vec![TxnOp::Put(raw_key.clone(), new_raw_value)])
.or_else(vec![TxnOp::Get(raw_key.clone())]);
Ok((txn, Self::build_decode_fn(raw_key)))
}
/// Builds a delete table route transaction, it expected the remote value equals the `table_route_value`.
pub(crate) fn build_delete_txn(
&self,
table_id: TableId,
table_route_value: &TableRouteValue,
) -> Result<Txn> {
let key = NextTableRouteKey::new(table_id);
let raw_key = key.as_raw_key();
let raw_value = table_route_value.try_as_raw_value()?;
let removed_key = to_removed_key(&String::from_utf8_lossy(&raw_key));
let txn = Txn::new().and_then(vec![
TxnOp::Delete(raw_key),
TxnOp::Put(removed_key.into_bytes(), raw_value),
]);
Ok(txn)
}
fn build_decode_fn(
raw_key: Vec<u8>,
) -> impl FnOnce(&Vec<TxnOpResponse>) -> Result<Option<TableRouteValue>> {
move |response: &Vec<TxnOpResponse>| {
response
.iter()
.filter_map(|resp| {
if let TxnOpResponse::ResponseGet(r) = resp {
Some(r)
} else {
None
}
})
.flat_map(|r| &r.kvs)
.find(|kv| kv.key == raw_key)
.map(|kv| TableRouteValue::try_from_raw_value(&kv.value))
.transpose()
}
}
pub async fn get(&self, table_id: TableId) -> Result<Option<TableRouteValue>> {
let key = NextTableRouteKey::new(table_id);
self.kv_backend
.get(&key.as_raw_key())
.await?
.map(|kv| TableRouteValue::try_from_raw_value(&kv.value))
.transpose()
}
#[cfg(test)]
pub async fn get_removed(&self, table_id: TableId) -> Result<Option<TableRouteValue>> {
let key = NextTableRouteKey::new(table_id).to_string();
let removed_key = to_removed_key(&key).into_bytes();
self.kv_backend
.get(&removed_key)
.await?
.map(|x| TableRouteValue::try_from_raw_value(&x.value))
.transpose()
}
pub async fn get_region_distribution(
&self,
table_id: TableId,
) -> Result<Option<RegionDistribution>> {
self.get(table_id)
.await?
.map(|table_route| region_distribution(&table_route.region_routes))
.transpose()
}
}
#[deprecated(since = "0.4.0", note = "Please use the NextTableRouteKey instead")]
#[derive(Copy, Clone)]
pub struct TableRouteKey<'a> {
pub table_id: TableId,
@@ -59,7 +259,8 @@ impl<'a> Display for TableRouteKey<'a> {
#[cfg(test)]
mod tests {
use api::v1::meta::TableName;
use api::v1::meta::TableName as PbTableName;
use super::TableRouteKey;
@@ -87,7 +288,7 @@ mod tests {
#[test]
fn test_with_table_name() {
let table_name = TableName {
let table_name = PbTableName {
catalog_name: "greptime".to_string(),
schema_name: "public".to_string(),
table_name: "demo".to_string(),

View File

@@ -100,6 +100,14 @@ pub struct TxnRequest {
pub failure: Vec<TxnOp>,
}
impl TxnRequest {
pub fn extend(&mut self, other: TxnRequest) {
self.compare.extend(other.compare);
self.success.extend(other.success);
self.failure.extend(other.failure);
}
}
#[derive(Debug, Clone, PartialEq)]
pub enum TxnOpResponse {
ResponsePut(PutResponse),
@@ -121,6 +129,23 @@ pub struct Txn {
}
impl Txn {
pub fn merge_all<T: IntoIterator<Item = Txn>>(values: T) -> Self {
values
.into_iter()
.reduce(|acc, e| acc.merge(e))
.unwrap_or_default()
}
pub fn merge(mut self, other: Txn) -> Self {
self.c_when |= other.c_when;
self.c_then |= other.c_then;
self.c_else |= other.c_else;
self.req.extend(other.req);
self
}
pub fn new() -> Self {
Txn::default()
}

View File

@@ -16,6 +16,8 @@
pub mod error;
pub mod heartbeat;
// TODO(weny): Removes it
#[allow(deprecated)]
pub mod helper;
pub mod ident;
pub mod instruction;

View File

@@ -12,19 +12,21 @@
// See the License for the specific language governing permissions and
// limitations under the License.
use std::collections::{HashMap, HashSet};
use std::collections::{BTreeMap, HashMap, HashSet};
use api::v1::meta::{
Partition as PbPartition, Peer as PbPeer, Region as PbRegion, RegionRoute as PbRegionRoute,
RouteRequest as PbRouteRequest, RouteResponse as PbRouteResponse, Table as PbTable,
TableId as PbTableId, TableRoute as PbTableRoute, TableRouteValue as PbTableRouteValue,
};
use serde::{Deserialize, Serialize, Serializer};
use serde::ser::SerializeSeq;
use serde::{Deserialize, Deserializer, Serialize, Serializer};
use snafu::OptionExt;
use store_api::storage::{RegionId, RegionNumber};
use table::metadata::TableId;
use crate::error::{self, Result};
use crate::key::RegionDistribution;
use crate::peer::Peer;
use crate::rpc::util;
use crate::table_name::TableName;
@@ -72,6 +74,27 @@ impl TryFrom<PbRouteResponse> for RouteResponse {
}
}
pub fn region_distribution(region_routes: &[RegionRoute]) -> Result<RegionDistribution> {
let mut regions_id_map = RegionDistribution::new();
for route in region_routes.iter() {
let node_id = route
.leader_peer
.as_ref()
.context(error::UnexpectedSnafu {
err_msg: "leader not found",
})?
.id;
let region_id = route.region.id.region_number();
regions_id_map.entry(node_id).or_default().push(region_id);
}
for (_, regions) in regions_id_map.iter_mut() {
// id asc
regions.sort()
}
Ok(regions_id_map)
}
#[derive(Debug, Clone, Deserialize, Serialize, PartialEq)]
pub struct TableRoute {
pub table: Table,
@@ -79,13 +102,47 @@ pub struct TableRoute {
region_leaders: HashMap<RegionNumber, Option<Peer>>,
}
pub fn find_leaders(region_routes: &[RegionRoute]) -> HashSet<Peer> {
region_routes
.iter()
.flat_map(|x| &x.leader_peer)
.cloned()
.collect()
}
pub fn find_leader_regions(region_routes: &[RegionRoute], datanode: &Peer) -> Vec<RegionNumber> {
region_routes
.iter()
.filter_map(|x| {
if let Some(peer) = &x.leader_peer {
if peer == datanode {
return Some(x.region.id.region_number());
}
}
None
})
.collect()
}
pub fn extract_all_peers(region_routes: &[RegionRoute]) -> Vec<Peer> {
let mut peers = region_routes
.iter()
.flat_map(|x| x.leader_peer.iter().chain(x.follower_peers.iter()))
.collect::<HashSet<_>>()
.into_iter()
.cloned()
.collect::<Vec<_>>();
peers.sort_by_key(|x| x.id);
peers
}
impl TableRoute {
pub fn new(table: Table, region_routes: Vec<RegionRoute>) -> Self {
let region_leaders = region_routes
.iter()
.map(|x| (x.region.id.region_number(), x.leader_peer.clone()))
.collect::<HashMap<_, _>>();
Self {
table,
region_routes,
@@ -193,25 +250,11 @@ impl TableRoute {
}
pub fn find_leaders(&self) -> HashSet<Peer> {
self.region_routes
.iter()
.flat_map(|x| &x.leader_peer)
.cloned()
.collect()
find_leaders(&self.region_routes)
}
pub fn find_leader_regions(&self, datanode: &Peer) -> Vec<RegionNumber> {
self.region_routes
.iter()
.filter_map(|x| {
if let Some(peer) = &x.leader_peer {
if peer == datanode {
return Some(x.region.id.region_number());
}
}
None
})
.collect()
find_leader_regions(&self.region_routes, datanode)
}
pub fn find_region_leader(&self, region_number: RegionNumber) -> Option<&Peer> {
@@ -223,6 +266,7 @@ impl TableRoute {
impl TryFrom<PbTableRouteValue> for TableRoute {
type Error = error::Error;
fn try_from(pb: PbTableRouteValue) -> Result<Self> {
TableRoute::try_from_raw(
&pb.peers,
@@ -233,11 +277,24 @@ impl TryFrom<PbTableRouteValue> for TableRoute {
}
}
impl TryFrom<TableRoute> for PbTableRouteValue {
type Error = error::Error;
fn try_from(table_route: TableRoute) -> Result<Self> {
let (peers, table_route) = table_route.try_into_raw()?;
Ok(PbTableRouteValue {
peers,
table_route: Some(table_route),
})
}
}
#[derive(Debug, Clone, Deserialize, Serialize, PartialEq)]
pub struct Table {
pub id: u64,
pub table_name: TableName,
#[serde(serialize_with = "as_utf8")]
#[serde(serialize_with = "as_utf8", deserialize_with = "from_utf8")]
pub table_schema: Vec<u8>,
}
@@ -281,7 +338,7 @@ pub struct Region {
pub id: RegionId,
pub name: String,
pub partition: Option<Partition>,
pub attrs: HashMap<String, String>,
pub attrs: BTreeMap<String, String>,
}
impl From<PbRegion> for Region {
@@ -290,7 +347,7 @@ impl From<PbRegion> for Region {
id: r.id.into(),
name: r.name,
partition: r.partition.map(Into::into),
attrs: r.attrs,
attrs: r.attrs.into_iter().collect::<BTreeMap<_, _>>(),
}
}
}
@@ -301,16 +358,16 @@ impl From<Region> for PbRegion {
id: region.id.into(),
name: region.name,
partition: region.partition.map(Into::into),
attrs: region.attrs,
attrs: region.attrs.into_iter().collect::<HashMap<_, _>>(),
}
}
}
#[derive(Debug, Clone, Deserialize, Serialize, PartialEq)]
pub struct Partition {
#[serde(serialize_with = "as_utf8_vec")]
#[serde(serialize_with = "as_utf8_vec", deserialize_with = "from_utf8_vec")]
pub column_list: Vec<Vec<u8>>,
#[serde(serialize_with = "as_utf8_vec")]
#[serde(serialize_with = "as_utf8_vec", deserialize_with = "from_utf8_vec")]
pub value_list: Vec<Vec<u8>>,
}
@@ -322,19 +379,37 @@ fn as_utf8<S: Serializer>(val: &[u8], serializer: S) -> std::result::Result<S::O
)
}
pub fn from_utf8<'de, D>(deserializer: D) -> std::result::Result<Vec<u8>, D::Error>
where
D: Deserializer<'de>,
{
let s = String::deserialize(deserializer)?;
Ok(s.into_bytes())
}
fn as_utf8_vec<S: Serializer>(
val: &[Vec<u8>],
serializer: S,
) -> std::result::Result<S::Ok, S::Error> {
serializer.serialize_str(
val.iter()
.map(|v| {
String::from_utf8(v.clone()).unwrap_or_else(|_| "<unknown-not-UTF8>".to_string())
})
.collect::<Vec<String>>()
.join(",")
.as_str(),
)
let mut seq = serializer.serialize_seq(Some(val.len()))?;
for v in val {
seq.serialize_element(&String::from_utf8_lossy(v))?;
}
seq.end()
}
pub fn from_utf8_vec<'de, D>(deserializer: D) -> std::result::Result<Vec<Vec<u8>>, D::Error>
where
D: Deserializer<'de>,
{
let values = Vec::<String>::deserialize(deserializer)?;
let values = values
.into_iter()
.map(|value| value.into_bytes())
.collect::<Vec<_>>();
Ok(values)
}
impl From<Partition> for PbPartition {
@@ -365,6 +440,19 @@ mod tests {
use super::*;
#[test]
fn test_de_serialize_partition() {
let p = Partition {
column_list: vec![b"a".to_vec(), b"b".to_vec()],
value_list: vec![b"hi".to_vec(), b",".to_vec()],
};
let output = serde_json::to_string(&p).unwrap();
let got: Partition = serde_json::from_str(&output).unwrap();
assert_eq!(got, p);
}
#[test]
fn test_route_request_trans() {
let req = RouteRequest {
@@ -513,7 +601,7 @@ mod tests {
id: 1.into(),
name: "r1".to_string(),
partition: None,
attrs: HashMap::new(),
attrs: BTreeMap::new(),
},
leader_peer: Some(Peer::new(2, "a2")),
follower_peers: vec![Peer::new(1, "a1"), Peer::new(3, "a3")],
@@ -523,7 +611,7 @@ mod tests {
id: 2.into(),
name: "r2".to_string(),
partition: None,
attrs: HashMap::new(),
attrs: BTreeMap::new(),
},
leader_peer: Some(Peer::new(1, "a1")),
follower_peers: vec![Peer::new(2, "a2"), Peer::new(3, "a3")],

View File

@@ -612,9 +612,9 @@ impl TryFrom<PbCompareAndPutResponse> for CompareAndPutResponse {
}
impl CompareAndPutResponse {
pub fn handle<R, E, F>(&self, f: F) -> std::result::Result<R, E>
pub fn handle<R, E, F>(self, f: F) -> std::result::Result<R, E>
where
F: FnOnce(&Self) -> std::result::Result<R, E>,
F: FnOnce(Self) -> std::result::Result<R, E>,
{
f(self)
}

Some files were not shown because too many files have changed in this diff Show More