Compare commits

..

45 Commits

Author SHA1 Message Date
Lei, HUANG
a8630cdb38 fix: clippy errors 2022-12-15 18:12:05 +08:00
Ruihang Xia
0f3dcc1b38 fix: Fix All The Tests! (#752)
* fix: Fix several tests compile errors

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* fix: some compile errors in tests

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* fix: compile errors in frontend tests

* fix: compile errors in frontend tests

* test: Fix tests in api and common-query

* test: Fix test in sql crate

* fix: resolve substrait error

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* chore: add more test

* test: Fix tests in servers

* fix instance_test

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* test: Fix tests in tests-integration

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
Co-authored-by: Lei, HUANG <mrsatangel@gmail.com>
Co-authored-by: evenyag <realevenyag@gmail.com>
2022-12-15 17:47:14 +08:00
evenyag
7c696dae08 Merge branch 'develop' into replace-arrow2 2022-12-15 15:29:35 +08:00
Yingwen
142dee41d6 fix: Fix compiler errors in script crate (#749)
* fix: Fix compiler errors in state.rs

* fix: fix compiler errors in state

* feat: upgrade sqlparser to 0.26

* fix: fix datafusion engine compiler errors

* fix: Fix some tests in query crate

* fix: Fix all warnings in tests

* feat: Remove `Type` from timestamp's type name

* fix: fix query tests

Now datafusion already supports median, so this commit also remove the
median function

* style: Fix clippy

* feat: Remove RecordBatch::pretty_print

* chore: Address CR comments

* feat: Add column_by_name to RecordBatch

* feat: modify select_from_rb

* feat: Fix some compiler errors in vector.rs

* feat: Fix more compiler errors in vector.rs

* fix: fix table.rs

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* fix: Fix compiler errors in coprocessor

* fix: Fix some compiler errors

* fix: Fix compiler errors in script

* chore: Remove unused imports and format code

* test: disable interval tests

* test: Fix test_compile_execute test

* style: Fix clippy

* feat: Support interval

* feat: Add RecordBatch::columns and fix clippy

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
Co-authored-by: Ruihang Xia <waynestxia@gmail.com>
2022-12-15 14:20:35 +08:00
Lei, HUANG
ce6d1cb7d1 fix: frontend compile errors (#747)
fix: fix compile errors in frontend
2022-12-14 18:30:16 +08:00
Yingwen
dbb3034ecb fix: Fix compiler errors in query crate (#746)
* fix: Fix compiler errors in state.rs

* fix: fix compiler errors in state

* feat: upgrade sqlparser to 0.26

* fix: fix datafusion engine compiler errors

* fix: Fix some tests in query crate

* fix: Fix all warnings in tests

* feat: Remove `Type` from timestamp's type name

* fix: fix query tests

Now datafusion already supports median, so this commit also remove the
median function

* style: Fix clippy

* feat: Remove RecordBatch::pretty_print

* chore: Address CR comments

* Update src/query/src/query_engine/state.rs

Co-authored-by: Ruihang Xia <waynestxia@gmail.com>
2022-12-14 17:42:07 +08:00
Lei, HUANG
652d59a643 fix: remove unwrap 2022-12-13 17:51:14 +08:00
Lei, HUANG
fa971c6513 fix: errors in optimzer 2022-12-13 17:44:37 +08:00
evenyag
36c929e1a7 fix: Fix imports in optimizer.rs 2022-12-13 17:27:44 +08:00
Ruihang Xia
a712382fba Merge pull request #745
* fix nyc-taxi and util

* Merge branch 'replace-arrow2' into fix-others

* fix substrait

* fix warnings and error in test
2022-12-13 16:59:28 +08:00
Yingwen
4b644aa482 fix: Fix compiler errors in catalog and mito crates (#742)
* fix: Fix compiler errors in mito

* fix: Fix compiler errors in catalog crate

* style: Fix clippy

* chore: Fix use
2022-12-13 15:53:55 +08:00
Lei, HUANG
4defde055c feat: upgrade storage crate to arrow and parquet offcial impl (#738)
* fix: compile erros

* fix: parquet reader and writer

* fix: parquet reader and writer

* fix: WriteBatch IPC encode/decode

* fix: clippy errors in storage subcrate

* chore: remove suspicious unwrap

* fix: some cr comments

* fix: CR comments

* fix: CR comments
2022-12-13 11:58:50 +08:00
Ruihang Xia
95b2d8654f fix: pre-cast to avoid tremendous match arms (#734)
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
2022-12-09 17:20:03 +08:00
Ruihang Xia
42fdc7251a fix: Fix common grpc expr (#730)
* fix compile errors

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* rename fn names

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* fix styles

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* fix wranings in common-time

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
2022-12-09 14:24:04 +08:00
Ruihang Xia
d0892bf0b7 fix: Fix compile error in server subcrate (#727)
* fix: Fix compile error in server subcrate

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* remove unused type alias

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* explicitly panic

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* Update src/storage/src/sst/parquet.rs

Co-authored-by: Yingwen <realevenyag@gmail.com>

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
Co-authored-by: Yingwen <realevenyag@gmail.com>
2022-12-08 20:27:53 +08:00
Ruihang Xia
fff530cb50 fix common record batch
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
2022-12-08 17:58:53 +08:00
Yingwen
b936d8b18a fix: Fix common::grpc compiler errors (#722)
* fix: Fix common::grpc compiler errors

This commit refactors RecordBatch and holds vectors in the RecordBatch
struct, so we don't need to cast the array to vector when doing
serialization or iterating the batch.

Now we use the vector API instead of the arrow API in grpc crate.

* chore: Address CR comments
2022-12-08 17:51:20 +08:00
Lei, HUANG
1bde1ba399 fix: row group pruning (#725)
* fix: row group pruning

* chore: use macro to simplify stats implemetation

* fxi: CR comments

* fix: row group metadata length mismatch

* fix: simplify code
2022-12-08 17:44:04 +08:00
Ruihang Xia
3687bc7346 fix: Fix tests and clippy for common-function subcrate (#726)
* further fixing

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* fix all compile errors in common function

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* fix tests

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* fix clippy

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* revert test changes

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
2022-12-08 17:01:54 +08:00
Ruihang Xia
587bdc9800 fix: fix other compile error in common-function (#719)
* further fixing

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* fix all compile errors in common function

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
2022-12-08 11:38:07 +08:00
Yingwen
58c26def6b fix: fix argmin/percentile/clip/interp/scipy_stats_norm_pdf errors (#718)
fix: fix argmin/percentile/clip/interp/scipy_stats_norm_pdf compiler errors
2022-12-07 19:55:07 +08:00
Ruihang Xia
6f3baf96b0 fix: fix compile error for mean/polyval/pow/interp ops (#717)
* fix: fix compile error for mean/polyval/pow/interp ops

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* simplify type bounds

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
2022-12-07 16:38:43 +08:00
Yingwen
a898f846d1 fix: Fix compiler errors in argmax/rate/median/norm_cdf (#716)
* fix: Fix compiler errors in argmax/rate/median/norm_cdf

* chore: Address CR comments
2022-12-07 15:28:27 +08:00
Ruihang Xia
a562199455 Revert "fix: fix compile error for mean/polyval/pow/interp ops"
This reverts commit fb0b4eb826.
2022-12-07 15:13:58 +08:00
Ruihang Xia
fb0b4eb826 fix: fix compile error for mean/polyval/pow/interp ops
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
2022-12-07 15:12:28 +08:00
Yingwen
2ba99259e1 feat: Implements diff accumulator using WrapperType (#715)
* feat: Remove usage of opaque error from common::recordbatch

* feat: Remove opaque error from common::query

* feat: Fix diff compiler errors

Now common_function just use common_query's Error and Result. Adds
a LargestType associated type to LogicalPrimitiveType to get the largest
type a logical primitive type can cast to.

* feat: Remove LargestType from NativeType trait

* chore: Update comments

* feat: Restrict Scalar::RefType of WrapperType to itself

Add trait bound `for<'a> Scalar<RefType<'a> = Self>` to WrapperType

* chore: Address CR comments

* chore: Format codes
2022-12-07 11:13:24 +08:00
Ruihang Xia
551cde23b1 Merge branch 'dev' into replace-arrow2 2022-12-07 10:50:27 +08:00
Yingwen
653906d4fa fix: Fix common::query compiler errors (#713)
* feat: Move conversion to ScalarValue to value.rs

* fix: Fix common::query compiler errors

This commit also make InnerError pub(crate)
2022-12-06 16:45:54 +08:00
Ruihang Xia
829ff491c4 fix: common-query subcrate (#712)
* fix: record batch adapter

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* fix error enum

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
2022-12-06 16:32:52 +08:00
Yingwen
b32438e78c feat: Fix some compiler errors in common::query (#710)
* feat: Fix some compiler errors in common::query

* feat: test_collect use vectors api
2022-12-06 15:32:12 +08:00
Lei, HUANG
0ccb8b4302 chore: delete datatypes based on arrow2 2022-12-06 15:01:57 +08:00
Lei, HUANG
b48ae21b71 fix: api crate (#708)
* fix: rename ConcreteDataType::timestamp_millis_type to ConcreteDataType::timestamp_millisecond_type. fix other warnings regarding timestamp

* fix: revert changes in datatypes2

* fix: helper
2022-12-06 14:56:59 +08:00
evenyag
3c0adb00f3 feat: Fix recordbatch test compiling issue 2022-12-06 12:03:06 +08:00
evenyag
8c66b7d000 feat: Fix common::recordbatch compiler errors 2022-12-06 11:55:19 +08:00
evenyag
99371fd31b chore: sort Cargo.toml 2022-12-06 11:39:15 +08:00
evenyag
fe505fecfd feat: Make recordbatch compile 2022-12-06 11:38:59 +08:00
evenyag
cc1ec26416 feat: Switch to datatypes2 2022-12-05 20:30:47 +08:00
Ruihang Xia
504059a699 chore: fix wrong merge commit
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
2022-12-05 20:11:22 +08:00
Ruihang Xia
7151deb4ed Merge branch 'dev' into replace-arrow2 2022-12-05 20:10:37 +08:00
Ruihang Xia
d0686f9c19 Merge branch 'replace-arrow2' of github.com:GreptimeTeam/greptimedb into replace-arrow2 2022-11-21 17:43:40 +08:00
Ruihang Xia
221f3e9d2e Merge branch 'dev' into replace-arrow2 2022-11-21 17:42:15 +08:00
evenyag
61c4a3691a chore: update dep of binary vector 2022-11-21 15:55:07 +08:00
evenyag
d7626fd6af feat: arrow_array switch to arrow 2022-11-21 15:39:41 +08:00
Ruihang Xia
e3201a4705 chore: replace one last datafusion dep
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
2022-11-21 14:29:59 +08:00
Ruihang Xia
571a84d91b chore: kick off. change datafusion/arrow/parquet to target version
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
2022-11-21 14:19:39 +08:00
304 changed files with 5825 additions and 6451 deletions

View File

@@ -24,7 +24,7 @@ on:
name: Code coverage
env:
RUST_TOOLCHAIN: nightly-2022-12-20
RUST_TOOLCHAIN: nightly-2022-07-14
jobs:
coverage:
@@ -34,11 +34,6 @@ jobs:
steps:
- uses: actions/checkout@v3
- uses: arduino/setup-protoc@v1
with:
repo-token: ${{ secrets.GITHUB_TOKEN }}
- uses: KyleMayes/install-llvm-action@v1
with:
version: "14.0"
- name: Install toolchain
uses: dtolnay/rust-toolchain@master
with:
@@ -53,7 +48,6 @@ jobs:
- name: Collect coverage data
run: cargo llvm-cov nextest --workspace --lcov --output-path lcov.info
env:
CARGO_BUILD_RUSTFLAGS: "-C link-arg=-fuse-ld=lld"
RUST_BACKTRACE: 1
CARGO_INCREMENTAL: 0
GT_S3_BUCKET: ${{ secrets.S3_BUCKET }}

View File

@@ -23,7 +23,7 @@ on:
name: CI
env:
RUST_TOOLCHAIN: nightly-2022-12-20
RUST_TOOLCHAIN: nightly-2022-07-14
jobs:
typos:
@@ -41,8 +41,6 @@ jobs:
steps:
- uses: actions/checkout@v3
- uses: arduino/setup-protoc@v1
with:
repo-token: ${{ secrets.GITHUB_TOKEN }}
- uses: dtolnay/rust-toolchain@master
with:
toolchain: ${{ env.RUST_TOOLCHAIN }}
@@ -83,8 +81,6 @@ jobs:
# path: ./llvm
# key: llvm
# - uses: arduino/setup-protoc@v1
# with:
# repo-token: ${{ secrets.GITHUB_TOKEN }}
# - uses: KyleMayes/install-llvm-action@v1
# with:
# version: "14.0"
@@ -118,8 +114,6 @@ jobs:
steps:
- uses: actions/checkout@v3
- uses: arduino/setup-protoc@v1
with:
repo-token: ${{ secrets.GITHUB_TOKEN }}
- uses: dtolnay/rust-toolchain@master
with:
toolchain: ${{ env.RUST_TOOLCHAIN }}
@@ -137,8 +131,6 @@ jobs:
steps:
- uses: actions/checkout@v3
- uses: arduino/setup-protoc@v1
with:
repo-token: ${{ secrets.GITHUB_TOKEN }}
- uses: dtolnay/rust-toolchain@master
with:
toolchain: ${{ env.RUST_TOOLCHAIN }}

View File

@@ -10,7 +10,7 @@ on:
name: Release
env:
RUST_TOOLCHAIN: nightly-2022-12-20
RUST_TOOLCHAIN: nightly-2022-07-14
# FIXME(zyy17): Would be better to use `gh release list -L 1 | cut -f 3` to get the latest release version tag, but for a long time, we will stay at 'v0.1.0-alpha-*'.
SCHEDULED_BUILD_VERSION_PREFIX: v0.1.0-alpha

647
Cargo.lock generated

File diff suppressed because it is too large Load Diff

View File

@@ -39,31 +39,5 @@ members = [
"tests/runner",
]
[workspace.package]
version = "0.1.0"
edition = "2021"
license = "Apache-2.0"
[workspace.dependencies]
arrow = "29.0"
arrow-flight = "29.0"
arrow-schema = { version = "29.0", features = ["serde"] }
async-stream = "0.3"
async-trait = "0.1"
# TODO(LFC): Use released Datafusion when it officially dpendent on Arrow 29.0
datafusion = { git = "https://github.com/apache/arrow-datafusion.git", rev = "4917235a398ae20145c87d20984e6367dc1a0c1e" }
datafusion-common = { git = "https://github.com/apache/arrow-datafusion.git", rev = "4917235a398ae20145c87d20984e6367dc1a0c1e" }
datafusion-expr = { git = "https://github.com/apache/arrow-datafusion.git", rev = "4917235a398ae20145c87d20984e6367dc1a0c1e" }
datafusion-optimizer = { git = "https://github.com/apache/arrow-datafusion.git", rev = "4917235a398ae20145c87d20984e6367dc1a0c1e" }
datafusion-physical-expr = { git = "https://github.com/apache/arrow-datafusion.git", rev = "4917235a398ae20145c87d20984e6367dc1a0c1e" }
datafusion-sql = { git = "https://github.com/apache/arrow-datafusion.git", rev = "4917235a398ae20145c87d20984e6367dc1a0c1e" }
futures = "0.3"
parquet = "29.0"
paste = "1.0"
serde = { version = "1.0", features = ["derive"] }
snafu = { version = "0.7", features = ["backtraces"] }
sqlparser = "0.28"
tokio = { version = "1", features = ["full"] }
[profile.release]
debug = true

View File

@@ -1,14 +1,14 @@
[package]
name = "benchmarks"
version.workspace = true
edition.workspace = true
license.workspace = true
version = "0.1.0"
edition = "2021"
license = "Apache-2.0"
[dependencies]
arrow.workspace = true
arrow = "26.0.0"
clap = { version = "4.0", features = ["derive"] }
client = { path = "../src/client" }
indicatif = "0.17.1"
itertools = "0.10.5"
parquet.workspace = true
tokio.workspace = true
parquet = "26.0.0"
tokio = { version = "1.21", features = ["full"] }

View File

@@ -15,6 +15,7 @@
//! Use the taxi trip records from New York City dataset to bench. You can download the dataset from
//! [here](https://www1.nyc.gov/site/tlc/about/tlc-trip-record-data.page).
#![feature(once_cell)]
#![allow(clippy::print_stdout)]
use std::collections::HashMap;
@@ -25,9 +26,10 @@ use arrow::array::{ArrayRef, PrimitiveArray, StringArray, TimestampNanosecondArr
use arrow::datatypes::{DataType, Float64Type, Int64Type};
use arrow::record_batch::RecordBatch;
use clap::Parser;
use client::admin::Admin;
use client::api::v1::column::Values;
use client::api::v1::{Column, ColumnDataType, ColumnDef, CreateTableExpr, InsertRequest, TableId};
use client::{Client, Database};
use client::api::v1::{Column, ColumnDataType, ColumnDef, CreateExpr, InsertExpr};
use client::{Client, Database, Select};
use indicatif::{MultiProgress, ProgressBar, ProgressStyle};
use parquet::arrow::arrow_reader::ParquetRecordBatchReaderBuilder;
use tokio::task::JoinSet;
@@ -92,14 +94,14 @@ async fn write_data(
.unwrap();
let progress_bar = mpb.add(ProgressBar::new(row_num as _));
progress_bar.set_style(pb_style);
progress_bar.set_message(format!("{path:?}"));
progress_bar.set_message(format!("{:?}", path));
let mut total_rpc_elapsed_ms = 0;
for record_batch in record_batch_reader {
let record_batch = record_batch.unwrap();
let (columns, row_count) = convert_record_batch(record_batch);
let request = InsertRequest {
let insert_expr = InsertExpr {
schema_name: "public".to_string(),
table_name: TABLE_NAME.to_string(),
region_number: 0,
@@ -107,13 +109,16 @@ async fn write_data(
row_count,
};
let now = Instant::now();
db.insert(request).await.unwrap();
db.insert(insert_expr).await.unwrap();
let elapsed = now.elapsed();
total_rpc_elapsed_ms += elapsed.as_millis();
progress_bar.inc(row_count as _);
}
progress_bar.finish_with_message(format!("file {path:?} done in {total_rpc_elapsed_ms}ms",));
progress_bar.finish_with_message(format!(
"file {:?} done in {}ms",
path, total_rpc_elapsed_ms
));
total_rpc_elapsed_ms
}
@@ -214,126 +219,126 @@ fn build_values(column: &ArrayRef) -> Values {
}
}
fn create_table_expr() -> CreateTableExpr {
CreateTableExpr {
catalog_name: CATALOG_NAME.to_string(),
schema_name: SCHEMA_NAME.to_string(),
fn create_table_expr() -> CreateExpr {
CreateExpr {
catalog_name: Some(CATALOG_NAME.to_string()),
schema_name: Some(SCHEMA_NAME.to_string()),
table_name: TABLE_NAME.to_string(),
desc: "".to_string(),
desc: None,
column_defs: vec![
ColumnDef {
name: "VendorID".to_string(),
datatype: ColumnDataType::Int64 as i32,
is_nullable: true,
default_constraint: vec![],
default_constraint: None,
},
ColumnDef {
name: "tpep_pickup_datetime".to_string(),
datatype: ColumnDataType::Int64 as i32,
is_nullable: true,
default_constraint: vec![],
default_constraint: None,
},
ColumnDef {
name: "tpep_dropoff_datetime".to_string(),
datatype: ColumnDataType::Int64 as i32,
is_nullable: true,
default_constraint: vec![],
default_constraint: None,
},
ColumnDef {
name: "passenger_count".to_string(),
datatype: ColumnDataType::Float64 as i32,
is_nullable: true,
default_constraint: vec![],
default_constraint: None,
},
ColumnDef {
name: "trip_distance".to_string(),
datatype: ColumnDataType::Float64 as i32,
is_nullable: true,
default_constraint: vec![],
default_constraint: None,
},
ColumnDef {
name: "RatecodeID".to_string(),
datatype: ColumnDataType::Float64 as i32,
is_nullable: true,
default_constraint: vec![],
default_constraint: None,
},
ColumnDef {
name: "store_and_fwd_flag".to_string(),
datatype: ColumnDataType::String as i32,
is_nullable: true,
default_constraint: vec![],
default_constraint: None,
},
ColumnDef {
name: "PULocationID".to_string(),
datatype: ColumnDataType::Int64 as i32,
is_nullable: true,
default_constraint: vec![],
default_constraint: None,
},
ColumnDef {
name: "DOLocationID".to_string(),
datatype: ColumnDataType::Int64 as i32,
is_nullable: true,
default_constraint: vec![],
default_constraint: None,
},
ColumnDef {
name: "payment_type".to_string(),
datatype: ColumnDataType::Int64 as i32,
is_nullable: true,
default_constraint: vec![],
default_constraint: None,
},
ColumnDef {
name: "fare_amount".to_string(),
datatype: ColumnDataType::Float64 as i32,
is_nullable: true,
default_constraint: vec![],
default_constraint: None,
},
ColumnDef {
name: "extra".to_string(),
datatype: ColumnDataType::Float64 as i32,
is_nullable: true,
default_constraint: vec![],
default_constraint: None,
},
ColumnDef {
name: "mta_tax".to_string(),
datatype: ColumnDataType::Float64 as i32,
is_nullable: true,
default_constraint: vec![],
default_constraint: None,
},
ColumnDef {
name: "tip_amount".to_string(),
datatype: ColumnDataType::Float64 as i32,
is_nullable: true,
default_constraint: vec![],
default_constraint: None,
},
ColumnDef {
name: "tolls_amount".to_string(),
datatype: ColumnDataType::Float64 as i32,
is_nullable: true,
default_constraint: vec![],
default_constraint: None,
},
ColumnDef {
name: "improvement_surcharge".to_string(),
datatype: ColumnDataType::Float64 as i32,
is_nullable: true,
default_constraint: vec![],
default_constraint: None,
},
ColumnDef {
name: "total_amount".to_string(),
datatype: ColumnDataType::Float64 as i32,
is_nullable: true,
default_constraint: vec![],
default_constraint: None,
},
ColumnDef {
name: "congestion_surcharge".to_string(),
datatype: ColumnDataType::Float64 as i32,
is_nullable: true,
default_constraint: vec![],
default_constraint: None,
},
ColumnDef {
name: "airport_fee".to_string(),
datatype: ColumnDataType::Float64 as i32,
is_nullable: true,
default_constraint: vec![],
default_constraint: None,
},
],
time_index: "tpep_pickup_datetime".to_string(),
@@ -341,7 +346,7 @@ fn create_table_expr() -> CreateTableExpr {
create_if_not_exists: false,
table_options: Default::default(),
region_ids: vec![0],
table_id: Some(TableId { id: 0 }),
table_id: Some(0),
}
}
@@ -350,23 +355,25 @@ fn query_set() -> HashMap<String, String> {
ret.insert(
"count_all".to_string(),
format!("SELECT COUNT(*) FROM {TABLE_NAME};"),
format!("SELECT COUNT(*) FROM {};", TABLE_NAME),
);
ret.insert(
"fare_amt_by_passenger".to_string(),
format!("SELECT passenger_count, MIN(fare_amount), MAX(fare_amount), SUM(fare_amount) FROM {TABLE_NAME} GROUP BY passenger_count")
format!("SELECT passenger_count, MIN(fare_amount), MAX(fare_amount), SUM(fare_amount) FROM {} GROUP BY passenger_count",TABLE_NAME)
);
ret
}
async fn do_write(args: &Args, db: &Database) {
async fn do_write(args: &Args, client: &Client) {
let admin = Admin::new("admin", client.clone());
let mut file_list = get_file_list(args.path.clone().expect("Specify data path in argument"));
let mut write_jobs = JoinSet::new();
let create_table_result = db.create(create_table_expr()).await;
println!("Create table result: {create_table_result:?}");
let create_table_result = admin.create(create_table_expr()).await;
println!("Create table result: {:?}", create_table_result);
let progress_bar_style = ProgressStyle::with_template(
"[{elapsed_precise}] {bar:60.cyan/blue} {pos:>7}/{len:7} {msg}",
@@ -380,7 +387,7 @@ async fn do_write(args: &Args, db: &Database) {
let batch_size = args.batch_size;
for _ in 0..args.thread_num {
if let Some(path) = file_list.pop() {
let db = db.clone();
let db = Database::new(DATABASE_NAME, client.clone());
let mpb = multi_progress_bar.clone();
let pb_style = progress_bar_style.clone();
write_jobs.spawn(async move { write_data(batch_size, &db, path, mpb, pb_style).await });
@@ -389,7 +396,7 @@ async fn do_write(args: &Args, db: &Database) {
while write_jobs.join_next().await.is_some() {
file_progress.inc(1);
if let Some(path) = file_list.pop() {
let db = db.clone();
let db = Database::new(DATABASE_NAME, client.clone());
let mpb = multi_progress_bar.clone();
let pb_style = progress_bar_style.clone();
write_jobs.spawn(async move { write_data(batch_size, &db, path, mpb, pb_style).await });
@@ -399,10 +406,10 @@ async fn do_write(args: &Args, db: &Database) {
async fn do_query(num_iter: usize, db: &Database) {
for (query_name, query) in query_set() {
println!("Running query: {query}");
println!("Running query: {}", query);
for i in 0..num_iter {
let now = Instant::now();
let _res = db.sql(&query).await.unwrap();
let _res = db.select(Select::Sql(query.clone())).await.unwrap();
let elapsed = now.elapsed();
println!(
"query {}, iteration {}: {}ms",
@@ -424,13 +431,13 @@ fn main() {
.unwrap()
.block_on(async {
let client = Client::with_urls(vec![&args.endpoint]);
let db = Database::new(DATABASE_NAME, client);
if !args.skip_write {
do_write(&args, &db).await;
do_write(&args, &client).await;
}
if !args.skip_read {
let db = Database::new(DATABASE_NAME, client.clone());
do_query(args.iter_num, &db).await;
}
})

View File

@@ -24,8 +24,6 @@ RUN cargo build --release
# TODO(zyy17): Maybe should use the more secure container image.
FROM ubuntu:22.04 as base
RUN apt-get update && DEBIAN_FRONTEND=noninteractive apt-get -y install ca-certificates
WORKDIR /greptime
COPY --from=builder /greptimedb/target/release/greptime /greptime/bin/
ENV PATH /greptime/bin/:$PATH

View File

@@ -1,7 +1,5 @@
FROM ubuntu:22.04
RUN apt-get update && DEBIAN_FRONTEND=noninteractive apt-get -y install ca-certificates
ARG TARGETARCH
ADD $TARGETARCH/greptime /greptime/bin/

Binary file not shown.

Before

Width:  |  Height:  |  Size: 34 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 58 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 35 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 46 KiB

View File

@@ -1,175 +0,0 @@
---
Feature Name: "promql-in-rust"
Tracking Issue: https://github.com/GreptimeTeam/greptimedb/issues/596
Date: 2022-12-20
Author: "Ruihang Xia <waynestxia@gmail.com>"
---
Rewrite PromQL in Rust
----------------------
# Summary
A Rust native implementation of PromQL, for GreptimeDB.
# Motivation
Prometheus and its query language PromQL prevails in the cloud-native observability area, which is an important scenario for time series database like GreptimeDB. We already have support for its remote read and write protocols. Users can now integrate GreptimeDB as the storage backend to existing Prometheus deployment, but cannot run PromQL query directly on GreptimeDB like SQL.
This RFC proposes to add support for PromQL. Because it was created in Go, we can't use the existing code easily. For interoperability, performance and extendability, porting its logic to Rust is a good choice.
# Details
## Overview
One of the goals is to make use of our existing basic operators, execution model and runtime to reduce the work. So the entire proposal is built on top of Apache Arrow DataFusion. The rewrote PromQL logic is manifested as `Expr` or `Execution Plan` in DataFusion. And both the intermediate data structure and the result is in the format of `Arrow`'s `RecordBatch`.
The following sections are organized in a top-down manner. Starts with evaluation procedure. Then introduces the building blocks of our new PromQL operation. Follows by an explanation of data model. And end with an example logic plan.
*This RFC is heavily related to Prometheus and PromQL. It won't repeat some basic concepts of them.*
## Evaluation
The original implementation is like an interpreter of parsed PromQL AST. It has two characteristics: (1) Operations are evaluated in place after they are parsed to AST. And some key parameters are separated from the AST because they do not present in the query, but come from other places like another field in the HTTP payload. (2) calculation is performed per timestamp. You can see this pattern many times:
```go
for ts := ev.startTimestamp; ts <= ev.endTimestamp; ts += ev.interval {}
```
These bring out two differences in the proposed implementation. First, to make it more general and clear, the evaluation procedure is reorganized into serval phases (and is the same as DataFusion's). And second, data are evaluated by time series (corresponding to "columnar calculation", if think timestamp as row number).
```
Logic
Query AST Plan
─────────► Parser ───────► Logical ────────► Physical ────┐
Planner Planner │
◄───────────────────────────── Executor ◄────────────────┘
Evaluation Result Execution
Plan
```
- Parser
Provided by [`promql-parser`](https://github.com/GreptimeTeam/promql-parser) crate. Same as the original implementation.
- Logical Planner
Generates a logical plan with all the needed parameters. It should accept something like `EvalStmt` in Go's implementation, which contains query time range, evaluation interval and lookback range.
Another important thing done here is assembling the logic plan, with all the operations baked into logically. Like what's the filter and time range to read, how the data then flows through a selector into a binary operation, etc. Or what's the output schema of every single step. The generated logic plan is deterministic without variables, and can be `EXPLAIN`ed clearly.
- Physical Planner
This step converts a logic plan into evaluatable execution plan. There are not many special things like the previous step. Except when a query is going to be executed distributedly. In this case, a logic plan will be divided into serval parts and sent to serval nodes. One physical planner only sees its own part.
- Executor
As its name shows, this step calculates data to result. And all new calculation logic, the implementation of PromQL in rust, is placed here. And the rewrote functions are using `RecordBatch` and `Array` from `Arrow` as the intermediate data structure.
Each "batch" contains only data from single time series. This is from the underlying storage implementation. Though it's not a requirement of this RFC, having this property can simplify some functions.
Another thing to mention is the rewrote functions don't aware of timestamp or value columns, they are defined only based on the input data types. For example, `increase()` function in PromQL calculates the unbiased delta of data, its implementation here only does this single thing. Let's compare the signature of two implementations:
- Go
```go
func funcIncrease(vals []parser.Value, args parser.Expressions) Vector {}
```
- Rust
```rust
fn prom_increase(input: Array) -> Array {}
```
Some unimportant parameters are omitted. The original Go version only writes the logic for `Point`'s value, either float or histogram. But the proposed rewritten one accepts a generic `Array` as input, which can be any type that suits, from `i8` to `u64` to `TimestampNanosecond`.
## Plan and Expression
They are structures to express logic from PromQL. The proposed implementation is built on top of DataFusion, thus our plan and expression are in form of `ExtensionPlan` and `ScalarUDF`. The only difference between them in this context is the return type: plan returns a record batch while expression returns a single column.
This RFC proposes to add four new plans, they are fundamental building blocks that mainly handle data selection logic in PromQL, for the following calculation expressions.
- `SeriesNormalize`
Sort data inside one series on the timestamp column, and bias "offset" if has. This plan usually comes after `TableScan` (or `TableScan` and `Filter`) plan.
- `VectorManipulator` and `MatrixManipulator`
Corresponding to `InstantSelector` and `RangeSelector`. We don't calculate timestamp by timestamp, thus use "vector" instead of "instant", this image shows the difference. And "matrix" is another name for "range vector", for not confused with our "vector". The following section will detail how they are implemented using Arrow.
![instant_and_vector](instant-and-vector.png)
Due to "interval" parameter in PromQL, data after "selector" (or "manipulator" here) are usually shorter than input. And we have to modify the entire record batch to shorten both timestamp, value and tag columns. So they are formed as plan.
- `PromAggregator`
The carrier of aggregator expressions. This should not be very different from the DataFusion built-in `Aggregate` plan, except PromQL can use "group without" to do reverse selection.
PromQL has around 70 expressions and functions. But luckily we can reuse lots of them from DataFusion. Like unary expression, binary expression and aggregator. We only need to implement those PromQL-specific expressions, like `rate` or `percentile`. The following table lists some typical functions in PromQL, and their signature in the proposed implementation. Other function should be the same.
| Name | In Param(s) | Out Param(s) | Explain |
|-------------------- |------------------------------------------------------ |-------------- |-------------------- |
| instant_delta | Matrix T | Array T | idelta in PromQL |
| increase | Matrix T | Array T | increase in PromQL |
| extrapolate_factor | - Matrix T<br>- Array Timestamp<br>- Array Timestamp | Array T | * |
*: *`extrapolate_factor` is one of the "dark sides" in PromQL. In short it's a translation of this [paragraph](https://github.com/prometheus/prometheus/blob/0372e259baf014bbade3134fd79bcdfd8cbdef2c/promql/functions.go#L134-L159)*
To reuse those common calculation logic, we can break them into serval expressions, and assemble in the logic planning phase. Like `rate()` in PromQL can be represented as `increase / extrapolate_factor`.
## Data Model
This part explains how data is represented. Following the data model in GreptimeDB, all the data are stored as table, with tag columns, timestamp column and value column. Table to record batch is very straightforward. So an instant vector can be thought of as a row (though as said before, we don't use instant vectors) in the table. Given four basic types in PromQL: scalar, string, instant vector and range vector, only the last "range vector" need some tricks to adapt our columnar calculation.
Range vector is some sort of matrix, it's consisted of small one-dimension vectors, with each being an input of range function. And, applying range function to a range vector can be thought of kind of convolution.
![range-vector-with-matrix](range-vector-with-matrix.png)
(Left is an illustration of range vector. Notice the Y-axis has no meaning, it's just put different pieces separately. The right side is an imagined "matrix" as range function. Multiplying the left side to it can get a one-dimension "matrix" with four elements. That's the evaluation result of a range vector.)
To adapt this range vector to record batch, it should be represented by a column. This RFC proposes to use `DictionaryArray` from Arrow to represent range vector, or `Matrix`. This is "misusing" `DictionaryArray` to ship some additional information about an array. Because the range vector is sliding over one series, we only need to know the `offset` and `length` of each slides to reconstruct the matrix from an array:
![matrix-from-array](matrix-from-array.png)
The length is not fixed, it depends on the input's timestamp. An PoC implementation of `Matrix` and `increase()` can be found in [this repo](https://github.com/waynexia/corroding-prometheus).
## Example
The logic plan of PromQL query
```promql
# start: 2022-12-20T10:00:00
# end: 2022-12-21T10:00:00
# interval: 1m
# lookback: 30s
sum (rate(request_duration[5m])) by (idc)
```
looks like
<!-- title: 'PromAggregator: \naggr = sum, column = idc'
operator: prom
inputs:
- title: 'Matrix Manipulator: \ninterval = 1m, range = 5m, expr = div(increase(value), extrapolate_factor(timestamp))'
operator: prom
inputs:
- title: 'Series Normalize: \noffset = 0'
operator: prom
inputs:
- title: 'Filter: \ntimetamp > 2022-12-20T10:00:00 && timestamp < 2022-12-21T10:00:00'
operator: filter
inputs:
- title: 'Table Scan: \ntable = request_duration, timetamp > 2022-12-20T10:00:00 && timestamp < 2022-12-21T10:00:00'
operator: scan -->
![example](example.png)
# Drawbacks
Human-being is always error-prone. It's harder to endeavor to rewrite from the ground and requires more attention to ensure correctness, than translate line-by-line. And, since the evaluator's architecture are different, it might be painful to catch up with PromQL's breaking update (if any) in the future.
Misusing Arrow's DictionaryVector as Matrix is another point. This hack needs some `unsafe` function call to bypass Arrow's check. And though Arrow's API is stable, this is still an undocumented behavior.
# Alternatives
There are a few alternatives we've considered:
- Wrap the existing PromQL's implementation via FFI, and import it to GreptimeDB.
- Translate its evaluator engine line-by-line, rather than rewrite one.
- Integrate the Prometheus server into GreptimeDB via RPC, making it a detached execution engine for PromQL.
The first and second options are making a separate execution engine in GreptimeDB, they may alleviate the pain during rewriting, but will have negative impacts to afterward evolve like resource management. And introduce another deploy component in the last option will bring a complex deploy architecture.
And all of them are more or less redundant in data transportation that affects performance and resources. The proposed built-in executing procedure is also easy to integrate and expose to the existing SQL interface GreptimeDB currently provides. Some concepts in PromQL like sliding windows (range vector in PromQL) are very convenient and ergonomic in analyzing series data. This makes it not only a PromQL evaluator, but also an enhancement to our query system.

View File

@@ -1 +1 @@
nightly-2022-12-20
nightly-2022-07-14

View File

@@ -1,11 +1,11 @@
[package]
name = "api"
version.workspace = true
edition.workspace = true
license.workspace = true
version = "0.1.0"
edition = "2021"
license = "Apache-2.0"
# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html
[dependencies]
arrow-flight.workspace = true
common-base = { path = "../common/base" }
common-error = { path = "../common/error" }
common-time = { path = "../common/time" }

View File

@@ -20,6 +20,7 @@ fn main() {
.file_descriptor_set_path(default_out_dir.join("greptime_fd.bin"))
.compile(
&[
"greptime/v1/select.proto",
"greptime/v1/greptime.proto",
"greptime/v1/meta/common.proto",
"greptime/v1/meta/heartbeat.proto",

View File

@@ -5,35 +5,50 @@ package greptime.v1;
import "greptime/v1/column.proto";
import "greptime/v1/common.proto";
// "Data Definition Language" requests, that create, modify or delete the database structures but not the data.
// `DdlRequest` could carry more information than plain SQL, for example, the "table_id" in `CreateTableExpr`.
// So create a new DDL expr if you need it.
message DdlRequest {
message AdminRequest {
string name = 1;
repeated AdminExpr exprs = 2;
}
message AdminResponse {
repeated AdminResult results = 1;
}
message AdminExpr {
ExprHeader header = 1;
oneof expr {
CreateDatabaseExpr create_database = 1;
CreateTableExpr create_table = 2;
CreateExpr create = 2;
AlterExpr alter = 3;
DropTableExpr drop_table = 4;
CreateDatabaseExpr create_database = 4;
DropTableExpr drop_table = 5;
}
}
message CreateTableExpr {
string catalog_name = 1;
string schema_name = 2;
message AdminResult {
ResultHeader header = 1;
oneof result {
MutateResult mutate = 2;
}
}
// TODO(hl): rename to CreateTableExpr
message CreateExpr {
optional string catalog_name = 1;
optional string schema_name = 2;
string table_name = 3;
string desc = 4;
optional string desc = 4;
repeated ColumnDef column_defs = 5;
string time_index = 6;
repeated string primary_keys = 7;
bool create_if_not_exists = 8;
map<string, string> table_options = 9;
TableId table_id = 10;
optional uint32 table_id = 10;
repeated uint32 region_ids = 11;
}
message AlterExpr {
string catalog_name = 1;
string schema_name = 2;
optional string catalog_name = 1;
optional string schema_name = 2;
string table_name = 3;
oneof kind {
AddColumns add_columns = 4;
@@ -47,11 +62,6 @@ message DropTableExpr {
string table_name = 3;
}
message CreateDatabaseExpr {
//TODO(hl): maybe rename to schema_name?
string database_name = 1;
}
message AddColumns {
repeated AddColumn add_columns = 1;
}
@@ -69,6 +79,7 @@ message DropColumn {
string name = 1;
}
message TableId {
uint32 id = 1;
message CreateDatabaseExpr {
//TODO(hl): maybe rename to schema_name?
string database_name = 1;
}

View File

@@ -59,7 +59,7 @@ message ColumnDef {
string name = 1;
ColumnDataType datatype = 2;
bool is_nullable = 3;
bytes default_constraint = 4;
optional bytes default_constraint = 4;
}
enum ColumnDataType {

View File

@@ -6,8 +6,17 @@ message RequestHeader {
string tenant = 1;
}
message ExprHeader {
uint32 version = 1;
}
message ResultHeader {
uint32 version = 1;
uint32 code = 2;
string err_msg = 3;
}
message MutateResult {
uint32 success = 1;
uint32 failure = 2;
}

View File

@@ -2,7 +2,6 @@ syntax = "proto3";
package greptime.v1;
import "greptime/v1/ddl.proto";
import "greptime/v1/column.proto";
import "greptime/v1/common.proto";
@@ -16,21 +15,24 @@ message DatabaseResponse {
}
message ObjectExpr {
oneof request {
InsertRequest insert = 1;
QueryRequest query = 2;
DdlRequest ddl = 3;
ExprHeader header = 1;
oneof expr {
InsertExpr insert = 2;
SelectExpr select = 3;
UpdateExpr update = 4;
DeleteExpr delete = 5;
}
}
message QueryRequest {
oneof query {
// TODO(fys): Only support sql now, and will support promql etc in the future
message SelectExpr {
oneof expr {
string sql = 1;
bytes logical_plan = 2;
}
}
message InsertRequest {
message InsertExpr {
string schema_name = 1;
string table_name = 2;
@@ -39,18 +41,26 @@ message InsertRequest {
// The row_count of all columns, which include null and non-null values.
//
// Note: the row_count of all columns in a InsertRequest must be same.
// Note: the row_count of all columns in a InsertExpr must be same.
uint32 row_count = 4;
// The region number of current insert request.
uint32 region_number = 5;
}
// TODO(jiachun)
message UpdateExpr {}
// TODO(jiachun)
message DeleteExpr {}
message ObjectResult {
ResultHeader header = 1;
repeated bytes flight_data = 2;
oneof result {
SelectResult select = 2;
MutateResult mutate = 3;
}
}
message FlightDataExt {
uint32 affected_rows = 1;
message SelectResult {
bytes raw_data = 1;
}

View File

@@ -2,6 +2,7 @@ syntax = "proto3";
package greptime.v1;
import "greptime/v1/admin.proto";
import "greptime/v1/common.proto";
import "greptime/v1/database.proto";
@@ -11,9 +12,11 @@ service Greptime {
message BatchRequest {
RequestHeader header = 1;
repeated DatabaseRequest databases = 2;
repeated AdminRequest admins = 2;
repeated DatabaseRequest databases = 3;
}
message BatchResponse {
repeated DatabaseResponse databases = 1;
repeated AdminResponse admins = 1;
repeated DatabaseResponse databases = 2;
}

View File

@@ -0,0 +1,10 @@
syntax = "proto3";
package greptime.v1.codec;
import "greptime/v1/column.proto";
message SelectResult {
repeated Column columns = 1;
uint32 row_count = 2;
}

View File

@@ -12,19 +12,30 @@
// See the License for the specific language governing permissions and
// limitations under the License.
use arrow_flight::FlightData;
use prost::Message;
use common_error::prelude::ErrorExt;
use crate::v1::{ObjectResult, ResultHeader};
use crate::v1::codec::SelectResult;
use crate::v1::{
admin_result, object_result, AdminResult, MutateResult, ObjectResult, ResultHeader,
SelectResult as SelectResultRaw,
};
pub const PROTOCOL_VERSION: u32 = 1;
pub type Success = u32;
pub type Failure = u32;
#[derive(Default)]
pub struct ObjectResultBuilder {
version: u32,
code: u32,
err_msg: Option<String>,
flight_data: Option<Vec<FlightData>>,
result: Option<Body>,
}
pub enum Body {
Mutate((Success, Failure)),
Select(SelectResult),
}
impl ObjectResultBuilder {
@@ -51,8 +62,13 @@ impl ObjectResultBuilder {
self
}
pub fn flight_data(mut self, flight_data: Vec<FlightData>) -> Self {
self.flight_data = Some(flight_data);
pub fn mutate_result(mut self, success: u32, failure: u32) -> Self {
self.result = Some(Body::Mutate((success, failure)));
self
}
pub fn select_result(mut self, select_result: SelectResult) -> Self {
self.result = Some(Body::Select(select_result));
self
}
@@ -63,24 +79,92 @@ impl ObjectResultBuilder {
err_msg: self.err_msg.unwrap_or_default(),
});
let flight_data = if let Some(flight_data) = self.flight_data {
flight_data
.into_iter()
.map(|x| x.encode_to_vec())
.collect::<Vec<Vec<u8>>>()
} else {
vec![]
let result = match self.result {
Some(Body::Mutate((success, failure))) => {
Some(object_result::Result::Mutate(MutateResult {
success,
failure,
}))
}
Some(Body::Select(select)) => Some(object_result::Result::Select(SelectResultRaw {
raw_data: select.into(),
})),
None => None,
};
ObjectResult {
header,
flight_data,
ObjectResult { header, result }
}
}
pub fn build_err_result(err: &impl ErrorExt) -> ObjectResult {
ObjectResultBuilder::new()
.status_code(err.status_code() as u32)
.err_msg(err.to_string())
.build()
}
#[derive(Debug)]
pub struct AdminResultBuilder {
version: u32,
code: u32,
err_msg: Option<String>,
mutate: Option<(Success, Failure)>,
}
impl AdminResultBuilder {
pub fn status_code(mut self, code: u32) -> Self {
self.code = code;
self
}
pub fn err_msg(mut self, err_msg: String) -> Self {
self.err_msg = Some(err_msg);
self
}
pub fn mutate_result(mut self, success: u32, failure: u32) -> Self {
self.mutate = Some((success, failure));
self
}
pub fn build(self) -> AdminResult {
let header = Some(ResultHeader {
version: self.version,
code: self.code,
err_msg: self.err_msg.unwrap_or_default(),
});
let result = if let Some((success, failure)) = self.mutate {
Some(admin_result::Result::Mutate(MutateResult {
success,
failure,
}))
} else {
None
};
AdminResult { header, result }
}
}
impl Default for AdminResultBuilder {
fn default() -> Self {
Self {
version: PROTOCOL_VERSION,
code: 0,
err_msg: None,
mutate: None,
}
}
}
#[cfg(test)]
mod tests {
use common_error::status_code::StatusCode;
use super::*;
use crate::error::UnknownColumnDataTypeSnafu;
use crate::v1::{object_result, MutateResult};
#[test]
fn test_object_result_builder() {
@@ -88,10 +172,32 @@ mod tests {
.version(101)
.status_code(500)
.err_msg("Failed to read this file!".to_string())
.mutate_result(100, 20)
.build();
let header = obj_result.header.unwrap();
assert_eq!(101, header.version);
assert_eq!(500, header.code);
assert_eq!("Failed to read this file!", header.err_msg);
let result = obj_result.result.unwrap();
assert_eq!(
object_result::Result::Mutate(MutateResult {
success: 100,
failure: 20,
}),
result
);
}
#[test]
fn test_build_err_result() {
let err = UnknownColumnDataTypeSnafu { datatype: 1 }.build();
let err_result = build_err_result(&err);
let header = err_result.header.unwrap();
let result = err_result.result;
assert_eq!(PROTOCOL_VERSION, header.version);
assert_eq!(StatusCode::InvalidArguments as u32, header.code);
assert!(result.is_none());
}
}

View File

@@ -15,6 +15,7 @@
pub use prost::DecodeError;
use prost::Message;
use crate::v1::codec::SelectResult;
use crate::v1::meta::TableRouteValue;
macro_rules! impl_convert_with_bytes {
@@ -35,4 +36,80 @@ macro_rules! impl_convert_with_bytes {
};
}
impl_convert_with_bytes!(SelectResult);
impl_convert_with_bytes!(TableRouteValue);
#[cfg(test)]
mod tests {
use std::ops::Deref;
use crate::v1::codec::*;
use crate::v1::{column, Column};
const SEMANTIC_TAG: i32 = 0;
#[test]
fn test_convert_select_result() {
let select_result = mock_select_result();
let bytes: Vec<u8> = select_result.into();
let result: SelectResult = bytes.deref().try_into().unwrap();
assert_eq!(8, result.row_count);
assert_eq!(1, result.columns.len());
let column = &result.columns[0];
assert_eq!("foo", column.column_name);
assert_eq!(SEMANTIC_TAG, column.semantic_type);
assert_eq!(vec![1], column.null_mask);
assert_eq!(
vec![2, 3, 4, 5, 6, 7, 8],
column.values.as_ref().unwrap().i32_values
);
}
#[should_panic]
#[test]
fn test_convert_select_result_wrong() {
let select_result = mock_select_result();
let mut bytes: Vec<u8> = select_result.into();
// modify some bytes
bytes[0] = 0b1;
bytes[1] = 0b1;
let result: SelectResult = bytes.deref().try_into().unwrap();
assert_eq!(8, result.row_count);
assert_eq!(1, result.columns.len());
let column = &result.columns[0];
assert_eq!("foo", column.column_name);
assert_eq!(SEMANTIC_TAG, column.semantic_type);
assert_eq!(vec![1], column.null_mask);
assert_eq!(
vec![2, 3, 4, 5, 6, 7, 8],
column.values.as_ref().unwrap().i32_values
);
}
fn mock_select_result() -> SelectResult {
let values = column::Values {
i32_values: vec![2, 3, 4, 5, 6, 7, 8],
..Default::default()
};
let null_mask = vec![1];
let column = Column {
column_name: "foo".to_string(),
semantic_type: SEMANTIC_TAG,
values: Some(values),
null_mask,
..Default::default()
};
SelectResult {
columns: vec![column],
row_count: 8,
}
}
}

View File

@@ -17,5 +17,9 @@ tonic::include_proto!("greptime.v1");
pub const GREPTIME_FD_SET: &[u8] = tonic::include_file_descriptor_set!("greptime_fd");
pub mod codec {
tonic::include_proto!("greptime.v1.codec");
}
mod column_def;
pub mod meta;

View File

@@ -23,13 +23,12 @@ impl ColumnDef {
pub fn try_as_column_schema(&self) -> Result<ColumnSchema> {
let data_type = ColumnDataTypeWrapper::try_new(self.datatype)?;
let constraint = if self.default_constraint.is_empty() {
None
} else {
Some(
ColumnDefaultConstraint::try_from(self.default_constraint.as_slice())
let constraint = match &self.default_constraint {
None => None,
Some(v) => Some(
ColumnDefaultConstraint::try_from(&v[..])
.context(error::ConvertColumnDefaultConstraintSnafu { column: &self.name })?,
)
),
};
ColumnSchema::new(&self.name, data_type.into(), self.is_nullable)

View File

@@ -1,13 +1,14 @@
[package]
name = "catalog"
version.workspace = true
edition.workspace = true
license.workspace = true
version = "0.1.0"
edition = "2021"
license = "Apache-2.0"
# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html
[dependencies]
api = { path = "../api" }
arc-swap = "1.0"
async-stream.workspace = true
async-stream = "0.3"
async-trait = "0.1"
backoff = { version = "0.4", features = ["tokio"] }
common-catalog = { path = "../common/catalog" }
@@ -18,7 +19,7 @@ common-recordbatch = { path = "../common/recordbatch" }
common-runtime = { path = "../common/runtime" }
common-telemetry = { path = "../common/telemetry" }
common-time = { path = "../common/time" }
datafusion.workspace = true
datafusion = "14.0.0"
datatypes = { path = "../datatypes" }
futures = "0.3"
futures-util = "0.3"
@@ -30,7 +31,7 @@ serde_json = "1.0"
snafu = { version = "0.7", features = ["backtraces"] }
storage = { path = "../storage" }
table = { path = "../table" }
tokio.workspace = true
tokio = { version = "1.18", features = ["full"] }
[dev-dependencies]
chrono = "0.4"
@@ -39,4 +40,4 @@ mito = { path = "../mito", features = ["test"] }
object-store = { path = "../object-store" }
storage = { path = "../storage" }
tempdir = "0.3"
tokio.workspace = true
tokio = { version = "1.0", features = ["full"] }

View File

@@ -33,38 +33,48 @@ const ALPHANUMERICS_NAME_PATTERN: &str = "[a-zA-Z_][a-zA-Z0-9_]*";
lazy_static! {
static ref CATALOG_KEY_PATTERN: Regex = Regex::new(&format!(
"^{CATALOG_KEY_PREFIX}-({ALPHANUMERICS_NAME_PATTERN})$"
"^{}-({})$",
CATALOG_KEY_PREFIX, ALPHANUMERICS_NAME_PATTERN
))
.unwrap();
}
lazy_static! {
static ref SCHEMA_KEY_PATTERN: Regex = Regex::new(&format!(
"^{SCHEMA_KEY_PREFIX}-({ALPHANUMERICS_NAME_PATTERN})-({ALPHANUMERICS_NAME_PATTERN})$"
"^{}-({})-({})$",
SCHEMA_KEY_PREFIX, ALPHANUMERICS_NAME_PATTERN, ALPHANUMERICS_NAME_PATTERN
))
.unwrap();
}
lazy_static! {
static ref TABLE_GLOBAL_KEY_PATTERN: Regex = Regex::new(&format!(
"^{TABLE_GLOBAL_KEY_PREFIX}-({ALPHANUMERICS_NAME_PATTERN})-({ALPHANUMERICS_NAME_PATTERN})-({ALPHANUMERICS_NAME_PATTERN})$"
"^{}-({})-({})-({})$",
TABLE_GLOBAL_KEY_PREFIX,
ALPHANUMERICS_NAME_PATTERN,
ALPHANUMERICS_NAME_PATTERN,
ALPHANUMERICS_NAME_PATTERN
))
.unwrap();
}
lazy_static! {
static ref TABLE_REGIONAL_KEY_PATTERN: Regex = Regex::new(&format!(
"^{TABLE_REGIONAL_KEY_PREFIX}-({ALPHANUMERICS_NAME_PATTERN})-({ALPHANUMERICS_NAME_PATTERN})-({ALPHANUMERICS_NAME_PATTERN})-([0-9]+)$"
"^{}-({})-({})-({})-([0-9]+)$",
TABLE_REGIONAL_KEY_PREFIX,
ALPHANUMERICS_NAME_PATTERN,
ALPHANUMERICS_NAME_PATTERN,
ALPHANUMERICS_NAME_PATTERN
))
.unwrap();
}
pub fn build_catalog_prefix() -> String {
format!("{CATALOG_KEY_PREFIX}-")
format!("{}-", CATALOG_KEY_PREFIX)
}
pub fn build_schema_prefix(catalog_name: impl AsRef<str>) -> String {
format!("{SCHEMA_KEY_PREFIX}-{}-", catalog_name.as_ref())
format!("{}-{}-", SCHEMA_KEY_PREFIX, catalog_name.as_ref())
}
pub fn build_table_global_prefix(
@@ -72,7 +82,8 @@ pub fn build_table_global_prefix(
schema_name: impl AsRef<str>,
) -> String {
format!(
"{TABLE_GLOBAL_KEY_PREFIX}-{}-{}-",
"{}-{}-{}-",
TABLE_GLOBAL_KEY_PREFIX,
catalog_name.as_ref(),
schema_name.as_ref()
)
@@ -367,7 +378,7 @@ mod tests {
table_info,
};
let serialized = serde_json::to_string(&value).unwrap();
let deserialized = TableGlobalValue::parse(serialized).unwrap();
let deserialized = TableGlobalValue::parse(&serialized).unwrap();
assert_eq!(value, deserialized);
}
}

View File

@@ -157,7 +157,7 @@ pub struct RegisterSchemaRequest {
/// Formats table fully-qualified name
pub fn format_full_table_name(catalog: &str, schema: &str, table: &str) -> String {
format!("{catalog}.{schema}.{table}")
format!("{}.{}.{}", catalog, schema, table)
}
pub trait CatalogProviderFactory {
@@ -187,7 +187,8 @@ pub(crate) async fn handle_system_table_request<'a, M: CatalogManager>(
.await
.with_context(|_| CreateTableSnafu {
table_info: format!(
"{catalog_name}.{schema_name}.{table_name}, id: {table_id}",
"{}.{}.{}, id: {}",
catalog_name, schema_name, table_name, table_id,
),
})?;
manager
@@ -199,7 +200,7 @@ pub(crate) async fn handle_system_table_request<'a, M: CatalogManager>(
table: table.clone(),
})
.await?;
info!("Created and registered system table: {table_name}");
info!("Created and registered system table: {}", table_name);
table
};
if let Some(hook) = req.open_hook {

View File

@@ -338,7 +338,7 @@ impl CatalogManager for LocalCatalogManager {
let schema = catalog
.schema(schema_name)?
.with_context(|| SchemaNotFoundSnafu {
schema_info: format!("{catalog_name}.{schema_name}"),
schema_info: format!("{}.{}", catalog_name, schema_name),
})?;
{
@@ -452,7 +452,7 @@ impl CatalogManager for LocalCatalogManager {
let schema = catalog
.schema(schema_name)?
.with_context(|| SchemaNotFoundSnafu {
schema_info: format!("{catalog_name}.{schema_name}"),
schema_info: format!("{}.{}", catalog_name, schema_name),
})?;
schema.table(table_name)
}

View File

@@ -331,7 +331,10 @@ impl RemoteCatalogManager {
.open_table(&context, request)
.await
.with_context(|_| OpenTableSnafu {
table_info: format!("{catalog_name}.{schema_name}.{table_name}, id:{table_id}"),
table_info: format!(
"{}.{}.{}, id:{}",
catalog_name, schema_name, table_name, table_id
),
})? {
Some(table) => {
info!(
@@ -352,7 +355,7 @@ impl RemoteCatalogManager {
.clone()
.try_into()
.context(InvalidTableSchemaSnafu {
table_info: format!("{catalog_name}.{schema_name}.{table_name}"),
table_info: format!("{}.{}.{}", catalog_name, schema_name, table_name,),
schema: meta.schema.clone(),
})?;
let req = CreateTableRequest {
@@ -474,7 +477,7 @@ impl CatalogManager for RemoteCatalogManager {
let schema = catalog
.schema(schema_name)?
.with_context(|| SchemaNotFoundSnafu {
schema_info: format!("{catalog_name}.{schema_name}"),
schema_info: format!("{}.{}", catalog_name, schema_name),
})?;
schema.table(table_name)
}

View File

@@ -61,7 +61,7 @@ impl Table for SystemCatalogTable {
async fn scan(
&self,
_projection: Option<&Vec<usize>>,
_projection: &Option<Vec<usize>>,
_filters: &[Expr],
_limit: Option<usize>,
) -> table::Result<PhysicalPlanRef> {
@@ -129,7 +129,7 @@ impl SystemCatalogTable {
let ctx = SessionContext::new();
let scan = self
.table
.scan(full_projection, &[], None)
.scan(&full_projection, &[], None)
.await
.context(error::SystemCatalogTableScanSnafu)?;
let stream = scan
@@ -197,7 +197,7 @@ pub fn build_table_insert_request(full_table_name: String, table_id: TableId) ->
}
pub fn build_schema_insert_request(catalog_name: String, schema_name: String) -> InsertRequest {
let full_schema_name = format!("{catalog_name}.{schema_name}");
let full_schema_name = format!("{}.{}", catalog_name, schema_name);
build_insert_request(
EntryType::Schema,
full_schema_name.as_bytes(),
@@ -390,7 +390,7 @@ mod tests {
if let Entry::Catalog(e) = entry {
assert_eq!("some_catalog", e.catalog_name);
} else {
panic!("Unexpected type: {entry:?}");
panic!("Unexpected type: {:?}", entry);
}
}
@@ -407,7 +407,7 @@ mod tests {
assert_eq!("some_catalog", e.catalog_name);
assert_eq!("some_schema", e.schema_name);
} else {
panic!("Unexpected type: {entry:?}");
panic!("Unexpected type: {:?}", entry);
}
}
@@ -426,7 +426,7 @@ mod tests {
assert_eq!("some_table", e.table_name);
assert_eq!(42, e.table_id);
} else {
panic!("Unexpected type: {entry:?}");
panic!("Unexpected type: {:?}", entry);
}
}

View File

@@ -77,7 +77,7 @@ impl Table for Tables {
async fn scan(
&self,
_projection: Option<&Vec<usize>>,
_projection: &Option<Vec<usize>>,
_filters: &[Expr],
_limit: Option<usize>,
) -> table::error::Result<PhysicalPlanRef> {
@@ -370,7 +370,7 @@ mod tests {
.unwrap();
let tables = Tables::new(catalog_list, "test_engine".to_string());
let tables_stream = tables.scan(None, &[], None).await.unwrap();
let tables_stream = tables.scan(&None, &[], None).await.unwrap();
let session_ctx = SessionContext::new();
let mut tables_stream = tables_stream.execute(0, session_ctx.task_ctx()).unwrap();

View File

@@ -69,7 +69,8 @@ mod tests {
assert!(
err.to_string()
.contains("Table `greptime.public.test_table` already exists"),
"Actual error message: {err}",
"Actual error message: {}",
err
);
}

View File

@@ -189,10 +189,10 @@ impl TableEngine for MockTableEngine {
unimplemented!()
}
fn get_table(
fn get_table<'a>(
&self,
_ctx: &EngineContext,
table_ref: &TableReference,
table_ref: &'a TableReference,
) -> table::Result<Option<TableRef>> {
futures::executor::block_on(async {
Ok(self
@@ -204,7 +204,7 @@ impl TableEngine for MockTableEngine {
})
}
fn table_exists(&self, _ctx: &EngineContext, table_ref: &TableReference) -> bool {
fn table_exists<'a>(&self, _ctx: &EngineContext, table_ref: &'a TableReference) -> bool {
futures::executor::block_on(async {
self.tables
.read()

View File

@@ -1,12 +1,13 @@
[package]
name = "client"
version.workspace = true
edition.workspace = true
license.workspace = true
version = "0.1.0"
edition = "2021"
license = "Apache-2.0"
# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html
[dependencies]
api = { path = "../api" }
async-stream.workspace = true
async-stream = "0.3"
common-base = { path = "../common/base" }
common-error = { path = "../common/error" }
common-grpc = { path = "../common/grpc" }
@@ -14,18 +15,18 @@ common-grpc-expr = { path = "../common/grpc-expr" }
common-query = { path = "../common/query" }
common-recordbatch = { path = "../common/recordbatch" }
common-time = { path = "../common/time" }
datafusion.workspace = true
datafusion = "14.0.0"
datatypes = { path = "../datatypes" }
enum_dispatch = "0.3"
parking_lot = "0.12"
rand = "0.8"
snafu.workspace = true
snafu = { version = "0.7", features = ["backtraces"] }
tonic = "0.8"
[dev-dependencies]
datanode = { path = "../datanode" }
substrait = { path = "../common/substrait" }
tokio.workspace = true
tokio = { version = "1.0", features = ["full"] }
tracing = "0.1"
tracing-subscriber = { version = "0.3", features = ["env-filter"] }

View File

@@ -0,0 +1,106 @@
// Copyright 2022 Greptime Team
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
use api::v1::*;
use client::{Client, Database};
fn main() {
tracing::subscriber::set_global_default(tracing_subscriber::FmtSubscriber::builder().finish())
.unwrap();
run();
}
#[tokio::main]
async fn run() {
let client = Client::with_urls(vec!["127.0.0.1:3001"]);
let db = Database::new("greptime", client);
let (columns, row_count) = insert_data();
let expr = InsertExpr {
schema_name: "public".to_string(),
table_name: "demo".to_string(),
region_number: 0,
columns,
row_count,
};
db.insert(expr).await.unwrap();
}
fn insert_data() -> (Vec<Column>, u32) {
const SEMANTIC_TAG: i32 = 0;
const SEMANTIC_FIELD: i32 = 1;
const SEMANTIC_TS: i32 = 2;
let row_count = 4;
let host_vals = column::Values {
string_values: vec![
"host1".to_string(),
"host2".to_string(),
"host3".to_string(),
"host4".to_string(),
],
..Default::default()
};
let host_column = Column {
column_name: "host".to_string(),
semantic_type: SEMANTIC_TAG,
values: Some(host_vals),
null_mask: vec![0],
..Default::default()
};
let cpu_vals = column::Values {
f64_values: vec![0.31, 0.41, 0.2],
..Default::default()
};
let cpu_column = Column {
column_name: "cpu".to_string(),
semantic_type: SEMANTIC_FIELD,
values: Some(cpu_vals),
null_mask: vec![2],
..Default::default()
};
let mem_vals = column::Values {
f64_values: vec![0.1, 0.2, 0.3],
..Default::default()
};
let mem_column = Column {
column_name: "memory".to_string(),
semantic_type: SEMANTIC_FIELD,
values: Some(mem_vals),
null_mask: vec![4],
..Default::default()
};
let ts_vals = column::Values {
i64_values: vec![100, 101, 102, 103],
..Default::default()
};
let ts_column = Column {
column_name: "ts".to_string(),
semantic_type: SEMANTIC_TS,
values: Some(ts_vals),
null_mask: vec![0],
..Default::default()
};
(
vec![host_column, cpu_column, mem_column, ts_column],
row_count,
)
}

View File

@@ -12,7 +12,8 @@
// See the License for the specific language governing permissions and
// limitations under the License.
use api::v1::{ColumnDataType, ColumnDef, CreateTableExpr, TableId};
use api::v1::{ColumnDataType, ColumnDef, CreateExpr};
use client::admin::Admin;
use client::{Client, Database};
use prost_09::Message;
use substrait_proto::protobuf::plan_rel::RelType as PlanRelType;
@@ -32,41 +33,41 @@ fn main() {
async fn run() {
let client = Client::with_urls(vec!["127.0.0.1:3001"]);
let create_table_expr = CreateTableExpr {
catalog_name: "greptime".to_string(),
schema_name: "public".to_string(),
let create_table_expr = CreateExpr {
catalog_name: Some("greptime".to_string()),
schema_name: Some("public".to_string()),
table_name: "test_logical_dist_exec".to_string(),
desc: "".to_string(),
desc: None,
column_defs: vec![
ColumnDef {
name: "timestamp".to_string(),
datatype: ColumnDataType::TimestampMillisecond as i32,
is_nullable: false,
default_constraint: vec![],
default_constraint: None,
},
ColumnDef {
name: "key".to_string(),
datatype: ColumnDataType::Uint64 as i32,
is_nullable: false,
default_constraint: vec![],
default_constraint: None,
},
ColumnDef {
name: "value".to_string(),
datatype: ColumnDataType::Uint64 as i32,
is_nullable: false,
default_constraint: vec![],
default_constraint: None,
},
],
time_index: "timestamp".to_string(),
primary_keys: vec!["key".to_string()],
create_if_not_exists: false,
table_options: Default::default(),
table_id: Some(TableId { id: 1024 }),
table_id: Some(1024),
region_ids: vec![0],
};
let db = Database::new("create table", client.clone());
let result = db.create(create_table_expr).await.unwrap();
let admin = Admin::new("create table", client.clone());
let result = admin.create(create_table_expr).await.unwrap();
event!(Level::INFO, "create table result: {:#?}", result);
let logical = mock_logical_plan();

View File

@@ -12,6 +12,23 @@
// See the License for the specific language governing permissions and
// limitations under the License.
mod normalize;
use client::{Client, Database, Select};
use tracing::{event, Level};
pub use normalize::{SeriesNormalize, SeriesNormalizeExec, SeriesNormalizeStream};
fn main() {
tracing::subscriber::set_global_default(tracing_subscriber::FmtSubscriber::builder().finish())
.unwrap();
run();
}
#[tokio::main]
async fn run() {
let client = Client::with_urls(vec!["127.0.0.1:3001"]);
let db = Database::new("greptime", client);
let sql = Select::Sql("select * from demo".to_string());
let result = db.select(sql).await.unwrap();
event!(Level::INFO, "result: {:#?}", result);
}

137
src/client/src/admin.rs Normal file
View File

@@ -0,0 +1,137 @@
// Copyright 2022 Greptime Team
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
use api::v1::*;
use common_error::prelude::StatusCode;
use common_query::Output;
use snafu::prelude::*;
use crate::database::PROTOCOL_VERSION;
use crate::{error, Client, Result};
#[derive(Clone, Debug)]
pub struct Admin {
name: String,
client: Client,
}
impl Admin {
pub fn new(name: impl Into<String>, client: Client) -> Self {
Self {
name: name.into(),
client,
}
}
pub async fn create(&self, expr: CreateExpr) -> Result<AdminResult> {
let header = ExprHeader {
version: PROTOCOL_VERSION,
};
let expr = AdminExpr {
header: Some(header),
expr: Some(admin_expr::Expr::Create(expr)),
};
self.do_request(expr).await
}
pub async fn do_request(&self, expr: AdminExpr) -> Result<AdminResult> {
// `remove(0)` is safe because of `do_requests`'s invariants.
Ok(self.do_requests(vec![expr]).await?.remove(0))
}
pub async fn alter(&self, expr: AlterExpr) -> Result<AdminResult> {
let header = ExprHeader {
version: PROTOCOL_VERSION,
};
let expr = AdminExpr {
header: Some(header),
expr: Some(admin_expr::Expr::Alter(expr)),
};
self.do_request(expr).await
}
pub async fn drop_table(&self, expr: DropTableExpr) -> Result<AdminResult> {
let header = ExprHeader {
version: PROTOCOL_VERSION,
};
let expr = AdminExpr {
header: Some(header),
expr: Some(admin_expr::Expr::DropTable(expr)),
};
self.do_request(expr).await
}
/// Invariants: the lengths of input vec (`Vec<AdminExpr>`) and output vec (`Vec<AdminResult>`) are equal.
async fn do_requests(&self, exprs: Vec<AdminExpr>) -> Result<Vec<AdminResult>> {
let expr_count = exprs.len();
let req = AdminRequest {
name: self.name.clone(),
exprs,
};
let resp = self.client.admin(req).await?;
let results = resp.results;
ensure!(
results.len() == expr_count,
error::MissingResultSnafu {
name: "admin_results",
expected: expr_count,
actual: results.len(),
}
);
Ok(results)
}
pub async fn create_database(&self, expr: CreateDatabaseExpr) -> Result<AdminResult> {
let header = ExprHeader {
version: PROTOCOL_VERSION,
};
let expr = AdminExpr {
header: Some(header),
expr: Some(admin_expr::Expr::CreateDatabase(expr)),
};
Ok(self.do_requests(vec![expr]).await?.remove(0))
}
}
pub fn admin_result_to_output(admin_result: AdminResult) -> Result<Output> {
let header = admin_result.header.context(error::MissingHeaderSnafu)?;
if !StatusCode::is_success(header.code) {
return error::DatanodeSnafu {
code: header.code,
msg: header.err_msg,
}
.fail();
}
let result = admin_result.result.context(error::MissingResultSnafu {
name: "result".to_string(),
expected: 1_usize,
actual: 0_usize,
})?;
let output = match result {
admin_result::Result::Mutate(mutate) => {
if mutate.failure != 0 {
return error::MutateFailureSnafu {
failure: mutate.failure,
}
.fail();
}
Output::AffectedRows(mutate.success as usize)
}
};
Ok(output)
}

View File

@@ -104,6 +104,20 @@ impl Client {
self.inner.set_peers(urls);
}
pub async fn admin(&self, req: AdminRequest) -> Result<AdminResponse> {
let req = BatchRequest {
admins: vec![req],
..Default::default()
};
let mut res = self.batch(req).await?;
res.admins.pop().context(error::MissingResultSnafu {
name: "admins",
expected: 1_usize,
actual: 0_usize,
})
}
pub async fn database(&self, req: DatabaseRequest) -> Result<DatabaseResponse> {
let req = BatchRequest {
databases: vec![req],

View File

@@ -12,22 +12,27 @@
// See the License for the specific language governing permissions and
// limitations under the License.
use api::v1::ddl_request::Expr as DdlExpr;
use std::sync::Arc;
use api::v1::codec::SelectResult as GrpcSelectResult;
use api::v1::column::SemanticType;
use api::v1::{
object_expr, query_request, AlterExpr, CreateTableExpr, DatabaseRequest, DdlRequest,
DropTableExpr, InsertRequest, ObjectExpr, ObjectResult as GrpcObjectResult, QueryRequest,
object_expr, object_result, select_expr, DatabaseRequest, ExprHeader, InsertExpr,
MutateResult as GrpcMutateResult, ObjectExpr, ObjectResult as GrpcObjectResult, SelectExpr,
};
use common_error::status_code::StatusCode;
use common_grpc::flight::{
flight_messages_to_recordbatches, raw_flight_data_to_message, FlightMessage,
};
use common_grpc_expr::column_to_vector;
use common_query::Output;
use common_recordbatch::RecordBatches;
use common_recordbatch::{RecordBatch, RecordBatches};
use datatypes::prelude::*;
use datatypes::schema::{ColumnSchema, Schema};
use snafu::{ensure, OptionExt, ResultExt};
use crate::error::{ConvertFlightDataSnafu, DatanodeSnafu, IllegalFlightMessagesSnafu};
use crate::error::{ColumnToVectorSnafu, ConvertSchemaSnafu, DatanodeSnafu, DecodeSelectSnafu};
use crate::{error, Client, Result};
pub const PROTOCOL_VERSION: u32 = 1;
#[derive(Clone, Debug)]
pub struct Database {
name: String,
@@ -46,63 +51,65 @@ impl Database {
&self.name
}
pub async fn insert(&self, request: InsertRequest) -> Result<RpcOutput> {
pub async fn insert(&self, insert: InsertExpr) -> Result<ObjectResult> {
let header = ExprHeader {
version: PROTOCOL_VERSION,
};
let expr = ObjectExpr {
request: Some(object_expr::Request::Insert(request)),
header: Some(header),
expr: Some(object_expr::Expr::Insert(insert)),
};
self.object(expr).await?.try_into()
}
pub async fn sql(&self, sql: &str) -> Result<RpcOutput> {
let query = QueryRequest {
query: Some(query_request::Query::Sql(sql.to_string())),
pub async fn batch_insert(&self, insert_exprs: Vec<InsertExpr>) -> Result<Vec<ObjectResult>> {
let header = ExprHeader {
version: PROTOCOL_VERSION,
};
self.do_query(query).await
let obj_exprs = insert_exprs
.into_iter()
.map(|expr| ObjectExpr {
header: Some(header.clone()),
expr: Some(object_expr::Expr::Insert(expr)),
})
.collect();
self.objects(obj_exprs)
.await?
.into_iter()
.map(|result| result.try_into())
.collect()
}
pub async fn logical_plan(&self, logical_plan: Vec<u8>) -> Result<RpcOutput> {
let query = QueryRequest {
query: Some(query_request::Query::LogicalPlan(logical_plan)),
pub async fn select(&self, expr: Select) -> Result<ObjectResult> {
let select_expr = match expr {
Select::Sql(sql) => SelectExpr {
expr: Some(select_expr::Expr::Sql(sql)),
},
};
self.do_query(query).await
self.do_select(select_expr).await
}
async fn do_query(&self, request: QueryRequest) -> Result<RpcOutput> {
pub async fn logical_plan(&self, logical_plan: Vec<u8>) -> Result<ObjectResult> {
let select_expr = SelectExpr {
expr: Some(select_expr::Expr::LogicalPlan(logical_plan)),
};
self.do_select(select_expr).await
}
async fn do_select(&self, select_expr: SelectExpr) -> Result<ObjectResult> {
let header = ExprHeader {
version: PROTOCOL_VERSION,
};
let expr = ObjectExpr {
request: Some(object_expr::Request::Query(request)),
header: Some(header),
expr: Some(object_expr::Expr::Select(select_expr)),
};
let obj_result = self.object(expr).await?;
obj_result.try_into()
}
pub async fn create(&self, expr: CreateTableExpr) -> Result<RpcOutput> {
let expr = ObjectExpr {
request: Some(object_expr::Request::Ddl(DdlRequest {
expr: Some(DdlExpr::CreateTable(expr)),
})),
};
self.object(expr).await?.try_into()
}
pub async fn alter(&self, expr: AlterExpr) -> Result<RpcOutput> {
let expr = ObjectExpr {
request: Some(object_expr::Request::Ddl(DdlRequest {
expr: Some(DdlExpr::Alter(expr)),
})),
};
self.object(expr).await?.try_into()
}
pub async fn drop_table(&self, expr: DropTableExpr) -> Result<RpcOutput> {
let expr = ObjectExpr {
request: Some(object_expr::Request::Ddl(DdlRequest {
expr: Some(DdlExpr::DropTable(expr)),
})),
};
self.object(expr).await?.try_into()
}
pub async fn object(&self, expr: ObjectExpr) -> Result<GrpcObjectResult> {
let res = self.objects(vec![expr]).await?.pop().unwrap();
Ok(res)
@@ -132,12 +139,12 @@ impl Database {
}
#[derive(Debug)]
pub enum RpcOutput {
RecordBatches(RecordBatches),
AffectedRows(usize),
pub enum ObjectResult {
Select(GrpcSelectResult),
Mutate(GrpcMutateResult),
}
impl TryFrom<api::v1::ObjectResult> for RpcOutput {
impl TryFrom<api::v1::ObjectResult> for ObjectResult {
type Error = error::Error;
fn try_from(object_result: api::v1::ObjectResult) -> std::result::Result<Self, Self::Error> {
@@ -150,50 +157,92 @@ impl TryFrom<api::v1::ObjectResult> for RpcOutput {
.fail();
}
let flight_messages = raw_flight_data_to_message(object_result.flight_data)
.context(ConvertFlightDataSnafu)?;
let obj_result = object_result.result.context(error::MissingResultSnafu {
name: "result".to_string(),
expected: 1_usize,
actual: 0_usize,
})?;
Ok(match obj_result {
object_result::Result::Select(select) => {
let result = (*select.raw_data).try_into().context(DecodeSelectSnafu)?;
ObjectResult::Select(result)
}
object_result::Result::Mutate(mutate) => ObjectResult::Mutate(mutate),
})
}
}
let output = if let Some(FlightMessage::AffectedRows(rows)) = flight_messages.get(0) {
ensure!(
flight_messages.len() == 1,
IllegalFlightMessagesSnafu {
reason: "Expect 'AffectedRows' Flight messages to be one and only!"
pub enum Select {
Sql(String),
}
impl TryFrom<ObjectResult> for Output {
type Error = error::Error;
fn try_from(value: ObjectResult) -> Result<Self> {
let output = match value {
ObjectResult::Select(select) => {
let vectors = select
.columns
.iter()
.map(|column| {
column_to_vector(column, select.row_count).context(ColumnToVectorSnafu)
})
.collect::<Result<Vec<VectorRef>>>()?;
let column_schemas = select
.columns
.iter()
.zip(vectors.iter())
.map(|(column, vector)| {
let datatype = vector.data_type();
// nullable or not, does not affect the output
let mut column_schema =
ColumnSchema::new(&column.column_name, datatype, true);
if column.semantic_type == SemanticType::Timestamp as i32 {
column_schema = column_schema.with_time_index(true);
}
column_schema
})
.collect::<Vec<ColumnSchema>>();
let schema = Arc::new(Schema::try_new(column_schemas).context(ConvertSchemaSnafu)?);
let recordbatches = if vectors.is_empty() {
RecordBatches::try_new(schema, vec![])
} else {
RecordBatch::new(schema, vectors)
.and_then(|batch| RecordBatches::try_new(batch.schema.clone(), vec![batch]))
}
);
RpcOutput::AffectedRows(*rows)
} else {
let recordbatches = flight_messages_to_recordbatches(flight_messages)
.context(ConvertFlightDataSnafu)?;
RpcOutput::RecordBatches(recordbatches)
.context(error::CreateRecordBatchesSnafu)?;
Output::RecordBatches(recordbatches)
}
ObjectResult::Mutate(mutate) => {
if mutate.failure != 0 {
return error::MutateFailureSnafu {
failure: mutate.failure,
}
.fail();
}
Output::AffectedRows(mutate.success as usize)
}
};
Ok(output)
}
}
impl From<RpcOutput> for Output {
fn from(value: RpcOutput) -> Self {
match value {
RpcOutput::AffectedRows(x) => Output::AffectedRows(x),
RpcOutput::RecordBatches(x) => Output::RecordBatches(x),
}
}
}
#[cfg(test)]
mod tests {
use std::sync::Arc;
use api::helper::ColumnDataTypeWrapper;
use api::v1::Column;
use common_grpc::select::{null_mask, values};
use common_grpc_expr::column_to_vector;
use datatypes::prelude::{Vector, VectorRef};
use datatypes::vectors::{
BinaryVector, BooleanVector, DateTimeVector, DateVector, Float32Vector, Float64Vector,
Int16Vector, Int32Vector, Int64Vector, Int8Vector, StringVector, UInt16Vector,
UInt32Vector, UInt64Vector, UInt8Vector,
};
use super::*;
#[test]
fn test_column_to_vector() {
let mut column = create_test_column(Arc::new(BooleanVector::from(vec![true])));

View File

@@ -13,15 +13,19 @@
// limitations under the License.
use std::any::Any;
use std::sync::Arc;
use api::serde::DecodeError;
use common_error::prelude::*;
use datafusion::physical_plan::ExecutionPlan;
#[derive(Debug, Snafu)]
#[snafu(visibility(pub))]
pub enum Error {
#[snafu(display("Illegal Flight messages, reason: {}", reason))]
IllegalFlightMessages {
reason: String,
#[snafu(display("Connect failed to {}, source: {}", url, source))]
ConnectFailed {
url: String,
source: tonic::transport::Error,
backtrace: Backtrace,
},
@@ -42,21 +46,34 @@ pub enum Error {
backtrace: Backtrace,
},
#[snafu(display("Fail to decode select result, source: {}", source))]
DecodeSelect { source: DecodeError },
#[snafu(display("Error occurred on the data node, code: {}, msg: {}", code, msg))]
Datanode { code: u32, msg: String },
#[snafu(display("Failed to convert FlightData, source: {}", source))]
ConvertFlightData {
#[snafu(display("Failed to encode physical plan: {:?}, source: {}", physical, source))]
EncodePhysical {
physical: Arc<dyn ExecutionPlan>,
#[snafu(backtrace)]
source: common_grpc::Error,
},
#[snafu(display("Mutate result has failure {}", failure))]
MutateFailure { failure: u32, backtrace: Backtrace },
#[snafu(display("Column datatype error, source: {}", source))]
ColumnDataType {
#[snafu(backtrace)]
source: api::error::Error,
},
#[snafu(display("Failed to create RecordBatches, source: {}", source))]
CreateRecordBatches {
#[snafu(backtrace)]
source: common_recordbatch::error::Error,
},
#[snafu(display("Illegal GRPC client state: {}", err_msg))]
IllegalGrpcClientState {
err_msg: String,
@@ -66,6 +83,12 @@ pub enum Error {
#[snafu(display("Missing required field in protobuf, field: {}", field))]
MissingField { field: String, backtrace: Backtrace },
#[snafu(display("Failed to convert schema, source: {}", source))]
ConvertSchema {
#[snafu(backtrace)]
source: datatypes::error::Error,
},
#[snafu(display(
"Failed to create gRPC channel, peer address: {}, source: {}",
addr,
@@ -76,6 +99,12 @@ pub enum Error {
#[snafu(backtrace)]
source: common_grpc::error::Error,
},
#[snafu(display("Failed to convert column to vector, source: {}", source))]
ColumnToVector {
#[snafu(backtrace)]
source: common_grpc_expr::error::Error,
},
}
pub type Result<T> = std::result::Result<T, Error>;
@@ -83,17 +112,21 @@ pub type Result<T> = std::result::Result<T, Error>;
impl ErrorExt for Error {
fn status_code(&self) -> StatusCode {
match self {
Error::IllegalFlightMessages { .. }
Error::ConnectFailed { .. }
| Error::MissingResult { .. }
| Error::MissingHeader { .. }
| Error::TonicStatus { .. }
| Error::DecodeSelect { .. }
| Error::Datanode { .. }
| Error::EncodePhysical { .. }
| Error::MutateFailure { .. }
| Error::ColumnDataType { .. }
| Error::MissingField { .. } => StatusCode::Internal,
Error::CreateChannel { source, .. } | Error::ConvertFlightData { source } => {
source.status_code()
}
Error::ConvertSchema { source } => source.status_code(),
Error::CreateRecordBatches { source } => source.status_code(),
Error::CreateChannel { source, .. } => source.status_code(),
Error::IllegalGrpcClientState { .. } => StatusCode::Unexpected,
Error::ColumnToVector { source, .. } => source.status_code(),
}
}

View File

@@ -12,6 +12,7 @@
// See the License for the specific language governing permissions and
// limitations under the License.
pub mod admin;
mod client;
mod database;
mod error;
@@ -20,5 +21,5 @@ pub mod load_balance;
pub use api;
pub use self::client::Client;
pub use self::database::{Database, RpcOutput};
pub use self::database::{Database, ObjectResult, Select};
pub use self::error::{Error, Result};

View File

@@ -1,9 +1,9 @@
[package]
name = "cmd"
version.workspace = true
edition.workspace = true
license.workspace = true
version = "0.1.0"
edition = "2021"
default-run = "greptime"
license = "Apache-2.0"
[[bin]]
name = "greptime"
@@ -18,17 +18,17 @@ common-telemetry = { path = "../common/telemetry", features = [
] }
datanode = { path = "../datanode" }
frontend = { path = "../frontend" }
futures.workspace = true
futures = "0.3"
meta-client = { path = "../meta-client" }
meta-srv = { path = "../meta-srv" }
serde.workspace = true
serde = "1.0"
servers = { path = "../servers" }
snafu.workspace = true
snafu = { version = "0.7", features = ["backtraces"] }
tokio = { version = "1.18", features = ["full"] }
toml = "0.5"
[dev-dependencies]
serde.workspace = true
serde = "1.0"
tempdir = "0.3"
[build-dependencies]

View File

@@ -12,8 +12,7 @@
// See the License for the specific language governing permissions and
// limitations under the License.
use std::sync::Arc;
use anymap::AnyMap;
use clap::Parser;
use frontend::frontend::{Frontend, FrontendOptions};
use frontend::grpc::GrpcOptions;
@@ -22,7 +21,6 @@ use frontend::instance::Instance;
use frontend::mysql::MysqlOptions;
use frontend::opentsdb::OpentsdbOptions;
use frontend::postgres::PostgresOptions;
use frontend::Plugins;
use meta_client::MetaClientOpts;
use servers::auth::UserProviderRef;
use servers::http::HttpOptions;
@@ -88,21 +86,21 @@ pub struct StartCommand {
impl StartCommand {
async fn run(self) -> Result<()> {
let plugins = Arc::new(load_frontend_plugins(&self.user_provider)?);
let plugins = load_frontend_plugins(&self.user_provider)?;
let opts: FrontendOptions = self.try_into()?;
let mut instance = Instance::try_new_distributed(&opts)
.await
.context(error::StartFrontendSnafu)?;
instance.set_plugins(plugins.clone());
let mut frontend = Frontend::new(opts, instance, plugins);
let mut frontend = Frontend::new(
opts.clone(),
Instance::try_new_distributed(&opts)
.await
.context(error::StartFrontendSnafu)?,
plugins,
);
frontend.start().await.context(error::StartFrontendSnafu)
}
}
pub fn load_frontend_plugins(user_provider: &Option<String>) -> Result<Plugins> {
let mut plugins = Plugins::new();
pub fn load_frontend_plugins(user_provider: &Option<String>) -> Result<AnyMap> {
let mut plugins = AnyMap::new();
if let Some(provider) = user_provider {
let provider = auth::user_provider_from_option(provider).context(IllegalAuthConfigSnafu)?;

View File

@@ -12,8 +12,7 @@
// See the License for the specific language governing permissions and
// limitations under the License.
use std::sync::Arc;
use anymap::AnyMap;
use clap::Parser;
use common_telemetry::info;
use datanode::datanode::{Datanode, DatanodeOptions, ObjectStoreConfig};
@@ -26,7 +25,6 @@ use frontend::mysql::MysqlOptions;
use frontend::opentsdb::OpentsdbOptions;
use frontend::postgres::PostgresOptions;
use frontend::prometheus::PrometheusOptions;
use frontend::Plugins;
use serde::{Deserialize, Serialize};
use servers::http::HttpOptions;
use servers::tls::{TlsMode, TlsOption};
@@ -152,7 +150,7 @@ impl StartCommand {
async fn run(self) -> Result<()> {
let enable_memory_catalog = self.enable_memory_catalog;
let config_file = self.config_file.clone();
let plugins = Arc::new(load_frontend_plugins(&self.user_provider)?);
let plugins = load_frontend_plugins(&self.user_provider)?;
let fe_opts = FrontendOptions::try_from(self)?;
let dn_opts: DatanodeOptions = {
let mut opts: StandaloneOptions = if let Some(path) = config_file {
@@ -189,12 +187,11 @@ impl StartCommand {
/// Build frontend instance in standalone mode
async fn build_frontend(
fe_opts: FrontendOptions,
plugins: Arc<Plugins>,
plugins: AnyMap,
datanode_instance: InstanceRef,
) -> Result<Frontend<FeInstance>> {
let mut frontend_instance = FeInstance::new_standalone(datanode_instance.clone());
frontend_instance.set_script_handler(datanode_instance);
frontend_instance.set_plugins(plugins.clone());
Ok(Frontend::new(fe_opts, frontend_instance, plugins))
}
@@ -224,7 +221,8 @@ impl TryFrom<StartCommand> for FrontendOptions {
if addr == datanode_grpc_addr {
return IllegalConfigSnafu {
msg: format!(
"gRPC listen address conflicts with datanode reserved gRPC addr: {datanode_grpc_addr}",
"gRPC listen address conflicts with datanode reserved gRPC addr: {}",
datanode_grpc_addr
),
}
.fail();

View File

@@ -1,8 +1,8 @@
[package]
name = "common-base"
version.workspace = true
edition.workspace = true
license.workspace = true
version = "0.1.0"
edition = "2021"
license = "Apache-2.0"
[dependencies]
bitvec = "1.0"
@@ -10,4 +10,4 @@ bytes = { version = "1.1", features = ["serde"] }
common-error = { path = "../error" }
paste = "1.0"
serde = { version = "1.0", features = ["derive"] }
snafu.workspace = true
snafu = { version = "0.7", features = ["backtraces"] }

View File

@@ -1,8 +1,8 @@
[package]
name = "common-catalog"
version.workspace = true
edition.workspace = true
license.workspace = true
version = "0.1.0"
edition = "2021"
license = "Apache-2.0"
[dependencies]
async-trait = "0.1"
@@ -11,7 +11,7 @@ common-telemetry = { path = "../telemetry" }
datatypes = { path = "../../datatypes" }
lazy_static = "1.4"
regex = "1.6"
serde.workspace = true
serde = "1.0"
serde_json = "1.0"
snafu = { version = "0.7", features = ["backtraces"] }

View File

@@ -1,8 +1,8 @@
[package]
name = "common-error"
version.workspace = true
edition.workspace = true
license.workspace = true
version = "0.1.0"
edition = "2021"
license = "Apache-2.0"
[dependencies]
snafu = { version = "0.7", features = ["backtraces"] }

View File

@@ -131,7 +131,7 @@ mod tests {
assert!(ErrorCompat::backtrace(&err).is_some());
let msg = format!("{err:?}");
let msg = format!("{:?}", err);
assert!(msg.contains("\nBacktrace:\n"));
let fmt_msg = format!("{:?}", DebugFormat::new(&err));
assert_eq!(msg, fmt_msg);
@@ -151,7 +151,7 @@ mod tests {
assert!(err.as_any().downcast_ref::<MockError>().is_some());
assert!(err.source().is_some());
let msg = format!("{err:?}");
let msg = format!("{:?}", err);
assert!(msg.contains("\nBacktrace:\n"));
assert!(msg.contains("Caused by"));

View File

@@ -31,11 +31,11 @@ impl<'a, E: ErrorExt + ?Sized> fmt::Debug for DebugFormat<'a, E> {
write!(f, "{}.", self.0)?;
if let Some(source) = self.0.source() {
// Source error use debug format for more verbose info.
write!(f, " Caused by: {source:?}")?;
write!(f, " Caused by: {:?}", source)?;
}
if let Some(backtrace) = self.0.backtrace_opt() {
// Add a newline to separate causes and backtrace.
write!(f, "\nBacktrace:\n{backtrace}")?;
write!(f, "\nBacktrace:\n{}", backtrace)?;
}
Ok(())

View File

@@ -51,7 +51,6 @@ pub enum StatusCode {
TableNotFound = 4001,
TableColumnNotFound = 4002,
TableColumnExists = 4003,
DatabaseNotFound = 4004,
// ====== End of catalog related status code =======
// ====== Begin of storage related status code =====
@@ -87,7 +86,7 @@ impl StatusCode {
impl fmt::Display for StatusCode {
fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
// The current debug format is suitable to display.
write!(f, "{self:?}")
write!(f, "{:?}", self)
}
}
@@ -96,7 +95,7 @@ mod tests {
use super::*;
fn assert_status_code_display(code: StatusCode, msg: &str) {
let code_msg = format!("{code}");
let code_msg = format!("{}", code);
assert_eq!(msg, code_msg);
}

View File

@@ -1,8 +1,8 @@
[package]
name = "common-function-macro"
version.workspace = true
edition.workspace = true
license.workspace = true
version = "0.1.0"
edition = "2021"
license = "Apache-2.0"
[lib]
proc-macro = true
@@ -15,5 +15,5 @@ syn = "1.0"
arc-swap = "1.0"
common-query = { path = "../query" }
datatypes = { path = "../../datatypes" }
snafu.workspace = true
snafu = { version = "0.7", features = ["backtraces"] }
static_assertions = "1.1.0"

View File

@@ -1,8 +1,8 @@
[package]
edition = "2021"
name = "common-function"
edition.workspace = true
version.workspace = true
license.workspace = true
version = "0.1.0"
license = "Apache-2.0"
[dependencies]
arc-swap = "1.0"
@@ -11,14 +11,14 @@ common-error = { path = "../error" }
common-function-macro = { path = "../function-macro" }
common-query = { path = "../query" }
common-time = { path = "../time" }
datafusion.workspace = true
datafusion-common = "14.0.0"
datatypes = { path = "../../datatypes" }
libc = "0.2"
num = "0.4"
num-traits = "0.2"
once_cell = "1.10"
paste = "1.0"
snafu.workspace = true
snafu = { version = "0.7", features = ["backtraces"] }
statrs = "0.15"
[dev-dependencies]

View File

@@ -343,7 +343,7 @@ mod tests {
Arc::new(Int64Vector::from_vec(fp.clone())),
];
let vector = interp(&args).unwrap();
assert!(matches!(vector.get(0), Value::Float64(v) if v == x[0]));
assert!(matches!(vector.get(0), Value::Float64(v) if v==x[0] as f64));
// x=None output:Null
let input = vec![None, Some(0.0), Some(0.3)];

View File

@@ -127,7 +127,12 @@ mod tests {
assert_eq!(4, vec.len());
for i in 0..4 {
assert_eq!(i == 0 || i == 3, vec.get_data(i).unwrap(), "Failed at {i}",)
assert_eq!(
i == 0 || i == 3,
vec.get_data(i).unwrap(),
"failed at {}",
i
)
}
}
_ => unreachable!(),

View File

@@ -1,12 +1,12 @@
[package]
name = "common-grpc-expr"
version.workspace = true
edition.workspace = true
license.workspace = true
version = "0.1.0"
edition = "2021"
license = "Apache-2.0"
[dependencies]
api = { path = "../../api" }
async-trait.workspace = true
async-trait = "0.1"
common-base = { path = "../base" }
common-catalog = { path = "../catalog" }
common-error = { path = "../error" }

View File

@@ -15,7 +15,7 @@
use std::sync::Arc;
use api::v1::alter_expr::Kind;
use api::v1::{AlterExpr, CreateTableExpr, DropColumns};
use api::v1::{AlterExpr, CreateExpr, DropColumns};
use common_catalog::consts::{DEFAULT_CATALOG_NAME, DEFAULT_SCHEMA_NAME};
use datatypes::schema::{ColumnSchema, SchemaBuilder, SchemaRef};
use snafu::{ensure, OptionExt, ResultExt};
@@ -29,16 +29,6 @@ use crate::error::{
/// Convert an [`AlterExpr`] to an optional [`AlterTableRequest`]
pub fn alter_expr_to_request(expr: AlterExpr) -> Result<Option<AlterTableRequest>> {
let catalog_name = if expr.catalog_name.is_empty() {
None
} else {
Some(expr.catalog_name)
};
let schema_name = if expr.schema_name.is_empty() {
None
} else {
Some(expr.schema_name)
};
match expr.kind {
Some(Kind::AddColumns(add_columns)) => {
let add_column_requests = add_columns
@@ -67,8 +57,8 @@ pub fn alter_expr_to_request(expr: AlterExpr) -> Result<Option<AlterTableRequest
};
let request = AlterTableRequest {
catalog_name,
schema_name,
catalog_name: expr.catalog_name,
schema_name: expr.schema_name,
table_name: expr.table_name,
alter_kind,
};
@@ -80,8 +70,8 @@ pub fn alter_expr_to_request(expr: AlterExpr) -> Result<Option<AlterTableRequest
};
let request = AlterTableRequest {
catalog_name,
schema_name,
catalog_name: expr.catalog_name,
schema_name: expr.schema_name,
table_name: expr.table_name,
alter_kind,
};
@@ -91,7 +81,7 @@ pub fn alter_expr_to_request(expr: AlterExpr) -> Result<Option<AlterTableRequest
}
}
pub fn create_table_schema(expr: &CreateTableExpr) -> Result<SchemaRef> {
pub fn create_table_schema(expr: &CreateExpr) -> Result<SchemaRef> {
let column_schemas = expr
.column_defs
.iter()
@@ -106,7 +96,7 @@ pub fn create_table_schema(expr: &CreateTableExpr) -> Result<SchemaRef> {
.iter()
.any(|column| column.name == expr.time_index),
MissingTimestampColumnSnafu {
msg: format!("CreateExpr: {expr:?}")
msg: format!("CreateExpr: {:?}", expr)
}
);
@@ -129,10 +119,7 @@ pub fn create_table_schema(expr: &CreateTableExpr) -> Result<SchemaRef> {
))
}
pub fn create_expr_to_request(
table_id: TableId,
expr: CreateTableExpr,
) -> Result<CreateTableRequest> {
pub fn create_expr_to_request(table_id: TableId, expr: CreateExpr) -> Result<CreateTableRequest> {
let schema = create_table_schema(&expr)?;
let primary_key_indices = expr
.primary_keys
@@ -147,19 +134,12 @@ pub fn create_expr_to_request(
})
.collect::<Result<Vec<usize>>>()?;
let mut catalog_name = expr.catalog_name;
if catalog_name.is_empty() {
catalog_name = DEFAULT_CATALOG_NAME.to_string();
}
let mut schema_name = expr.schema_name;
if schema_name.is_empty() {
schema_name = DEFAULT_SCHEMA_NAME.to_string();
}
let desc = if expr.desc.is_empty() {
None
} else {
Some(expr.desc)
};
let catalog_name = expr
.catalog_name
.unwrap_or_else(|| DEFAULT_CATALOG_NAME.to_string());
let schema_name = expr
.schema_name
.unwrap_or_else(|| DEFAULT_SCHEMA_NAME.to_string());
let region_ids = if expr.region_ids.is_empty() {
vec![0]
@@ -172,7 +152,7 @@ pub fn create_expr_to_request(
catalog_name,
schema_name,
table_name: expr.table_name,
desc,
desc: expr.desc,
schema,
region_numbers: region_ids,
primary_key_indices,
@@ -191,8 +171,8 @@ mod tests {
#[test]
fn test_alter_expr_to_request() {
let expr = AlterExpr {
catalog_name: "".to_string(),
schema_name: "".to_string(),
catalog_name: None,
schema_name: None,
table_name: "monitor".to_string(),
kind: Some(Kind::AddColumns(AddColumns {
@@ -201,7 +181,7 @@ mod tests {
name: "mem_usage".to_string(),
datatype: ColumnDataType::Float64 as i32,
is_nullable: false,
default_constraint: vec![],
default_constraint: None,
}),
is_key: false,
}],
@@ -228,8 +208,8 @@ mod tests {
#[test]
fn test_drop_column_expr() {
let expr = AlterExpr {
catalog_name: "test_catalog".to_string(),
schema_name: "test_schema".to_string(),
catalog_name: Some("test_catalog".to_string()),
schema_name: Some("test_schema".to_string()),
table_name: "monitor".to_string(),
kind: Some(Kind::DropColumns(DropColumns {

View File

@@ -12,16 +12,14 @@
// See the License for the specific language governing permissions and
// limitations under the License.
use std::collections::hash_map::Entry;
use std::collections::{HashMap, HashSet};
use std::sync::Arc;
use api::helper::ColumnDataTypeWrapper;
use api::v1::column::{SemanticType, Values};
use api::v1::{
AddColumn, AddColumns, Column, ColumnDataType, ColumnDef, CreateTableExpr,
InsertRequest as GrpcInsertRequest,
};
use api::v1::{AddColumn, AddColumns, Column, ColumnDataType, ColumnDef, CreateExpr};
use common_base::BitVec;
use common_catalog::consts::DEFAULT_CATALOG_NAME;
use common_time::timestamp::Timestamp;
use common_time::{Date, DateTime};
use datatypes::data_type::{ConcreteDataType, DataType};
@@ -32,6 +30,7 @@ use datatypes::vectors::MutableVector;
use snafu::{ensure, OptionExt, ResultExt};
use table::metadata::TableId;
use table::requests::{AddColumnRequest, AlterKind, AlterTableRequest, InsertRequest};
use table::Table;
use crate::error::{
ColumnDataTypeSnafu, ColumnNotFoundSnafu, CreateVectorSnafu, DuplicatedTimestampColumnSnafu,
@@ -46,7 +45,7 @@ fn build_column_def(column_name: &str, datatype: i32, nullable: bool) -> ColumnD
name: column_name.to_string(),
datatype,
is_nullable: nullable,
default_constraint: vec![],
default_constraint: None,
}
}
@@ -155,7 +154,7 @@ fn collect_column_values(column_datatype: ColumnDataType, values: &Values) -> Ve
collect_values!(values.i32_values, |v| ValueRef::from(*v))
}
ColumnDataType::Int64 => {
collect_values!(values.i64_values, |v| ValueRef::from(*v))
collect_values!(values.i64_values, |v| ValueRef::from(*v as i64))
}
ColumnDataType::Uint8 => {
collect_values!(values.u8_values, |v| ValueRef::from(*v as u8))
@@ -167,7 +166,7 @@ fn collect_column_values(column_datatype: ColumnDataType, values: &Values) -> Ve
collect_values!(values.u32_values, |v| ValueRef::from(*v))
}
ColumnDataType::Uint64 => {
collect_values!(values.u64_values, |v| ValueRef::from(*v))
collect_values!(values.u64_values, |v| ValueRef::from(*v as u64))
}
ColumnDataType::Float32 => collect_values!(values.f32_values, |v| ValueRef::from(*v)),
ColumnDataType::Float64 => collect_values!(values.f64_values, |v| ValueRef::from(*v)),
@@ -215,7 +214,7 @@ pub fn build_create_expr_from_insertion(
table_id: Option<TableId>,
table_name: &str,
columns: &[Column],
) -> Result<CreateTableExpr> {
) -> Result<CreateExpr> {
let mut new_columns: HashSet<String> = HashSet::default();
let mut column_defs = Vec::default();
let mut primary_key_indices = Vec::default();
@@ -264,60 +263,67 @@ pub fn build_create_expr_from_insertion(
.map(|idx| columns[*idx].column_name.clone())
.collect::<Vec<_>>();
let expr = CreateTableExpr {
catalog_name: catalog_name.to_string(),
schema_name: schema_name.to_string(),
let expr = CreateExpr {
catalog_name: Some(catalog_name.to_string()),
schema_name: Some(schema_name.to_string()),
table_name: table_name.to_string(),
desc: "Created on insertion".to_string(),
desc: Some("Created on insertion".to_string()),
column_defs,
time_index: timestamp_field_name,
primary_keys,
create_if_not_exists: true,
table_options: Default::default(),
table_id: table_id.map(|id| api::v1::TableId { id }),
table_id,
region_ids: vec![0], // TODO:(hl): region id should be allocated by frontend
};
Ok(expr)
}
pub fn to_table_insert_request(
request: GrpcInsertRequest,
schema: SchemaRef,
pub fn insertion_expr_to_request(
catalog_name: &str,
schema_name: &str,
table_name: &str,
insert_batches: Vec<(Vec<Column>, u32)>,
table: Arc<dyn Table>,
) -> Result<InsertRequest> {
let catalog_name = DEFAULT_CATALOG_NAME;
let schema_name = &request.schema_name;
let table_name = &request.table_name;
let row_count = request.row_count as usize;
let schema = table.schema();
let mut columns_builders = HashMap::with_capacity(schema.column_schemas().len());
let mut columns_values = HashMap::with_capacity(request.columns.len());
for Column {
column_name,
values,
null_mask,
..
} in request.columns
{
let Some(values) = values else { continue };
for (columns, row_count) in insert_batches {
for Column {
column_name,
values,
null_mask,
..
} in columns
{
let values = match values {
Some(vals) => vals,
None => continue,
};
let vector_builder = &mut schema
.column_schema_by_name(&column_name)
.context(ColumnNotFoundSnafu {
column_name: &column_name,
table_name,
})?
.data_type
.create_mutable_vector(row_count);
add_values_to_builder(vector_builder, values, row_count, null_mask)?;
ensure!(
columns_values
.insert(column_name, vector_builder.to_vector())
.is_none(),
IllegalInsertDataSnafu
);
let column = column_name.clone();
let vector_builder = match columns_builders.entry(column) {
Entry::Occupied(entry) => entry.into_mut(),
Entry::Vacant(entry) => {
let column_schema = schema.column_schema_by_name(&column_name).context(
ColumnNotFoundSnafu {
column_name: &column_name,
table_name,
},
)?;
let data_type = &column_schema.data_type;
entry.insert(data_type.create_mutable_vector(row_count as usize))
}
};
add_values_to_builder(vector_builder, values, row_count as usize, null_mask)?;
}
}
let columns_values = columns_builders
.into_iter()
.map(|(column_name, mut vector_builder)| (column_name, vector_builder.to_vector()))
.collect();
Ok(InsertRequest {
catalog_name: catalog_name.to_string(),
@@ -473,7 +479,10 @@ mod tests {
use table::metadata::TableInfoRef;
use table::Table;
use super::*;
use super::{
build_create_expr_from_insertion, convert_values, insertion_expr_to_request, is_null,
TAG_SEMANTIC_TYPE, TIMESTAMP_SEMANTIC_TYPE,
};
use crate::error;
use crate::error::ColumnDataTypeSnafu;
use crate::insert::find_new_columns;
@@ -507,9 +516,9 @@ mod tests {
build_create_expr_from_insertion("", "", table_id, table_name, &insert_batch.0)
.unwrap();
assert_eq!(table_id, create_expr.table_id.map(|x| x.id));
assert_eq!(table_id, create_expr.table_id);
assert_eq!(table_name, create_expr.table_name);
assert_eq!("Created on insertion".to_string(), create_expr.desc);
assert_eq!(Some("Created on insertion".to_string()), create_expr.desc);
assert_eq!(
vec![create_expr.column_defs[0].name.clone()],
create_expr.primary_keys
@@ -619,18 +628,12 @@ mod tests {
}
#[test]
fn test_to_table_insert_request() {
fn test_insertion_expr_to_request() {
let table: Arc<dyn Table> = Arc::new(DemoTable {});
let (columns, row_count) = mock_insert_batch();
let request = GrpcInsertRequest {
schema_name: "public".to_string(),
table_name: "demo".to_string(),
columns,
row_count,
region_number: 0,
};
let insert_req = to_table_insert_request(request, table.schema()).unwrap();
let insert_batches = vec![mock_insert_batch()];
let insert_req =
insertion_expr_to_request("greptime", "public", "demo", insert_batches, table).unwrap();
assert_eq!("greptime", insert_req.catalog_name);
assert_eq!("public", insert_req.schema_name);
@@ -722,7 +725,7 @@ mod tests {
async fn scan(
&self,
_projection: Option<&Vec<usize>>,
_projection: &Option<Vec<usize>>,
_filters: &[Expr],
_limit: Option<usize>,
) -> TableResult<PhysicalPlanRef> {

View File

@@ -1,3 +1,4 @@
#![feature(assert_matches)]
// Copyright 2022 Greptime Team
//
// Licensed under the Apache License, Version 2.0 (the "License");
@@ -14,9 +15,10 @@
mod alter;
pub mod error;
pub mod insert;
mod insert;
pub use alter::{alter_expr_to_request, create_expr_to_request, create_table_schema};
pub use insert::{
build_alter_table_request, build_create_expr_from_insertion, column_to_vector, find_new_columns,
build_alter_table_request, build_create_expr_from_insertion, column_to_vector,
find_new_columns, insertion_expr_to_request,
};

View File

@@ -1,12 +1,11 @@
[package]
name = "common-grpc"
version.workspace = true
edition.workspace = true
license.workspace = true
version = "0.1.0"
edition = "2021"
license = "Apache-2.0"
[dependencies]
api = { path = "../../api" }
arrow-flight.workspace = true
async-trait = "0.1"
common-base = { path = "../base" }
common-error = { path = "../error" }
@@ -14,11 +13,8 @@ common-query = { path = "../query" }
common-recordbatch = { path = "../recordbatch" }
common-runtime = { path = "../runtime" }
dashmap = "5.4"
datafusion.workspace = true
datafusion = "14.0.0"
datatypes = { path = "../../datatypes" }
flatbuffers = "22"
futures = "0.3"
prost = "0.11"
snafu = { version = "0.7", features = ["backtraces"] }
tokio = { version = "1.0", features = ["full"] }
tonic = "0.8"

View File

@@ -26,7 +26,7 @@ async fn do_bench_channel_manager() {
let join = tokio::spawn(async move {
for _ in 0..10000 {
let idx = rand::random::<usize>() % 100;
let ret = m_clone.get(format!("{idx}"));
let ret = m_clone.get(format!("{}", idx));
assert!(ret.is_ok());
}
});

View File

@@ -120,7 +120,7 @@ impl ChannelManager {
fn build_endpoint(&self, addr: &str) -> Result<Endpoint> {
let mut endpoint =
Endpoint::new(format!("http://{addr}")).context(error::CreateChannelSnafu)?;
Endpoint::new(format!("http://{}", addr)).context(error::CreateChannelSnafu)?;
if let Some(dur) = self.config.timeout {
endpoint = endpoint.timeout(dur);

View File

@@ -44,8 +44,8 @@ pub enum Error {
backtrace: Backtrace,
},
#[snafu(display("Failed to create RecordBatch, source: {}", source))]
CreateRecordBatch {
#[snafu(display("Failed to collect RecordBatches, source: {}", source))]
CollectRecordBatches {
#[snafu(backtrace)]
source: common_recordbatch::error::Error,
},
@@ -58,40 +58,15 @@ pub enum Error {
#[snafu(backtrace)]
source: api::error::Error,
},
#[snafu(display("Failed to decode FlightData, source: {}", source))]
DecodeFlightData {
source: api::DecodeError,
backtrace: Backtrace,
},
#[snafu(display("Invalid FlightData, reason: {}", reason))]
InvalidFlightData {
reason: String,
backtrace: Backtrace,
},
#[snafu(display("Failed to convert Arrow Schema, source: {}", source))]
ConvertArrowSchema {
#[snafu(backtrace)]
source: datatypes::error::Error,
},
}
impl ErrorExt for Error {
fn status_code(&self) -> StatusCode {
match self {
Error::MissingField { .. }
| Error::TypeMismatch { .. }
| Error::InvalidFlightData { .. } => StatusCode::InvalidArguments,
Error::CreateChannel { .. }
| Error::Conversion { .. }
| Error::DecodeFlightData { .. } => StatusCode::Internal,
Error::CreateRecordBatch { source } => source.status_code(),
Error::MissingField { .. } | Error::TypeMismatch { .. } => StatusCode::InvalidArguments,
Error::CreateChannel { .. } | Error::Conversion { .. } => StatusCode::Internal,
Error::CollectRecordBatches { source } => source.status_code(),
Error::ColumnDataType { source } => source.status_code(),
Error::ConvertArrowSchema { source } => source.status_code(),
}
}

View File

@@ -1,334 +0,0 @@
// Copyright 2022 Greptime Team
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
use std::collections::HashMap;
use std::pin::Pin;
use std::sync::Arc;
use api::result::ObjectResultBuilder;
use api::v1::{FlightDataExt, ObjectResult};
use arrow_flight::utils::{flight_data_from_arrow_batch, flight_data_to_arrow_batch};
use arrow_flight::{FlightData, IpcMessage, SchemaAsIpc};
use common_error::prelude::StatusCode;
use common_recordbatch::{RecordBatch, RecordBatches};
use datatypes::arrow;
use datatypes::arrow::datatypes::Schema as ArrowSchema;
use datatypes::arrow::ipc::{root_as_message, writer, MessageHeader};
use datatypes::schema::{Schema, SchemaRef};
use flatbuffers::FlatBufferBuilder;
use futures::TryStreamExt;
use prost::Message;
use snafu::{OptionExt, ResultExt};
use tonic::codegen::futures_core::Stream;
use tonic::Response;
use crate::error::{
ConvertArrowSchemaSnafu, CreateRecordBatchSnafu, DecodeFlightDataSnafu, InvalidFlightDataSnafu,
Result,
};
type TonicResult<T> = std::result::Result<T, tonic::Status>;
type TonicStream<T> = Pin<Box<dyn Stream<Item = TonicResult<T>> + Send + Sync + 'static>>;
#[derive(Debug, Clone)]
pub enum FlightMessage {
Schema(SchemaRef),
Recordbatch(RecordBatch),
AffectedRows(usize),
}
#[derive(Default)]
pub struct FlightEncoder {
write_options: writer::IpcWriteOptions,
}
impl FlightEncoder {
pub fn encode(&self, flight_message: FlightMessage) -> FlightData {
match flight_message {
FlightMessage::Schema(schema) => {
SchemaAsIpc::new(schema.arrow_schema(), &self.write_options).into()
}
FlightMessage::Recordbatch(recordbatch) => {
let (flight_dictionaries, flight_batch) = flight_data_from_arrow_batch(
recordbatch.df_record_batch(),
&self.write_options,
);
// TODO(LFC): Handle dictionary as FlightData here, when we supported Arrow's Dictionary DataType.
// Currently we don't have a datatype corresponding to Arrow's Dictionary DataType,
// so there won't be any "dictionaries" here. Assert to be sure about it, and
// perform a "testing guard" in case we forgot to handle the possible "dictionaries"
// here in the future.
debug_assert_eq!(flight_dictionaries.len(), 0);
flight_batch
}
FlightMessage::AffectedRows(rows) => {
let ext_data = FlightDataExt {
affected_rows: rows as _,
}
.encode_to_vec();
FlightData::new(None, IpcMessage(build_none_flight_msg()), vec![], ext_data)
}
}
}
}
#[derive(Default)]
pub struct FlightDecoder {
schema: Option<SchemaRef>,
}
impl FlightDecoder {
pub fn try_decode(&mut self, flight_data: FlightData) -> Result<FlightMessage> {
let message = root_as_message(flight_data.data_header.as_slice()).map_err(|e| {
InvalidFlightDataSnafu {
reason: e.to_string(),
}
.build()
})?;
match message.header_type() {
MessageHeader::NONE => {
let ext_data = FlightDataExt::decode(flight_data.data_body.as_slice())
.context(DecodeFlightDataSnafu)?;
Ok(FlightMessage::AffectedRows(ext_data.affected_rows as _))
}
MessageHeader::Schema => {
let arrow_schema = ArrowSchema::try_from(&flight_data).map_err(|e| {
InvalidFlightDataSnafu {
reason: e.to_string(),
}
.build()
})?;
let schema =
Arc::new(Schema::try_from(arrow_schema).context(ConvertArrowSchemaSnafu)?);
self.schema = Some(schema.clone());
Ok(FlightMessage::Schema(schema))
}
MessageHeader::RecordBatch => {
let schema = self.schema.clone().context(InvalidFlightDataSnafu {
reason: "Should have decoded schema first!",
})?;
let arrow_schema = schema.arrow_schema().clone();
let arrow_batch =
flight_data_to_arrow_batch(&flight_data, arrow_schema, &HashMap::new())
.map_err(|e| {
InvalidFlightDataSnafu {
reason: e.to_string(),
}
.build()
})?;
let recordbatch = RecordBatch::try_from_df_record_batch(schema, arrow_batch)
.context(CreateRecordBatchSnafu)?;
Ok(FlightMessage::Recordbatch(recordbatch))
}
other => {
let name = other.variant_name().unwrap_or("UNKNOWN");
InvalidFlightDataSnafu {
reason: format!("Unsupported FlightData type: {name}"),
}
.fail()
}
}
}
}
// TODO(LFC): Remove it once we completely get rid of old GRPC interface.
pub async fn flight_data_to_object_result(
response: Response<TonicStream<FlightData>>,
) -> Result<ObjectResult> {
let stream = response.into_inner();
let result: TonicResult<Vec<FlightData>> = stream.try_collect().await;
match result {
Ok(flight_data) => Ok(ObjectResultBuilder::new()
.status_code(StatusCode::Success as u32)
.flight_data(flight_data)
.build()),
Err(e) => Ok(ObjectResultBuilder::new()
.status_code(StatusCode::Internal as _)
.err_msg(e.to_string())
.build()),
}
}
pub fn raw_flight_data_to_message(raw_data: Vec<Vec<u8>>) -> Result<Vec<FlightMessage>> {
let flight_data = raw_data
.into_iter()
.map(|x| FlightData::decode(x.as_slice()).context(DecodeFlightDataSnafu))
.collect::<Result<Vec<FlightData>>>()?;
let decoder = &mut FlightDecoder::default();
flight_data
.into_iter()
.map(|x| decoder.try_decode(x))
.collect()
}
pub fn flight_messages_to_recordbatches(messages: Vec<FlightMessage>) -> Result<RecordBatches> {
if messages.is_empty() {
Ok(RecordBatches::empty())
} else {
let mut recordbatches = Vec::with_capacity(messages.len() - 1);
let schema = match &messages[0] {
FlightMessage::Schema(schema) => schema.clone(),
_ => {
return InvalidFlightDataSnafu {
reason: "First Flight Message must be schema!",
}
.fail()
}
};
for message in messages.into_iter().skip(1) {
match message {
FlightMessage::Recordbatch(recordbatch) => recordbatches.push(recordbatch),
_ => {
return InvalidFlightDataSnafu {
reason: "Expect the following Flight Messages are all Recordbatches!",
}
.fail()
}
}
}
RecordBatches::try_new(schema, recordbatches).context(CreateRecordBatchSnafu)
}
}
fn build_none_flight_msg() -> Vec<u8> {
let mut builder = FlatBufferBuilder::new();
let mut message = arrow::ipc::MessageBuilder::new(&mut builder);
message.add_version(arrow::ipc::MetadataVersion::V5);
message.add_header_type(MessageHeader::NONE);
message.add_bodyLength(0);
let data = message.finish();
builder.finish(data, None);
builder.finished_data().to_vec()
}
#[cfg(test)]
mod test {
use arrow_flight::utils::batches_to_flight_data;
use datatypes::arrow::datatypes::{DataType, Field};
use datatypes::prelude::ConcreteDataType;
use datatypes::schema::ColumnSchema;
use datatypes::vectors::Int32Vector;
use super::*;
use crate::Error;
#[test]
fn test_try_decode() {
let arrow_schema = ArrowSchema::new(vec![Field::new("n", DataType::Int32, true)]);
let schema = Arc::new(Schema::try_from(arrow_schema.clone()).unwrap());
let batch1 = RecordBatch::new(
schema.clone(),
vec![Arc::new(Int32Vector::from(vec![Some(1), None, Some(3)])) as _],
)
.unwrap();
let batch2 = RecordBatch::new(
schema.clone(),
vec![Arc::new(Int32Vector::from(vec![None, Some(5)])) as _],
)
.unwrap();
let flight_data = batches_to_flight_data(
arrow_schema,
vec![
batch1.clone().into_df_record_batch(),
batch2.clone().into_df_record_batch(),
],
)
.unwrap();
assert_eq!(flight_data.len(), 3);
let [d1, d2, d3] = flight_data.as_slice() else { unreachable!() };
let decoder = &mut FlightDecoder::default();
assert!(decoder.schema.is_none());
let result = decoder.try_decode(d2.clone());
assert!(matches!(result, Err(Error::InvalidFlightData { .. })));
assert!(result
.unwrap_err()
.to_string()
.contains("Should have decoded schema first!"));
let message = decoder.try_decode(d1.clone()).unwrap();
assert!(matches!(message, FlightMessage::Schema(_)));
let FlightMessage::Schema(decoded_schema) = message else { unreachable!() };
assert_eq!(decoded_schema, schema);
assert!(decoder.schema.is_some());
let message = decoder.try_decode(d2.clone()).unwrap();
assert!(matches!(message, FlightMessage::Recordbatch(_)));
let FlightMessage::Recordbatch(actual_batch) = message else { unreachable!() };
assert_eq!(actual_batch, batch1);
let message = decoder.try_decode(d3.clone()).unwrap();
assert!(matches!(message, FlightMessage::Recordbatch(_)));
let FlightMessage::Recordbatch(actual_batch) = message else { unreachable!() };
assert_eq!(actual_batch, batch2);
}
#[test]
fn test_flight_messages_to_recordbatches() {
let schema = Arc::new(Schema::new(vec![ColumnSchema::new(
"m",
ConcreteDataType::int32_datatype(),
true,
)]));
let batch1 = RecordBatch::new(
schema.clone(),
vec![Arc::new(Int32Vector::from(vec![Some(2), None, Some(4)])) as _],
)
.unwrap();
let batch2 = RecordBatch::new(
schema.clone(),
vec![Arc::new(Int32Vector::from(vec![None, Some(6)])) as _],
)
.unwrap();
let recordbatches =
RecordBatches::try_new(schema.clone(), vec![batch1.clone(), batch2.clone()]).unwrap();
let m1 = FlightMessage::Schema(schema);
let m2 = FlightMessage::Recordbatch(batch1);
let m3 = FlightMessage::Recordbatch(batch2);
let result = flight_messages_to_recordbatches(vec![m2.clone(), m1.clone(), m3.clone()]);
assert!(matches!(result, Err(Error::InvalidFlightData { .. })));
assert!(result
.unwrap_err()
.to_string()
.contains("First Flight Message must be schema!"));
let result = flight_messages_to_recordbatches(vec![m1.clone(), m2.clone(), m1.clone()]);
assert!(matches!(result, Err(Error::InvalidFlightData { .. })));
assert!(result
.unwrap_err()
.to_string()
.contains("Expect the following Flight Messages are all Recordbatches!"));
let actual = flight_messages_to_recordbatches(vec![m1, m2, m3]).unwrap();
assert_eq!(actual, recordbatches);
}
}

View File

@@ -14,7 +14,6 @@
pub mod channel_manager;
pub mod error;
pub mod flight;
pub mod select;
pub mod writer;

View File

@@ -12,8 +12,17 @@
// See the License for the specific language governing permissions and
// limitations under the License.
use api::v1::column::Values;
use api::helper::ColumnDataTypeWrapper;
use api::result::{build_err_result, ObjectResultBuilder};
use api::v1::codec::SelectResult;
use api::v1::column::{SemanticType, Values};
use api::v1::{Column, ObjectResult};
use common_base::BitVec;
use common_error::prelude::ErrorExt;
use common_error::status_code::StatusCode;
use common_query::Output;
use common_recordbatch::{RecordBatches, SendableRecordBatchStream};
use datatypes::schema::SchemaRef;
use datatypes::types::{TimestampType, WrapperType};
use datatypes::vectors::{
BinaryVector, BooleanVector, DateTimeVector, DateVector, Float32Vector, Float64Vector,
@@ -21,9 +30,88 @@ use datatypes::vectors::{
TimestampMillisecondVector, TimestampNanosecondVector, TimestampSecondVector, UInt16Vector,
UInt32Vector, UInt64Vector, UInt8Vector, VectorRef,
};
use snafu::OptionExt;
use snafu::{OptionExt, ResultExt};
use crate::error::{ConversionSnafu, Result};
use crate::error::{self, ConversionSnafu, Result};
pub async fn to_object_result(output: std::result::Result<Output, impl ErrorExt>) -> ObjectResult {
let result = match output {
Ok(Output::AffectedRows(rows)) => Ok(ObjectResultBuilder::new()
.status_code(StatusCode::Success as u32)
.mutate_result(rows as u32, 0)
.build()),
Ok(Output::Stream(stream)) => collect(stream).await,
Ok(Output::RecordBatches(recordbatches)) => build_result(recordbatches),
Err(e) => return build_err_result(&e),
};
match result {
Ok(r) => r,
Err(e) => build_err_result(&e),
}
}
async fn collect(stream: SendableRecordBatchStream) -> Result<ObjectResult> {
let recordbatches = RecordBatches::try_collect(stream)
.await
.context(error::CollectRecordBatchesSnafu)?;
let object_result = build_result(recordbatches)?;
Ok(object_result)
}
fn build_result(recordbatches: RecordBatches) -> Result<ObjectResult> {
let select_result = try_convert(recordbatches)?;
let object_result = ObjectResultBuilder::new()
.status_code(StatusCode::Success as u32)
.select_result(select_result)
.build();
Ok(object_result)
}
#[inline]
fn get_semantic_type(schema: &SchemaRef, idx: usize) -> i32 {
if Some(idx) == schema.timestamp_index() {
SemanticType::Timestamp as i32
} else {
// FIXME(dennis): set primary key's columns semantic type as Tag,
// but we can't get the table's schema here right now.
SemanticType::Field as i32
}
}
fn try_convert(record_batches: RecordBatches) -> Result<SelectResult> {
let schema = record_batches.schema();
let record_batches = record_batches.take();
let row_count: usize = record_batches.iter().map(|r| r.num_rows()).sum();
let schemas = schema.column_schemas();
let mut columns = Vec::with_capacity(schemas.len());
for (idx, column_schema) in schemas.iter().enumerate() {
let column_name = column_schema.name.clone();
let arrays: Vec<_> = record_batches
.iter()
.map(|r| r.column(idx).clone())
.collect();
let column = Column {
column_name,
values: Some(values(&arrays)?),
null_mask: null_mask(&arrays, row_count),
datatype: ColumnDataTypeWrapper::try_from(column_schema.data_type.clone())
.context(error::ColumnDataTypeSnafu)?
.datatype() as i32,
semantic_type: get_semantic_type(&schema, idx),
};
columns.push(column);
}
Ok(SelectResult {
columns,
row_count: row_count as u32,
})
}
pub fn null_mask(arrays: &[VectorRef], row_count: usize) -> Vec<u8> {
let null_count: usize = arrays.iter().map(|a| a.null_count()).sum();
@@ -175,8 +263,34 @@ pub fn values(arrays: &[VectorRef]) -> Result<Values> {
mod tests {
use std::sync::Arc;
use common_recordbatch::{RecordBatch, RecordBatches};
use datatypes::data_type::ConcreteDataType;
use datatypes::schema::{ColumnSchema, Schema};
use super::*;
#[test]
fn test_convert_record_batches_to_select_result() {
let r1 = mock_record_batch();
let schema = r1.schema.clone();
let r2 = mock_record_batch();
let record_batches = vec![r1, r2];
let record_batches = RecordBatches::try_new(schema, record_batches).unwrap();
let s = try_convert(record_batches).unwrap();
let c1 = s.columns.get(0).unwrap();
let c2 = s.columns.get(1).unwrap();
assert_eq!("c1", c1.column_name);
assert_eq!("c2", c2.column_name);
assert_eq!(vec![0b0010_0100], c1.null_mask);
assert_eq!(vec![0b0011_0110], c2.null_mask);
assert_eq!(vec![1, 2, 1, 2], c1.values.as_ref().unwrap().u32_values);
assert_eq!(vec![1, 1], c2.values.as_ref().unwrap().u32_values);
}
#[test]
fn test_convert_arrow_arrays_i32() {
let array = Int32Vector::from(vec![Some(1), Some(2), None, Some(3)]);
@@ -244,4 +358,18 @@ mod tests {
let mask = null_mask(&[a1, a2], 3 + 3);
assert_eq!(vec![0b0010_0000], mask);
}
fn mock_record_batch() -> RecordBatch {
let column_schemas = vec![
ColumnSchema::new("c1", ConcreteDataType::uint32_datatype(), true),
ColumnSchema::new("c2", ConcreteDataType::uint32_datatype(), true),
];
let schema = Arc::new(Schema::try_new(column_schemas).unwrap());
let v1 = Arc::new(UInt32Vector::from(vec![Some(1), Some(2), None]));
let v2 = Arc::new(UInt32Vector::from(vec![Some(1), None, None]));
let columns: Vec<VectorRef> = vec![v1, v2];
RecordBatch::new(schema, columns).unwrap()
}
}

View File

@@ -1,19 +1,19 @@
[package]
name = "common-query"
version.workspace = true
edition.workspace = true
license.workspace = true
version = "0.1.0"
edition = "2021"
license = "Apache-2.0"
[dependencies]
async-trait.workspace = true
async-trait = "0.1"
common-error = { path = "../error" }
common-recordbatch = { path = "../recordbatch" }
common-time = { path = "../time" }
datafusion.workspace = true
datafusion-common.workspace = true
datafusion-expr.workspace = true
datafusion = "14.0.0"
datafusion-common = "14.0.0"
datafusion-expr = "14.0.0"
datatypes = { path = "../../datatypes" }
snafu.workspace = true
snafu = { version = "0.7", features = ["backtraces"] }
statrs = "0.15"
[dev-dependencies]

View File

@@ -161,7 +161,12 @@ mod tests {
assert_eq!(4, vec.len());
for i in 0..4 {
assert_eq!(i == 0 || i == 3, vec.get_data(i).unwrap(), "Failed at {i}")
assert_eq!(
i == 0 || i == 3,
vec.get_data(i).unwrap(),
"failed at {}",
i
)
}
}
_ => unreachable!(),

View File

@@ -18,7 +18,7 @@ use std::fmt::Debug;
use std::sync::Arc;
use datafusion_common::Result as DfResult;
use datafusion_expr::Accumulator as DfAccumulator;
use datafusion_expr::{Accumulator as DfAccumulator, AggregateState};
use datatypes::arrow::array::ArrayRef;
use datatypes::prelude::*;
use datatypes::vectors::{Helper as VectorHelper, VectorRef};
@@ -126,19 +126,24 @@ impl DfAccumulatorAdaptor {
}
impl DfAccumulator for DfAccumulatorAdaptor {
fn state(&self) -> DfResult<Vec<ScalarValue>> {
fn state(&self) -> DfResult<Vec<AggregateState>> {
let state_values = self.accumulator.state()?;
let state_types = self.creator.state_types()?;
if state_values.len() != state_types.len() {
return error::BadAccumulatorImplSnafu {
err_msg: format!("Accumulator {self:?} returned state values size do not match its state types size."),
err_msg: format!("Accumulator {:?} returned state values size do not match its state types size.", self),
}
.fail()?;
}
Ok(state_values
.into_iter()
.zip(state_types.iter())
.map(|(v, t)| v.try_to_scalar_value(t).context(error::ToScalarValueSnafu))
.map(|(v, t)| {
let scalar = v
.try_to_scalar_value(t)
.context(error::ToScalarValueSnafu)?;
Ok(AggregateState::Scalar(scalar))
})
.collect::<Result<Vec<_>>>()?)
}
@@ -170,9 +175,4 @@ impl DfAccumulator for DfAccumulatorAdaptor {
.map_err(Error::from)?;
Ok(scalar_value)
}
fn size(&self) -> usize {
// TODO(LFC): Implement new "size" method for Accumulator.
0
}
}

View File

@@ -233,7 +233,7 @@ mod test {
async fn scan(
&self,
_ctx: &SessionState,
_projection: Option<&Vec<usize>>,
_projection: &Option<Vec<usize>>,
_filters: &[Expr],
_limit: Option<usize>,
) -> DfResult<Arc<dyn DfPhysicalPlan>> {

View File

@@ -1,15 +1,15 @@
[package]
name = "common-recordbatch"
version.workspace = true
edition.workspace = true
license.workspace = true
version = "0.1.0"
edition = "2021"
license = "Apache-2.0"
[dependencies]
common-error = { path = "../error" }
datafusion.workspace = true
datafusion-common.workspace = true
datafusion = "14.0.0"
datafusion-common = "14.0.0"
datatypes = { path = "../../datatypes" }
futures.workspace = true
futures = "0.3"
paste = "1.0"
serde = "1.0"
snafu = { version = "0.7", features = ["backtraces"] }

View File

@@ -121,8 +121,7 @@ impl Stream for RecordBatchStreamAdapter {
enum AsyncRecordBatchStreamAdapterState {
Uninit(FutureStream),
Ready(DfSendableRecordBatchStream),
Failed,
Inited(std::result::Result<DfSendableRecordBatchStream, DataFusionError>),
}
pub struct AsyncRecordBatchStreamAdapter {
@@ -152,26 +151,28 @@ impl Stream for AsyncRecordBatchStreamAdapter {
loop {
match &mut self.state {
AsyncRecordBatchStreamAdapterState::Uninit(stream_future) => {
match ready!(Pin::new(stream_future).poll(cx)) {
Ok(stream) => {
self.state = AsyncRecordBatchStreamAdapterState::Ready(stream);
continue;
}
Err(e) => {
self.state = AsyncRecordBatchStreamAdapterState::Failed;
return Poll::Ready(Some(
Err(e).context(error::InitRecordbatchStreamSnafu),
));
}
};
self.state = AsyncRecordBatchStreamAdapterState::Inited(ready!(Pin::new(
stream_future
)
.poll(cx)));
continue;
}
AsyncRecordBatchStreamAdapterState::Ready(stream) => {
return Poll::Ready(ready!(Pin::new(stream).poll_next(cx)).map(|x| {
let df_record_batch = x.context(error::PollStreamSnafu)?;
RecordBatch::try_from_df_record_batch(self.schema(), df_record_batch)
}))
}
AsyncRecordBatchStreamAdapterState::Failed => return Poll::Ready(None),
AsyncRecordBatchStreamAdapterState::Inited(stream) => match stream {
Ok(stream) => {
return Poll::Ready(ready!(Pin::new(stream).poll_next(cx)).map(|df| {
let df_record_batch = df.context(error::PollStreamSnafu)?;
RecordBatch::try_from_df_record_batch(self.schema(), df_record_batch)
}));
}
Err(e) => {
return Poll::Ready(Some(
error::CreateRecordBatchesSnafu {
reason: format!("Read error {:?} from stream", e),
}
.fail(),
))
}
},
}
}
}
@@ -182,104 +183,3 @@ impl Stream for AsyncRecordBatchStreamAdapter {
(0, None)
}
}
#[cfg(test)]
mod test {
use common_error::mock::MockError;
use common_error::prelude::{BoxedError, StatusCode};
use datatypes::prelude::ConcreteDataType;
use datatypes::schema::ColumnSchema;
use datatypes::vectors::Int32Vector;
use super::*;
use crate::RecordBatches;
#[tokio::test]
async fn test_async_recordbatch_stream_adaptor() {
struct MaybeErrorRecordBatchStream {
items: Vec<Result<RecordBatch>>,
}
impl RecordBatchStream for MaybeErrorRecordBatchStream {
fn schema(&self) -> SchemaRef {
unimplemented!()
}
}
impl Stream for MaybeErrorRecordBatchStream {
type Item = Result<RecordBatch>;
fn poll_next(
mut self: Pin<&mut Self>,
_: &mut Context<'_>,
) -> Poll<Option<Self::Item>> {
if let Some(batch) = self.items.pop() {
Poll::Ready(Some(Ok(batch?)))
} else {
Poll::Ready(None)
}
}
}
fn new_future_stream(
maybe_recordbatches: Result<Vec<Result<RecordBatch>>>,
) -> FutureStream {
Box::pin(async move {
maybe_recordbatches
.map(|items| {
Box::pin(DfRecordBatchStreamAdapter::new(Box::pin(
MaybeErrorRecordBatchStream { items },
))) as _
})
.map_err(|e| DataFusionError::External(Box::new(e)))
})
}
let schema = Arc::new(Schema::new(vec![ColumnSchema::new(
"a",
ConcreteDataType::int32_datatype(),
false,
)]));
let batch1 = RecordBatch::new(
schema.clone(),
vec![Arc::new(Int32Vector::from_slice(&[1])) as _],
)
.unwrap();
let batch2 = RecordBatch::new(
schema.clone(),
vec![Arc::new(Int32Vector::from_slice(&[2])) as _],
)
.unwrap();
let success_stream = new_future_stream(Ok(vec![Ok(batch1.clone()), Ok(batch2.clone())]));
let adapter = AsyncRecordBatchStreamAdapter::new(schema.clone(), success_stream);
let collected = RecordBatches::try_collect(Box::pin(adapter)).await.unwrap();
assert_eq!(
collected,
RecordBatches::try_new(schema.clone(), vec![batch2.clone(), batch1.clone()]).unwrap()
);
let poll_err_stream = new_future_stream(Ok(vec![
Ok(batch1.clone()),
Err(error::Error::External {
source: BoxedError::new(MockError::new(StatusCode::Unknown)),
}),
]));
let adapter = AsyncRecordBatchStreamAdapter::new(schema.clone(), poll_err_stream);
let result = RecordBatches::try_collect(Box::pin(adapter)).await;
assert_eq!(
result.unwrap_err().to_string(),
"Failed to poll stream, source: External error: External error, source: Unknown"
);
let failed_to_init_stream = new_future_stream(Err(error::Error::External {
source: BoxedError::new(MockError::new(StatusCode::Internal)),
}));
let adapter = AsyncRecordBatchStreamAdapter::new(schema.clone(), failed_to_init_stream);
let result = RecordBatches::try_collect(Box::pin(adapter)).await;
assert_eq!(
result.unwrap_err().to_string(),
"Failed to init Recordbatch stream, source: External error: External error, source: Internal"
);
}
}

View File

@@ -64,12 +64,6 @@ pub enum Error {
source: datatypes::arrow::error::ArrowError,
backtrace: Backtrace,
},
#[snafu(display("Failed to init Recordbatch stream, source: {}", source))]
InitRecordbatchStream {
source: datafusion_common::DataFusionError,
backtrace: Backtrace,
},
}
impl ErrorExt for Error {
@@ -80,8 +74,7 @@ impl ErrorExt for Error {
Error::DataTypes { .. }
| Error::CreateRecordBatches { .. }
| Error::PollStream { .. }
| Error::Format { .. }
| Error::InitRecordbatchStream { .. } => StatusCode::Internal,
| Error::Format { .. } => StatusCode::Internal,
Error::External { source } => source.status_code(),

View File

@@ -231,7 +231,8 @@ mod tests {
assert_eq!(
result.unwrap_err().to_string(),
format!(
"Failed to create RecordBatches, reason: expect RecordBatch schema equals {schema1:?}, actual: {schema2:?}",
"Failed to create RecordBatches, reason: expect RecordBatch schema equals {:?}, actual: {:?}",
schema1, schema2
)
);

View File

@@ -151,7 +151,7 @@ impl<'a> RecordBatchRowIterator<'a> {
}
impl<'a> Iterator for RecordBatchRowIterator<'a> {
type Item = Vec<Value>;
type Item = Result<Vec<Value>>;
fn next(&mut self) -> Option<Self::Item> {
if self.row_cursor == self.rows {
@@ -165,7 +165,7 @@ impl<'a> Iterator for RecordBatchRowIterator<'a> {
}
self.row_cursor += 1;
Some(row)
Some(Ok(row))
}
}
}
@@ -227,7 +227,7 @@ mod tests {
let output = serde_json::to_string(&batch).unwrap();
assert_eq!(
r#"{"schema":{"fields":[{"name":"number","data_type":"UInt32","nullable":false,"dict_id":0,"dict_is_ordered":false,"metadata":{}}],"metadata":{"greptime:version":"0"}},"columns":[[0,1,2,3,4,5,6,7,8,9]]}"#,
r#"{"schema":{"fields":[{"name":"number","data_type":"UInt32","nullable":false,"dict_id":0,"dict_is_ordered":false}],"metadata":{"greptime:version":"0"}},"columns":[[0,1,2,3,4,5,6,7,8,9]]}"#,
output
);
}
@@ -256,6 +256,7 @@ mod tests {
record_batch_iter
.next()
.unwrap()
.unwrap()
.into_iter()
.collect::<Vec<Value>>()
);
@@ -265,6 +266,7 @@ mod tests {
record_batch_iter
.next()
.unwrap()
.unwrap()
.into_iter()
.collect::<Vec<Value>>()
);
@@ -274,6 +276,7 @@ mod tests {
record_batch_iter
.next()
.unwrap()
.unwrap()
.into_iter()
.collect::<Vec<Value>>()
);
@@ -283,6 +286,7 @@ mod tests {
record_batch_iter
.next()
.unwrap()
.unwrap()
.into_iter()
.collect::<Vec<Value>>()
);

View File

@@ -1,17 +1,17 @@
[package]
name = "common-runtime"
version.workspace = true
edition.workspace = true
license.workspace = true
version = "0.1.0"
edition = "2021"
license = "Apache-2.0"
[dependencies]
common-error = { path = "../error" }
common-telemetry = { path = "../telemetry" }
metrics = "0.20"
once_cell = "1.12"
paste.workspace = true
snafu.workspace = true
tokio.workspace = true
paste = "1.0"
snafu = { version = "0.7", features = ["backtraces"] }
tokio = { version = "1.18", features = ["full"] }
[dev-dependencies]
tokio-test = "0.4"

View File

@@ -1,8 +1,8 @@
[package]
name = "substrait"
version.workspace = true
edition.workspace = true
license.workspace = true
version = "0.1.0"
edition = "2021"
license = "Apache-2.0"
[dependencies]
bytes = "1.1"
@@ -10,12 +10,12 @@ catalog = { path = "../../catalog" }
common-catalog = { path = "../catalog" }
common-error = { path = "../error" }
common-telemetry = { path = "../telemetry" }
datafusion.workspace = true
datafusion-expr.workspace = true
datafusion = "14.0.0"
datafusion-expr = "14.0.0"
datatypes = { path = "../../datatypes" }
futures = "0.3"
prost = "0.9"
snafu.workspace = true
snafu = { version = "0.7", features = ["backtraces"] }
table = { path = "../../table" }
[dependencies.substrait_proto]

View File

@@ -16,7 +16,6 @@ use std::collections::VecDeque;
use std::str::FromStr;
use datafusion::common::Column;
use datafusion_expr::expr::Sort;
use datafusion_expr::{expr_fn, lit, Between, BinaryExpr, BuiltinScalarFunction, Expr, Operator};
use datatypes::schema::Schema;
use snafu::{ensure, OptionExt};
@@ -62,7 +61,7 @@ pub(crate) fn to_df_expr(
| RexType::Cast(_)
| RexType::Subquery(_)
| RexType::Enum(_) => UnsupportedExprSnafu {
name: format!("substrait expression {expr_rex_type:?}"),
name: format!("substrait expression {:?}", expr_rex_type),
}
.fail()?,
}
@@ -110,7 +109,7 @@ pub fn convert_scalar_function(
let fn_name = ctx
.find_scalar_fn(anchor)
.with_context(|| InvalidParametersSnafu {
reason: format!("Unregistered scalar function reference: {anchor}"),
reason: format!("Unregistered scalar function reference: {}", anchor),
})?;
// convenient util
@@ -332,19 +331,19 @@ pub fn convert_scalar_function(
// skip Cast and TryCast, is covered in substrait::Cast.
"sort" | "sort_des" => {
ensure_arg_len(1)?;
Expr::Sort(Sort {
Expr::Sort {
expr: Box::new(inputs.pop_front().unwrap()),
asc: false,
nulls_first: false,
})
}
}
"sort_asc" => {
ensure_arg_len(1)?;
Expr::Sort(Sort {
Expr::Sort {
expr: Box::new(inputs.pop_front().unwrap()),
asc: true,
nulls_first: false,
})
}
}
// those are datafusion built-in "scalar functions".
"abs"
@@ -436,7 +435,7 @@ pub fn convert_scalar_function(
// skip Wildcard, unimplemented.
// end other direct expr
_ => UnsupportedExprSnafu {
name: format!("scalar function {fn_name}"),
name: format!("scalar function {}", fn_name),
}
.fail()?,
};
@@ -538,11 +537,11 @@ pub fn expression_from_df_expr(
name: expr.to_string(),
}
.fail()?,
Expr::Sort(Sort {
Expr::Sort {
expr,
asc,
nulls_first: _,
}) => {
} => {
let expr = expression_from_df_expr(ctx, expr, schema)?;
let arguments = utils::expression_to_argument(vec![expr]);
let op_name = if *asc { "sort_asc" } else { "sort_des" };
@@ -578,7 +577,6 @@ pub fn expression_from_df_expr(
| Expr::Exists { .. }
| Expr::InSubquery { .. }
| Expr::ScalarSubquery(..)
| Expr::Placeholder { .. }
| Expr::QualifiedWildcard { .. } => todo!(),
Expr::GroupingSet(_) => UnsupportedExprSnafu {
name: expr.to_string(),
@@ -597,8 +595,8 @@ pub fn convert_column(column: &Column, schema: &Schema) -> Result<FieldReference
schema
.column_index_by_name(column_name)
.with_context(|| MissingFieldSnafu {
field: format!("{column:?}"),
plan: format!("schema: {schema:?}"),
field: format!("{:?}", column),
plan: format!("schema: {:?}", schema),
})?;
Ok(FieldReference {
@@ -648,8 +646,6 @@ mod utils {
Operator::BitwiseShiftRight => "bitwise_shift_right",
Operator::BitwiseShiftLeft => "bitwise_shift_left",
Operator::StringConcat => "string_concat",
Operator::ILike => "i_like",
Operator::NotILike => "not_i_like",
}
}

View File

@@ -19,7 +19,7 @@ use catalog::CatalogManagerRef;
use common_error::prelude::BoxedError;
use common_telemetry::debug;
use datafusion::arrow::datatypes::SchemaRef as ArrowSchemaRef;
use datafusion::common::{DFField, DFSchema};
use datafusion::common::ToDFSchema;
use datafusion::datasource::DefaultTableSource;
use datafusion::physical_plan::project_schema;
use datafusion_expr::{Filter, LogicalPlan, TableScan, TableSource};
@@ -236,7 +236,7 @@ impl DFLogicalSubstraitConvertor {
.map_err(BoxedError::new)
.context(InternalSnafu)?
.context(TableNotFoundSnafu {
name: format!("{catalog_name}.{schema_name}.{table_name}"),
name: format!("{}.{}.{}", catalog_name, schema_name, table_name),
})?;
let adapter = Arc::new(DefaultTableSource::new(Arc::new(
DfTableProviderAdapter::new(table_ref),
@@ -262,26 +262,16 @@ impl DFLogicalSubstraitConvertor {
};
// Calculate the projected schema
let qualified = &format!("{catalog_name}.{schema_name}.{table_name}");
let projected_schema = Arc::new(
project_schema(&stored_schema, projection.as_ref())
.and_then(|x| {
DFSchema::new_with_metadata(
x.fields()
.iter()
.map(|f| DFField::from_qualified(qualified, f.clone()))
.collect(),
x.metadata().clone(),
)
})
.context(DFInternalSnafu)?,
);
let projected_schema = project_schema(&stored_schema, projection.as_ref())
.context(DFInternalSnafu)?
.to_dfschema_ref()
.context(DFInternalSnafu)?;
ctx.set_df_schema(projected_schema.clone());
// TODO(ruihang): Support limit(fetch)
Ok(LogicalPlan::TableScan(TableScan {
table_name: format!("{catalog_name}.{schema_name}.{table_name}"),
table_name: format!("{}.{}.{}", catalog_name, schema_name, table_name),
source: adapter,
projection,
projected_schema,
@@ -395,10 +385,10 @@ impl DFLogicalSubstraitConvertor {
| LogicalPlan::Values(_)
| LogicalPlan::Explain(_)
| LogicalPlan::Analyze(_)
| LogicalPlan::Extension(_)
| LogicalPlan::Prepare(_) => InvalidParametersSnafu {
| LogicalPlan::Extension(_) => InvalidParametersSnafu {
reason: format!(
"Trying to convert DDL/DML plan to substrait proto, plan: {plan:?}",
"Trying to convert DDL/DML plan to substrait proto, plan: {:?}",
plan
),
}
.fail()?,
@@ -572,7 +562,7 @@ mod test {
let proto = convertor.encode(plan.clone()).unwrap();
let tripped_plan = convertor.decode(proto, catalog).unwrap();
assert_eq!(format!("{plan:?}"), format!("{tripped_plan:?}"));
assert_eq!(format!("{:?}", plan), format!("{:?}", tripped_plan));
}
#[tokio::test]
@@ -606,7 +596,8 @@ mod test {
let table_scan_plan = LogicalPlan::TableScan(TableScan {
table_name: format!(
"{DEFAULT_CATALOG_NAME}.{DEFAULT_SCHEMA_NAME}.{DEFAULT_TABLE_NAME}",
"{}.{}.{}",
DEFAULT_CATALOG_NAME, DEFAULT_SCHEMA_NAME, DEFAULT_TABLE_NAME
),
source: adapter,
projection: Some(projection),

View File

@@ -87,7 +87,7 @@ pub fn to_concrete_type(ty: &SType) -> Result<(ConcreteDataType, bool)> {
| Kind::List(_)
| Kind::Map(_)
| Kind::UserDefinedTypeReference(_) => UnsupportedSubstraitTypeSnafu {
ty: format!("{kind:?}"),
ty: format!("{:?}", kind),
}
.fail(),
}
@@ -154,7 +154,7 @@ pub(crate) fn scalar_value_as_literal_type(v: &ScalarValue) -> Result<LiteralTyp
// TODO(LFC): Implement other conversions: ScalarValue => LiteralType
_ => {
return error::UnsupportedExprSnafu {
name: format!("{v:?}"),
name: format!("{:?}", v),
}
.fail()
}
@@ -177,7 +177,7 @@ pub(crate) fn literal_type_to_scalar_value(t: LiteralType) -> Result<ScalarValue
// TODO(LFC): Implement other conversions: Kind => ScalarValue
_ => {
return error::UnsupportedSubstraitTypeSnafu {
ty: format!("{kind:?}"),
ty: format!("{:?}", kind),
}
.fail()
}
@@ -194,7 +194,7 @@ pub(crate) fn literal_type_to_scalar_value(t: LiteralType) -> Result<ScalarValue
// TODO(LFC): Implement other conversions: LiteralType => ScalarValue
_ => {
return error::UnsupportedSubstraitTypeSnafu {
ty: format!("{t:?}"),
ty: format!("{:?}", t),
}
.fail()
}

View File

@@ -1,8 +1,8 @@
[package]
name = "common-telemetry"
version.workspace = true
edition.workspace = true
license.workspace = true
version = "0.1.0"
edition = "2021"
license = "Apache-2.0"
[features]
console = ["console-subscriber"]

View File

@@ -28,7 +28,7 @@ pub fn set_panic_hook() {
let default_hook = panic::take_hook();
panic::set_hook(Box::new(move |panic| {
let backtrace = Backtrace::new();
let backtrace = format!("{backtrace:?}");
let backtrace = format!("{:?}", backtrace);
if let Some(location) = panic.location() {
tracing::error!(
message = %panic,

View File

@@ -1,8 +1,8 @@
[package]
name = "common-time"
version.workspace = true
edition.workspace = true
license.workspace = true
version = "0.1.0"
edition = "2021"
license = "Apache-2.0"
[dependencies]
chrono = "0.4"

View File

@@ -1,18 +1,16 @@
[package]
name = "datanode"
version.workspace = true
edition.workspace = true
license.workspace = true
version = "0.1.0"
edition = "2021"
license = "Apache-2.0"
[features]
default = ["python"]
python = ["dep:script"]
[dependencies]
async-stream.workspace = true
async-trait.workspace = true
api = { path = "../api" }
arrow-flight.workspace = true
async-trait = "0.1"
axum = "0.6"
axum-macros = "0.3"
backon = "0.2"
@@ -27,7 +25,7 @@ common-recordbatch = { path = "../common/recordbatch" }
common-runtime = { path = "../common/runtime" }
common-telemetry = { path = "../common/telemetry" }
common-time = { path = "../common/time" }
datafusion.workspace = true
datafusion = "14.0.0"
datatypes = { path = "../datatypes" }
futures = "0.3"
hyper = { version = "0.14", features = ["full"] }
@@ -37,8 +35,6 @@ meta-srv = { path = "../meta-srv", features = ["mock"] }
metrics = "0.20"
mito = { path = "../mito", features = ["test"] }
object-store = { path = "../object-store" }
pin-project = "1.0"
prost = "0.11"
query = { path = "../query" }
script = { path = "../script", features = ["python"], optional = true }
serde = "1.0"
@@ -61,5 +57,5 @@ tower-http = { version = "0.3", features = ["full"] }
axum-test-helper = { git = "https://github.com/sunng87/axum-test-helper.git", branch = "patch-1" }
client = { path = "../client" }
common-query = { path = "../common/query" }
datafusion-common.workspace = true
datafusion-common = "14.0.0"
tempdir = "0.3"

View File

@@ -36,8 +36,11 @@ pub enum Error {
source: substrait::error::Error,
},
#[snafu(display("Incorrect internal state: {}", state))]
IncorrectInternalState { state: String, backtrace: Backtrace },
#[snafu(display("Failed to execute physical plan, source: {}", source))]
ExecutePhysicalPlan {
#[snafu(backtrace)]
source: query::error::Error,
},
#[snafu(display("Failed to create catalog list, source: {}", source))]
NewCatalog {
@@ -152,11 +155,14 @@ pub enum Error {
#[snafu(display("Failed to init backend, config: {:#?}, source: {}", config, source))]
InitBackend {
config: Box<ObjectStoreConfig>,
config: ObjectStoreConfig,
source: object_store::Error,
backtrace: Backtrace,
},
#[snafu(display("Unsupported expr type: {}", name))]
UnsupportedExpr { name: String },
#[snafu(display("Runtime resource error, source: {}", source))]
RuntimeResource {
#[snafu(backtrace)]
@@ -205,6 +211,12 @@ pub enum Error {
source: catalog::error::Error,
},
#[snafu(display("Failed to decode as physical plan, source: {}", source))]
IntoPhysicalPlan {
#[snafu(backtrace)]
source: common_grpc::Error,
},
#[snafu(display("Failed to convert alter expr to request: {}", source))]
AlterExprToRequest {
#[snafu(backtrace)]
@@ -264,6 +276,9 @@ pub enum Error {
source: common_grpc_expr::error::Error,
},
#[snafu(display("Insert batch is empty"))]
EmptyInsertBatch,
#[snafu(display(
"Table id provider not found, cannot execute SQL directly on datanode in distributed mode"
))]
@@ -286,33 +301,6 @@ pub enum Error {
#[snafu(display("Missing node id option in distributed mode"))]
MissingMetasrvOpts { backtrace: Backtrace },
#[snafu(display("Invalid Flight ticket, source: {}", source))]
InvalidFlightTicket {
source: api::DecodeError,
backtrace: Backtrace,
},
#[snafu(display("Missing required field: {}", name))]
MissingRequiredField { name: String, backtrace: Backtrace },
#[snafu(display("Failed to poll recordbatch stream, source: {}", source))]
PollRecordbatchStream {
#[snafu(backtrace)]
source: common_recordbatch::error::Error,
},
#[snafu(display("Invalid FlightData, source: {}", source))]
InvalidFlightData {
#[snafu(backtrace)]
source: common_grpc::Error,
},
#[snafu(display("Failed to do Flight get, source: {}", source))]
FlightGet {
source: tonic::Status,
backtrace: Backtrace,
},
}
pub type Result<T> = std::result::Result<T, Error>;
@@ -322,6 +310,7 @@ impl ErrorExt for Error {
match self {
Error::ExecuteSql { source } => source.status_code(),
Error::DecodeLogicalPlan { source } => source.status_code(),
Error::ExecutePhysicalPlan { source } => source.status_code(),
Error::NewCatalog { source } => source.status_code(),
Error::FindTable { source, .. } => source.status_code(),
Error::CreateTable { source, .. }
@@ -339,11 +328,7 @@ impl ErrorExt for Error {
}
Error::AlterExprToRequest { source, .. }
| Error::CreateExprToRequest { source }
| Error::InsertData { source } => source.status_code(),
Error::InvalidFlightData { source } => source.status_code(),
| Error::CreateExprToRequest { source, .. } => source.status_code(),
Error::CreateSchema { source, .. }
| Error::ConvertSchema { source, .. }
| Error::VectorComputation { source } => source.status_code(),
@@ -366,11 +351,9 @@ impl ErrorExt for Error {
| Error::CreateDir { .. }
| Error::InsertSystemCatalog { .. }
| Error::RegisterSchema { .. }
| Error::Catalog { .. }
| Error::MissingRequiredField { .. }
| Error::FlightGet { .. }
| Error::InvalidFlightTicket { .. }
| Error::IncorrectInternalState { .. } => StatusCode::Internal,
| Error::IntoPhysicalPlan { .. }
| Error::UnsupportedExpr { .. }
| Error::Catalog { .. } => StatusCode::Internal,
Error::InitBackend { .. } => StatusCode::StorageUnavailable,
Error::OpenLogStore { source } => source.status_code(),
@@ -378,12 +361,13 @@ impl ErrorExt for Error {
Error::OpenStorageEngine { source } => source.status_code(),
Error::RuntimeResource { .. } => StatusCode::RuntimeResourcesExhausted,
Error::MetaClientInit { source, .. } => source.status_code(),
Error::InsertData { source, .. } => source.status_code(),
Error::EmptyInsertBatch => StatusCode::InvalidArguments,
Error::TableIdProviderNotFound { .. } => StatusCode::Unsupported,
Error::BumpTableId { source, .. } => source.status_code(),
Error::MissingNodeId { .. } => StatusCode::InvalidArguments,
Error::MissingMetasrvOpts { .. } => StatusCode::InvalidArguments,
Error::StartLogStore { source, .. } => source.status_code(),
Error::PollRecordbatchStream { source } => source.status_code(),
}
}

View File

@@ -48,7 +48,6 @@ use crate::heartbeat::HeartbeatTask;
use crate::script::ScriptExecutor;
use crate::sql::SqlHandler;
mod flight;
mod grpc;
mod script;
mod sql;
@@ -234,7 +233,7 @@ pub(crate) async fn new_fs_object_store(data_dir: &str) -> Result<ObjectStore> {
.context(error::CreateDirSnafu { dir: &data_dir })?;
info!("The file storage directory is: {}", &data_dir);
let atomic_write_dir = format!("{data_dir}/.tmp/");
let atomic_write_dir = format!("{}/.tmp/", data_dir);
let accessor = FsBuilder::default()
.root(&data_dir)

View File

@@ -1,472 +0,0 @@
// Copyright 2022 Greptime Team
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
mod stream;
use std::pin::Pin;
use api::v1::ddl_request::Expr as DdlExpr;
use api::v1::object_expr::Request as GrpcRequest;
use api::v1::query_request::Query;
use api::v1::{DdlRequest, InsertRequest, ObjectExpr};
use arrow_flight::flight_service_server::FlightService;
use arrow_flight::{
Action, ActionType, Criteria, Empty, FlightData, FlightDescriptor, FlightInfo,
HandshakeRequest, HandshakeResponse, PutResult, SchemaResult, Ticket,
};
use async_trait::async_trait;
use common_catalog::consts::DEFAULT_CATALOG_NAME;
use common_grpc::flight::{FlightEncoder, FlightMessage};
use common_query::Output;
use futures::Stream;
use prost::Message;
use session::context::QueryContext;
use snafu::{OptionExt, ResultExt};
use tonic::{Request, Response, Streaming};
use crate::error::{
CatalogSnafu, ExecuteSqlSnafu, InsertDataSnafu, InsertSnafu, InvalidFlightTicketSnafu,
MissingRequiredFieldSnafu, Result, TableNotFoundSnafu,
};
use crate::instance::flight::stream::FlightRecordBatchStream;
use crate::instance::Instance;
type TonicResult<T> = std::result::Result<T, tonic::Status>;
type TonicStream<T> = Pin<Box<dyn Stream<Item = TonicResult<T>> + Send + Sync + 'static>>;
#[async_trait]
impl FlightService for Instance {
type HandshakeStream = TonicStream<HandshakeResponse>;
async fn handshake(
&self,
_request: Request<Streaming<HandshakeRequest>>,
) -> TonicResult<Response<Self::HandshakeStream>> {
Err(tonic::Status::unimplemented("Not yet implemented"))
}
type ListFlightsStream = TonicStream<FlightInfo>;
async fn list_flights(
&self,
_request: Request<Criteria>,
) -> TonicResult<Response<Self::ListFlightsStream>> {
Err(tonic::Status::unimplemented("Not yet implemented"))
}
async fn get_flight_info(
&self,
_request: Request<FlightDescriptor>,
) -> TonicResult<Response<FlightInfo>> {
Err(tonic::Status::unimplemented("Not yet implemented"))
}
async fn get_schema(
&self,
_request: Request<FlightDescriptor>,
) -> TonicResult<Response<SchemaResult>> {
Err(tonic::Status::unimplemented("Not yet implemented"))
}
type DoGetStream = TonicStream<FlightData>;
async fn do_get(&self, request: Request<Ticket>) -> TonicResult<Response<Self::DoGetStream>> {
let ticket = request.into_inner().ticket;
let request = ObjectExpr::decode(ticket.as_slice())
.context(InvalidFlightTicketSnafu)?
.request
.context(MissingRequiredFieldSnafu { name: "request" })?;
let output = match request {
GrpcRequest::Insert(request) => self.handle_insert(request).await?,
GrpcRequest::Query(query_request) => {
let query = query_request
.query
.context(MissingRequiredFieldSnafu { name: "query" })?;
self.handle_query(query).await?
}
GrpcRequest::Ddl(request) => self.handle_ddl(request).await?,
};
let stream = to_flight_data_stream(output);
Ok(Response::new(stream))
}
type DoPutStream = TonicStream<PutResult>;
async fn do_put(
&self,
_request: Request<Streaming<FlightData>>,
) -> TonicResult<Response<Self::DoPutStream>> {
Err(tonic::Status::unimplemented("Not yet implemented"))
}
type DoExchangeStream = TonicStream<FlightData>;
async fn do_exchange(
&self,
_request: Request<Streaming<FlightData>>,
) -> TonicResult<Response<Self::DoExchangeStream>> {
Err(tonic::Status::unimplemented("Not yet implemented"))
}
type DoActionStream = TonicStream<arrow_flight::Result>;
async fn do_action(
&self,
_request: Request<Action>,
) -> TonicResult<Response<Self::DoActionStream>> {
Err(tonic::Status::unimplemented("Not yet implemented"))
}
type ListActionsStream = TonicStream<ActionType>;
async fn list_actions(
&self,
_request: Request<Empty>,
) -> TonicResult<Response<Self::ListActionsStream>> {
Err(tonic::Status::unimplemented("Not yet implemented"))
}
}
impl Instance {
async fn handle_query(&self, query: Query) -> Result<Output> {
Ok(match query {
Query::Sql(sql) => {
let stmt = self
.query_engine
.sql_to_statement(&sql)
.context(ExecuteSqlSnafu)?;
self.execute_stmt(stmt, QueryContext::arc()).await?
}
Query::LogicalPlan(plan) => self.execute_logical(plan).await?,
})
}
pub async fn handle_insert(&self, request: InsertRequest) -> Result<Output> {
let table_name = &request.table_name.clone();
// TODO(LFC): InsertRequest should carry catalog name, too.
let table = self
.catalog_manager
.table(DEFAULT_CATALOG_NAME, &request.schema_name, table_name)
.context(CatalogSnafu)?
.context(TableNotFoundSnafu { table_name })?;
let request = common_grpc_expr::insert::to_table_insert_request(request, table.schema())
.context(InsertDataSnafu)?;
let affected_rows = table
.insert(request)
.await
.context(InsertSnafu { table_name })?;
Ok(Output::AffectedRows(affected_rows))
}
async fn handle_ddl(&self, request: DdlRequest) -> Result<Output> {
let expr = request
.expr
.context(MissingRequiredFieldSnafu { name: "expr" })?;
match expr {
DdlExpr::CreateTable(expr) => self.handle_create(expr).await,
DdlExpr::Alter(expr) => self.handle_alter(expr).await,
DdlExpr::CreateDatabase(expr) => self.handle_create_database(expr).await,
DdlExpr::DropTable(expr) => self.handle_drop_table(expr).await,
}
}
}
fn to_flight_data_stream(output: Output) -> TonicStream<FlightData> {
match output {
Output::Stream(stream) => {
let stream = FlightRecordBatchStream::new(stream);
Box::pin(stream) as _
}
Output::RecordBatches(x) => {
let stream = FlightRecordBatchStream::new(x.as_stream());
Box::pin(stream) as _
}
Output::AffectedRows(rows) => {
let stream = tokio_stream::once(Ok(
FlightEncoder::default().encode(FlightMessage::AffectedRows(rows))
));
Box::pin(stream) as _
}
}
}
#[cfg(test)]
mod test {
use api::v1::column::{SemanticType, Values};
use api::v1::{
alter_expr, AddColumn, AddColumns, AlterExpr, Column, ColumnDataType, ColumnDef,
CreateDatabaseExpr, CreateTableExpr, QueryRequest,
};
use client::RpcOutput;
use common_grpc::flight;
use common_recordbatch::RecordBatches;
use datatypes::prelude::*;
use super::*;
use crate::tests::test_util::{self, MockInstance};
async fn boarding(instance: &MockInstance, ticket: Request<Ticket>) -> RpcOutput {
let response = instance.inner().do_get(ticket).await.unwrap();
let result = flight::flight_data_to_object_result(response)
.await
.unwrap();
result.try_into().unwrap()
}
#[tokio::test(flavor = "multi_thread")]
async fn test_handle_ddl() {
let instance = MockInstance::new("test_handle_ddl").await;
let ticket = Request::new(Ticket {
ticket: ObjectExpr {
request: Some(GrpcRequest::Ddl(DdlRequest {
expr: Some(DdlExpr::CreateDatabase(CreateDatabaseExpr {
database_name: "my_database".to_string(),
})),
})),
}
.encode_to_vec(),
});
let output = boarding(&instance, ticket).await;
assert!(matches!(output, RpcOutput::AffectedRows(1)));
let ticket = Request::new(Ticket {
ticket: ObjectExpr {
request: Some(GrpcRequest::Ddl(DdlRequest {
expr: Some(DdlExpr::CreateTable(CreateTableExpr {
catalog_name: "greptime".to_string(),
schema_name: "my_database".to_string(),
table_name: "my_table".to_string(),
desc: "blabla".to_string(),
column_defs: vec![
ColumnDef {
name: "a".to_string(),
datatype: ColumnDataType::String as i32,
is_nullable: true,
default_constraint: vec![],
},
ColumnDef {
name: "ts".to_string(),
datatype: ColumnDataType::TimestampMillisecond as i32,
is_nullable: false,
default_constraint: vec![],
},
],
time_index: "ts".to_string(),
..Default::default()
})),
})),
}
.encode_to_vec(),
});
let output = boarding(&instance, ticket).await;
assert!(matches!(output, RpcOutput::AffectedRows(1)));
let ticket = Request::new(Ticket {
ticket: ObjectExpr {
request: Some(GrpcRequest::Ddl(DdlRequest {
expr: Some(DdlExpr::Alter(AlterExpr {
catalog_name: "greptime".to_string(),
schema_name: "my_database".to_string(),
table_name: "my_table".to_string(),
kind: Some(alter_expr::Kind::AddColumns(AddColumns {
add_columns: vec![AddColumn {
column_def: Some(ColumnDef {
name: "b".to_string(),
datatype: ColumnDataType::Int32 as i32,
is_nullable: true,
default_constraint: vec![],
}),
is_key: true,
}],
})),
})),
})),
}
.encode_to_vec(),
});
let output = boarding(&instance, ticket).await;
assert!(matches!(output, RpcOutput::AffectedRows(0)));
let output = instance
.inner()
.execute_sql(
"INSERT INTO my_database.my_table (a, b, ts) VALUES ('s', 1, 1672384140000)",
QueryContext::arc(),
)
.await
.unwrap();
assert!(matches!(output, Output::AffectedRows(1)));
let output = instance
.inner()
.execute_sql(
"SELECT ts, a, b FROM my_database.my_table",
QueryContext::arc(),
)
.await
.unwrap();
let Output::Stream(stream) = output else { unreachable!() };
let recordbatches = RecordBatches::try_collect(stream).await.unwrap();
let expected = "\
+---------------------+---+---+
| ts | a | b |
+---------------------+---+---+
| 2022-12-30T07:09:00 | s | 1 |
+---------------------+---+---+";
assert_eq!(recordbatches.pretty_print().unwrap(), expected);
}
#[tokio::test(flavor = "multi_thread")]
async fn test_handle_insert() {
let instance = MockInstance::new("test_handle_insert").await;
test_util::create_test_table(
&instance,
ConcreteDataType::timestamp_millisecond_datatype(),
)
.await
.unwrap();
let insert = InsertRequest {
schema_name: "public".to_string(),
table_name: "demo".to_string(),
columns: vec![
Column {
column_name: "host".to_string(),
values: Some(Values {
string_values: vec![
"host1".to_string(),
"host2".to_string(),
"host3".to_string(),
],
..Default::default()
}),
semantic_type: SemanticType::Tag as i32,
datatype: ColumnDataType::String as i32,
..Default::default()
},
Column {
column_name: "cpu".to_string(),
values: Some(Values {
f64_values: vec![1.0, 3.0],
..Default::default()
}),
null_mask: vec![2],
semantic_type: SemanticType::Field as i32,
datatype: ColumnDataType::Float64 as i32,
},
Column {
column_name: "ts".to_string(),
values: Some(Values {
ts_millisecond_values: vec![1672384140000, 1672384141000, 1672384142000],
..Default::default()
}),
semantic_type: SemanticType::Timestamp as i32,
datatype: ColumnDataType::TimestampMillisecond as i32,
..Default::default()
},
],
row_count: 3,
..Default::default()
};
let ticket = Request::new(Ticket {
ticket: ObjectExpr {
request: Some(GrpcRequest::Insert(insert)),
}
.encode_to_vec(),
});
let output = boarding(&instance, ticket).await;
assert!(matches!(output, RpcOutput::AffectedRows(3)));
let output = instance
.inner()
.execute_sql("SELECT ts, host, cpu FROM demo", QueryContext::arc())
.await
.unwrap();
let Output::Stream(stream) = output else { unreachable!() };
let recordbatches = RecordBatches::try_collect(stream).await.unwrap();
let expected = "\
+---------------------+-------+-----+
| ts | host | cpu |
+---------------------+-------+-----+
| 2022-12-30T07:09:00 | host1 | 1 |
| 2022-12-30T07:09:01 | host2 | |
| 2022-12-30T07:09:02 | host3 | 3 |
+---------------------+-------+-----+";
assert_eq!(recordbatches.pretty_print().unwrap(), expected);
}
#[tokio::test(flavor = "multi_thread")]
async fn test_handle_query() {
let instance = MockInstance::new("test_handle_query").await;
test_util::create_test_table(
&instance,
ConcreteDataType::timestamp_millisecond_datatype(),
)
.await
.unwrap();
let ticket = Request::new(Ticket {
ticket: ObjectExpr {
request: Some(GrpcRequest::Query(QueryRequest {
query: Some(Query::Sql(
"INSERT INTO demo(host, cpu, memory, ts) VALUES \
('host1', 66.6, 1024, 1672201025000),\
('host2', 88.8, 333.3, 1672201026000)"
.to_string(),
)),
})),
}
.encode_to_vec(),
});
let output = boarding(&instance, ticket).await;
assert!(matches!(output, RpcOutput::AffectedRows(2)));
let ticket = Request::new(Ticket {
ticket: ObjectExpr {
request: Some(GrpcRequest::Query(QueryRequest {
query: Some(Query::Sql(
"SELECT ts, host, cpu, memory FROM demo".to_string(),
)),
})),
}
.encode_to_vec(),
});
let response = instance.inner().do_get(ticket).await.unwrap();
let result = flight::flight_data_to_object_result(response)
.await
.unwrap();
let raw_data = result.flight_data;
let messages = flight::raw_flight_data_to_message(raw_data).unwrap();
assert_eq!(messages.len(), 2);
let recordbatch = flight::flight_messages_to_recordbatches(messages).unwrap();
let expected = "\
+---------------------+-------+------+--------+
| ts | host | cpu | memory |
+---------------------+-------+------+--------+
| 2022-12-28T04:17:05 | host1 | 66.6 | 1024 |
| 2022-12-28T04:17:06 | host2 | 88.8 | 333.3 |
+---------------------+-------+------+--------+";
let actual = recordbatch.pretty_print().unwrap();
assert_eq!(actual, expected);
}
}

View File

@@ -1,178 +0,0 @@
// Copyright 2022 Greptime Team
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
use std::pin::Pin;
use std::task::{Context, Poll};
use arrow_flight::FlightData;
use common_grpc::flight::{FlightEncoder, FlightMessage};
use common_recordbatch::SendableRecordBatchStream;
use common_telemetry::warn;
use futures::channel::mpsc;
use futures::channel::mpsc::Sender;
use futures::{SinkExt, Stream, StreamExt};
use pin_project::{pin_project, pinned_drop};
use snafu::ResultExt;
use tokio::task::JoinHandle;
use crate::error;
use crate::instance::flight::TonicResult;
#[pin_project(PinnedDrop)]
pub(super) struct FlightRecordBatchStream {
#[pin]
rx: mpsc::Receiver<Result<FlightMessage, tonic::Status>>,
join_handle: JoinHandle<()>,
done: bool,
encoder: FlightEncoder,
}
impl FlightRecordBatchStream {
pub(super) fn new(recordbatches: SendableRecordBatchStream) -> Self {
let (tx, rx) = mpsc::channel::<TonicResult<FlightMessage>>(1);
let join_handle =
common_runtime::spawn_read(
async move { Self::flight_data_stream(recordbatches, tx).await },
);
Self {
rx,
join_handle,
done: false,
encoder: FlightEncoder::default(),
}
}
async fn flight_data_stream(
mut recordbatches: SendableRecordBatchStream,
mut tx: Sender<TonicResult<FlightMessage>>,
) {
let schema = recordbatches.schema();
if let Err(e) = tx.send(Ok(FlightMessage::Schema(schema))).await {
warn!("stop sending Flight data, err: {e}");
return;
}
while let Some(batch_or_err) = recordbatches.next().await {
match batch_or_err {
Ok(recordbatch) => {
if let Err(e) = tx.send(Ok(FlightMessage::Recordbatch(recordbatch))).await {
warn!("stop sending Flight data, err: {e}");
return;
}
}
Err(e) => {
let e = Err(e).context(error::PollRecordbatchStreamSnafu);
if let Err(e) = tx.send(e.map_err(|x| x.into())).await {
warn!("stop sending Flight data, err: {e}");
}
return;
}
}
}
}
}
#[pinned_drop]
impl PinnedDrop for FlightRecordBatchStream {
fn drop(self: Pin<&mut Self>) {
self.join_handle.abort();
}
}
impl Stream for FlightRecordBatchStream {
type Item = TonicResult<FlightData>;
fn poll_next(self: Pin<&mut Self>, cx: &mut Context<'_>) -> Poll<Option<Self::Item>> {
let this = self.project();
if *this.done {
Poll::Ready(None)
} else {
match this.rx.poll_next(cx) {
Poll::Ready(None) => {
*this.done = true;
Poll::Ready(None)
}
Poll::Ready(Some(result)) => match result {
Ok(flight_message) => {
let flight_data = this.encoder.encode(flight_message);
Poll::Ready(Some(Ok(flight_data)))
}
Err(e) => {
*this.done = true;
Poll::Ready(Some(Err(e)))
}
},
Poll::Pending => Poll::Pending,
}
}
}
}
#[cfg(test)]
mod test {
use std::sync::Arc;
use common_grpc::flight::{FlightDecoder, FlightMessage};
use common_recordbatch::{RecordBatch, RecordBatches};
use datatypes::prelude::*;
use datatypes::schema::{ColumnSchema, Schema};
use datatypes::vectors::Int32Vector;
use futures::StreamExt;
use super::*;
#[tokio::test]
async fn test_flight_record_batch_stream() {
let schema = Arc::new(Schema::new(vec![ColumnSchema::new(
"a",
ConcreteDataType::int32_datatype(),
false,
)]));
let v: VectorRef = Arc::new(Int32Vector::from_slice(&[1, 2]));
let recordbatch = RecordBatch::new(schema.clone(), vec![v]).unwrap();
let recordbatches = RecordBatches::try_new(schema.clone(), vec![recordbatch.clone()])
.unwrap()
.as_stream();
let mut stream = FlightRecordBatchStream::new(recordbatches);
let mut raw_data = Vec::with_capacity(2);
raw_data.push(stream.next().await.unwrap().unwrap());
raw_data.push(stream.next().await.unwrap().unwrap());
assert!(stream.next().await.is_none());
assert!(stream.done);
let decoder = &mut FlightDecoder::default();
let mut flight_messages = raw_data
.into_iter()
.map(|x| decoder.try_decode(x).unwrap())
.collect::<Vec<FlightMessage>>();
assert_eq!(flight_messages.len(), 2);
match flight_messages.remove(0) {
FlightMessage::Schema(actual_schema) => {
assert_eq!(actual_schema, schema);
}
_ => unreachable!(),
}
match flight_messages.remove(0) {
FlightMessage::Recordbatch(actual_recordbatch) => {
assert_eq!(actual_recordbatch, recordbatch);
}
_ => unreachable!(),
}
}
}

View File

@@ -12,42 +12,139 @@
// See the License for the specific language governing permissions and
// limitations under the License.
use api::v1::{CreateDatabaseExpr, ObjectExpr, ObjectResult};
use arrow_flight::flight_service_server::FlightService;
use arrow_flight::Ticket;
use std::sync::Arc;
use api::result::{build_err_result, AdminResultBuilder, ObjectResultBuilder};
use api::v1::{
admin_expr, object_expr, select_expr, AdminExpr, AdminResult, Column, CreateDatabaseExpr,
ObjectExpr, ObjectResult, SelectExpr,
};
use async_trait::async_trait;
use common_error::prelude::BoxedError;
use common_grpc::flight;
use common_catalog::consts::DEFAULT_CATALOG_NAME;
use common_error::ext::ErrorExt;
use common_error::status_code::StatusCode;
use common_grpc::select::to_object_result;
use common_grpc_expr::insertion_expr_to_request;
use common_query::Output;
use prost::Message;
use query::plan::LogicalPlan;
use servers::query_handler::GrpcQueryHandler;
use servers::query_handler::{GrpcAdminHandler, GrpcQueryHandler};
use session::context::QueryContext;
use snafu::prelude::*;
use substrait::{DFLogicalSubstraitConvertor, SubstraitPlan};
use table::requests::CreateDatabaseRequest;
use tonic::Request;
use crate::error::{
DecodeLogicalPlanSnafu, ExecuteSqlSnafu, FlightGetSnafu, InvalidFlightDataSnafu, Result,
CatalogNotFoundSnafu, CatalogSnafu, DecodeLogicalPlanSnafu, EmptyInsertBatchSnafu,
ExecuteSqlSnafu, InsertDataSnafu, InsertSnafu, Result, SchemaNotFoundSnafu, TableNotFoundSnafu,
UnsupportedExprSnafu,
};
use crate::instance::Instance;
impl Instance {
async fn boarding(&self, ticket: Request<Ticket>) -> Result<ObjectResult> {
let response = self.do_get(ticket).await.context(FlightGetSnafu)?;
flight::flight_data_to_object_result(response)
pub async fn execute_grpc_insert(
&self,
catalog_name: &str,
schema_name: &str,
table_name: &str,
insert_batches: Vec<(Vec<Column>, u32)>,
) -> Result<Output> {
let schema_provider = self
.catalog_manager
.catalog(catalog_name)
.context(CatalogSnafu)?
.context(CatalogNotFoundSnafu { name: catalog_name })?
.schema(schema_name)
.context(CatalogSnafu)?
.context(SchemaNotFoundSnafu { name: schema_name })?;
ensure!(!insert_batches.is_empty(), EmptyInsertBatchSnafu);
let table = schema_provider
.table(table_name)
.context(CatalogSnafu)?
.context(TableNotFoundSnafu { table_name })?;
let insert = insertion_expr_to_request(
catalog_name,
schema_name,
table_name,
insert_batches,
table.clone(),
)
.context(InsertDataSnafu)?;
let affected_rows = table
.insert(insert)
.await
.context(InvalidFlightDataSnafu)
.context(InsertSnafu { table_name })?;
Ok(Output::AffectedRows(affected_rows))
}
pub(crate) async fn handle_create_database(&self, expr: CreateDatabaseExpr) -> Result<Output> {
async fn handle_insert(
&self,
catalog_name: &str,
schema_name: &str,
table_name: &str,
insert_batches: Vec<(Vec<Column>, u32)>,
) -> ObjectResult {
match self
.execute_grpc_insert(catalog_name, schema_name, table_name, insert_batches)
.await
{
Ok(Output::AffectedRows(rows)) => ObjectResultBuilder::new()
.status_code(StatusCode::Success as u32)
.mutate_result(rows as u32, 0)
.build(),
Err(err) => {
common_telemetry::error!(err; "Failed to handle insert, catalog name: {}, schema name: {}, table name: {}", catalog_name, schema_name, table_name);
// TODO(fys): failure count
build_err_result(&err)
}
_ => unreachable!(),
}
}
async fn handle_select(&self, select_expr: SelectExpr) -> ObjectResult {
let result = self.do_handle_select(select_expr).await;
to_object_result(result).await
}
async fn do_handle_select(&self, select_expr: SelectExpr) -> Result<Output> {
let expr = select_expr.expr;
match expr {
Some(select_expr::Expr::Sql(sql)) => {
self.execute_sql(&sql, Arc::new(QueryContext::new())).await
}
Some(select_expr::Expr::LogicalPlan(plan)) => self.execute_logical(plan).await,
_ => UnsupportedExprSnafu {
name: format!("{:?}", expr),
}
.fail(),
}
}
async fn execute_create_database(
&self,
create_database_expr: CreateDatabaseExpr,
) -> AdminResult {
let req = CreateDatabaseRequest {
db_name: expr.database_name,
db_name: create_database_expr.database_name,
};
self.sql_handler().create_database(req).await
let result = self.sql_handler.create_database(req).await;
match result {
Ok(Output::AffectedRows(rows)) => AdminResultBuilder::default()
.status_code(StatusCode::Success as u32)
.mutate_result(rows as u32, 0)
.build(),
Ok(Output::Stream(_)) | Ok(Output::RecordBatches(_)) => unreachable!(),
Err(err) => AdminResultBuilder::default()
.status_code(err.status_code() as u32)
.err_msg(err.to_string())
.build(),
}
}
pub(crate) async fn execute_logical(&self, plan_bytes: Vec<u8>) -> Result<Output> {
async fn execute_logical(&self, plan_bytes: Vec<u8>) -> Result<Output> {
let logical_plan = DFLogicalSubstraitConvertor
.decode(plan_bytes.as_slice(), self.catalog_manager.clone())
.context(DecodeLogicalPlanSnafu)?;
@@ -62,15 +159,50 @@ impl Instance {
#[async_trait]
impl GrpcQueryHandler for Instance {
async fn do_query(&self, query: ObjectExpr) -> servers::error::Result<ObjectResult> {
let ticket = Request::new(Ticket {
ticket: query.encode_to_vec(),
});
// TODO(LFC): Temporarily use old GRPC interface here, will get rid of them near the end of Arrow Flight adoption.
self.boarding(ticket)
.await
.map_err(BoxedError::new)
.with_context(|_| servers::error::ExecuteQuerySnafu {
query: format!("{query:?}"),
})
let object_resp = match query.expr {
Some(object_expr::Expr::Insert(insert_expr)) => {
let catalog_name = DEFAULT_CATALOG_NAME;
let schema_name = &insert_expr.schema_name;
let table_name = &insert_expr.table_name;
// TODO(fys): _region_number is for later use.
let _region_number: u32 = insert_expr.region_number;
let insert_batches = vec![(insert_expr.columns, insert_expr.row_count)];
self.handle_insert(catalog_name, schema_name, table_name, insert_batches)
.await
}
Some(object_expr::Expr::Select(select_expr)) => self.handle_select(select_expr).await,
other => {
return servers::error::NotSupportedSnafu {
feat: format!("{:?}", other),
}
.fail();
}
};
Ok(object_resp)
}
}
#[async_trait]
impl GrpcAdminHandler for Instance {
async fn exec_admin_request(&self, expr: AdminExpr) -> servers::error::Result<AdminResult> {
let admin_resp = match expr.expr {
Some(admin_expr::Expr::Create(create_expr)) => self.handle_create(create_expr).await,
Some(admin_expr::Expr::Alter(alter_expr)) => self.handle_alter(alter_expr).await,
Some(admin_expr::Expr::CreateDatabase(create_database_expr)) => {
self.execute_create_database(create_database_expr).await
}
Some(admin_expr::Expr::DropTable(drop_table_expr)) => {
self.handle_drop_table(drop_table_expr).await
}
other => {
return servers::error::NotSupportedSnafu {
feat: format!("{:?}", other),
}
.fail();
}
};
Ok(admin_resp)
}
}

View File

@@ -33,11 +33,12 @@ use crate::metric;
use crate::sql::SqlRequest;
impl Instance {
pub async fn execute_stmt(
&self,
stmt: Statement,
query_ctx: QueryContextRef,
) -> Result<Output> {
pub async fn execute_sql(&self, sql: &str, query_ctx: QueryContextRef) -> Result<Output> {
let stmt = self
.query_engine
.sql_to_statement(sql)
.context(ExecuteSqlSnafu)?;
match stmt {
Statement::Query(_) => {
let logical_plan = self
@@ -152,14 +153,6 @@ impl Instance {
}
}
}
pub async fn execute_sql(&self, sql: &str, query_ctx: QueryContextRef) -> Result<Output> {
let stmt = self
.query_engine
.sql_to_statement(sql)
.context(ExecuteSqlSnafu)?;
self.execute_stmt(stmt, query_ctx).await
}
}
// TODO(LFC): Refactor consideration: move this function to some helper mod,
@@ -187,7 +180,8 @@ fn table_idents_to_full_name(
)),
_ => error::InvalidSqlSnafu {
msg: format!(
"expect table name to be <catalog>.<schema>.<table>, <schema>.<table> or <table>, actual: {obj_name}",
"expect table name to be <catalog>.<schema>.<table>, <schema>.<table> or <table>, actual: {}",
obj_name
),
}.fail(),
}
@@ -199,40 +193,15 @@ impl SqlQueryHandler for Instance {
&self,
query: &str,
query_ctx: QueryContextRef,
) -> Vec<servers::error::Result<Output>> {
let _timer = timer!(metric::METRIC_HANDLE_SQL_ELAPSED);
// we assume sql string has only 1 statement in datanode
let result = self
.execute_sql(query, query_ctx)
.await
.map_err(|e| {
error!(e; "Instance failed to execute sql");
BoxedError::new(e)
})
.context(servers::error::ExecuteQuerySnafu { query });
vec![result]
}
async fn do_statement_query(
&self,
stmt: Statement,
query_ctx: QueryContextRef,
) -> servers::error::Result<Output> {
let _timer = timer!(metric::METRIC_HANDLE_SQL_ELAPSED);
self.execute_stmt(stmt, query_ctx)
self.execute_sql(query, query_ctx)
.await
.map_err(|e| {
error!(e; "Instance failed to execute sql");
BoxedError::new(e)
})
.context(servers::error::ExecuteStatementSnafu)
}
fn is_valid_schema(&self, catalog: &str, schema: &str) -> servers::error::Result<bool> {
self.catalog_manager
.schema(catalog, schema)
.map(|s| s.is_some())
.context(servers::error::CatalogSnafu)
.context(servers::error::ExecuteQuerySnafu { query })
}
}

View File

@@ -40,7 +40,7 @@ impl Services {
pub async fn try_new(instance: InstanceRef, opts: &DatanodeOptions) -> Result<Self> {
let grpc_runtime = Arc::new(
RuntimeBuilder::default()
.worker_threads(opts.rpc_runtime_size)
.worker_threads(opts.rpc_runtime_size as usize)
.thread_name("grpc-io-handlers")
.build()
.context(RuntimeResourceSnafu)?,
@@ -54,7 +54,7 @@ impl Services {
Mode::Distributed => {
let mysql_io_runtime = Arc::new(
RuntimeBuilder::default()
.worker_threads(opts.mysql_runtime_size)
.worker_threads(opts.mysql_runtime_size as usize)
.thread_name("mysql-io-handlers")
.build()
.context(RuntimeResourceSnafu)?,
@@ -69,7 +69,7 @@ impl Services {
};
Ok(Self {
grpc_server: GrpcServer::new(instance, grpc_runtime),
grpc_server: GrpcServer::new(instance.clone(), instance, grpc_runtime),
mysql_server,
})
}

View File

@@ -12,77 +12,144 @@
// See the License for the specific language governing permissions and
// limitations under the License.
use api::v1::{AlterExpr, CreateTableExpr, DropTableExpr};
use std::sync::Arc;
use api::result::AdminResultBuilder;
use api::v1::{AdminResult, AlterExpr, CreateExpr, DropTableExpr};
use common_error::prelude::{ErrorExt, StatusCode};
use common_grpc_expr::{alter_expr_to_request, create_expr_to_request};
use common_query::Output;
use common_telemetry::info;
use common_telemetry::{error, info};
use futures::TryFutureExt;
use session::context::QueryContext;
use snafu::prelude::*;
use table::requests::DropTableRequest;
use crate::error::{
AlterExprToRequestSnafu, BumpTableIdSnafu, CreateExprToRequestSnafu,
IncorrectInternalStateSnafu, Result,
};
use crate::error::{AlterExprToRequestSnafu, BumpTableIdSnafu, CreateExprToRequestSnafu};
use crate::instance::Instance;
use crate::sql::SqlRequest;
impl Instance {
/// Handle gRPC create table requests.
pub(crate) async fn handle_create(&self, expr: CreateTableExpr) -> Result<Output> {
let table_name = format!(
"{}.{}.{}",
expr.catalog_name, expr.schema_name, expr.table_name
);
// TODO(LFC): Revisit table id related feature, add more tests.
// Also merge this mod with mod instance::grpc.
pub(crate) async fn handle_create(&self, expr: CreateExpr) -> AdminResult {
// Respect CreateExpr's table id and region ids if present, or allocate table id
// from local table id provider and set region id to 0.
let table_id = if let Some(table_id) = &expr.table_id {
let table_id = if let Some(table_id) = expr.table_id {
info!(
"Creating table {table_name} with table id {} from Frontend",
table_id.id
"Creating table {:?}.{:?}.{:?} with table id from frontend: {}",
expr.catalog_name, expr.schema_name, expr.table_name, table_id
);
table_id.id
} else {
let provider =
self.table_id_provider
.as_ref()
.context(IncorrectInternalStateSnafu {
state: "Table id provider absent in standalone mode",
})?;
let table_id = provider.next_table_id().await.context(BumpTableIdSnafu)?;
info!("Creating table {table_name} with table id {table_id} from TableIdProvider");
table_id
} else {
match self.table_id_provider.as_ref() {
None => {
return AdminResultBuilder::default()
.status_code(StatusCode::Internal as u32)
.err_msg("Table id provider absent in standalone mode".to_string())
.build();
}
Some(table_id_provider) => {
match table_id_provider
.next_table_id()
.await
.context(BumpTableIdSnafu)
{
Ok(table_id) => {
info!(
"Creating table {:?}.{:?}.{:?} with table id from catalog manager: {}",
&expr.catalog_name, &expr.schema_name, expr.table_name, table_id
);
table_id
}
Err(e) => {
error!(e;"Failed to create table id when creating table: {:?}.{:?}.{:?}", &expr.catalog_name, &expr.schema_name, expr.table_name);
return AdminResultBuilder::default()
.status_code(e.status_code() as u32)
.err_msg(e.to_string())
.build();
}
}
}
}
};
let request = create_expr_to_request(table_id, expr).context(CreateExprToRequestSnafu)?;
self.sql_handler()
.execute(SqlRequest::CreateTable(request), QueryContext::arc())
.await
let request = create_expr_to_request(table_id, expr).context(CreateExprToRequestSnafu);
let result = futures::future::ready(request)
.and_then(|request| {
self.sql_handler().execute(
SqlRequest::CreateTable(request),
Arc::new(QueryContext::new()),
)
})
.await;
match result {
Ok(Output::AffectedRows(rows)) => AdminResultBuilder::default()
.status_code(StatusCode::Success as u32)
.mutate_result(rows as u32, 0)
.build(),
// Unreachable because we are executing "CREATE TABLE"; otherwise it's an internal bug.
Ok(Output::Stream(_)) | Ok(Output::RecordBatches(_)) => unreachable!(),
Err(err) => AdminResultBuilder::default()
.status_code(err.status_code() as u32)
.err_msg(err.to_string())
.build(),
}
}
pub(crate) async fn handle_alter(&self, expr: AlterExpr) -> Result<Output> {
let request = alter_expr_to_request(expr).context(AlterExprToRequestSnafu)?;
let Some(request) = request else { return Ok(Output::AffectedRows(0)) };
pub(crate) async fn handle_alter(&self, expr: AlterExpr) -> AdminResult {
let request = match alter_expr_to_request(expr)
.context(AlterExprToRequestSnafu)
.transpose()
{
None => {
return AdminResultBuilder::default()
.status_code(StatusCode::Success as u32)
.mutate_result(0, 0)
.build()
}
Some(req) => req,
};
self.sql_handler()
.execute(SqlRequest::Alter(request), QueryContext::arc())
.await
let result = futures::future::ready(request)
.and_then(|request| {
self.sql_handler()
.execute(SqlRequest::Alter(request), Arc::new(QueryContext::new()))
})
.await;
match result {
Ok(Output::AffectedRows(rows)) => AdminResultBuilder::default()
.status_code(StatusCode::Success as u32)
.mutate_result(rows as u32, 0)
.build(),
Ok(Output::Stream(_)) | Ok(Output::RecordBatches(_)) => unreachable!(),
Err(err) => AdminResultBuilder::default()
.status_code(err.status_code() as u32)
.err_msg(err.to_string())
.build(),
}
}
pub(crate) async fn handle_drop_table(&self, expr: DropTableExpr) -> Result<Output> {
pub(crate) async fn handle_drop_table(&self, expr: DropTableExpr) -> AdminResult {
let req = DropTableRequest {
catalog_name: expr.catalog_name,
schema_name: expr.schema_name,
table_name: expr.table_name,
};
self.sql_handler()
.execute(SqlRequest::DropTable(req), QueryContext::arc())
.await
let result = self
.sql_handler()
.execute(SqlRequest::DropTable(req), Arc::new(QueryContext::new()))
.await;
match result {
Ok(Output::AffectedRows(rows)) => AdminResultBuilder::default()
.status_code(StatusCode::Success as u32)
.mutate_result(rows as _, 0)
.build(),
Ok(Output::Stream(_)) | Ok(Output::RecordBatches(_)) => unreachable!(),
Err(err) => AdminResultBuilder::default()
.status_code(err.status_code() as u32)
.err_msg(err.to_string())
.build(),
}
}
}
@@ -90,7 +157,7 @@ impl Instance {
mod tests {
use std::sync::Arc;
use api::v1::{ColumnDataType, ColumnDef, TableId};
use api::v1::{ColumnDataType, ColumnDef};
use common_catalog::consts::MIN_USER_TABLE_ID;
use common_grpc_expr::create_table_schema;
use datatypes::prelude::ConcreteDataType;
@@ -108,7 +175,7 @@ mod tests {
assert_eq!(request.catalog_name, "greptime".to_string());
assert_eq!(request.schema_name, "public".to_string());
assert_eq!(request.table_name, "my-metrics");
assert_eq!(request.desc, Some("blabla little magic fairy".to_string()));
assert_eq!(request.desc, Some("blabla".to_string()));
assert_eq!(request.schema, expected_table_schema());
assert_eq!(request.primary_key_indices, vec![1, 0]);
assert!(request.create_if_not_exists);
@@ -135,7 +202,8 @@ mod tests {
let err_msg = result.unwrap_err().to_string();
assert!(
err_msg.contains("Missing timestamp column"),
"actual: {err_msg}",
"actual: {}",
err_msg
);
}
@@ -146,7 +214,7 @@ mod tests {
name: "a".to_string(),
datatype: 1024,
is_nullable: true,
default_constraint: vec![],
default_constraint: None,
};
let result = column_def.try_as_column_schema();
assert!(matches!(
@@ -158,7 +226,7 @@ mod tests {
name: "a".to_string(),
datatype: ColumnDataType::String as i32,
is_nullable: true,
default_constraint: vec![],
default_constraint: None,
};
let column_schema = column_def.try_as_column_schema().unwrap();
assert_eq!(column_schema.name, "a");
@@ -170,7 +238,7 @@ mod tests {
name: "a".to_string(),
datatype: ColumnDataType::String as i32,
is_nullable: true,
default_constraint: default_constraint.clone().try_into().unwrap(),
default_constraint: Some(default_constraint.clone().try_into().unwrap()),
};
let column_schema = column_def.try_as_column_schema().unwrap();
assert_eq!(column_schema.name, "a");
@@ -182,46 +250,44 @@ mod tests {
);
}
fn testing_create_expr() -> CreateTableExpr {
fn testing_create_expr() -> CreateExpr {
let column_defs = vec![
ColumnDef {
name: "host".to_string(),
datatype: ColumnDataType::String as i32,
is_nullable: false,
default_constraint: vec![],
default_constraint: None,
},
ColumnDef {
name: "ts".to_string(),
datatype: ColumnDataType::TimestampMillisecond as i32,
is_nullable: false,
default_constraint: vec![],
default_constraint: None,
},
ColumnDef {
name: "cpu".to_string(),
datatype: ColumnDataType::Float32 as i32,
is_nullable: true,
default_constraint: vec![],
default_constraint: None,
},
ColumnDef {
name: "memory".to_string(),
datatype: ColumnDataType::Float64 as i32,
is_nullable: true,
default_constraint: vec![],
default_constraint: None,
},
];
CreateTableExpr {
catalog_name: "".to_string(),
schema_name: "".to_string(),
CreateExpr {
catalog_name: None,
schema_name: None,
table_name: "my-metrics".to_string(),
desc: "blabla little magic fairy".to_string(),
desc: Some("blabla".to_string()),
column_defs,
time_index: "ts".to_string(),
primary_keys: vec!["ts".to_string(), "host".to_string()],
create_if_not_exists: true,
table_options: Default::default(),
table_id: Some(TableId {
id: MIN_USER_TABLE_ID,
}),
table_id: Some(MIN_USER_TABLE_ID),
region_ids: vec![0],
}
}

View File

@@ -96,7 +96,7 @@ impl SqlHandler {
result
}
pub(crate) fn get_table(&self, table_ref: &TableReference) -> Result<TableRef> {
pub(crate) fn get_table<'a>(&self, table_ref: &'a TableReference) -> Result<TableRef> {
self.table_engine
.get_table(&EngineContext::default(), table_ref)
.with_context(|_| GetTableSnafu {
@@ -176,7 +176,7 @@ mod tests {
async fn scan(
&self,
_projection: Option<&Vec<usize>>,
_projection: &Option<Vec<usize>>,
_filters: &[Expr],
_limit: Option<usize>,
) -> TableResult<PhysicalPlanRef> {

View File

@@ -61,7 +61,7 @@ impl SqlHandler {
let alter_kind = match alter_table.alter_operation() {
AlterTableOperation::AddConstraint(table_constraint) => {
return error::InvalidSqlSnafu {
msg: format!("unsupported table constraint {table_constraint}"),
msg: format!("unsupported table constraint {}", table_constraint),
}
.fail()
}
@@ -76,13 +76,6 @@ impl SqlHandler {
AlterTableOperation::DropColumn { name } => AlterKind::DropColumns {
names: vec![name.value.clone()],
},
AlterTableOperation::RenameTable { .. } => {
// TODO update proto to support alter table name
return error::InvalidSqlSnafu {
msg: "rename table not unsupported yet".to_string(),
}
.fail();
}
};
Ok(AlterTableRequest {
catalog_name: Some(table_ref.catalog.to_string()),
@@ -140,14 +133,4 @@ mod tests {
_ => unreachable!(),
}
}
#[tokio::test]
async fn test_alter_to_request_with_renaming_table() {
let handler = create_mock_sql_handler().await;
let alter_table = parse_sql("ALTER TABLE test_table RENAME table_t;");
let err = handler
.alter_to_request(alter_table, TableReference::bare("test_table"))
.unwrap_err();
assert_matches!(err, crate::error::Error::InvalidSql { .. });
}
}

View File

@@ -143,7 +143,7 @@ impl SqlHandler {
)?;
} else {
return error::InvalidSqlSnafu {
msg: format!("Cannot recognize named UNIQUE constraint: {name}"),
msg: format!("Cannot recognize named UNIQUE constraint: {}", name),
}
.fail();
}
@@ -158,7 +158,8 @@ impl SqlHandler {
} else {
return error::InvalidSqlSnafu {
msg: format!(
"Unrecognized non-primary unnamed UNIQUE constraint: {name:?}",
"Unrecognized non-primary unnamed UNIQUE constraint: {:?}",
name
),
}
.fail();
@@ -166,7 +167,7 @@ impl SqlHandler {
}
_ => {
return ConstraintNotSupportedSnafu {
constraint: format!("{c:?}"),
constraint: format!("{:?}", c),
}
.fail();
}

View File

@@ -21,11 +21,17 @@ use datatypes::data_type::ConcreteDataType;
use datatypes::vectors::{Int64Vector, StringVector, UInt64Vector, VectorRef};
use session::context::QueryContext;
use crate::tests::test_util::{self, MockInstance};
use crate::instance::Instance;
use crate::tests::test_util;
#[tokio::test(flavor = "multi_thread")]
async fn test_create_database_and_insert_query() {
let instance = MockInstance::new("create_database_and_insert_query").await;
common_telemetry::init_default_ut_logging();
let (opts, _guard) =
test_util::create_tmp_dir_and_datanode_opts("create_database_and_insert_query");
let instance = Instance::with_mock_meta_client(&opts).await.unwrap();
instance.start().await.unwrap();
let output = execute_sql(&instance, "create database test").await;
assert!(matches!(output, Output::AffectedRows(1)));
@@ -71,7 +77,12 @@ async fn test_create_database_and_insert_query() {
}
#[tokio::test(flavor = "multi_thread")]
async fn test_issue477_same_table_name_in_different_databases() {
let instance = MockInstance::new("test_issue477_same_table_name_in_different_databases").await;
common_telemetry::init_default_ut_logging();
let (opts, _guard) =
test_util::create_tmp_dir_and_datanode_opts("create_database_and_insert_query");
let instance = Instance::with_mock_meta_client(&opts).await.unwrap();
instance.start().await.unwrap();
// Create database a and b
let output = execute_sql(&instance, "create database a").await;
@@ -138,7 +149,7 @@ async fn test_issue477_same_table_name_in_different_databases() {
.await;
}
async fn assert_query_result(instance: &MockInstance, sql: &str, ts: i64, host: &str) {
async fn assert_query_result(instance: &Instance, sql: &str, ts: i64, host: &str) {
let query_output = execute_sql(instance, sql).await;
match query_output {
Output::Stream(s) => {
@@ -158,11 +169,16 @@ async fn assert_query_result(instance: &MockInstance, sql: &str, ts: i64, host:
}
}
async fn setup_test_instance(test_name: &str) -> MockInstance {
let instance = MockInstance::new(test_name).await;
async fn setup_test_instance(test_name: &str) -> Instance {
common_telemetry::init_default_ut_logging();
let (opts, _guard) = test_util::create_tmp_dir_and_datanode_opts(test_name);
let instance = Instance::with_mock_meta_client(&opts).await.unwrap();
instance.start().await.unwrap();
test_util::create_test_table(
&instance,
instance.catalog_manager(),
instance.sql_handler(),
ConcreteDataType::timestamp_millisecond_datatype(),
)
.await
@@ -187,11 +203,19 @@ async fn test_execute_insert() {
#[tokio::test(flavor = "multi_thread")]
async fn test_execute_insert_query_with_i64_timestamp() {
let instance = MockInstance::new("insert_query_i64_timestamp").await;
common_telemetry::init_default_ut_logging();
test_util::create_test_table(&instance, ConcreteDataType::int64_datatype())
.await
.unwrap();
let (opts, _guard) = test_util::create_tmp_dir_and_datanode_opts("insert_query_i64_timestamp");
let instance = Instance::with_mock_meta_client(&opts).await.unwrap();
instance.start().await.unwrap();
test_util::create_test_table(
instance.catalog_manager(),
instance.sql_handler(),
ConcreteDataType::int64_datatype(),
)
.await
.unwrap();
let output = execute_sql(
&instance,
@@ -238,7 +262,9 @@ async fn test_execute_insert_query_with_i64_timestamp() {
#[tokio::test(flavor = "multi_thread")]
async fn test_execute_query() {
let instance = MockInstance::new("execute_query").await;
let (opts, _guard) = test_util::create_tmp_dir_and_datanode_opts("execute_query");
let instance = Instance::with_mock_meta_client(&opts).await.unwrap();
instance.start().await.unwrap();
let output = execute_sql(&instance, "select sum(number) from numbers limit 20").await;
match output {
@@ -258,7 +284,10 @@ async fn test_execute_query() {
#[tokio::test(flavor = "multi_thread")]
async fn test_execute_show_databases_tables() {
let instance = MockInstance::new("execute_show_databases_tables").await;
let (opts, _guard) =
test_util::create_tmp_dir_and_datanode_opts("execute_show_databases_tables");
let instance = Instance::with_mock_meta_client(&opts).await.unwrap();
instance.start().await.unwrap();
let output = execute_sql(&instance, "show databases").await;
match output {
@@ -302,7 +331,8 @@ async fn test_execute_show_databases_tables() {
// creat a table
test_util::create_test_table(
&instance,
instance.catalog_manager(),
instance.sql_handler(),
ConcreteDataType::timestamp_millisecond_datatype(),
)
.await
@@ -337,7 +367,11 @@ async fn test_execute_show_databases_tables() {
#[tokio::test(flavor = "multi_thread")]
pub async fn test_execute_create() {
let instance = MockInstance::new("execute_create").await;
common_telemetry::init_default_ut_logging();
let (opts, _guard) = test_util::create_tmp_dir_and_datanode_opts("execute_create");
let instance = Instance::with_mock_meta_client(&opts).await.unwrap();
instance.start().await.unwrap();
let output = execute_sql(
&instance,
@@ -446,16 +480,19 @@ async fn test_alter_table() {
}
async fn test_insert_with_default_value_for_type(type_name: &str) {
let instance = MockInstance::new("execute_create").await;
let (opts, _guard) = test_util::create_tmp_dir_and_datanode_opts("execute_create");
let instance = Instance::with_mock_meta_client(&opts).await.unwrap();
instance.start().await.unwrap();
let create_sql = format!(
r#"create table test_table(
host string,
ts {type_name} DEFAULT CURRENT_TIMESTAMP,
ts {} DEFAULT CURRENT_TIMESTAMP,
cpu double default 0,
TIME INDEX (ts),
PRIMARY KEY(host)
) engine=mito with(regions=1);"#,
type_name
);
let output = execute_sql(&instance, &create_sql).await;
assert!(matches!(output, Output::AffectedRows(1)));
@@ -491,13 +528,17 @@ async fn test_insert_with_default_value_for_type(type_name: &str) {
#[tokio::test(flavor = "multi_thread")]
async fn test_insert_with_default_value() {
common_telemetry::init_default_ut_logging();
test_insert_with_default_value_for_type("timestamp").await;
test_insert_with_default_value_for_type("bigint").await;
}
#[tokio::test(flavor = "multi_thread")]
async fn test_use_database() {
let instance = MockInstance::new("test_use_database").await;
let (opts, _guard) = test_util::create_tmp_dir_and_datanode_opts("use_database");
let instance = Instance::with_mock_meta_client(&opts).await.unwrap();
instance.start().await.unwrap();
let output = execute_sql(&instance, "create database db1").await;
assert!(matches!(output, Output::AffectedRows(1)));
@@ -554,11 +595,11 @@ async fn test_use_database() {
check_output_stream(output, expected).await;
}
async fn execute_sql(instance: &MockInstance, sql: &str) -> Output {
async fn execute_sql(instance: &Instance, sql: &str) -> Output {
execute_sql_in_db(instance, sql, DEFAULT_SCHEMA_NAME).await
}
async fn execute_sql_in_db(instance: &MockInstance, sql: &str, db: &str) -> Output {
async fn execute_sql_in_db(instance: &Instance, sql: &str, db: &str) -> Output {
let query_ctx = Arc::new(QueryContext::with_current_schema(db.to_string()));
instance.inner().execute_sql(sql, query_ctx).await.unwrap()
instance.execute_sql(sql, query_ctx).await.unwrap()
}

View File

@@ -15,6 +15,7 @@
use std::collections::HashMap;
use std::sync::Arc;
use catalog::CatalogManagerRef;
use common_catalog::consts::{DEFAULT_CATALOG_NAME, DEFAULT_SCHEMA_NAME, MIN_USER_TABLE_ID};
use datatypes::data_type::ConcreteDataType;
use datatypes::schema::{ColumnSchema, SchemaBuilder};
@@ -29,37 +30,18 @@ use tempdir::TempDir;
use crate::datanode::{DatanodeOptions, ObjectStoreConfig};
use crate::error::{CreateTableSnafu, Result};
use crate::instance::Instance;
use crate::sql::SqlHandler;
pub(crate) struct MockInstance {
instance: Instance,
_guard: TestGuard,
}
impl MockInstance {
pub(crate) async fn new(name: &str) -> Self {
let (opts, _guard) = create_tmp_dir_and_datanode_opts(name);
let instance = Instance::with_mock_meta_client(&opts).await.unwrap();
instance.start().await.unwrap();
MockInstance { instance, _guard }
}
pub(crate) fn inner(&self) -> &Instance {
&self.instance
}
}
struct TestGuard {
/// Create a tmp dir(will be deleted once it goes out of scope.) and a default `DatanodeOptions`,
/// Only for test.
pub struct TestGuard {
_wal_tmp_dir: TempDir,
_data_tmp_dir: TempDir,
}
fn create_tmp_dir_and_datanode_opts(name: &str) -> (DatanodeOptions, TestGuard) {
let wal_tmp_dir = TempDir::new(&format!("gt_wal_{name}")).unwrap();
let data_tmp_dir = TempDir::new(&format!("gt_data_{name}")).unwrap();
pub fn create_tmp_dir_and_datanode_opts(name: &str) -> (DatanodeOptions, TestGuard) {
let wal_tmp_dir = TempDir::new(&format!("gt_wal_{}", name)).unwrap();
let data_tmp_dir = TempDir::new(&format!("gt_data_{}", name)).unwrap();
let opts = DatanodeOptions {
wal_dir: wal_tmp_dir.path().to_str().unwrap().to_string(),
storage: ObjectStoreConfig::File {
@@ -77,8 +59,9 @@ fn create_tmp_dir_and_datanode_opts(name: &str) -> (DatanodeOptions, TestGuard)
)
}
pub(crate) async fn create_test_table(
instance: &MockInstance,
pub async fn create_test_table(
catalog_manager: &CatalogManagerRef,
sql_handler: &SqlHandler,
ts_type: ConcreteDataType,
) -> Result<()> {
let column_schemas = vec![
@@ -89,7 +72,7 @@ pub(crate) async fn create_test_table(
];
let table_name = "demo";
let table_engine: TableEngineRef = instance.inner().sql_handler().table_engine();
let table_engine: TableEngineRef = sql_handler.table_engine();
let table = table_engine
.create_table(
&EngineContext::default(),
@@ -114,10 +97,11 @@ pub(crate) async fn create_test_table(
.await
.context(CreateTableSnafu { table_name })?;
let schema_provider = instance
.inner()
.catalog_manager
.schema(DEFAULT_CATALOG_NAME, DEFAULT_SCHEMA_NAME)
let schema_provider = catalog_manager
.catalog(DEFAULT_CATALOG_NAME)
.unwrap()
.unwrap()
.schema(DEFAULT_SCHEMA_NAME)
.unwrap()
.unwrap();
schema_provider

Some files were not shown because too many files have changed in this diff Show More