Commit Graph

166 Commits

Author SHA1 Message Date
evenyag
97d2aa4bfd feat: script engine and python impl (#219)
* feat: improve try_into_vector function

* Impl python mod and PyVector to execute script

* add AsSeq(BUT not IMPL)

* add&test pythonic_index, add into_py_obj(UNTEST)

* add into_datatypes_value(UNTEST)

* inplace setitem_by_index unsupport

* still struggle with testing AsSeq

* actually pyimpl AsSeq&AsMap

* add slice for PyVector

* improve visualibility for testing

* adjust for clippy

* add assert for test_execute_script

* add type anno in test

* feat: basic support for PyVector's operator with scalar (#64)

* feat: memory size of vector (#53)

* feat: improve try_into_vector function

* feat: impl memory_size function for vectors

* fix: forgot memory_size assertion in null vector test

* feat: use LargeUtf8 instead of utf8 for string, and rename LargeBianryArray to BinaryArray

* feat: memory_size only calculates heap size

* feat: impl bytes_allocated for memtable (#55)

* add init and constr

* rename type cast and add test

* fix bug in pyobj_to_val

* add default cast when no type specifed

* add basic add/sub/mul for array and scalar(value)

* cargo clippy

* comment out some println

* stricter clippy

* style: cargo fmt

* fix: string&bool support in val2pyobj & back

* style: remove println in test

* style: rm println in test mod in python.rs

* refactor: use wrap_index instead of pythonic_index

* refactor: right op in scalar_arith_op

* fix: stronger type& better test

* style: remove println

* fix: scalar sign/unsigned cast

* feat: improve try_into_vector function

* Impl python mod and PyVector to execute script

* add AsSeq(BUT not IMPL)

* add&test pythonic_index, add into_py_obj(UNTEST)

* add into_datatypes_value(UNTEST)

* inplace setitem_by_index unsupport

* still struggle with testing AsSeq

* actually pyimpl AsSeq&AsMap

* add slice for PyVector

* improve visualibility for testing

* adjust for clippy

* add assert for test_execute_script

* add type anno in test

* add init and constr

* rename type cast and add test

* fix bug in pyobj_to_val

* add default cast when no type specifed

* add basic add/sub/mul for array and scalar(value)

* cargo clippy

* comment out some println

* stricter clippy

* style: cargo fmt

* fix: string&bool support in val2pyobj & back

* style: remove println in test

* style: rm println in test mod in python.rs

* refactor: use wrap_index instead of pythonic_index

* refactor: right op in scalar_arith_op

* fix: stronger type& better test

* style: remove println

* fix: scalar sign/unsigned cast

* style: remove instead of comment out

* style: remove more comment out

* feat: support scalar div vector

* style: cargo fmt

* style: typo

* refactor: rename to correct var name

* refactor: directly use arrow2::array

* refactor: mv rsub&rdiv's op into a function

* test: add python expr test

* test: add test for PyList

* refactor: tweak order of arithmetics in rtruediv

* style: remove some `use`

* refactor: move `is_instance` to mod

* refactor: move fn to mod& move `use` to head

* style: cargo fmt

* fix: correct signed/unsigned cast

* refactor: wrap err msg in another fn

* style: cargo fmt

* style: remove ok_or_else for readability

* feat: add coprocessor fn(not yet impl)

* refactor: change back to wrapped_at

* fix: update Cargo.lock

* fix: update rustc version

* Update Rust Toolchain to nightly-2022-07-14

* feat: derive Eq when possible

* style: use `from` to avoid `needless_borrow` lint

Co-authored-by: dennis zhuang <killme2008@gmail.com>

* feat: python coprocessor with type annotation (#96)

* feat: add coprocessor fn

Signed-off-by: discord9 <zglzy29yzdk@gmail.com>

* feat: cast args into PyVector

* feat: uncomplete coprocessor

* feat: erase decorator in python ast

* feat: strip decorator in ast

* fix: change parse to `Interactive`

* style: format Cargo.toml

* feat: make coprocessor actually work

* feat: move coprocessor fn out of test mod

* feat: add error handling

* style: add some comment

* feat: rm type annotation

* feat: add type annotation support

* style: move compile method to vm closure

* feat: annotation for nullable

* feat: type coercion cast in annotation

* feat: actually cast(NOT TESTED)

* fix: allow single into(type)

* refactor: extract parse_type from parser

* style: cargo fmt

* feat: change to Expr to preserve location info

* feat: add CoprParse to deal parse check error

* style: add type anno doc for coprocessor

* test: add some test

* feat: add underscore as any type in annotation

* test: add parse& runtime testcases

* style: rm dbg! remnant

* style: cargo fmt

* feat: add more error prompt info

* style: cargo fmt

* style: add doc tests' missing `use`

* fix: doc test for coprocessor

* style: cargo fmt

* fix: add missing `use` for `cargo test --doc`

* refactor: according to reviews

* refactor: more tweaks according to reviews

* refactor: merge match arm

* refactor: move into different files(UNCOMPLELTE)

* refactor: split parse_copr into more function

* refactor: split `exec_coprocessor` to more fn

* style: cargo fmt

* feat: print Py Exceptions in String

* feat: error handling conform standards

* test: fix test_coprocessor

* feat: remove `into` in python

* test: remove all `into` in python test

* style: update comment

* refactor: move strip compile fn to impl Copr

* refactor: move `gen_schema` to impl copr

* refactor: move `check_cast_type` to impl copr

* refactor: if let to match

* style: cargo fmt

* refactor: better parse of keyword arg list

* style: cargo fmt

* refactor: some error handling(UNCOMPLETE)

* refactor: error handling to general Error type

* refactor: rm some Vec::new()

* test: modify all tests to ok

* style: reorder item

* refactor: fetch using iter

* style: cargo fmt

* style: fmt macro by hand

* refactor: rename InnerError to Error

* test: use ron to write test

* test: add test for exec_copr

* refactor: add parse_bin_op

* feat: add check_anno

* refactor: add some checker function

* refactor: exec_copr into smaller func

* style: add some comment

* refactor: add check for bin_op

* refactor: rm useless Result

* style: add pretty print for error with location

* feat: more info for pretty print

* refactor: mv pretty print to error.rs

* refactor: rm execute_script

* feat: add pretty print

* feat: add constant column support

* test: add test for constant column

* feat: add pretty print exec fn

* style: cargo fmt

* feat: add macro to chain call `.fail()`

* style: update doc for constant columns

* style: add lint to allow print in test fn

* style: cargo fmt

* docs: update some comment

* fix: ignore doctest for now

* refactor: check_bin_op

* refactor: parse_in_op, check ret anno fn

* refactor: rm check_decorator

* doc: loc add newline explain

* style: cargo fmt

* refactor: use Helper::try_into_vec in try_into_vec

* style: cargo fmt

* test: add ret anno test

* style: cargo fmt

* test: add name for .ron tests for better debug

* test: print emoji in test

* style: rm some comment out line

* style: rename `into` to `try_into` fn

* style: cargo fmt

* refactor: rm unuse serialize derive

* fix: pretty print out of bound fix

* fix: rm some space in pretty print

* style: cargo fmt

* test: not even a python fn def

* style: cargo fmt

* fix: pretty print off by one space

* fix: allow `eprint` in clippy lint

* fix: compile error after rebase develop

* feat: port 35 functions from DataFusion to Python Coprocessor (#137)

* refactor: `cargo clippy`

* feat: create a module

* style: cargo fmt

* feat: bind `pow()` function(UNTEST)

* test: add test for udf mod

* style: allow part eq not eq for gen code

* style: allow print in test lint

* feat: use PyObjectRef to handle more types

* feat: add cargo feature for udf modules

* style: rename feature to udf-builtins

* refactor: move away from mod.rs

* feat: add all_to_f64 cast fn

* feat: add bind_math_fn macro

* feat: add all simple math UDF

* feat: add `random(len)` math fn

* feat: port `avg()` from datafusion

* refactor: add `eval_aggr_fn`

* feat: add bind_aggr_fn macro

* doc: add comment for args of macro

* feat: add all UDAF from datafusion

* refactor: extract test to separate file

* style: cargo fmt

* test: add incomplete test

* test: add .ron test fn

* feat: support scalar::list

* doc: add comments

* style: rename VagueFloat/Int to LenFloat/IntVec

* test: for all fn(expect approx_median)

* test: better print

* doc: add comment for FloatWithError

* refactor: move test.rs out of builtins/

* style: cargo fmt

* doc: add comment for .ron file

* doc: update some comments

* test: EPS=1e-12 for float eq

* test: use f64::EPSILON instead

* test: change to 2*EPS

* test: cache interpreter for fast testing

* doc: remove a TODO which is done

* test: refacto to_py_obj fn

* fix: pow fn

* doc: add a TODO for type_.rs

* test: use new_int/float in test serde

* test: for str case

* style: cargo fmt

* feat: cast PyList to ScalarValue::List

* test: cast scalar to py obj and back

* feat: cast to PyList

* test: cast from PyList

* test: nested PyVector unsupported

* doc: remove unrunable doctest

* test: replace PartialEq with impl just_as_expect

* doc: add name for discord9's TODO

* refactor: cahnge to vm.ctx.new_** instead

* doc: complete a TODO

* refactor: is_instance and other minor problem

* refactor: remove type_::is_instance

* style: cargo fmt

* feat: rename to `greptime_builtin`

* fix: error handling for PyList datatype

* style: fix clippy warning

* test: for PyList

* feat: Python Coprocessor MVP (#180)

* feat: add get_arrow_op

* feat: add comparsion op(UNTESTED)

* doc: explain why no rich compare

* refactor: py_str2str&parse_keywords

* feat: add DecoratorArgs

* refactor: parse_keywords ret Deco Args

* style: remove unused

* doc: add todo

* style: remove some unused fn

* doc: add comment for copr's field

* feat: add copr_engine module

* refactor: move to `script` crate

* style: clean up cargo.toml

* feat: add query engine for copr engine

* refactor: deco args into separate struct

* test: update corrsponding test

* feat: async coprocessor engine

* refactor: add `exec_parsed` fn

* feat: sync version of coprocessor(UNTEST)

* refactor: remove useless lifetime

* feat: new type for async stream record batch

* merge: from PR#137 add py builtins

* toolchain: update rustc to nightly-08-16

* feat: add `exec_with_cached_vm` fn(Can't compile)

* toolchain: revert to 07-14

* fix: `exec_with_cached_vm`

* fix: allow vector[_] in params

* style: cargo fmt

* doc: update comment on `_`&`_|None`

* fix: allow import&ignore type anno is ok

* feat: allow ignore return types

* refsctor: remove unused py files in functions/

* style: fmt&clippy

* refactor: python modules (#186)

* refactor: move common/script to script

* fix: clippy warnings and refactor python modules

* refactor: remove modules mod rename tests mod

* feat: adds Script and ScriptEngine trait, then impl PyScript/PyScriptEngine

* refactor: remove pub use some functions in script

* refactor: python error mod

* refactor: coprocessor and vector

* feat: adds engine test and greptime.vector function to create vector from iterable

* fix: adds a blank line to cargo file end

* fix: compile error after rebase develop

* feat: script endpoint for http server (#206)

* feat: impl /scripts API for http server

* feat: adds http api version

* test: add test for scripts handler and endpoint

* feat: python side mock module and more builtin functions (#209)

* feat: add python side module(for both mock and real upload script)

* style: add *.pyc to gitignore

* feat: move copr decorator(in .py) to greptime.py

* doc: update comment for `datetime`&`mock_tester`&gitignore

* feat: `filter()` a array with bool array(UNTESTED)

* feat: `prev()`ious elem in array ret as new array(UNTEST)

* feat: `datetime()` parse date time string and ret integer(UNTEST)

* fix: add missing return&fmt

* fix: allow f32 cast to PyFloat

* fix: `datetime()`'s last token now parsed

* test: `calc_rvs` now can run with builtin module

* feat: allow rich compare which ret bool array

* feat: logic and(`&`) for bool array

* style: cargo fmt

* feat: index PyVector by bool array

* feat: alias `ln` as `log` in builtin modules

* feat: logic or(`|`)&not( `~`) for bool array

* feat: add `post` for @copr in py side mod

* feat: change datetime return to i64

* feat: py side mod `post` script to given address

* fix: add `engine` field in `post` in py side mod

* refactor: use `ConstantVector` in `pow()` builtin

* fix: prev ret err for zero array

* doc: rm comment out code

* test: incomplete pyside mod test case

* git: ignore all __pycache__

* style: fmt&clippy

* refactor: split py side module into exmaple&gptime

* feat: init_table in py using `v1/sql`  api

* feat: calc_rvs now run both locally and remote

* doc: add doc for how to run it

* fix: comment out start server code in test

* fix: clippy warnings

* fix: http test url

* fix: some CR problems

* fix: some CR problems

* refactor: script executor for instance

* refactor: remove engine param in execute_script

* chore: Remove unnecessary allow attributes

Co-authored-by: Dennis Zhuang <killme2008@gmail.com>
Co-authored-by: Discord9 <discord9@163.com>
Co-authored-by: discord9 <zglzy29yzdk@gmail.com>
Co-authored-by: discord9 <55937128+discord9@users.noreply.github.com>
v0.1-alpha
2022-09-01 20:38:39 +08:00
evenyag
d71ae7934e feat: Upgrade rust to nightly-2022-07-14 (#217)
* feat: upgrade rust to nightly-2022-07-14

* style: Fix some clippy warnings

* style: clippy fix

* style: fix clippy

* style: Fix clippy

Some PartialEq warnings have been work around using cfg_attr test

* feat: Implement Eq and PartialEq for PrimitiveType

* chore: Remove unnecessary allow

* chore: Remove usage of cfg_attr for PartialEq
2022-09-01 17:50:48 +08:00
fys
db55c69117 feat: impl grpc physical plan (#212)
* chore: rename "convert.rs" to "serde.rs"

* proto definition

* impl "projection"

* add mock_input_exec for test

* impl physical plan execution
2022-08-31 21:43:50 +08:00
fys
ba93aa83f2 chore: replace bitvec impl (#214)
* chore: replace bitvec impl

* chore: reduce one copy of nullmask

* chore: move bitvec to common_base
2022-08-31 14:13:36 +08:00
egg
38d5febafe modify docs (#213) 2022-08-30 18:56:21 +08:00
egg
787aab9c00 wal_benchmark (#188) 2022-08-30 14:49:31 +08:00
dennis zhuang
1caa94cd3e feat: save create table schema (#211)
* feat: save create table schema and respect user defined columns order when querying, close #179

* fix: address CR problems

* refactor: use with_context with ProjectedColumnNotFoundSnafu
2022-08-26 19:22:55 +08:00
evenyag
ad1bbc3817 feat: Implement PartialEq for Vector (#207)
* fix: ListVector::get returns Null if index is invalid

* feat: Implement eq for vector

* feat: Derive PartialEq for Batch

Simplify some test codes in schema mod

* refactor: Use macro to simplify vector equality check
2022-08-26 12:13:00 +08:00
Lei, Huang
ce139c8a23 fix: impl scalar value helper and remove range limit (#205)
* fix: impl scalar value helper function for DateTime

* remove range limit for date

* remove range limit for date
2022-08-26 10:59:23 +08:00
egg
99cf553148 add aggregate functions (#147) 2022-08-25 19:44:11 +08:00
evenyag
793caa8d44 refactor: Rename SstSchema to StoreSchema (#204) 2022-08-25 17:43:10 +08:00
evenyag
53637c90fd feat: Support projection (#192)
* feat: Add projected schema

* feat: Use projected schema to read sst

* feat: Use vector of column to implement Batch

* feat: Use projected schema to convert batch to chunk

* feat: Add no_projection() to build ProjectedSchema

* feat: Memtable supports projection

The btree memtable use `is_needed()` to filter unneeded value columns,
then use `ProjectedSchema::batch_from_parts()` to construct
batch, so it don't need to known the layout of internal columns.

* test: Add tests for ProjectedSchema

* test: Add tests for ProjectedSchema

Also returns error if the `projected_columns` used to build the
`ProjectedSchema` is empty.

* test: Add test for memtable projection

* feat: Table pass projection to storage engine

* fix: Use timestamp column name as schema metadata

This fix the issue that the metadata refer to the wrong timestamp column
if datafusion reorder the fields of the arrow schema.

* fix: Fix projected schema not passed to memtable

* feat: Add tests for region projection

* chore: fix clippy

* test: Add test for unordered projection

* chore: Move projected_schema to ReadOptions

Also fix some typo
2022-08-25 15:27:47 +08:00
Lei, Huang
465dcca65e feat: implement DateTime type (#198)
* feat: implement DateTime type

* add some tests

* Update src/common/time/src/datetime.rs

Co-authored-by: Ning Sun <sunng@protonmail.com>

* Update src/common/time/src/datetime.rs

Co-authored-by: Ning Sun <sunng@protonmail.com>
2022-08-24 14:34:42 +08:00
Lei, Huang
2373d676f7 feat: add Date type and value (#189)
* wip: add Date type and value

* fix some cr comments

* impl Date values

* finish date type

* optimize Date value serialization

* add some tests

* fix some cr comments

* add some more test
2022-08-23 18:04:32 +08:00
evenyag
4a117157b9 fix: Fix replay sequence and wal dir (#196)
* fix: Fix replay include flushed data

Replay should starts from flushed_sequence + 1

* fix: Move default wal path to `/tmp/greptimedb`
2022-08-23 17:39:53 +08:00
evenyag
8ea2aa73cf refactor: Use error!(e; xxx) pattern to log error (#195)
Use `error!(e; xxx)` pattern so we could get backtrace in error log.

Also use BoxedError as error source of ExecuteQuery instead of String,
so we could carry backtrace and other info in it.
2022-08-23 17:35:24 +08:00
Ning Sun
ad14c83369 feat: Add row iterator for recordbatch and removed some data copy/allocation from MySQL impl (#193)
* feat: add `BorrowedValue` and DF Array access by index

This `BorrowedValue` can hold from datafusion arrow without copy.

`arrow_array_access` provides an index access to Arrow array and it holds the
result with our `BorrowedValue`. So we don't have to copy string/binary when
converting to `Value`.

* refactor: use borrowed types and iterator for recordbatch access

* fix: return Null with early check

* fix: i64 type error addressed by unit test

* refactor: give arrow_array_access a better name

* refactor: removed borrowed value and use value for now

* refactor: make iterator to return result of vec

* refactor: lift recordbatch iterator into common module

* fix: address clippy warnings
2022-08-23 17:20:50 +08:00
Yong
86dd19dcc8 build: add dockerfile to build greptimedb container image (#194) 2022-08-22 12:20:20 +08:00
evenyag
144ea348c7 ci: Skip some actions if the PR is a draft (#191)
- Don't run github actions on draft pull requests
- Now the title checker won't be affected, seems due to it was triggered by pull_request_target, not pull_request event
2022-08-19 17:54:30 +08:00
Yong
cab20cfd3e docs: add 'SQL Operations' section (#190)
Signed-off-by: zyy17 <zyylsxm@gmail.com>

Signed-off-by: zyy17 <zyylsxm@gmail.com>
2022-08-19 17:26:07 +08:00
LFC
9a68e4ca88 fix: correctly convert Value::Null to ScalarValue (#187)
* fix: correctly convert Value::Null to ScalarValue

* address PR comments

* refactor: make code robust

Co-authored-by: luofucong <luofucong@greptime.com>
2022-08-19 10:37:30 +08:00
evenyag
5c9b46fbf8 refactor: Rename value_type to op_type (#185) 2022-08-18 16:07:45 +08:00
fys
7a0a20e0f4 fix: rename "DataNode" to "Datanode" (#181) 2022-08-18 12:58:23 +08:00
evenyag
9d5be75a9c refactor: Move test_util to datanode/src (#178)
This also fixes the dead code warning of `create_test_table()` as the
files under `datanode/tests` are considered as individual libs. Moves
them to src dir makes sharing codes much easier.
2022-08-17 18:36:05 +08:00
evenyag
7c779a9861 feat: Add region schema for storage engine (#171)
* refactor: Merge RowKeyMetadata into ColumnsMetadata

Now RowKeyMetadata and ColumnsMetadata are almost always being used together, no need
to separate them into two structs. Now they are combined into the single
ColumnsMetadata struct.

chore: Make some fields of metadata private

feat: Replace schema in RegionMetadata by RegionSchema

The internal schema of a region should have the knownledge about all
internal columns that are reserved and used by the storage engine, such as
sequence, value type. So we introduce the `RegionSchema`, and it would
holds a `SchemaRef` that only contains the columns that user could see.

feat: Value derives Serialize and supports converting into json value

feat: Add version to schema

The schema version has an initial value 0 and would bump each time the
schema being altered.

feat: Adds internal columns to region metadata

Introduce the concept of reserved columns and internal columns.
Reserved columns are columns that their names, ids are reserved by the storage
engine, and could not be used by the user. Reserved columns usually have
special usage. Reserved columns expect the version columns are also
called internal columns (though the version could also be thought as a
special kind of internal column), are not visible to user, such as our
internal sequence, value_type columns.

The RegionMetadataBuilder always push internal columns used by the
engine to the columns in metadata. Internal columns are all stored
behind all user columns in the columns vector.

To avoid column id collision, the id reserved for columns has the most
significant bit set to 1. And the RegionMetadataBuilder would check the
uniqueness of the column id.

chore: Rebase develop and fix compile error

feat: add internal schema to region schema

feat: Add SchemaBuilder to build Schema

feat: Store row key end in region schema metadata

Also move the arrow schema construction to region::schema mod

feat: Add SstSchema

refactor: Replace MemtableSchema by RegionSchema

Now when writing sst files, we could use the arrow schema from our sst
schema, which contains the internal columns.

feat: Use SstSchema to read parquet

Adds user_column_end to metadata. When reading parquet file,
converts the arrow schema into SstSchema, then uses the row_key_end
and user_column_end to find out row key parts, value parts and internal
columns, instead of using the timestamp index, which may yields
incorrect index if we don't put the timestamp at the end of row key.

Move conversion from Batch to arrow Chunk to SstSchema, so SST mod doesn't
need to care the order of key, value and internal columns.

test: Add test for Value to serde_json::Value

feat: Add RawRegionMetadata to persist RegionMetadata

test: Add test to RegionSchema

fix: Fix clippy

To fix clippy::enum_clike_unportable_variant lint, define the column id
offset in ReservedColumnType and compute the final column id in
ReservedColumnId's const method

refactor: Move batch/chunk conversion to SstSchema

The parquet ChunkStream now holds the SstSchema and use its method to
convert Chunk into Batch.

chore: Address CR comment

Also add a test for pushing internal column to RegionMetadataBuilder

chore: Address CR comment

chore: Use bitwise or to compute column id

* chore: Address CR comment
2022-08-17 15:28:38 +08:00
LFC
ccda17248e feat: unify servers and mysql server in datanode (#172)
* address PR comments

address PR comments

use 3306 for mysql server's default port

upgrade metric to version 0.20

move crate "servers" out of "common"

make mysql io threads count configurable in config file

add snafu backtrace for errors with source

use common-server error for mysql server

add test for grpc server

refactor testing codes

fix rustfmt check

start mysql server in datanode

move grpc server codes from datanode to common-servers

feat: unify servers

* rebase develop and resolve conflicts

* remove an unnecessary todo

Co-authored-by: luofucong <luofucong@greptime.com>
2022-08-17 14:29:12 +08:00
evenyag
6d23118aa0 chore: Resolves remaining comments in #168 (#175)
* fix: Rename current_timestamp to current_time_millis, fix resolution

Fix current_timestamp returns seconds resolution, also add a test for
this method

* chore: Use slice of array instead of Vec

Save some heap allocations

* test: Compare std and chrono timestamp

The original test always success even the current_time_millis returns in
seconds resolution

* chore: Store current time in gmt_created/gmt_modified
2022-08-17 12:11:08 +08:00
Lei, Huang
a1c4921933 feat: impl create table sql execution (#168)
* catalog manager allocates table id

* rebase develop

* add some tests

* add some more test

* fix some cr comments

* insert into system catalog

* use slice pattern to simplify code

* add optional dependencies

* add sql-to-request test

* successfully recover

* fix unit tests

* rebase develop

* add some tests

* fix some cr comments

* fix some cr comments

* add a lock to CatalogManager

* feat: add gmt_created and gmt_modified columns to system catalog table
2022-08-17 10:53:19 +08:00
fys
34133fae5a feat: impl select (grpc) (#138)
* SelectExpr: change to oneof expr

* Convert between Vec<u8> and SelectResult

* Chore: use encode_to_vec and decode, instead of encode_length_delimited_to_vec and decode_length_delimited

* Chore: move bitset into separate file

* Grpc select impl
2022-08-15 18:31:47 +08:00
dennis zhuang
60dc77d1d9 feat: adds datanode config file supporting, close #156 (#167)
* feat: adds datanode config file supporting, close #156

* doc: update readme

* fix: address CR problems

* fix: remove unused log
2022-08-15 16:17:56 +08:00
Lei, Huang
b695881c6a fix: logstore read supports namespace isolation (#163)
* logstore read supports namespace isolation

* add namespace isolation test

* update

* revert unexpected changes

* Update log.rs

remove unnecessary info log

* reformat code
2022-08-15 11:43:48 +08:00
Lei, Huang
28b7a7cf35 fix develop (#166) 2022-08-12 14:21:52 +08:00
LFC
4098c57446 feat: MySQL protocol server (#158)
* MySQL protocol server

* fix: Rustfmt check

* fix: resolve PR comments

Co-authored-by: luofucong <luofucong@greptime.com>
2022-08-12 11:41:45 +08:00
dennis zhuang
41ffbe82f8 feat: impl table manifest (#157)
* feat: impl TableManifest and refactor table engine, object store etc.

* feat: persist table metadata when creating it

* fix: remove unused file src/storage/src/manifest/impl.rs

* feat: impl recover table info from manifest

* test: add open table test and table manifest test

* fix: resolve CR problems

* fix: compile error and remove region id

* doc: describe parent_dir

* fix: address CR problems

* fix: typo

* Revert "fix: compile error and remove region id"

This reverts commit c14c250f8a.

* fix: compile error and generate region id by table_id and region number
2022-08-12 10:47:33 +08:00
Jiachun Feng
ea40616cfe chore: avoid clone column names (#161) 2022-08-12 10:09:23 +08:00
Lei, Huang
1dd780d857 feat: implement catalog manager (#129)
Implement catalog manager that provides a vision of all existing tables while instance start. Current implementation is based on local table engine, all catalog info is stored in an system catalog table.
2022-08-11 15:43:59 +08:00
Lei, Huang
2c7e83c792 ci: ignore error.rs in coverage (#162) 2022-08-11 15:01:45 +08:00
Jiachun Feng
ffd637e5f5 chore: replace bitvec impl (#159)
* chore: replace bitvec impl

* chore: lazy init bitvec
2022-08-11 10:00:20 +08:00
Lei, Huang
d141fbc674 fix: log store write and read (#97)
* add pwrite

* write

* fix write

* error handling in write thread

* wrap some LogFile field to state field

* remove some unwraps

* reStructure some code

* implement file chunk

* composite chunk decode

* add test for chunk stream

* fix buffer test

* remove some useless code

* add test for read_at and file_chunk_stream

* use bounded channel to implement back pressure

* reimplement entry read and decoding

* add some doc

* clean some code

* use Sender::blocking_send to replace manually spawn

* support synchronous file chunk stream

* remove useless clone

* remove set_offset from Entry trait

* cr: fix some comments

* fix: add peek methods for Buffer

* add test for read at the middle of file

* fix some minor issues on comments

* rebase on to develop

* add peek_to_slice and read_to_slice

* initialize file chunk on heap

* fix some comments in CR

* respect entry id set outside LogStore

* fix unit test

* Update src/log-store/src/fs/file.rs

Co-authored-by: evenyag <realevenyag@gmail.com>

* fix some cr comments

Co-authored-by: evenyag <realevenyag@gmail.com>
2022-08-10 11:16:04 +08:00
egg
8d51ad3429 feat: write_batch proto codec (#122)
* feat: protobuf codec

* chore: minor fix

* chore: beatify the macro code

* chore: minor fix

* chore: by cr

* chore: by cr and impl wal with proto

* bugfix: invalid num_rows for multi put_data in mutations

Co-authored-by: jiachun <jiachun_fjc@163.com>
2022-08-09 19:57:51 +08:00
evenyag
567510fa3e ci: Add pr title checker (#155) 2022-08-08 18:27:02 +08:00
Lei, Huang
80372720bb refactor: open_region return None if region does not exist (#145)
* refactor: open_region return None if region does not exist

* fix some unit tests

* fix some CR comments
2022-08-08 16:53:52 +08:00
evenyag
f98d406580 refactor(storage): Add region id and name to metadata (#140)
* refactor(storage): Add region id and name to metadata

Add region id and name to `RegionMetadata`, simplify input arguments of
`RegionImpl::create()` and `RegionImpl::new()` method, since id and name
are already in metadata/version.

To avoid an atomic load of `Version` each time we access the region
id/name, we still store a copy of id/name in `SharedData`.

* chore: Remove todo in OpenOptions

Create region if missing when opening the region would be hard to
implement, since sometimes we may don't known the exact region schema user
would like to have.

* refactor: Make id and name of region readonly

By making `id` and `name` fields of `SharedData` and `RegionMetadata`
private and only exposing a pub getter.
2022-08-08 16:46:51 +08:00
dennis zhuang
e9d6546c12 feat: impl create_table for MitoEngine, #125 (#142)
* feat: impl create_table for MitoEngine, #125

* fix: typo

* fix: address CR problems

* fix: address CR problems

* fix: address CR problems

* fix: format

* refactor: minor change
2022-08-08 15:36:00 +08:00
LFC
e833167ad6 feat: extract MemTable to ease testing (#133)
* feat: memtable backed by DataFusion to ease testing

* move test utility codes out of src folder

* Implement our own MemTable because DataFusion's MemTable does not support limit; and replace the original testing numbers table.

* fix: address PR comments

* fix: "testutil" -> "test-util"

* roll back "NumbersTable"

Co-authored-by: luofucong <luofucong@greptime.com>
2022-08-05 13:58:05 +08:00
Ning Sun
97be052b33 feat: update tonic/prost and simplify build requirements (#130)
* feat: update tonic/prost and simplify build requirements

* doc: update readme for protoc installtion
2022-08-04 23:11:39 +08:00
evenyag
fb4495eb46 feat: Adds TableEngine::open_table() (#132)
* feat: Add `open_table()` method to `TableEngine`

* feat: Implements MitoEngine::open_table()

For simplicity, this implementation just use the table name as region
name, and using that name to open a region for that table. It also
introduce a mutex to avoid opening the same table simultaneously.

* refactor: Shorten generic param name

Use `S` instead of `Store` for `MitoEngine`.

* test: Mock storage engine for table engine test

Add a `MockEngine` to mock the storage engine, so that testing the mito
table engine can sometimes use the mocked storage.

* test: Add open table test

Also remove `storage::gen_region_name` method, and always use table name
as default region name, so the table engine can open the table created
by `create_table()`.

* chore: Add open table log
2022-08-04 17:35:17 +08:00
evenyag
56fae412d2 feat: Implements replay (#135)
* feat: Implements RegionWriter::replay()

Refactors `preprocess_write()`, wraps time ranges calculation and
memtable creation to `prepare_memtables()` so these logic can be reused
by `WriterInner::replay()`. Then implements `WriterInner::replay()`
which reads write batch from wal and inserts it into memtables.

* feat: Use sequence in request as committed sequence

Also checks that sequence should increase monotonically and returns
error if found sequence decreases

* chore: Remove OpenOptions param from RegionWriter::replay

* test: Add region reopen tests

refactor(storage): Rename read_write test mod to basic

refactor(storage): Move common region test logic to TesterBase

Let read/write Tester and flush Tester share the same TesterBase struct,
which implements common operations like put/full_scan.

* feat: Constructs RegionImpl in open()

Constructs RegionImpl after replay in `RegionImpl::open()`

* feat: Adds RegionImpl::create()

Adds `RegionImpl::create()` method to persist region metadata to
manifest, then create the RegionImpl instance, so the storage engine
just invoke `RegionImpl::create()` instead of `RegionImpl::new()` to
create the region instance, and don't need to update manifest after
creating region instance anymore. Now `RegionImpl::new()` need to takes
version instead of metadata as input.

This change is also a necessary part to pass the region open test, since
to open a region,  need to persist something to manifest first.

* feat: Pass region open test

Use LocalFileLogStore for region test since NoopLogStore won't persist
data to the file system.

Create dir in `LocalFileLogStore::open` if it is not exist, so we don't
need to create the dir before using the logstore.

To pass the test, we always recover from flushed_sequence and use
`req_sequence + 1` as last sequence.

* test: Test reopen region multiple times

* chore: Address CR comments

Add more info to replay log and add an assert to check committed
sequence after reopen.

* refactor: Add cfg(test) to Version::new()

Remove `VersionControl::new()`, and add `#[cfg(test)]` to
`Version::new()` as it is only used by tests.
2022-08-04 17:00:01 +08:00
Lei, Huang
7395920bc8 move catalog-related traits and struct to a catalog crate (#134) 2022-08-04 11:05:28 +08:00
dennis zhuang
6db6106829 feat: impl recovering version from manifest for region (#127)
* feat: impl recovering version from manifest for region

* refactor: rename try_apply_edit to replay_edit

* fix: remove println

* fix: address CR problems

* feat: remove Metadata in manifest trait and update region manifest state after recovering
2022-08-03 11:05:52 +08:00