Commit Graph

22 Commits

Author SHA1 Message Date
evenyag
d52d1eb122 fix: Only convert LogicalTypeId to ConcreteDataType in tests (#241)
LogicalTypeId to ConcreteDataType is only allowed in tests, since some
additional info is not stored in LogicalTypeId now. It is just an id, or
kind, not contains full type info.
2022-09-09 17:48:59 +08:00
Lei, Huang
9366e77407 feat: impl timestamp type, value and vectors (#226)
* wip: impl timestamp data type

* add timestamp vectors

* adapt to recent changes to vector module

* fix all unit test

* rebase develop

* fix slice

* change default time unit to millisecond

* add more tests

* fix some CR comments

* fix some CR comments

* fix clippy

* fix some cr comments

* fix some CR comments

* fix some CR comments

* remove time unit in LogicalTypeId::Timestamp
2022-09-09 11:43:30 +08:00
evenyag
0ae99f7ac3 fix: Fix MitoTable::scan only returns one batch (#227)
* feat: Add tests for different batch

Add region scan test with different batch size

* fix: Fix table scan only returns one batch

* style: Fix clippy

* test: Add tests to scan table with rows more than batch size

* fix: Fix MockChunkReader never stop
2022-09-06 20:36:05 +08:00
LFC
5e67301c00 feat: implement alter table (#218)
* feat: implement alter table

* Currently we have no plans to support altering the primary keys (maybe never), so removed the related codes.

* make `alter` a trait function in table

* address other CR comments

* cleanup

* rebase develop

* resolve code review comments

Co-authored-by: luofucong <luofucong@greptime.com>
2022-09-06 13:44:34 +08:00
LFC
119ff2fc2e feat: create table through GRPC interface (#224)
* feat: create table through GRPC interface

* move `CreateExpr` `oneof` expr of `AdminExpr` in `admin.proto`, and implement the admin GRPC interface

* add `table_options` and `partition_options` to `CreateExpr`

* resolve code review comments

Co-authored-by: luofucong <luofucong@greptime.com>
2022-09-06 12:51:07 +08:00
evenyag
d71ae7934e feat: Upgrade rust to nightly-2022-07-14 (#217)
* feat: upgrade rust to nightly-2022-07-14

* style: Fix some clippy warnings

* style: clippy fix

* style: fix clippy

* style: Fix clippy

Some PartialEq warnings have been work around using cfg_attr test

* feat: Implement Eq and PartialEq for PrimitiveType

* chore: Remove unnecessary allow

* chore: Remove usage of cfg_attr for PartialEq
2022-09-01 17:50:48 +08:00
dennis zhuang
1caa94cd3e feat: save create table schema (#211)
* feat: save create table schema and respect user defined columns order when querying, close #179

* fix: address CR problems

* refactor: use with_context with ProjectedColumnNotFoundSnafu
2022-08-26 19:22:55 +08:00
evenyag
53637c90fd feat: Support projection (#192)
* feat: Add projected schema

* feat: Use projected schema to read sst

* feat: Use vector of column to implement Batch

* feat: Use projected schema to convert batch to chunk

* feat: Add no_projection() to build ProjectedSchema

* feat: Memtable supports projection

The btree memtable use `is_needed()` to filter unneeded value columns,
then use `ProjectedSchema::batch_from_parts()` to construct
batch, so it don't need to known the layout of internal columns.

* test: Add tests for ProjectedSchema

* test: Add tests for ProjectedSchema

Also returns error if the `projected_columns` used to build the
`ProjectedSchema` is empty.

* test: Add test for memtable projection

* feat: Table pass projection to storage engine

* fix: Use timestamp column name as schema metadata

This fix the issue that the metadata refer to the wrong timestamp column
if datafusion reorder the fields of the arrow schema.

* fix: Fix projected schema not passed to memtable

* feat: Add tests for region projection

* chore: fix clippy

* test: Add test for unordered projection

* chore: Move projected_schema to ReadOptions

Also fix some typo
2022-08-25 15:27:47 +08:00
evenyag
7c779a9861 feat: Add region schema for storage engine (#171)
* refactor: Merge RowKeyMetadata into ColumnsMetadata

Now RowKeyMetadata and ColumnsMetadata are almost always being used together, no need
to separate them into two structs. Now they are combined into the single
ColumnsMetadata struct.

chore: Make some fields of metadata private

feat: Replace schema in RegionMetadata by RegionSchema

The internal schema of a region should have the knownledge about all
internal columns that are reserved and used by the storage engine, such as
sequence, value type. So we introduce the `RegionSchema`, and it would
holds a `SchemaRef` that only contains the columns that user could see.

feat: Value derives Serialize and supports converting into json value

feat: Add version to schema

The schema version has an initial value 0 and would bump each time the
schema being altered.

feat: Adds internal columns to region metadata

Introduce the concept of reserved columns and internal columns.
Reserved columns are columns that their names, ids are reserved by the storage
engine, and could not be used by the user. Reserved columns usually have
special usage. Reserved columns expect the version columns are also
called internal columns (though the version could also be thought as a
special kind of internal column), are not visible to user, such as our
internal sequence, value_type columns.

The RegionMetadataBuilder always push internal columns used by the
engine to the columns in metadata. Internal columns are all stored
behind all user columns in the columns vector.

To avoid column id collision, the id reserved for columns has the most
significant bit set to 1. And the RegionMetadataBuilder would check the
uniqueness of the column id.

chore: Rebase develop and fix compile error

feat: add internal schema to region schema

feat: Add SchemaBuilder to build Schema

feat: Store row key end in region schema metadata

Also move the arrow schema construction to region::schema mod

feat: Add SstSchema

refactor: Replace MemtableSchema by RegionSchema

Now when writing sst files, we could use the arrow schema from our sst
schema, which contains the internal columns.

feat: Use SstSchema to read parquet

Adds user_column_end to metadata. When reading parquet file,
converts the arrow schema into SstSchema, then uses the row_key_end
and user_column_end to find out row key parts, value parts and internal
columns, instead of using the timestamp index, which may yields
incorrect index if we don't put the timestamp at the end of row key.

Move conversion from Batch to arrow Chunk to SstSchema, so SST mod doesn't
need to care the order of key, value and internal columns.

test: Add test for Value to serde_json::Value

feat: Add RawRegionMetadata to persist RegionMetadata

test: Add test to RegionSchema

fix: Fix clippy

To fix clippy::enum_clike_unportable_variant lint, define the column id
offset in ReservedColumnType and compute the final column id in
ReservedColumnId's const method

refactor: Move batch/chunk conversion to SstSchema

The parquet ChunkStream now holds the SstSchema and use its method to
convert Chunk into Batch.

chore: Address CR comment

Also add a test for pushing internal column to RegionMetadataBuilder

chore: Address CR comment

chore: Use bitwise or to compute column id

* chore: Address CR comment
2022-08-17 15:28:38 +08:00
Lei, Huang
a1c4921933 feat: impl create table sql execution (#168)
* catalog manager allocates table id

* rebase develop

* add some tests

* add some more test

* fix some cr comments

* insert into system catalog

* use slice pattern to simplify code

* add optional dependencies

* add sql-to-request test

* successfully recover

* fix unit tests

* rebase develop

* add some tests

* fix some cr comments

* fix some cr comments

* add a lock to CatalogManager

* feat: add gmt_created and gmt_modified columns to system catalog table
2022-08-17 10:53:19 +08:00
dennis zhuang
41ffbe82f8 feat: impl table manifest (#157)
* feat: impl TableManifest and refactor table engine, object store etc.

* feat: persist table metadata when creating it

* fix: remove unused file src/storage/src/manifest/impl.rs

* feat: impl recover table info from manifest

* test: add open table test and table manifest test

* fix: resolve CR problems

* fix: compile error and remove region id

* doc: describe parent_dir

* fix: address CR problems

* fix: typo

* Revert "fix: compile error and remove region id"

This reverts commit c14c250f8a.

* fix: compile error and generate region id by table_id and region number
2022-08-12 10:47:33 +08:00
Lei, Huang
1dd780d857 feat: implement catalog manager (#129)
Implement catalog manager that provides a vision of all existing tables while instance start. Current implementation is based on local table engine, all catalog info is stored in an system catalog table.
2022-08-11 15:43:59 +08:00
Lei, Huang
80372720bb refactor: open_region return None if region does not exist (#145)
* refactor: open_region return None if region does not exist

* fix some unit tests

* fix some CR comments
2022-08-08 16:53:52 +08:00
dennis zhuang
e9d6546c12 feat: impl create_table for MitoEngine, #125 (#142)
* feat: impl create_table for MitoEngine, #125

* fix: typo

* fix: address CR problems

* fix: address CR problems

* fix: address CR problems

* fix: format

* refactor: minor change
2022-08-08 15:36:00 +08:00
evenyag
fb4495eb46 feat: Adds TableEngine::open_table() (#132)
* feat: Add `open_table()` method to `TableEngine`

* feat: Implements MitoEngine::open_table()

For simplicity, this implementation just use the table name as region
name, and using that name to open a region for that table. It also
introduce a mutex to avoid opening the same table simultaneously.

* refactor: Shorten generic param name

Use `S` instead of `Store` for `MitoEngine`.

* test: Mock storage engine for table engine test

Add a `MockEngine` to mock the storage engine, so that testing the mito
table engine can sometimes use the mocked storage.

* test: Add open table test

Also remove `storage::gen_region_name` method, and always use table name
as default region name, so the table engine can open the table created
by `create_table()`.

* chore: Add open table log
2022-08-04 17:35:17 +08:00
Ning Sun
cd42f308a8 refactor: remove constructors from trait (#121)
* refactor: remove constructors from trait

* refactor: move PutOp into its parent type

* refactor: move put constructor to write request

* refactor: change visibility of PutData constructors

call from WriteRequest instead

* refactor: consistent naming for entry constructor

* refactor: fix constructor form Namespace trait

* refactor: remove comment code

* doc: fix doc comments
2022-08-02 16:25:03 +08:00
Lei, Huang
96b4ed01f7 refactor: Make TableEngine object safe (#119)
* refactor: Make TableEngine object safe

* define TableEngineRef

* fix some comments

* replace table::engine::Error with table::error::Error
2022-08-01 15:37:11 +08:00
Ning Sun
62cb649389 refactor: use derive_builder for boilerplate builders (#116)
* refactor: remove boilerplate builder code with derive_builder macro

* refactor: better build creation using Default::default()

* refactor: resolve api change issues in benchmark code

* refactor: address some review issues

* refactor: address clippy issues

* chore: doc and todo update

* refactor: add builder for RegionDescriptor
2022-07-29 14:31:12 +08:00
evenyag
bf5975ca3e feat: Prototype of the storage engine (#107)
* feat: memtable flush (#63)

* wip: memtable flush

* optimize schema conversion

* remove unnecessary import

* add parquet file verfication

* add backtrace to error

* chore: upgrade opendal to 0.9 and fixed some problems

* rename error

* fix: error description

Co-authored-by: Dennis Zhuang <killme2008@gmail.com>

* feat: region manifest service (#57)

* feat: adds Manifest API

* feat: impl region manifest service

* refactor: by CR comments

* fix: storage error mod test

* fix: tweak storage cargo

* fix: tweak storage cargo

* refactor: by CR comments

* refactor: rename current_version

* feat: add wal writer (#60)

* feat: add Wal

* upgrade engine for wal

* fix: unit test for wal

* feat: wal into region

* fix: unix test

* fix clippy

* chore: by cr

* chore: by cr

* chore: prevent test data polution

* chore: by cr

* minor fix

* chore: by cr

* feat: Implement flush (#65)

* feat: Flush framework

- feat: Add id to memtable
- refactor: Rename MemtableSet/MutableMemtables to MemtableVersion/MemtableSet
- feat: Freeze memtable
- feat: Trigger flush
- feat: Background job pool
- feat: flush job
- feat: Sst access layer
- feat: Custom Deserialize for StringBytes
- feat: Use RegionWriter to apply file metas
- feat: Apply version edit
- chore: Remove unused imports

refactor: Use ParquetWriter to replace FlushTask

refactor: FsAccessLayer takes object store as param

chore: Remove todo from doc comments

feat: Move wal to WriterContext

chore: Fix clippy

chore: Add backtrace to WriteWal error

* feat: adds manifest to region and refactor sst/manifest dir config (#72)

* feat: adds manifest to region and refactor sst/manifest dir with EngineConfig

* refactor: ensure path ends with '/' in ManifestLogStorage

* fix: style

* refactor: normalize storage directory path and minor changes by CR

* refactor: doesn't need slash any more

* feat: Implement apply_edit() and add timestamp index to schema (#73)

* feat: Implement VersionControl::apply_edit()

* feat: Add timestamp index to schema

* feat: Implement Schema::timestamp_column()

* feat: persist region metadata to manifest (#74)

* feat: persist metadata when creating region or sst files

* fix: revert FileMeta comment

* feat: resolve todo

* fix: clippy warning

* fix: revert files_to_remove type in RegionEdit

* feat: impl SizeBasedStrategy for flush (#76)

* feat: impl SizeBasedStrategy for flush

* doc: get_mutable_limitation

* fix: code style and comment

* feat: align timestamp (#75)

* feat: align timestamps in write batch

* fix cr comments

* fix timestamp overflow

* simplify overflow check

* fix cr comments

* fix clippy issues

* test: Fix region tests (comment out some unsupported tests) (#82)

* feat: flush job (#80)

* feat: flush job

* fix cr comments

* move file name instead of clone

* comment log file test (#84)

* feat: improve MemtableVersion (#78)

* feat: improve MemtableVersion

* feat: remove flushed immutable memtables and test MemtableVersion

* refactor: by CR comments

* refactor: clone kv in iterator

* fix: clippy warning

* refactor: Make BatchIterator supertrait of Iterator (#85)

* refactor: rename Version to ManifestVersion and move out manifest from ShareData (#83)

* feat: Insert multiple memtables by time range (#77)

* feat: memtable::Inserter supports insert multiple memtables by time range

* chore: Update timestamp comment

* test: Add tests for Inserter

* test: Fix region tests (comment out some unsupported tests)

* refactor: align_timestamp() use TimestampMillis::aligned_by_bucket()

* chore: rename aligned_by_bucket to align_by_bucket

* fix: Fix compile errors

* fix: sst and manifest dir (#86)

* Set RowKeyDescriptor::enable_version_column to false by default

* feat: Implement write stall (#90)

* feat: Implement write stall

* chore: Update comments

* feat: Support reading multiple memtables (#93)

* feat: Support reading multiple memtables

* test: uncomment tests rely on snapshot read

* feat: wal format (#70)

* feat: wal codec

* chore: minor fix

* chore: comment

* chore: by cr

* chore: write_batch_codec mod

* chore: by cr

* chore: upgrade proto

* chore: by cr

* fix failing test

* fix failing test

* feat: manifest to wal (#100)

* feat: write manifest to wal

* chore: sequence into wal

* chore: by cr

* chore: by cr

* refactor: create log store (#104)

Co-authored-by: dennis zhuang <killme2008@gmail.com>
Co-authored-by: Lei, Huang <6406592+v0y4g3r@users.noreply.github.com>
Co-authored-by: fariygirl <clickmetoday@163.com>
Co-authored-by: Jiachun Feng <jiachun_feng@proton.me>
Co-authored-by: Lei, HUANG <mrsatangel@gmail.com>

* chore: Fix clippy

Co-authored-by: Lei, Huang <6406592+v0y4g3r@users.noreply.github.com>
Co-authored-by: Dennis Zhuang <killme2008@gmail.com>
Co-authored-by: Jiachun Feng <jiachun_feng@proton.me>
Co-authored-by: fariygirl <clickmetoday@163.com>
Co-authored-by: Lei, HUANG <mrsatangel@gmail.com>
2022-07-25 15:26:00 +08:00
evenyag
6ec870625f refactor: Refactor usage of BoxedError (#48)
* feat: Define a general boxed error

* refactor: common_function use Error in common_query

* feat: Add tests to define_opaque_error macro

* refactor: Refactor table and table engine error

* refactor: recordbatch remove arrow dev-dependency

* refactor: datanode crate use common_error::BoxedError

* chore: Fix clippy

* feat: Returning source status code when using BoxedError

* test: Fix opaque error test

* test: Add tests for table::Error & table_engine::Error

* test: Add test for RecordBatch::new()

* test: Remove generated tests from define_opaque_error

* chore: Address cr comment
2022-06-21 15:24:45 +08:00
dennis zhuang
4071b0cff2 feat: impl scanning data from storage engine for table (#47)
* feat: impl scanning data from storage for MitoTable

* adds test mod to setup table engine test

* fix: comment error

* fix: boyan -> dennis in todo comments

* fix: remove necessary send in BatchIteratorPtr
2022-06-20 15:42:57 +08:00
dennis zhuang
e78c015fc0 TableEngine and SqlHandler impl (#45)
* Impl TableEngine, bridge to storage

* Impl sql handler to process insert sql

* fix: minor changes and typo

* test: add datanode test

* test: add table-engine test

* fix: code style

* refactor: split out insert mod from sql and minor changes by CR

* refactor: replace with_context with context
2022-06-17 11:36:49 +08:00