Commit Graph

106 Commits

Author SHA1 Message Date
evenyag
03e965954a feat: implement read framework (#108)
* feat: implement read framework

feat: chunk reader builder

refactor: rename BatchIteratorPtr to BoxedBatchIterator

feat: BatchReader to read batch from ssts

feat: Add a ConcatReader to concat sst readers

test: Add tests for concat reader

chore: Fix clippy

* feat: implement SST parquet reader (#109)

* feat: implement parquet sst reader

* chores: fix some CR comments

* gst

* fix sst writer flush issue

* feat: Implement FsAccessLayer::read_sst

* fix: remove lifetime from ChunkStream

* refactor: Store file name in FileMeta

- Store file name instead of path (`region-name/file-name`) in FileMeta.
- `AccessLayer::read()` takes file name instead of path, so the read/write api are consistent

Co-authored-by: Lei, Huang <6406592+v0y4g3r@users.noreply.github.com>
Co-authored-by: Lei, HUANG <mrsatangel@gmail.com>
2022-07-28 11:46:51 +08:00
fys
3b2716ed70 feat: impl insert via grpc (#102)
* fix: build protobuf

* feat: impl grpc insert

* Add an example of grpc insert

* fix: cargo clippy

* cr
2022-07-28 10:25:22 +08:00
Lei, HUANG
3e42334b92 chores: change readme 2022-07-27 15:14:10 +08:00
Ning Sun
f81dfc9bed feat: add fmt::Debug for RegionImpl 2022-07-27 15:04:51 +08:00
evenyag
c9db093af7 feat: Cherry picks lost commits of flush (#111)
* fix: Fix write stall blocks flush applying version

refactor: Use store config to help constructing Region

chore: Address CR comments

* feat: adds manifest protocol supporting and refactor region metadata protocol

feat: ignore sqlparser log

refactor: PREV_VERSION_KEY constant

refactor: minor change for checking readable/writable

fix: address CR problems

refactor: use binary literal

Co-authored-by: Dennis Zhuang <killme2008@gmail.com>
2022-07-26 15:52:39 +08:00
evenyag
bf5975ca3e feat: Prototype of the storage engine (#107)
* feat: memtable flush (#63)

* wip: memtable flush

* optimize schema conversion

* remove unnecessary import

* add parquet file verfication

* add backtrace to error

* chore: upgrade opendal to 0.9 and fixed some problems

* rename error

* fix: error description

Co-authored-by: Dennis Zhuang <killme2008@gmail.com>

* feat: region manifest service (#57)

* feat: adds Manifest API

* feat: impl region manifest service

* refactor: by CR comments

* fix: storage error mod test

* fix: tweak storage cargo

* fix: tweak storage cargo

* refactor: by CR comments

* refactor: rename current_version

* feat: add wal writer (#60)

* feat: add Wal

* upgrade engine for wal

* fix: unit test for wal

* feat: wal into region

* fix: unix test

* fix clippy

* chore: by cr

* chore: by cr

* chore: prevent test data polution

* chore: by cr

* minor fix

* chore: by cr

* feat: Implement flush (#65)

* feat: Flush framework

- feat: Add id to memtable
- refactor: Rename MemtableSet/MutableMemtables to MemtableVersion/MemtableSet
- feat: Freeze memtable
- feat: Trigger flush
- feat: Background job pool
- feat: flush job
- feat: Sst access layer
- feat: Custom Deserialize for StringBytes
- feat: Use RegionWriter to apply file metas
- feat: Apply version edit
- chore: Remove unused imports

refactor: Use ParquetWriter to replace FlushTask

refactor: FsAccessLayer takes object store as param

chore: Remove todo from doc comments

feat: Move wal to WriterContext

chore: Fix clippy

chore: Add backtrace to WriteWal error

* feat: adds manifest to region and refactor sst/manifest dir config (#72)

* feat: adds manifest to region and refactor sst/manifest dir with EngineConfig

* refactor: ensure path ends with '/' in ManifestLogStorage

* fix: style

* refactor: normalize storage directory path and minor changes by CR

* refactor: doesn't need slash any more

* feat: Implement apply_edit() and add timestamp index to schema (#73)

* feat: Implement VersionControl::apply_edit()

* feat: Add timestamp index to schema

* feat: Implement Schema::timestamp_column()

* feat: persist region metadata to manifest (#74)

* feat: persist metadata when creating region or sst files

* fix: revert FileMeta comment

* feat: resolve todo

* fix: clippy warning

* fix: revert files_to_remove type in RegionEdit

* feat: impl SizeBasedStrategy for flush (#76)

* feat: impl SizeBasedStrategy for flush

* doc: get_mutable_limitation

* fix: code style and comment

* feat: align timestamp (#75)

* feat: align timestamps in write batch

* fix cr comments

* fix timestamp overflow

* simplify overflow check

* fix cr comments

* fix clippy issues

* test: Fix region tests (comment out some unsupported tests) (#82)

* feat: flush job (#80)

* feat: flush job

* fix cr comments

* move file name instead of clone

* comment log file test (#84)

* feat: improve MemtableVersion (#78)

* feat: improve MemtableVersion

* feat: remove flushed immutable memtables and test MemtableVersion

* refactor: by CR comments

* refactor: clone kv in iterator

* fix: clippy warning

* refactor: Make BatchIterator supertrait of Iterator (#85)

* refactor: rename Version to ManifestVersion and move out manifest from ShareData (#83)

* feat: Insert multiple memtables by time range (#77)

* feat: memtable::Inserter supports insert multiple memtables by time range

* chore: Update timestamp comment

* test: Add tests for Inserter

* test: Fix region tests (comment out some unsupported tests)

* refactor: align_timestamp() use TimestampMillis::aligned_by_bucket()

* chore: rename aligned_by_bucket to align_by_bucket

* fix: Fix compile errors

* fix: sst and manifest dir (#86)

* Set RowKeyDescriptor::enable_version_column to false by default

* feat: Implement write stall (#90)

* feat: Implement write stall

* chore: Update comments

* feat: Support reading multiple memtables (#93)

* feat: Support reading multiple memtables

* test: uncomment tests rely on snapshot read

* feat: wal format (#70)

* feat: wal codec

* chore: minor fix

* chore: comment

* chore: by cr

* chore: write_batch_codec mod

* chore: by cr

* chore: upgrade proto

* chore: by cr

* fix failing test

* fix failing test

* feat: manifest to wal (#100)

* feat: write manifest to wal

* chore: sequence into wal

* chore: by cr

* chore: by cr

* refactor: create log store (#104)

Co-authored-by: dennis zhuang <killme2008@gmail.com>
Co-authored-by: Lei, Huang <6406592+v0y4g3r@users.noreply.github.com>
Co-authored-by: fariygirl <clickmetoday@163.com>
Co-authored-by: Jiachun Feng <jiachun_feng@proton.me>
Co-authored-by: Lei, HUANG <mrsatangel@gmail.com>

* chore: Fix clippy

Co-authored-by: Lei, Huang <6406592+v0y4g3r@users.noreply.github.com>
Co-authored-by: Dennis Zhuang <killme2008@gmail.com>
Co-authored-by: Jiachun Feng <jiachun_feng@proton.me>
Co-authored-by: fariygirl <clickmetoday@163.com>
Co-authored-by: Lei, HUANG <mrsatangel@gmail.com>
2022-07-25 15:26:00 +08:00
LFC
2b064265bf feat: UDAF made generically (#91)
* feat: UDAF implementation backed by DataFusion.

Directly Transplant DataFusion's UDAF related structs, traits and functions, like `AggregateUDF`, `Accumulator` or `create_udaf` etc.

Implement median UDAF on top of it and used in unit testing.

Refs: #61

* feat: UDAF made generically

Refs: #61

* fix: cargo fmt

* fix: use prelude

* fix: uniform the name

* fix: move maybe commonly used functions together

* fix: make comments more clear

* fix: resolve conversations in CR

* fix: store input types in AccumulatorCreator, and use ScalarVector's iterator

* feat: introducing List value and List datatype

* refactor: use ArcSwap instead of Mutext

* refactor: shorten some namings

* refactor: move median UDAF out of tests

* refactor: rename

* feat: aggregate function registry

* fix: make `Value` satisfy ordering again

* fix: clippy warnings

* doc: add "how to write aggregate function"

* fix: address PR comments

* fix: trying to get rid of unwraps

Co-authored-by: luofucong <luofucong@greptime.com>
2022-07-25 10:35:36 +08:00
Lei, Huang
c126b480fd doc: add openssl install instructions to README.md (#99)
* doc: add openssl install instructions to README.md

* remove newline
2022-07-20 14:03:58 +08:00
evenyag
18509bacfa docs: Add prerequisites part to readme (#94) 2022-07-19 19:01:08 +08:00
天空好像下雨~
267a47e9dd move interp from test to numpy (#88)
* move interp from test to numpy

* move interp from test to numpy

* move interp from test to numpy

* move interp from test to numpy

* move interp from test to numpy
2022-07-18 15:38:30 +08:00
天空好像下雨~
403b94c948 feat: add operator interp (#66)
* benchmark

* bench:add read/write for memtable

* numpy-interp

* fix cast

* implement tests

* implement tests

Co-authored-by: 张心怡 <zhangxinyi@zhangxinyideMacBook-Pro.local>
2022-07-15 10:49:44 +08:00
fys
ad020284d3 feat: define proto for InsertExpr (grpc) (#79)
* feat: implement InsertExpr

* 1.InsertExpr reverted to previous version 2.add InsertBatch message

* add two SemanticTypes: TAG, TIMESTAMP

* chore: format proto files

* chore: add some comments about "Column"

* fix: rename "semanticType" -> "semantic_type"

* fix: unique number in InsertBatch

* fix: type of f64_values

* chore: move insertbatch and column to insert.proto

* chore: rename "ExprHeader" to "Header"

* fix: ExprHeader not found in this scope
2022-07-14 16:36:56 +08:00
天空好像下雨~
8852c9bc32 bench: read/write for memtable (#52)
* benchmark

* fix style

Co-authored-by: 张心怡 <zhangxinyi@zhangxinyideMacBook-Pro.local>
2022-07-11 17:44:22 +08:00
Lei, Huang
65890e09f6 doc: contributing.md (#67) 2022-07-07 15:54:34 +08:00
Jiachun Feng
6cf1da35ee feat: add grpc impl (#50)
* feat: add grpc impl

* feat: add grpc server

* some ut

* verson format: a.b

* code style

* admin request/response

* by cr

* admin api

* by cr

* chore: by cr

* chore: by cr
2022-07-06 20:56:16 +08:00
Lei, Huang
008f62afc1 feat: buffer abstraction (#51)
* feat: add buffer abstraction and rewrite entry encode/decode process

* add some tests

* remove pad.rst

* fix some comments

* fix comments

* remove mmap mod

* feat: Bytes type implementation switch to bytes::Bytes

* fix: use Bytes::from(String) and Bytes::from(Vec<u8>)

* feat: add new method to Entry trait
2022-07-04 14:08:23 +08:00
evenyag
11bf970efd feat: Implement TimestampMillis and RangeMillis (#56) 2022-06-29 20:55:27 +08:00
Lei, Huang
651bdbaa71 fix: log file test fail (#54)
* fix: log file test fail

* remove some log

* wip

* add log

* wip
2022-06-29 17:01:15 +08:00
dennis zhuang
bac6c720f8 feat: impl bytes_allocated for memtable (#55) 2022-06-28 15:11:04 +08:00
dennis zhuang
b567cfb9bc feat: memory size of vector (#53)
* feat: improve try_into_vector function

* feat: impl memory_size function for vectors

* fix: forgot memory_size assertion in null vector test

* feat: use LargeUtf8 instead of utf8 for string, and rename LargeBianryArray to BinaryArray

* feat: memory_size only calculates heap size
2022-06-28 11:06:53 +08:00
dennis zhuang
379d2e2f50 feature: runtime crate and global runtimes (#49)
* feat: init common runtime crate

* feat: tokio Runtime wrapper and global runtime functions

* feat: adds block_on_read, block_on_write, block_on_bg functions to runtime

* refactor: panic when configure global runtimes which are already initialized

* refactor: adds read/write/bg thread pool size

* fix: code style

* fix: clippy warning

* fix: test_metric panic

* fix: address CR problems

* log: adds log when creating runtime
2022-06-21 16:09:15 +08:00
evenyag
6ec870625f refactor: Refactor usage of BoxedError (#48)
* feat: Define a general boxed error

* refactor: common_function use Error in common_query

* feat: Add tests to define_opaque_error macro

* refactor: Refactor table and table engine error

* refactor: recordbatch remove arrow dev-dependency

* refactor: datanode crate use common_error::BoxedError

* chore: Fix clippy

* feat: Returning source status code when using BoxedError

* test: Fix opaque error test

* test: Add tests for table::Error & table_engine::Error

* test: Add test for RecordBatch::new()

* test: Remove generated tests from define_opaque_error

* chore: Address cr comment
2022-06-21 15:24:45 +08:00
dennis zhuang
4071b0cff2 feat: impl scanning data from storage engine for table (#47)
* feat: impl scanning data from storage for MitoTable

* adds test mod to setup table engine test

* fix: comment error

* fix: boyan -> dennis in todo comments

* fix: remove necessary send in BatchIteratorPtr
2022-06-20 15:42:57 +08:00
evenyag
056185eb24 feat(storage): Implement snapshot scan for region (#46)
* feat: Maintain last sequence in VersionControl

* refactor(recordbatch): Replace `Arc<Schema>` by SchemaRef

* feat: Memtable support filter rows with invisible sequence

* feat: snapshot wip

* feat: Implement scan for SnapshotImpl

* test: Add a test that simply puts and scans a region

* chore: Fix clippy

* fix(memtable): Fix memtable returning duplicate keys

* test(memtable): Add sequence visibility test

* test: Add ValueType test

* chore: Address cr comments

* fix: Fix value is not storing but adding to committed sequence
2022-06-20 14:09:31 +08:00
dennis zhuang
e78c015fc0 TableEngine and SqlHandler impl (#45)
* Impl TableEngine, bridge to storage

* Impl sql handler to process insert sql

* fix: minor changes and typo

* test: add datanode test

* test: add table-engine test

* fix: code style

* refactor: split out insert mod from sql and minor changes by CR

* refactor: replace with_context with context
2022-06-17 11:36:49 +08:00
Lei, Huang
e03ac2fc2b Implement log store append and file set management (#43)
* add log store impl

* add some test

* delete failing test

* fix: concurrent close issue

* feat: use arcswap to replace unsafe AtomicPtr

* fix: use lock to protect rolling procedure.
fix: use try_recv to replace poll_recv on appender task.

* chores: 1. use direct tmp dir instead of creating TempDir instance; 2. inline some short function; 3. rename some structs; 4. optimize namespace to arc wrapper inner struct.
2022-06-16 19:09:09 +08:00
fengjiachun
725a261b55 feat(cmd): command refactor (#44)
* feat(cmd): command refactor
2022-06-15 20:08:00 +08:00
fengjiachun
633524709b Merge pull request #42 from GrepTimeTeam/feat/storage/memtable/iter
feat: Add BatchIterator trait and support iterating btree memtable
2022-06-14 17:21:18 +08:00
evenyag
7700a167f2 chores: Address CR comment 2022-06-14 16:53:12 +08:00
evenyag
268598eb57 test: Fix VectorBuilder test and add Value data type test 2022-06-10 17:31:59 +08:00
evenyag
46c5681cb0 chore: Fix clippy 2022-06-10 16:11:07 +08:00
evenyag
9697fbc5e4 test: Add MemtableTester and batch_size test 2022-06-10 15:37:18 +08:00
evenyag
7a55d988fb test: Add simple write/iter test for memtable 2022-06-10 11:53:27 +08:00
evenyag
727bdb8b86 feat: Add sequences and value_types to Batch 2022-06-09 17:30:17 +08:00
evenyag
69b39e7846 feat: Impl BatchIterator for btree memtable
feat: Impl MapIterWrapper

refactor: Rename RowKey to InnerKey
2022-06-09 17:13:02 +08:00
evenyag
4171173b76 feat: Support creating in memory region and writing to memtable (#40)
* chore(store-api): Fix typo in region comments

* feat(storage): Init storage crate

* feat(store-api): Make some method async

* feat(storage): Blank StorageEngine implementation

* feat(storage): StorageEngine returns owned SchemaRef

* feat: pub use arrow in datatypes

* feat(store-api): Implement RegionMetadata

* feat(storage): Impl create region in memory.

* chore(object-store): Format cargo toml

* chore(storage): Log on region created

* feat: Impl CowCell

* feat: Store id to cf meta mapping

* refactor: Refactor version and rename it to VersionControl

* feat: Impl write batch for put, refactor column family

* feat(storage): Skeleton of writing to memtable

* refactor(storage): MemTable returns MemTableSchema

* feat: Add ColumnSchema and conversion between schema and arrow's schema

* feat: Validate put data

* feat: Valid schema of write batch

* feat: insert memtable WIP

* feat: Impl Inserter for memtable

* feat(datatypes): Implement Eq/Ord for Value

feat: Implement Ord/Eq for Bytes/StringBytes and Deref for Bytes

test: Test Value::from()

* feat: Define BTreeMemTable

* Fix: Rename get/get_unchecked to try_get/get and fix get not consider null.

* feat: Impl BTreeMemTable::write()

* refactor: Remove useless ColumnFamilyHandle now

* chore: Clean comment

* feat(common): Add from `String/&str/Vec<u8>/&[u8]` for Value

* test(storage): Add tests for WriteBatch

* chore: Fix clippy

* feat: Add builder for RowKey/ColumnFamilyDescriptor

* test: Add test for metadata

* chore: Fix clippy

* test: Add test for region and engine

* chore: Fix clippy

* chore: Address CR comment
2022-06-09 16:50:02 +08:00
evenyag
8fe577649f feat: Constructing Bytes/StringBytes from Vec<u8>/&[u8]/String/&str (#41) 2022-06-08 14:25:25 +08:00
dennis zhuang
f7136819fc function crate and scalars function (#39)
* feat: adds scalars mod and enhance vectors

* temp commit

* fix compile error

* Impl pow function with new udf framework

* Adds common-function crate and impl scalars function

* fix: remove used code

* test: adds test for function crate and refactor vectors

* fix: fmt style

* feat: impl numpy.clip function

* feat: improve clip function returning int64 type when arguments do not have float type

* feat: adds more test for vectors

* feat: adds replicate method test for primitive vector

* fix: by code review

* feat: clip function returns uint64 when all arguments type are unsigned

* refactor: improve vectors#only_null

* fix: clippy warning

* fix: clippy warning

* fix: clip should return float64 when arguments have both signed and unsigned types
2022-06-08 13:15:22 +08:00
evenyag
23f235524d feat: Implements validity() and null_count() for Vector (#38)
* feat: Add validity() to Vector

* test(datatypes): Add more tests and fix get_data() not returns None for null
2022-06-01 20:55:58 +08:00
Lei, Huang
fb0585229e refactor: Entry should be a trait (#37) 2022-05-26 11:30:50 +08:00
evenyag
383c55d39c ci: Only trigger ci on pull request (#36) 2022-05-25 10:54:43 +08:00
dennis zhuang
a2331366f6 feat: adds adds register_udf api to query engine and refactor datatypes (#34)
* feat: adds ColumnarValue and refactor vectors

* fix: ConcreteDataType compile error

* feat:adds udf/function mods

* feat: adds test for common_query crate

* feat: adds register_udf api to query engine

* feat: adds common_query::error test

* refactor: by CR comments

* refactor: adds impl_new_concret_type_functions! macro to reduce boilerplate codes

* fix: typo
2022-05-24 16:50:56 +08:00
Lei, Huang
06b592f00f feat: add WAL definitions (#35)
* feat: add WAL definitions

* rename and add some tests
2022-05-24 16:12:23 +08:00
evenyag
1594da337f feat(store-api): Prototype of storage engine api (#33) 2022-05-20 18:51:51 +08:00
Lei, Huang
e75a54b766 feat: impl From arrow array for exsisting vectors (#32)
* feat: impl From arrow array for exsisting vectors

* fix: review comments

* feat: clippy forbid prints
2022-05-19 16:10:00 +08:00
dennis zhuang
b0d2e2e91b feat: adds ConcretDataType and more datatypes impl (#31)
* feat: adds ConcretDataType and impl binary/boolean/null types and vectors

* feat: adds String to ConcretDataType

* docs:  ConcretDataType::from_arrow_type may panic
2022-05-19 11:21:11 +08:00
evenyag
5777732fde feat(store-api): Init store-api crate (#30) 2022-05-18 17:19:57 +08:00
Lei, Huang
519cbc832a feat: add StringVector datatype (#28) 2022-05-18 14:49:36 +08:00
Lei, Huang
bd4fe1f5bc feat: RecordBatch serialization (#26) 2022-05-17 17:01:00 +08:00
evenyag
3d374cce68 feat: implement log related macros (#29) 2022-05-17 16:00:17 +08:00