Commit Graph

21 Commits

Author SHA1 Message Date
Lei, HUANG
2df8143ad5 feat: support table ttl (#1052)
* feat: purge expired sst on compaction

* chore: add more log

* fix: clippy

* fix: mark expired ssts as compacting before picking candidates

* fix: some CR comments

* fix: remove useless result

* fix: cr comments
2023-02-22 16:56:20 +08:00
Lei, HUANG
e17d5a1c41 feat: support table options (#1044)
* feat: change table options from string map to a struct, add ttl and write_buffer_size

* fix: also pass table options to table meta

* feat: pass table options when opening/creating regions

* fix: CR comments
2023-02-21 08:10:23 +00:00
Lei, HUANG
af1f8d6101 feat: file purger (#1030)
* wip

* wip

* feat: file purger

* chore: add tests

* feat: delete removed file on sst merge

* chore: move MockAccessLayer to test_util

* fix: some cr comments

* feat: add await termination for scheduler

* fix: some cr comments

* chore: rename max_file_in_level0 to max_files_in_level0
2023-02-19 14:56:41 +08:00
Weny Xu
2f39a77137 feat: add close method for the region trait (#970)
feat: add close for region trait
2023-02-17 11:32:55 +08:00
Lei, HUANG
75b8afe043 feat: compaction integration (#997)
* feat: trigger compaction on flush

* chore: rebase develop

* feat: add config item max_file_in_level0 and remove compaction_after_flush

* fix: cr comments

* chore: add unit test to cover Timestamp::new_inclusive

* fix: workaround to fix future is not Sync

* fix: future is not sync

* fix: some cr comments
2023-02-15 14:14:07 +08:00
Lei, HUANG
8ffc078f88 fix: license header (#815) 2023-01-03 15:09:49 +08:00
LFC
ea9af42091 chore: upgrade Rust to nightly 2022-12-20 (#772)
* chore: upgrade Rust to nightly 2022-12-20

* chore: upgrade Rust to nightly 2022-12-20

Co-authored-by: luofucong <luofucong@greptime.com>
2022-12-21 19:32:30 +08:00
Ruihang Xia
7ba512980a chore: add APACHE-2.0 license header (#518)
* feat: add license checker workflow

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* fix existing header

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* specify license for internal sub-crate

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* fix rustfmt

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
2022-11-15 18:05:46 +08:00
Ruihang Xia
1565c8d236 chore: specify import style in rustfmt (#460)
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
2022-11-15 15:58:54 +08:00
Ruihang Xia
e30879f638 feat: Remove memtable's time bucket (#442)
* refactor: partially replace MemtableSet with Memtable

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* remove MemtableWithMeta and MemtableSet in non-test mod

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* remove dead code

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* make test compile 🤣

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* fix broken tests

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* make all tests pass

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* fix clippys

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* remove redundant clone

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* update comment

Co-authored-by: Yingwen <realevenyag@gmail.com>

* resolve review comment

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
Co-authored-by: Yingwen <realevenyag@gmail.com>
2022-11-11 18:02:34 +08:00
Ruihang Xia
ae147c2a74 chore: refine some unnecessary log (#410)
* remove some unnecessary informations in log

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* further cleaning

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
2022-11-07 16:36:27 +08:00
Ning Sun
f243649971 refactor: Removed openssl from build requirement (#308)
* refactor:replace another axum-test-helper branch

* refactor: upgrade opendal  version

* refactor: use cursor for file buffer

* refactor:remove native-tls in mysql_async

* refactor: use async block and pipeline for newer opendal api

* chore: update Cargo.lock

* chore: update dependencies

* docs: removed openssl from build requirement

* fix: call close on pipe writer to flush reader for parquet streamer

* refactor: remove redundant return

* chore: use pinned revision for our forked mysql_async

* style: avoid wild-card import in test code

* Apply suggestions from code review

Co-authored-by: Yingwen <realevenyag@gmail.com>

* style: use chained call for builder

Co-authored-by: liangxingjian <965662709@qq.com>
Co-authored-by: Yingwen <realevenyag@gmail.com>
2022-10-17 19:29:17 +08:00
dennis zhuang
41ffbe82f8 feat: impl table manifest (#157)
* feat: impl TableManifest and refactor table engine, object store etc.

* feat: persist table metadata when creating it

* fix: remove unused file src/storage/src/manifest/impl.rs

* feat: impl recover table info from manifest

* test: add open table test and table manifest test

* fix: resolve CR problems

* fix: compile error and remove region id

* doc: describe parent_dir

* fix: address CR problems

* fix: typo

* Revert "fix: compile error and remove region id"

This reverts commit c14c250f8a.

* fix: compile error and generate region id by table_id and region number
2022-08-12 10:47:33 +08:00
Lei, Huang
80372720bb refactor: open_region return None if region does not exist (#145)
* refactor: open_region return None if region does not exist

* fix some unit tests

* fix some CR comments
2022-08-08 16:53:52 +08:00
evenyag
f98d406580 refactor(storage): Add region id and name to metadata (#140)
* refactor(storage): Add region id and name to metadata

Add region id and name to `RegionMetadata`, simplify input arguments of
`RegionImpl::create()` and `RegionImpl::new()` method, since id and name
are already in metadata/version.

To avoid an atomic load of `Version` each time we access the region
id/name, we still store a copy of id/name in `SharedData`.

* chore: Remove todo in OpenOptions

Create region if missing when opening the region would be hard to
implement, since sometimes we may don't known the exact region schema user
would like to have.

* refactor: Make id and name of region readonly

By making `id` and `name` fields of `SharedData` and `RegionMetadata`
private and only exposing a pub getter.
2022-08-08 16:46:51 +08:00
evenyag
56fae412d2 feat: Implements replay (#135)
* feat: Implements RegionWriter::replay()

Refactors `preprocess_write()`, wraps time ranges calculation and
memtable creation to `prepare_memtables()` so these logic can be reused
by `WriterInner::replay()`. Then implements `WriterInner::replay()`
which reads write batch from wal and inserts it into memtables.

* feat: Use sequence in request as committed sequence

Also checks that sequence should increase monotonically and returns
error if found sequence decreases

* chore: Remove OpenOptions param from RegionWriter::replay

* test: Add region reopen tests

refactor(storage): Rename read_write test mod to basic

refactor(storage): Move common region test logic to TesterBase

Let read/write Tester and flush Tester share the same TesterBase struct,
which implements common operations like put/full_scan.

* feat: Constructs RegionImpl in open()

Constructs RegionImpl after replay in `RegionImpl::open()`

* feat: Adds RegionImpl::create()

Adds `RegionImpl::create()` method to persist region metadata to
manifest, then create the RegionImpl instance, so the storage engine
just invoke `RegionImpl::create()` instead of `RegionImpl::new()` to
create the region instance, and don't need to update manifest after
creating region instance anymore. Now `RegionImpl::new()` need to takes
version instead of metadata as input.

This change is also a necessary part to pass the region open test, since
to open a region,  need to persist something to manifest first.

* feat: Pass region open test

Use LocalFileLogStore for region test since NoopLogStore won't persist
data to the file system.

Create dir in `LocalFileLogStore::open` if it is not exist, so we don't
need to create the dir before using the logstore.

To pass the test, we always recover from flushed_sequence and use
`req_sequence + 1` as last sequence.

* test: Test reopen region multiple times

* chore: Address CR comments

Add more info to replay log and add an assert to check committed
sequence after reopen.

* refactor: Add cfg(test) to Version::new()

Remove `VersionControl::new()`, and add `#[cfg(test)]` to
`Version::new()` as it is only used by tests.
2022-08-04 17:00:01 +08:00
evenyag
f06968f4f5 feat: Engine::open_region code skeleton (#120)
* refactor: Move fields in SharedData to EngineInner

Since `SharedData` isn't shared now, we move all its fields to
EngineInner, and remove the `SharedData` struct, also remove the
unused config field.

* feat: Store RegionSlot in engine's region map

A `RegionSlot` has three possible state:
- Opening
- Creating
- Ready (Holds the `RegionImpl`)

Also use the `RegionSlot` as a placeholder in the region map to indicate
the region is opening/creating, so another open/create request will
fail immediately. The `SlotGuard` is used to clean the slot if we failed
to create/open the region.

* feat: Add a blank method `RegionImpl::open`

* feat: Remove MetadataId from Manifest

Now metadata id of manifest is unused, also unnecessary as we have
manifest dir to build the manifest, but constructing the manifest
still needs a passing region id as argument, which is unavailable
during opening region. So we remove the metadata id from manifest so
`region_store_config()` don't need region id as input anymore

* feat: Remove region id from logstore::Namespace and Wal

This is necessary for implementing open, since we don't have region
id this time, but we need to build Wal and its logstore namespace. Now
this is ok as id is not actually used by logstore.

* feat: Setup `open_region` code skeleton
2022-07-29 17:52:33 +08:00
Ning Sun
f81dfc9bed feat: add fmt::Debug for RegionImpl 2022-07-27 15:04:51 +08:00
evenyag
c9db093af7 feat: Cherry picks lost commits of flush (#111)
* fix: Fix write stall blocks flush applying version

refactor: Use store config to help constructing Region

chore: Address CR comments

* feat: adds manifest protocol supporting and refactor region metadata protocol

feat: ignore sqlparser log

refactor: PREV_VERSION_KEY constant

refactor: minor change for checking readable/writable

fix: address CR problems

refactor: use binary literal

Co-authored-by: Dennis Zhuang <killme2008@gmail.com>
2022-07-26 15:52:39 +08:00
evenyag
bf5975ca3e feat: Prototype of the storage engine (#107)
* feat: memtable flush (#63)

* wip: memtable flush

* optimize schema conversion

* remove unnecessary import

* add parquet file verfication

* add backtrace to error

* chore: upgrade opendal to 0.9 and fixed some problems

* rename error

* fix: error description

Co-authored-by: Dennis Zhuang <killme2008@gmail.com>

* feat: region manifest service (#57)

* feat: adds Manifest API

* feat: impl region manifest service

* refactor: by CR comments

* fix: storage error mod test

* fix: tweak storage cargo

* fix: tweak storage cargo

* refactor: by CR comments

* refactor: rename current_version

* feat: add wal writer (#60)

* feat: add Wal

* upgrade engine for wal

* fix: unit test for wal

* feat: wal into region

* fix: unix test

* fix clippy

* chore: by cr

* chore: by cr

* chore: prevent test data polution

* chore: by cr

* minor fix

* chore: by cr

* feat: Implement flush (#65)

* feat: Flush framework

- feat: Add id to memtable
- refactor: Rename MemtableSet/MutableMemtables to MemtableVersion/MemtableSet
- feat: Freeze memtable
- feat: Trigger flush
- feat: Background job pool
- feat: flush job
- feat: Sst access layer
- feat: Custom Deserialize for StringBytes
- feat: Use RegionWriter to apply file metas
- feat: Apply version edit
- chore: Remove unused imports

refactor: Use ParquetWriter to replace FlushTask

refactor: FsAccessLayer takes object store as param

chore: Remove todo from doc comments

feat: Move wal to WriterContext

chore: Fix clippy

chore: Add backtrace to WriteWal error

* feat: adds manifest to region and refactor sst/manifest dir config (#72)

* feat: adds manifest to region and refactor sst/manifest dir with EngineConfig

* refactor: ensure path ends with '/' in ManifestLogStorage

* fix: style

* refactor: normalize storage directory path and minor changes by CR

* refactor: doesn't need slash any more

* feat: Implement apply_edit() and add timestamp index to schema (#73)

* feat: Implement VersionControl::apply_edit()

* feat: Add timestamp index to schema

* feat: Implement Schema::timestamp_column()

* feat: persist region metadata to manifest (#74)

* feat: persist metadata when creating region or sst files

* fix: revert FileMeta comment

* feat: resolve todo

* fix: clippy warning

* fix: revert files_to_remove type in RegionEdit

* feat: impl SizeBasedStrategy for flush (#76)

* feat: impl SizeBasedStrategy for flush

* doc: get_mutable_limitation

* fix: code style and comment

* feat: align timestamp (#75)

* feat: align timestamps in write batch

* fix cr comments

* fix timestamp overflow

* simplify overflow check

* fix cr comments

* fix clippy issues

* test: Fix region tests (comment out some unsupported tests) (#82)

* feat: flush job (#80)

* feat: flush job

* fix cr comments

* move file name instead of clone

* comment log file test (#84)

* feat: improve MemtableVersion (#78)

* feat: improve MemtableVersion

* feat: remove flushed immutable memtables and test MemtableVersion

* refactor: by CR comments

* refactor: clone kv in iterator

* fix: clippy warning

* refactor: Make BatchIterator supertrait of Iterator (#85)

* refactor: rename Version to ManifestVersion and move out manifest from ShareData (#83)

* feat: Insert multiple memtables by time range (#77)

* feat: memtable::Inserter supports insert multiple memtables by time range

* chore: Update timestamp comment

* test: Add tests for Inserter

* test: Fix region tests (comment out some unsupported tests)

* refactor: align_timestamp() use TimestampMillis::aligned_by_bucket()

* chore: rename aligned_by_bucket to align_by_bucket

* fix: Fix compile errors

* fix: sst and manifest dir (#86)

* Set RowKeyDescriptor::enable_version_column to false by default

* feat: Implement write stall (#90)

* feat: Implement write stall

* chore: Update comments

* feat: Support reading multiple memtables (#93)

* feat: Support reading multiple memtables

* test: uncomment tests rely on snapshot read

* feat: wal format (#70)

* feat: wal codec

* chore: minor fix

* chore: comment

* chore: by cr

* chore: write_batch_codec mod

* chore: by cr

* chore: upgrade proto

* chore: by cr

* fix failing test

* fix failing test

* feat: manifest to wal (#100)

* feat: write manifest to wal

* chore: sequence into wal

* chore: by cr

* chore: by cr

* refactor: create log store (#104)

Co-authored-by: dennis zhuang <killme2008@gmail.com>
Co-authored-by: Lei, Huang <6406592+v0y4g3r@users.noreply.github.com>
Co-authored-by: fariygirl <clickmetoday@163.com>
Co-authored-by: Jiachun Feng <jiachun_feng@proton.me>
Co-authored-by: Lei, HUANG <mrsatangel@gmail.com>

* chore: Fix clippy

Co-authored-by: Lei, Huang <6406592+v0y4g3r@users.noreply.github.com>
Co-authored-by: Dennis Zhuang <killme2008@gmail.com>
Co-authored-by: Jiachun Feng <jiachun_feng@proton.me>
Co-authored-by: fariygirl <clickmetoday@163.com>
Co-authored-by: Lei, HUANG <mrsatangel@gmail.com>
2022-07-25 15:26:00 +08:00
evenyag
4171173b76 feat: Support creating in memory region and writing to memtable (#40)
* chore(store-api): Fix typo in region comments

* feat(storage): Init storage crate

* feat(store-api): Make some method async

* feat(storage): Blank StorageEngine implementation

* feat(storage): StorageEngine returns owned SchemaRef

* feat: pub use arrow in datatypes

* feat(store-api): Implement RegionMetadata

* feat(storage): Impl create region in memory.

* chore(object-store): Format cargo toml

* chore(storage): Log on region created

* feat: Impl CowCell

* feat: Store id to cf meta mapping

* refactor: Refactor version and rename it to VersionControl

* feat: Impl write batch for put, refactor column family

* feat(storage): Skeleton of writing to memtable

* refactor(storage): MemTable returns MemTableSchema

* feat: Add ColumnSchema and conversion between schema and arrow's schema

* feat: Validate put data

* feat: Valid schema of write batch

* feat: insert memtable WIP

* feat: Impl Inserter for memtable

* feat(datatypes): Implement Eq/Ord for Value

feat: Implement Ord/Eq for Bytes/StringBytes and Deref for Bytes

test: Test Value::from()

* feat: Define BTreeMemTable

* Fix: Rename get/get_unchecked to try_get/get and fix get not consider null.

* feat: Impl BTreeMemTable::write()

* refactor: Remove useless ColumnFamilyHandle now

* chore: Clean comment

* feat(common): Add from `String/&str/Vec<u8>/&[u8]` for Value

* test(storage): Add tests for WriteBatch

* chore: Fix clippy

* feat: Add builder for RowKey/ColumnFamilyDescriptor

* test: Add test for metadata

* chore: Fix clippy

* test: Add test for region and engine

* chore: Fix clippy

* chore: Address CR comment
2022-06-09 16:50:02 +08:00