Commit Graph

244 Commits

Author SHA1 Message Date
Yingwen
74862f8c3f feat(mito): Checks whether a region should flush periodically (#3459)
* feat: handle flush periodically

* chore: call periodical method in loop

* feat: check periodical tasks on channel timeout

* refactor: use time provider to get time

Mock a time provider to test auto flush

* chore: fix typos

* refactor: rename mock time provider

* style: fix cilppy

* chore: address comment
2024-03-15 06:41:28 +00:00
Yingwen
8ca9e01455 feat: Partition memtables by time if compaction window is provided (#3501)
* feat: define time partitions

* feat: adapt time partitions to version

* feat: implement non write methods

* feat: add write one to memtable

* feat: implement write

* chore: fix warning

* fix: inner not set

* refactor: add collect_iter_timestamps

* test: test partitions

* chore: debug log

* chore: fix typos

* chore: log memtable id

* fix: empty check

* chore: log total parts

* chore: update comments
2024-03-14 11:13:01 +00:00
Yingwen
7c895e2605 perf: more benchmarks for memtables (#3491)
* chore: remove duplicate bench

* refactor: rename bench

* perf: add full scan bench for memtable

* feat: filter bench and add time series to bench group

* chore: comment

* refactor: rename

* style: fix clippy
2024-03-12 12:02:58 +00:00
Yingwen
9aa8f756ab fix: allow passing extra table options (#3484)
* fix: do not check options in parser

* test: fix tests

* test: fix sqlness

* test: add sqlness test

* chore: log options

* chore: must specify compaction type

* feat: validate option key

* feat: add option key validation back
2024-03-12 07:03:52 +00:00
Yingwen
06dcd0f6ed fix: freeze data buffer in shard (#3468)
* feat: call freeze if the active data buffer in a shard is full

* chore: more metrics

* chore: print metrics

* chore: enlarge freeze threshold

* test: test freeze

* test: fix config test
2024-03-11 14:51:06 +00:00
gcmutator
21ff3620be chore: remove repetitive words (#3469)
remove repetitive words

Signed-off-by: gcmutator <329964069@qq.com>
2024-03-09 04:18:47 +00:00
Yingwen
3ee53360ee perf: Reduce decode overhead during pruning keys in the memtable (#3415)
* feat: reuse value buf

* feat: skip values to decode

* feat: prune shard

chore: fix compiler errors

refactor: shard prune metrics

* fix: panic on DedupReader::try_new

* fix: prune after next

* chore: num parts metrics

* feat: metrics and logs

* chore: data build cost

* chore: more logs

* feat: cache skip result

* chore: todo

* fix: index out of bound

* test: test codec

* fix: invalid offsets

* fix: skip binary

* fix: offset buffer reuse

* chore: comment

* test: test memtable filter

* style: fix clippy

* chore: fix compiler error
2024-03-08 02:54:00 +00:00
Lei, HUANG
7183fa198c refactor: make MergeTreeMemtable the default choice (#3430)
* refactor: make MergeTreeMemtable the default choice

* refactor: reformat

* chore: add doc to config
2024-03-05 10:00:08 +00:00
Yingwen
49157868f9 feat: Correct server metrics and add more metrics for scan (#3426)
* feat: drop timer on stream terminated

* refactor: combine metrics into a histogram vec

* refactor: frontend grpc metrics

* feat: add metrics middleware layer to grpc server

* refactor: move http metrics layer to metrics mod

* feat: bucket for grpc/http elapsed

* feat: remove duplicate metrics

* style: fix cilppy

* fix: incorrect bucket of promql series

* feat: more metrics for mito

* feat: convert cost

* test: fix metrics test
2024-03-04 10:15:10 +00:00
niebayes
7d30c2484b fix: mitigate memory spike during startup (#3418)
* fix: fix memory spike during startup

* fix: allocate a region write ctx for each wal entry
2024-03-01 07:46:05 +00:00
Lei, HUANG
376409b857 feat: employ sparse key encoding for shard lookup (#3410)
* feat: employ short key encoding for shard lookup

* fix: license

* chore: simplify code

* refactor: only enable sparse encoding to speed lookup on metric engine

* fix: names
2024-03-01 06:22:15 +00:00
Lei, HUANG
3413fc0781 refactor: move some costly methods in DataBuffer::read out of read lock (#3406)
* refactor: move some costly methods in DataBuffer::read out of read lock

* refactor: also replace ShardReader with ShardReaderBuilder
2024-02-28 12:22:44 +00:00
Lei, HUANG
a0a8e8c587 fix: some read metrics (#3404)
* fix: some read metrics

* chore: fix some metrics

* fix
2024-02-28 08:47:49 +00:00
Zhenchi
c3c80b92c8 feat(index): measure memory usage in global instead of single-column and add metrics (#3383)
* feat(index): measure memory usage in global instead of single-column and add metrics

* feat: add leading zeros to streamline memory usage

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* fix: fmt

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* fix: remove println

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

---------

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>
2024-02-28 06:49:24 +00:00
Lei, HUANG
7942b8fae9 chore: add metris for memtable read path (#3397)
* chore: add metris for read path

* chore: add more metrics
2024-02-28 03:37:19 +00:00
Yingwen
b97f957489 feat: Use a partition level map to look up pk index (#3400)
* feat: partition level map

* test: test shard and builder

* fix: do not use pk index from shard builder

* feat: add multi key test

* fix: freeze shard before finding pk in shards
2024-02-28 03:17:09 +00:00
Lei, HUANG
492a00969d feat: enable zstd compression and encodings in merge tree data part (#3380)
* feat: enable zstd compression in merge tree data part to save memory

* feat: also enable customized column encoding in DataPartEncoder
2024-02-27 06:54:56 +00:00
Yingwen
206666bff6 feat: Implement partition eviction and only add value size to write buffer size (#3393)
* feat: track key bytes in dict

* chore: done allocating on finish

* feat: evict keys

* chore: do not add to write buffer

* chore: only count value bytes

* fix: reset key bytes

* feat: remove write buffer manager from shards

* feat: change dict size compute method

* chore: adjust dictionary size by os memory
2024-02-27 06:28:57 +00:00
Ruihang Xia
ce397ebcc6 feat: change how region id maps to region worker (#3384)
* feat: change how region id maps to region worker

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* add overflow test

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

---------

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
2024-02-26 08:42:29 +00:00
Yingwen
26011ed0b6 fix: resets dict builder keys counter and avoid unnecessary pruning (#3386)
* fix: dict builder resets num_keys on finish

* feat: skip empty shard and builder

* feat: avoid pruning if possible

Implementations:
- Apply all filters on the partition column
- If no filter to prune, skip decoding keys
2024-02-26 08:24:46 +00:00
Lei, HUANG
8087822ab2 refactor: change the receivers of merge tree components (#3378)
* refactor: change the receivers of Shard::read/DataBuffer::read/DataParts::read to &self instead of &mut self

* refactor: remove allow(dead_code) in merge tree
2024-02-26 06:50:55 +00:00
Yingwen
e481f073f5 feat: Implement dedup for the new memtable and expose the config (#3377)
* fix: KeyValues num_fields() is incorrect

* chore: fix warnings

* feat: support dedup

* feat: allow using the new memtable

* feat: serde default for config

* fix: resets pk index after finishing a dict
2024-02-25 13:06:01 +00:00
Lei, HUANG
606309f49a fix: remove unused imports in memtable_util.rs (#3376) 2024-02-25 09:23:28 +00:00
Yingwen
8059b95e37 feat: Implement iter for the new memtable (#3373)
* chore: read shard builder

* chore: reuse pk weights

* chore: prune key

* chore: shard reader wip

* refactor: shard builder DataBatch

* feat: merge shard readers

* feat: return shard id in shard readers

* feat: impl partition reader

* chore: impl partition read

* feat: impl iter tree

* chore: save last yield pk id

* style: fix clippy

* refactor: rename ShardReaderImpl to ShardReader

* chore: address CR comment
2024-02-25 07:42:16 +00:00
Lei, HUANG
afe4633320 feat: merge tree dedup reader (#3375)
* feat: add dedup option to merge tree component

* feat: impl dedup reader for shard reader

* refactor: DedupReader::new to DedupReader::try_new

* refactor: remove DedupReader::current_key field

* fix: some cr comments

* fix: fmt

* fix: remove shard_id method from DedupSource
2024-02-24 13:50:49 +00:00
Yingwen
abbfd23d4b feat: Add freeze and fork method to the memtable (#3374)
* feat: add fork method to the memtable

* feat: allow mark immutable returns result

* feat: use fork to create the mutable memtable

* feat: remove memtable builder from freeze

* chore: warninigs

* fix: inspect error

* feat: iter returns result

* chore: maintains memtable id in region

* chore: update comment

* fix: remove region status if failed to freeze a memtable

* chroe: update comment

* chore: iter should not require sync

* chore: implement freeze and fork for the new memtable
2024-02-24 12:11:16 +00:00
Yingwen
1df64f294b refactor: Remove Item from merger's Node trait (#3371)
* refactor: data reader returns reference to data batch

* refactor: use range to create merger

* chore: Reference RecordBatch in DataBatch

* fix: top node not read if no next node

* refactor: move timestamp_array_to_i64_slice to data mod

* style: fix cilppy

* chore: derive copy for DataBatch

* chore: address CR comments
2024-02-24 07:19:48 +00:00
Lei, HUANG
1f1d1b4f57 feat: distinguish between different read paths (#3369)
* feat: distinguish between different read paths

* fix: reformat code
2024-02-23 12:40:39 +00:00
Yingwen
b144836935 feat: Implement write and fork for the new memtable (#3357)
* feat: write to a shard or a shard builder

* feat: freeze and fork for partition and shards

* chore: shard builder

* chore: change dict reader to support random access

* test: test write shard

* test: test write

* test: test memtable

* feat: add new and write_row to DataParts

* refactor: partition freeze shards

* refactor: write_with_pk_id

* style: fix clippy

* chore: add methods to get pk weights

* chroe: fix compiler errors
2024-02-23 07:20:55 +00:00
Lei, HUANG
90e9b69035 feat: impl merge reader for DataParts (#3361)
* feat: impl merge reader for DataParts

* fix: fmt

* fix: sort rows with pk and ts according to sequnce desc

* fix: remove pk weight as pk index are already replace by weights

* fix: format

* fix: some cr comments

* fix: some cr comments

* refactor: simply trait's associated types

* fix: some cr comments
2024-02-23 06:07:55 +00:00
Ruihang Xia
7341f23019 feat: skip filling NULL for put and delete requests (#3364)
* feat: optimize for sparse data

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* remove old structures

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

---------

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
2024-02-22 14:30:43 +00:00
SteveLauC
e9a2b0a9ee chore: use workspace-wide lints (#3352)
* chore: use workspace-wide lints

* respond to review
2024-02-22 01:01:10 +00:00
Yingwen
7c88d721c2 Merge pull request #3348
* feat: define functions for partitions

* feat: write partitions

* feat: fork and freeze partition

* feat: create iter by partition

* style: fix clippy

* chore: typos

* feat: add scan method to builder

* feat: check whether the builder should freeze first
2024-02-21 20:50:34 +08:00
Lei, HUANG
90169c868d feat: merge tree data parts (#3346)
* feat: add iter method for DataPart

* chore: rename iter to reader

* chore: some doc

* fix: resolve some comments

* fix: remove metadata in DataPart
2024-02-21 11:37:29 +00:00
Lei, HUANG
86a98c80f5 feat: replace pk index with pk_weight during freeze (#3343)
* feat: replace pk index with pk_weight during freeze

* chore: add parameter to control pk_index replacement

* fix: dedup pk weights also

* fix: generate pk array before dedup
2024-02-21 08:05:25 +00:00
Yingwen
f087a843bb feat: Implement KeyDictBuilder for the merge tree memtable (#3334)
* feat: dict builder

* feat: write and scan dict builder

* chore: address CR comments
2024-02-20 15:39:17 +00:00
Lei, HUANG
450dfe324d feat: data buffer and related structs (#3329)
* feat: data buffer and related structs

* fix: some cr comments

* chore: remove freeze_threshold in DataBuffer

* fix: use LazyMutableVectorBuilder instead of two vector; add option to control dedup

* fix: dedup rows according to both pk weights and timestamps

* fix: assembly DataBatch on demand
2024-02-20 09:22:45 +00:00
Yingwen
43fd87e051 feat: Defines structs in the merge tree memtable (#3326)
* chore: define mods

* feat: memtable struct

* feat: define structs inside the tree
2024-02-19 11:43:19 +00:00
Zhenchi
40f43de27d fix(index): encode string type to original data to enable fst regex to work (#3324)
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>
2024-02-19 10:52:19 +00:00
Zhenchi
4810c91a64 refactor(index): move option segment_row_count from WriteOptions to IndexOptions (#3307)
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>
2024-02-19 08:03:41 +00:00
Ruihang Xia
72cd443ba3 feat: organize tracing on query path (#3310)
* feat: organize tracing on query path

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* warp json conversion to TracingContext's methods

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* remove unnecessary .trace()

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* Update src/query/src/dist_plan/merge_scan.rs

Co-authored-by: Zhenchi <zhongzc_arch@outlook.com>

---------

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
Co-authored-by: Zhenchi <zhongzc_arch@outlook.com>
2024-02-18 15:04:57 +00:00
tison
4e04a4e48f build: support build without git (#3309)
* build: support build without git

Signed-off-by: tison <wander4096@gmail.com>

* chore

Signed-off-by: tison <wander4096@gmail.com>

* address comment

Signed-off-by: tison <wander4096@gmail.com>

* fix syntax

Signed-off-by: tison <wander4096@gmail.com>

---------

Signed-off-by: tison <wander4096@gmail.com>
2024-02-18 10:30:01 +00:00
Zhenchi
f9ce2708d3 feat(mito): add options to ignore building index for specific column ids (#3295)
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>
2024-02-16 08:50:41 +00:00
Zhenchi
34050ea8b5 fix(index): sanitize S3 upload buffer size (#3300)
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>
2024-02-16 06:45:31 +00:00
Zhenchi
141ed51dcc feat(mito): adjust seg size of inverted index to finer granularity instead of row group level (#3289)
* feat(mito): adjust seg size of inverted index to finer granularity instead of row group level

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* fix: wrong metric

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* fix: more suitable name

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* feat: BitVec instead

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

---------

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>
2024-02-07 08:20:00 +00:00
Zhenchi
dbf62f3273 chore(index): add BiError to fulfil the requirement of returning two errors (#3291)
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>
2024-02-06 16:03:03 +00:00
Ruihang Xia
51feec2579 feat: use simple filter to prune memtable (#3269)
* switch on clippy warnings

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* feat: use simple filter to prune memtable

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* remove deadcode

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* refine util function

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

---------

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
2024-02-04 11:35:55 +00:00
LFC
e375060b73 refactor: add same SST files (#3270)
* Make adding same SST file multiple times possible, instead of panic there.

* Update src/mito2/src/sst/version.rs

Co-authored-by: Yingwen <realevenyag@gmail.com>

---------

Co-authored-by: Yingwen <realevenyag@gmail.com>
2024-01-31 07:21:30 +00:00
Ruihang Xia
a079955d38 chore: adjust storage engine related metrics (#3261)
* chore: adjust metrics to metric engine and mito engine

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* adjust more mito bucket

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* fix compile

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

---------

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
2024-01-30 06:43:03 +00:00
Ruihang Xia
9a28a1eb5e fix: decouple columns in projection and prune (#3253)
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
2024-01-29 08:29:21 +00:00