Yingwen
74862f8c3f
feat(mito): Checks whether a region should flush periodically ( #3459 )
...
* feat: handle flush periodically
* chore: call periodical method in loop
* feat: check periodical tasks on channel timeout
* refactor: use time provider to get time
Mock a time provider to test auto flush
* chore: fix typos
* refactor: rename mock time provider
* style: fix cilppy
* chore: address comment
2024-03-15 06:41:28 +00:00
Yingwen
8ca9e01455
feat: Partition memtables by time if compaction window is provided ( #3501 )
...
* feat: define time partitions
* feat: adapt time partitions to version
* feat: implement non write methods
* feat: add write one to memtable
* feat: implement write
* chore: fix warning
* fix: inner not set
* refactor: add collect_iter_timestamps
* test: test partitions
* chore: debug log
* chore: fix typos
* chore: log memtable id
* fix: empty check
* chore: log total parts
* chore: update comments
2024-03-14 11:13:01 +00:00
Yingwen
7c895e2605
perf: more benchmarks for memtables ( #3491 )
...
* chore: remove duplicate bench
* refactor: rename bench
* perf: add full scan bench for memtable
* feat: filter bench and add time series to bench group
* chore: comment
* refactor: rename
* style: fix clippy
2024-03-12 12:02:58 +00:00
Yingwen
9aa8f756ab
fix: allow passing extra table options ( #3484 )
...
* fix: do not check options in parser
* test: fix tests
* test: fix sqlness
* test: add sqlness test
* chore: log options
* chore: must specify compaction type
* feat: validate option key
* feat: add option key validation back
2024-03-12 07:03:52 +00:00
Yingwen
06dcd0f6ed
fix: freeze data buffer in shard ( #3468 )
...
* feat: call freeze if the active data buffer in a shard is full
* chore: more metrics
* chore: print metrics
* chore: enlarge freeze threshold
* test: test freeze
* test: fix config test
2024-03-11 14:51:06 +00:00
gcmutator
21ff3620be
chore: remove repetitive words ( #3469 )
...
remove repetitive words
Signed-off-by: gcmutator <329964069@qq.com >
2024-03-09 04:18:47 +00:00
Yingwen
3ee53360ee
perf: Reduce decode overhead during pruning keys in the memtable ( #3415 )
...
* feat: reuse value buf
* feat: skip values to decode
* feat: prune shard
chore: fix compiler errors
refactor: shard prune metrics
* fix: panic on DedupReader::try_new
* fix: prune after next
* chore: num parts metrics
* feat: metrics and logs
* chore: data build cost
* chore: more logs
* feat: cache skip result
* chore: todo
* fix: index out of bound
* test: test codec
* fix: invalid offsets
* fix: skip binary
* fix: offset buffer reuse
* chore: comment
* test: test memtable filter
* style: fix clippy
* chore: fix compiler error
2024-03-08 02:54:00 +00:00
Lei, HUANG
7183fa198c
refactor: make MergeTreeMemtable the default choice ( #3430 )
...
* refactor: make MergeTreeMemtable the default choice
* refactor: reformat
* chore: add doc to config
2024-03-05 10:00:08 +00:00
Yingwen
49157868f9
feat: Correct server metrics and add more metrics for scan ( #3426 )
...
* feat: drop timer on stream terminated
* refactor: combine metrics into a histogram vec
* refactor: frontend grpc metrics
* feat: add metrics middleware layer to grpc server
* refactor: move http metrics layer to metrics mod
* feat: bucket for grpc/http elapsed
* feat: remove duplicate metrics
* style: fix cilppy
* fix: incorrect bucket of promql series
* feat: more metrics for mito
* feat: convert cost
* test: fix metrics test
2024-03-04 10:15:10 +00:00
niebayes
7d30c2484b
fix: mitigate memory spike during startup ( #3418 )
...
* fix: fix memory spike during startup
* fix: allocate a region write ctx for each wal entry
2024-03-01 07:46:05 +00:00
Lei, HUANG
376409b857
feat: employ sparse key encoding for shard lookup ( #3410 )
...
* feat: employ short key encoding for shard lookup
* fix: license
* chore: simplify code
* refactor: only enable sparse encoding to speed lookup on metric engine
* fix: names
2024-03-01 06:22:15 +00:00
Lei, HUANG
3413fc0781
refactor: move some costly methods in DataBuffer::read out of read lock ( #3406 )
...
* refactor: move some costly methods in DataBuffer::read out of read lock
* refactor: also replace ShardReader with ShardReaderBuilder
2024-02-28 12:22:44 +00:00
Lei, HUANG
a0a8e8c587
fix: some read metrics ( #3404 )
...
* fix: some read metrics
* chore: fix some metrics
* fix
2024-02-28 08:47:49 +00:00
Zhenchi
c3c80b92c8
feat(index): measure memory usage in global instead of single-column and add metrics ( #3383 )
...
* feat(index): measure memory usage in global instead of single-column and add metrics
* feat: add leading zeros to streamline memory usage
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
* fix: fmt
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
* fix: remove println
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
---------
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
2024-02-28 06:49:24 +00:00
Lei, HUANG
7942b8fae9
chore: add metris for memtable read path ( #3397 )
...
* chore: add metris for read path
* chore: add more metrics
2024-02-28 03:37:19 +00:00
Yingwen
b97f957489
feat: Use a partition level map to look up pk index ( #3400 )
...
* feat: partition level map
* test: test shard and builder
* fix: do not use pk index from shard builder
* feat: add multi key test
* fix: freeze shard before finding pk in shards
2024-02-28 03:17:09 +00:00
Lei, HUANG
492a00969d
feat: enable zstd compression and encodings in merge tree data part ( #3380 )
...
* feat: enable zstd compression in merge tree data part to save memory
* feat: also enable customized column encoding in DataPartEncoder
2024-02-27 06:54:56 +00:00
Yingwen
206666bff6
feat: Implement partition eviction and only add value size to write buffer size ( #3393 )
...
* feat: track key bytes in dict
* chore: done allocating on finish
* feat: evict keys
* chore: do not add to write buffer
* chore: only count value bytes
* fix: reset key bytes
* feat: remove write buffer manager from shards
* feat: change dict size compute method
* chore: adjust dictionary size by os memory
2024-02-27 06:28:57 +00:00
Ruihang Xia
ce397ebcc6
feat: change how region id maps to region worker ( #3384 )
...
* feat: change how region id maps to region worker
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
* add overflow test
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
---------
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
2024-02-26 08:42:29 +00:00
Yingwen
26011ed0b6
fix: resets dict builder keys counter and avoid unnecessary pruning ( #3386 )
...
* fix: dict builder resets num_keys on finish
* feat: skip empty shard and builder
* feat: avoid pruning if possible
Implementations:
- Apply all filters on the partition column
- If no filter to prune, skip decoding keys
2024-02-26 08:24:46 +00:00
Lei, HUANG
8087822ab2
refactor: change the receivers of merge tree components ( #3378 )
...
* refactor: change the receivers of Shard::read/DataBuffer::read/DataParts::read to &self instead of &mut self
* refactor: remove allow(dead_code) in merge tree
2024-02-26 06:50:55 +00:00
Yingwen
e481f073f5
feat: Implement dedup for the new memtable and expose the config ( #3377 )
...
* fix: KeyValues num_fields() is incorrect
* chore: fix warnings
* feat: support dedup
* feat: allow using the new memtable
* feat: serde default for config
* fix: resets pk index after finishing a dict
2024-02-25 13:06:01 +00:00
Lei, HUANG
606309f49a
fix: remove unused imports in memtable_util.rs ( #3376 )
2024-02-25 09:23:28 +00:00
Yingwen
8059b95e37
feat: Implement iter for the new memtable ( #3373 )
...
* chore: read shard builder
* chore: reuse pk weights
* chore: prune key
* chore: shard reader wip
* refactor: shard builder DataBatch
* feat: merge shard readers
* feat: return shard id in shard readers
* feat: impl partition reader
* chore: impl partition read
* feat: impl iter tree
* chore: save last yield pk id
* style: fix clippy
* refactor: rename ShardReaderImpl to ShardReader
* chore: address CR comment
2024-02-25 07:42:16 +00:00
Lei, HUANG
afe4633320
feat: merge tree dedup reader ( #3375 )
...
* feat: add dedup option to merge tree component
* feat: impl dedup reader for shard reader
* refactor: DedupReader::new to DedupReader::try_new
* refactor: remove DedupReader::current_key field
* fix: some cr comments
* fix: fmt
* fix: remove shard_id method from DedupSource
2024-02-24 13:50:49 +00:00
Yingwen
abbfd23d4b
feat: Add freeze and fork method to the memtable ( #3374 )
...
* feat: add fork method to the memtable
* feat: allow mark immutable returns result
* feat: use fork to create the mutable memtable
* feat: remove memtable builder from freeze
* chore: warninigs
* fix: inspect error
* feat: iter returns result
* chore: maintains memtable id in region
* chore: update comment
* fix: remove region status if failed to freeze a memtable
* chroe: update comment
* chore: iter should not require sync
* chore: implement freeze and fork for the new memtable
2024-02-24 12:11:16 +00:00
Yingwen
1df64f294b
refactor: Remove Item from merger's Node trait ( #3371 )
...
* refactor: data reader returns reference to data batch
* refactor: use range to create merger
* chore: Reference RecordBatch in DataBatch
* fix: top node not read if no next node
* refactor: move timestamp_array_to_i64_slice to data mod
* style: fix cilppy
* chore: derive copy for DataBatch
* chore: address CR comments
2024-02-24 07:19:48 +00:00
Lei, HUANG
1f1d1b4f57
feat: distinguish between different read paths ( #3369 )
...
* feat: distinguish between different read paths
* fix: reformat code
2024-02-23 12:40:39 +00:00
Yingwen
b144836935
feat: Implement write and fork for the new memtable ( #3357 )
...
* feat: write to a shard or a shard builder
* feat: freeze and fork for partition and shards
* chore: shard builder
* chore: change dict reader to support random access
* test: test write shard
* test: test write
* test: test memtable
* feat: add new and write_row to DataParts
* refactor: partition freeze shards
* refactor: write_with_pk_id
* style: fix clippy
* chore: add methods to get pk weights
* chroe: fix compiler errors
2024-02-23 07:20:55 +00:00
Lei, HUANG
90e9b69035
feat: impl merge reader for DataParts ( #3361 )
...
* feat: impl merge reader for DataParts
* fix: fmt
* fix: sort rows with pk and ts according to sequnce desc
* fix: remove pk weight as pk index are already replace by weights
* fix: format
* fix: some cr comments
* fix: some cr comments
* refactor: simply trait's associated types
* fix: some cr comments
2024-02-23 06:07:55 +00:00
Ruihang Xia
7341f23019
feat: skip filling NULL for put and delete requests ( #3364 )
...
* feat: optimize for sparse data
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
* remove old structures
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
---------
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
2024-02-22 14:30:43 +00:00
SteveLauC
e9a2b0a9ee
chore: use workspace-wide lints ( #3352 )
...
* chore: use workspace-wide lints
* respond to review
2024-02-22 01:01:10 +00:00
Yingwen
7c88d721c2
Merge pull request #3348
...
* feat: define functions for partitions
* feat: write partitions
* feat: fork and freeze partition
* feat: create iter by partition
* style: fix clippy
* chore: typos
* feat: add scan method to builder
* feat: check whether the builder should freeze first
2024-02-21 20:50:34 +08:00
Lei, HUANG
90169c868d
feat: merge tree data parts ( #3346 )
...
* feat: add iter method for DataPart
* chore: rename iter to reader
* chore: some doc
* fix: resolve some comments
* fix: remove metadata in DataPart
2024-02-21 11:37:29 +00:00
Lei, HUANG
86a98c80f5
feat: replace pk index with pk_weight during freeze ( #3343 )
...
* feat: replace pk index with pk_weight during freeze
* chore: add parameter to control pk_index replacement
* fix: dedup pk weights also
* fix: generate pk array before dedup
2024-02-21 08:05:25 +00:00
Yingwen
f087a843bb
feat: Implement KeyDictBuilder for the merge tree memtable ( #3334 )
...
* feat: dict builder
* feat: write and scan dict builder
* chore: address CR comments
2024-02-20 15:39:17 +00:00
Lei, HUANG
450dfe324d
feat: data buffer and related structs ( #3329 )
...
* feat: data buffer and related structs
* fix: some cr comments
* chore: remove freeze_threshold in DataBuffer
* fix: use LazyMutableVectorBuilder instead of two vector; add option to control dedup
* fix: dedup rows according to both pk weights and timestamps
* fix: assembly DataBatch on demand
2024-02-20 09:22:45 +00:00
Yingwen
43fd87e051
feat: Defines structs in the merge tree memtable ( #3326 )
...
* chore: define mods
* feat: memtable struct
* feat: define structs inside the tree
2024-02-19 11:43:19 +00:00
Zhenchi
40f43de27d
fix(index): encode string type to original data to enable fst regex to work ( #3324 )
...
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
2024-02-19 10:52:19 +00:00
Zhenchi
4810c91a64
refactor(index): move option segment_row_count from WriteOptions to IndexOptions ( #3307 )
...
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
2024-02-19 08:03:41 +00:00
Ruihang Xia
72cd443ba3
feat: organize tracing on query path ( #3310 )
...
* feat: organize tracing on query path
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
* warp json conversion to TracingContext's methods
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
* remove unnecessary .trace()
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
* Update src/query/src/dist_plan/merge_scan.rs
Co-authored-by: Zhenchi <zhongzc_arch@outlook.com >
---------
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
Co-authored-by: Zhenchi <zhongzc_arch@outlook.com >
2024-02-18 15:04:57 +00:00
tison
4e04a4e48f
build: support build without git ( #3309 )
...
* build: support build without git
Signed-off-by: tison <wander4096@gmail.com >
* chore
Signed-off-by: tison <wander4096@gmail.com >
* address comment
Signed-off-by: tison <wander4096@gmail.com >
* fix syntax
Signed-off-by: tison <wander4096@gmail.com >
---------
Signed-off-by: tison <wander4096@gmail.com >
2024-02-18 10:30:01 +00:00
Zhenchi
f9ce2708d3
feat(mito): add options to ignore building index for specific column ids ( #3295 )
...
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
2024-02-16 08:50:41 +00:00
Zhenchi
34050ea8b5
fix(index): sanitize S3 upload buffer size ( #3300 )
...
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
2024-02-16 06:45:31 +00:00
Zhenchi
141ed51dcc
feat(mito): adjust seg size of inverted index to finer granularity instead of row group level ( #3289 )
...
* feat(mito): adjust seg size of inverted index to finer granularity instead of row group level
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
* fix: wrong metric
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
* fix: more suitable name
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
* feat: BitVec instead
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
---------
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
2024-02-07 08:20:00 +00:00
Zhenchi
dbf62f3273
chore(index): add BiError to fulfil the requirement of returning two errors ( #3291 )
...
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
2024-02-06 16:03:03 +00:00
Ruihang Xia
51feec2579
feat: use simple filter to prune memtable ( #3269 )
...
* switch on clippy warnings
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
* feat: use simple filter to prune memtable
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
* remove deadcode
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
* refine util function
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
---------
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
2024-02-04 11:35:55 +00:00
LFC
e375060b73
refactor: add same SST files ( #3270 )
...
* Make adding same SST file multiple times possible, instead of panic there.
* Update src/mito2/src/sst/version.rs
Co-authored-by: Yingwen <realevenyag@gmail.com >
---------
Co-authored-by: Yingwen <realevenyag@gmail.com >
2024-01-31 07:21:30 +00:00
Ruihang Xia
a079955d38
chore: adjust storage engine related metrics ( #3261 )
...
* chore: adjust metrics to metric engine and mito engine
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
* adjust more mito bucket
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
* fix compile
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
---------
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
2024-01-30 06:43:03 +00:00
Ruihang Xia
9a28a1eb5e
fix: decouple columns in projection and prune ( #3253 )
...
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
2024-01-29 08:29:21 +00:00