Zhenchi
9c79baca4b
feat(index): support building inverted index for the field column on Mito ( #4887 )
...
feat(index): support building inverted index for the field column
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
2024-10-29 07:57:17 +00:00
Ruihang Xia
03f2fa219d
feat: optimizer rule for windowed sort ( #4874 )
...
* basic impl
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
* implement physical rule
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
* feat: install windowed sort physical rule and optimize partition ranges
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
* add logs and sqlness test
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
* feat: introduce PartSortExec for partitioned sorting
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
* tune exec nodes' properties and metrics
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
* clean up
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
* fix typo
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
* debug: add more info on very wrong
* debug: also print overlap ranges
* feat: add check when emit PartSort Stream
* dbg: info on overlap working range
* feat: check batch range is inside part range
* set distinguish partition range param
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
* chore: more logs
* update sqlness
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
* tune optimizer
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
* clean up
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
* fix lints
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
* fix windowed sort rule
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
* fix: early terminate sort stream
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
* chore: remove min/max check
* chore: remove unused windowed_sort module, uuid feature and refactor region_scanner to synchronous
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
* chore: print more fuzz log
* chore: more log
* fix: part sort should skip empty part
* chore: remove insert logs
* tests: empty PartitionRange
* refactor: testcase
* docs: update comment&tests: all empty
* ci: enlarge etcd cpu limit
---------
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
Co-authored-by: discord9 <discord9@163.com >
Co-authored-by: evenyag <realevenyag@gmail.com >
2024-10-29 07:46:05 +00:00
Lei, HUANG
eab9e3a48d
chore: remove struct size assertion ( #4885 )
...
chore/remove-struct-size-assertion: Remove unit tests for parquet_meta_size function in cache_size.rs
2024-10-28 08:50:10 +00:00
Yingwen
1008af5324
feat!: Divide flush and compaction job pool ( #4871 )
...
* feat: divide flush/compact job pool
* feat!: divide bg jobs config
* docs: update config examples
* test: fix tests
2024-10-25 23:36:16 +00:00
Lei, HUANG
e328c7067c
chore: udapte Rust toolchain to 2024-10-19 ( #4857 )
...
* update rust toolchain
* change toolchain to 2024-10-17
* fix: clippy
* fix: ut
* bump shadow-rs
* fix: use nightly-2024-10-19
* fix: clippy
* chore/udapte-toolchain-2024-10-17: Update DEV_BUILDER_IMAGE_TAG to 2024-10-19-a5c00e85-20241024184445 in Makefile
2024-10-25 00:23:32 +00:00
Yingwen
5d28f7a912
feat: yields empty batch after reading a range ( #4845 )
...
* feat: add empty batch to end of range stream
* feat: add batch validation
* fix: validate batch order
* fix: not yield empty batch in compaction
* fix: empty record batch
* feat: add a flag to enable empty batch
2024-10-21 13:52:47 +00:00
Yingwen
e0c4157ad8
feat: Seq scanner scans data by time range ( #4809 )
...
* feat: seq scan by partition
* feat: part metrics
* chore: remove unused codes
* chore: fmt stream
* feat: build ranges returns smallvec
* feat: move scan mem/file ranges to util and reuse
* feat: log metrics
* chore: correct some metrics
* feat: get explain info from ranges
* test: group test and remove unused codes
* chore: fix clippy
* feat: change PartitionRange end to exclusive
* test: add tests
2024-10-17 11:05:12 +00:00
Yingwen
2f2b4b306c
feat!: implement interval type by multiple structs ( #4772 )
...
* define structs and methods
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
* feat: re-implement interval types in time crate
* feat: use new
* feat: interval value
* feat: query crate interval
* feat: pg and mysql interval
* chore: remove unused imports
* chore: remove commented codes
* feat: make flow compile but may not work
* feat: flow datetime
* test: fix some tests
* test: fix some flow tests(WIP)
* chore: some fix test&docs
* fix: change interval order
* chore: remove unused codes
* chore: fix cilppy
* chore: now signature change
* chore: remove todo
* feat: update error message
---------
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
Co-authored-by: Ruihang Xia <waynestxia@gmail.com >
Co-authored-by: discord9 <discord9@163.com >
2024-10-14 03:09:03 +00:00
Weny Xu
6e776d5f98
feat: support to reject write after flushing ( #4759 )
...
* refactor: use `RegionRoleState` instead of `RegionState`
* feat: introducing `RegionLeaderState::Downgrading`
* refactor: introduce `set_region_role_state_gracefully`
* refactor: use `set_region_role` instead of `set_writable`
* feat: support to reject write after flushing
* fix: fix unit tests
* test: add unit test for `should_reject_write`
* chore: add comments
* chore: refine comments
* fix: fix unit test
* test: enable fuzz tests for Local WAL
* chore: add logs
* chore: rename `RegionStatus` to `RegionState`
* feat: introduce `DowngradingLeader`
* chore: rename
* refactor: refactor `set_role_state` tests
* test: ensure downgrading region will reject write
* chore: enhance logs
* chore: refine name
* chore: refine comment
* test: add tests for `set_role_role_state`
* fix: fix unit tests
* chore: apply suggestions from CR
* chore: apply suggestions from CR
2024-09-30 08:28:51 +00:00
Yingwen
cd55202136
feat: unordered scanner scans data by time ranges ( #4757 )
...
* feat: define range meta
* feat: group ranges
* feat: split range
* feat: build ranges from the scan input
* feat: get partition range from range meta
* feat: build file range
* feat: unordered scan read by ranges
* feat: wip for mem ranges
* feat: build ranges
* feat: remove unused codes
* chore: update comments
* feat: update metrics
* chore: address review comments
* chore: debug assertion
2024-09-29 07:14:48 +00:00
Lei, HUANG
934bc13967
feat(mito): limit compaction output file size ( #4754 )
...
* Commit Message
Clarify documentation for CompactionOutput struct
Updated the documentation for the `CompactionOutput` struct to specify that the output time range is only relevant for windowed compaction.
* Add max_output_file_size to TwcsPicker and TwcsOptions
- Introduced `max_output_file_size` to `TwcsPicker` struct and its logic to enforce output file size limits during compaction.
- Updated `TwcsOptions` to include `max_output_file_size` and adjusted related tests.
- Modified `new_picker` function to initialize `TwcsPicker` with the new `max_output_file_size` field.
* feat/limit-compaction-output-size:
Refactor compaction picker and TWCS to support append mode and improve options handling
- Update compaction picker to accept a reference to options and append mode flag
- Modify TWCS picker logic to consider append mode when filtering deleted rows
- Remove VersionControl usage in compactor and simplify return type
- Adjust enforce_max_output_size logic in TWCS picker to handle max output file size
- Add append mode flag to TwcsPicker struct
- Fix incorrect condition in TWCS picker for enforcing max output size
- Update region options tests to reflect new max output file size format (1GB and 7MB)
- Simplify InvalidTableOptionSnafu error handling in create_parser
- Add `compaction.twcs.max_output_file_size` to mito engine option keys
* resolve some comments
2024-09-27 11:17:36 +00:00
Weny Xu
4045298cb2
feat: add region_statistics table ( #4771 )
...
* refactor: introduce `region_statistic`
* refactor: move DatanodeStat related structs to common_meta
* chore: add comments
* feat: implement `list_region_stats` for `ClusterInfo` trait
* feat: add `region_statistics` table
* feat: add table_id and region_number fields
* chore: rename unused snafu
* chore: udpate sqlness results
* chore: avoid to print source in error msg
* chore: move `procedure_info` under `greptime` catalog
* chore: apply suggestions from CR
* Update src/common/meta/src/datanode.rs
Co-authored-by: jeremyhi <jiachun_feng@proton.me >
---------
Co-authored-by: jeremyhi <jiachun_feng@proton.me >
2024-09-27 09:54:52 +00:00
Weny Xu
163cea81c2
feat: migrate local WAL regions ( #4715 )
...
* feat: allow to flush region before migrating
* fix: fix unit tests
* feat: allow to set `flush_timeout`
* feat: skip to replay memtable
* fix: fix unit tests
* test: add more tests
* refactor: simplify timeout logical
* test: add unit tests
* test: add unit tests
* chore: update comments
* fix: fix unit tests
* fix: fmt and clippy
* feat: change default timeout to 30s
* fix: throw `ExceededDeadline` error
* test: add tests for `downgrade_region_with_retry`
* chore: apply suggestions from CR
* chore: apply suggestions from CR
* chore: apply suggestions from CR
* chore: update proto to `3633474`
* refactor: refactor `upgrade_region_with_retry`
* chore: apply suggestions from CR
2024-09-20 08:27:20 +00:00
Yingwen
e12ffbeb2f
feat: flush other workers if still need flush ( #4746 )
2024-09-20 02:55:31 +00:00
Yingwen
f02410c39b
fix: disable field pruning in last non null mode ( #4740 )
...
* fix: don't prune fields in last non null mode
* test: add sqlness test for field pruning
* test: add flush
* refine implementation
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
---------
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
Co-authored-by: Ruihang Xia <waynestxia@gmail.com >
2024-09-20 00:35:37 +00:00
Weny Xu
befb6d85f0
fix: determine region role by using is_readonly ( #4725 )
...
fix: correct `is_writable` behavior
2024-09-18 22:17:39 +00:00
Zhenchi
3b5b906543
feat(index): add explicit adapter between RangeReader and AsyncRead ( #4724 )
...
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
2024-09-18 03:33:55 +00:00
Yingwen
3e17c09e45
feat: skip caching uncompressed pages if they are large ( #4705 )
...
* feat: cache each uncompressed page
* chore: remove unused function
* chore: log
* chore: log
* chore: row group pages cache kv
* feat: also support row group level cache
* chore: fix range count
* feat: don't cache compressed page for row group cache
* feat: use function to get part
* chore: log whether scan is from compaction
* chore: avoid get column
* feat: add timer metrics
* chore: Revert "feat: add timer metrics"
This reverts commit 4618f57fa2ba13b1e1a8dec83afd01c00ae4c867.
* feat: don't cache individual uncompressed page
* feat: append in row group level under append mode
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
* chore: fetch pages cost
* perf: yield
* Update src/mito2/src/sst/parquet/row_group.rs
* refactor: cache key
* feat: print file num and row groups num in explain
* test: update sqlness test
* chore: Update src/mito2/src/sst/parquet/page_reader.rs
---------
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
Co-authored-by: Ruihang Xia <waynestxia@gmail.com >
2024-09-10 11:52:16 +00:00
Ruihang Xia
29f215531a
feat: parallel in row group level under append mode ( #4704 )
...
feat: append in row group level under append mode
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
2024-09-10 07:12:23 +00:00
Yohan Wal
04e7dd6fd5
feat: add json data type ( #4619 )
...
* feat: add json type and vector
* fix: allow to create and insert json data
* feat: udf to query json as string
* refactor: remove JsonbValue and JsonVector
* feat: show json value as strings
* chore: make ci happy
* test: adunit test and sqlness test
* refactor: use binary as grpc value of json
* fix: use non-preserve-order jsonb
* test: revert changed test
* refactor: change udf get_by_path to jq
* chore: make ci happy
* fix: distinguish binary and json in proto
* chore: delete udf for future pr
* refactor: remove Value(Json)
* chore: follow review comments
* test: some tests and checks
* test: fix unit tests
* chore: follow review comments
* chore: corresponding changes to proto
* fix: change grpc and pgsql server behavior alongside with sqlness/crud tests
* chore: follow review comments
* feat: udf of conversions between json and strings, used for grpc server
* refactor: rename to_string to json_to_string
* test: add more sqlness test for json
* chore: thanks for review :)
* Apply suggestions from code review
---------
Co-authored-by: Weny Xu <wenymedia@gmail.com >
2024-09-09 11:41:36 +00:00
Ruihang Xia
d2d62e0c6f
fix: unconditional statistics ( #4694 )
...
* fix: unconditional statistics
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
* add more sqlness case
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
---------
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
2024-09-07 04:28:11 +00:00
Yingwen
506dc20765
fix: last non null iter not init ( #4687 )
2024-09-06 04:13:23 +00:00
Lanqing Yang
86cef648cd
feat: add more spans to mito engine ( #4643 )
...
feat: add more span on mito engine
2024-09-05 06:13:22 +00:00
LFC
d43e31c7ed
feat: schedule compaction when adding sst files by editing region ( #4648 )
...
* feat: schedule compaction when adding sst files by editing region
* add minimum time interval for two successive compactions
* resolve PR comments
2024-09-04 10:10:07 +00:00
Ruihang Xia
8ca35a4a1a
fix: use number of partitions as parallilism in region scanner ( #4669 )
...
* fix: use number of partitions as parallilism in region scanner
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
* add sqlness
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
Co-authored-by: Lei HUANG <mrsatangel@gmail.com >
* order by ts
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
* debug pring time range
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
---------
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
Co-authored-by: Lei HUANG <mrsatangel@gmail.com >
2024-09-03 13:42:38 +00:00
Ruihang Xia
93f202694c
refactor: remove unused error variants ( #4666 )
...
* add python script
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
* remove unused errors
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
* fix all negative cases
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
* setup CI
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
* add license header
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
---------
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
2024-09-03 13:19:38 +00:00
Lei, HUANG
37dcf34bb9
fix(mito): avoid caching empty batches in row group ( #4652 )
...
* fix: avoid caching empty batches in row group
* fix: clippy
* Update tests/cases/standalone/common/select/last_value.sql
* fix: sqlness
2024-09-02 02:43:00 +00:00
Yingwen
8eda36bfe3
feat: remove files from the write cache in purger ( #4655 )
...
* feat: remove files from the write cache in purger
* chore: fix typo
2024-08-31 04:19:52 +00:00
Ruihang Xia
a37aeb2814
feat: initialize partition range from ScanInput ( #4635 )
...
* feat: initialize partition range from ScanInput
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
* use num_rows instead
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
* add todo
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
* setup unordered scan
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
* Update src/mito2/src/read/scan_region.rs
Co-authored-by: jeremyhi <jiachun_feng@proton.me >
* leave unordered scan unchanged
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
---------
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
Co-authored-by: jeremyhi <jiachun_feng@proton.me >
2024-08-30 07:30:37 +00:00
LFC
8ea4f67e4b
refactor: reduce a object store "stat" call ( #4645 )
2024-08-30 03:31:19 +00:00
LFC
d45b04180c
feat: pre-download the ingested sst ( #4636 )
...
* refactor: pre-read the ingested sst file in object store to fill the local cache to accelerate first query
* feat: pre-download the ingested SST from remote to accelerate following reads
* resolve PR comments
* resolve PR comments
2024-08-29 08:36:41 +00:00
Weny Xu
47657ebbc8
feat: replay WAL entries respect index ( #4565 )
...
* feat(log_store): use new `Consumer`
* feat: add `from_peer_id`
* feat: read WAL entries respect index
* test: add test for `build_region_wal_index_iterator`
* fix: keep the handle
* fix: incorrect last index
* fix: replay last entry id may be greater than expected
* chore: remove unused code
* chore: apply suggestions from CR
* chore: rename `datanode_id` to `location_id`
* chore: rename `from_peer_id` to `location_id`
* chore: rename `from_peer_id` to `location_id`
* chore: apply suggestions from CR
2024-08-28 11:37:18 +00:00
Yingwen
28bf549907
fix: fallback to window size in manifest ( #4629 )
2024-08-28 06:43:56 +00:00
zyy17
5177717f71
refactor: add fallback_to_local region option ( #4578 )
...
* refactor: add 'fallback_to_local_compaction' region option
* refactor: use 'fallback_to_local'
2024-08-23 03:09:43 +00:00
Weny Xu
25cd61b310
chore: upgrade toolchain to nightly-2024-08-07 ( #4549 )
...
* chore: upgrade toolchain to `nightly-2024-08-07`
* chore(ci): upgrade toolchain
* fix: fix unit test
2024-08-22 11:02:18 +00:00
LFC
883c5bc5b0
refactor: skip checking the existence of the SST files ( #4602 )
...
refactor: skip checking the existence of the SST files when region is directly edited
2024-08-22 08:32:27 +00:00
Yingwen
d628079f4c
feat: collect filters metrics for scanners ( #4591 )
...
* feat: collect filter metrics
* refactor: reuse ReaderFilterMetrics
* feat: record read rows from parquet by type
* feat: unordered scan observe rows
also fix read type
* chore: rename label
2024-08-22 03:22:05 +00:00
Yingwen
a12a905578
chore: disable ttl for write cache by default ( #4595 )
...
* chore: remove default write cache ttl
* docs: update example config
* chore: fix ci
2024-08-21 08:38:38 +00:00
Ran Joe
9db08dbbe0
refactor(mito2): reduce duplicate IndexOutput struct ( #4592 )
...
* refactor(mito2): reduce duplicate IndexOutput struct
* docs(mito2): add index output note
2024-08-20 12:30:17 +00:00
ozewr
30af78700f
feat: Implement the Buf to avoid extra memory allocation ( #4585 )
...
* feat: Implement the Buf to avoid extra memory allocation
* fmt toml
* fmt code
* mv entry.into_buffer to raw_entry_buffer
* less reuse opendal
* remove todo #4065
* Update src/mito2/src/wal/entry_reader.rs
Co-authored-by: Weny Xu <wenymedia@gmail.com >
* fmt code
---------
Co-authored-by: ozewr <l19ht@google.com >
Co-authored-by: Weny Xu <wenymedia@gmail.com >
2024-08-19 12:11:08 +00:00
Weny Xu
76dc906574
feat(log_store): introduce the CollectionTask ( #4530 )
...
* feat: introduce the `CollectionTask`
* feat: add config of index collector
* chore: remove unused code
* feat: truncate indexes
* chore: apply suggestions from CR
* chore: update config examples
* refactor: retrieve latest offset while dumping indexes
* chore: print warn
2024-08-19 03:48:35 +00:00
LFC
ec59ce5c9a
feat: able to handle concurrent region edit requests ( #4569 )
...
* feat: able to handle concurrent region edit requests
* resolve PR comments
2024-08-16 03:29:03 +00:00
Lei, HUANG
216bce6973
perf: count(*) for append-only tables ( #4545 )
...
* feat: support fast count(*) for append-only tables
* fix: total_rows stats in time series memtable
* fix: sqlness result changes for SinglePartitionScanner -> StreamScanAdapter
* fix: some cr comments
2024-08-13 09:27:50 +00:00
Weny Xu
665b7e5c6e
perf: merge small byte ranges for optimized fetching ( #4520 )
2024-08-09 08:17:54 +00:00
Weny Xu
cb4cffe636
chore: bump opendal version to 0.48 ( #4499 )
2024-08-04 00:46:04 +00:00
Yingwen
ded874da04
feat: enlarge default page cache size ( #4490 )
2024-08-02 07:24:20 +00:00
Yingwen
b388829a96
fix: avoid total size overflow ( #4487 )
...
feat: avoid total size overflow
2024-08-02 06:16:37 +00:00
Lei, HUANG
9d5d7c1f9a
feat(compaction): add file number limits to TWCS compaction ( #4481 )
...
* Add file number limits to TWCS compaction
- Introduce `max_active_window_files` and `max_inactive_window_files` to `TwcsOptions`.
* feat/limit-files-in-windows: Add max active/inactive window files options to mito engine config
* feat/limit-files-in-windows: Add Debug derive to TwcsPicker and implement max file enforcement logging in TWCS compaction
* fix: clippy
2024-08-01 12:42:09 +00:00
Yingwen
6c4b8b63a5
fix: notify flush receiver after write buffer is released ( #4476 )
...
* fix: notify the worker after write buffer is released
* feat: worker level region count
2024-08-01 07:15:36 +00:00
Yingwen
f382a7695f
perf: reduce lock scope and improve log ( #4453 )
...
* feat: refine logs for scan
* feat: improve build parts and unordered scan metrics
* feat: change to debug log
* fix: release lock before reading part
* test: replace region id
* test: fix sqlness
* chore: add todo
Co-authored-by: dennis zhuang <killme2008@gmail.com >
---------
Co-authored-by: dennis zhuang <killme2008@gmail.com >
2024-07-31 04:07:34 +00:00