greptimedb

mirror of https://github.com/GreptimeTeam/greptimedb.git synced 2026-05-21 23:40:38 +00:00

Author	SHA1	Message	Date
Zhenchi	5cf9d7b6ca	fix(bloom-filter): filter rows with segment precision (#5286 ) * fix(bloom-filter): filter rows with segment precision Signed-off-by: Zhenchi <zhongzc_arch@outlook.com> * add case Signed-off-by: Zhenchi <zhongzc_arch@outlook.com> * address TODO Signed-off-by: Zhenchi <zhongzc_arch@outlook.com> --------- Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>	2025-01-06 11:45:15 +00:00
Weny Xu	96b2a5fb28	feat: introduce `ParallelFstValuesMapper` (#5276 ) * refactor: `RangeReader` to use `&self` * refactor: `InvertedIndexReader` to use `&self` * refactor: refactor: `BloomFilterReader` to use `&self` * feat: introduce `ParallelFstValuesMapper` * chore: change prefetch size to 8KiB * chore: add `file_size_hint` for cached blob reader * chore: fix clippy * refactor: remove `FstValuesMapper` * chore: apply suggestions from CR	2025-01-06 07:33:35 +00:00
Kould	1067357b72	chore(config)!: refactor configs of write cache (#5259 ) * chore: refactor configs of write cache * chore: write_cache_size `10GiB` -> `5GiB`	2025-01-04 07:14:38 +00:00
Lei, HUANG	577d81f14c	chore: suppress list warning (#5280 ) chore/suppress-list-warning: ### Update logging level in `intermediate.rs` - Changed logging level from `warn` to `debug` for unexpected directory entries in index creation. - Added `debug` to the `common_telemetry` import to support the logging level change.	2025-01-03 09:05:03 +00:00
Yingwen	89399131dd	feat: support add if not exists in the gRPC alter kind (#5273 ) * test: test adding existing columns * chore: add more checks to AlterKind * chore: update logs * fix: check and build table info first * feat: Add add_if_not_exists flag to alter expr * feat: skip existing columns when building alter kind * checks in make_region_alter_kind() * reuse the alter kind * test: fix tests in common-meta * chore: fix typos * chore: update comments	2025-01-03 07:23:17 +00:00
Yohan Wal	bcb0f14227	refactor: adjust index cache page size (#5267 ) * refactor: adjust index cache page size * fix: wrong docs * Update config/datanode.example.toml * Update config/config.md * Update config/config.md * chore: adjust to 64KiB * Apply suggestions from code review	2025-01-03 03:26:17 +00:00
Ning Sun	53d006292d	fix: correct invalid testing feature gate usage (#5258 ) * fix: correct invalid testing feature gate usage * test: refactor tests to avoid test code leak * fix: sync main	2025-01-02 03:22:54 +00:00
Ruihang Xia	55b7656956	feat: override `__sequence` on creating SST to save space and CPU (#5252 ) * override memtable sequence Signed-off-by: Ruihang Xia <waynestxia@gmail.com> * override sst sequence Signed-off-by: Ruihang Xia <waynestxia@gmail.com> * chore changes per to CR comments Signed-off-by: Ruihang Xia <waynestxia@gmail.com> * use correct sequence number Signed-off-by: Ruihang Xia <waynestxia@gmail.com> * wrap a method to get max sequence Signed-off-by: Ruihang Xia <waynestxia@gmail.com> * fix typo Signed-off-by: Ruihang Xia <waynestxia@gmail.com> --------- Signed-off-by: Ruihang Xia <waynestxia@gmail.com>	2024-12-31 03:28:02 +00:00
Yingwen	75e4f307c9	feat: update partition duration of memtable using compaction window (#5197 ) * feat: update partition duration of memtable using compaction window * chore: only use provided duration if it is not None * test: more tests * test: test compaction apply window * style: fix clippy	2024-12-30 13:06:25 +00:00
Zhenchi	109fe04d17	fix(bloom-filter): skip applying for non-indexed columns (#5246 ) Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>	2024-12-30 06:56:58 +00:00
Yingwen	f1eb76f489	fix: implement a CacheStrategy to ensure compaction use cache correctly (#5254 ) * feat: impl CacheStrategy * refactor: replace Option<CacheManagerRef> with CacheStrategy * feat: add disabled strategy * ci: force update taplo * refactor: rename CacheStrategy::Normal to CacheStrategy::EnableAll * ci: force install cargo-gc-bin * ci: force install * chore: use CacheStrategy::Disabled as ScanInput default * chore: fix compiler errors	2024-12-30 06:24:53 +00:00
Zhenchi	7471f55c2e	feat(mito): add bloom filter read metrics (#5239 ) Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>	2024-12-26 04:44:03 +00:00
Zhenchi	f4b2d393be	feat(config): add bloom filter config (#5237 ) * feat(bloom-filter): integrate indexer with mito2 Signed-off-by: Zhenchi <zhongzc_arch@outlook.com> * feat(config) add bloom filter config Signed-off-by: Zhenchi <zhongzc_arch@outlook.com> * fix Signed-off-by: Zhenchi <zhongzc_arch@outlook.com> * fix docs Signed-off-by: Zhenchi <zhongzc_arch@outlook.com> * address comments Signed-off-by: Zhenchi <zhongzc_arch@outlook.com> * fix docs Signed-off-by: Zhenchi <zhongzc_arch@outlook.com> * merge Signed-off-by: Zhenchi <zhongzc_arch@outlook.com> * remove cache config Signed-off-by: Zhenchi <zhongzc_arch@outlook.com> --------- Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>	2024-12-26 04:38:45 +00:00
Ruihang Xia	00ad27dd2e	feat(bloom-filter): bloom filter applier (#5220 ) * wip Signed-off-by: Ruihang Xia <waynestxia@gmail.com> * draft search logic Signed-off-by: Ruihang Xia <waynestxia@gmail.com> * use defined BloomFilterReader Signed-off-by: Ruihang Xia <waynestxia@gmail.com> * fix clippy Signed-off-by: Ruihang Xia <waynestxia@gmail.com> * round the range end Signed-off-by: Ruihang Xia <waynestxia@gmail.com> * finish index applier Signed-off-by: Ruihang Xia <waynestxia@gmail.com> * integrate applier into mito2 with cache layer Signed-off-by: Ruihang Xia <waynestxia@gmail.com> * fix cache key and add unit test Signed-off-by: Ruihang Xia <waynestxia@gmail.com> * provide bloom filter index size hint Signed-off-by: Ruihang Xia <waynestxia@gmail.com> * revert BloomFilterReaderImpl::read_vec Signed-off-by: Ruihang Xia <waynestxia@gmail.com> * remove dead code Signed-off-by: Ruihang Xia <waynestxia@gmail.com> * ignore null on eq Signed-off-by: Ruihang Xia <waynestxia@gmail.com> * add more tests and fix bloom filter logic Signed-off-by: Ruihang Xia <waynestxia@gmail.com> --------- Signed-off-by: Ruihang Xia <waynestxia@gmail.com>	2024-12-26 02:51:18 +00:00
Zhenchi	a9f21915ef	feat(bloom-filter): integrate indexer with mito2 (#5236 ) * feat(bloom-filter): integrate indexer with mito2 Signed-off-by: Zhenchi <zhongzc_arch@outlook.com> * rename skippingindextype Signed-off-by: Zhenchi <zhongzc_arch@outlook.com> * address comments Signed-off-by: Zhenchi <zhongzc_arch@outlook.com> --------- Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>	2024-12-25 14:30:07 +00:00
Ruihang Xia	a23f269bb1	fix: correct write cache's metric labels (#5227 ) * refactor: remove unused field in WriteCache Signed-off-by: Ruihang Xia <waynestxia@gmail.com> * refactor: unify read and write cache path Signed-off-by: Ruihang Xia <waynestxia@gmail.com> * update config and fix clippy Signed-off-by: Ruihang Xia <waynestxia@gmail.com> * remove unnecessary methods and adapt test Signed-off-by: Ruihang Xia <waynestxia@gmail.com> * change the default path Signed-off-by: Ruihang Xia <waynestxia@gmail.com> * remove remote-home Signed-off-by: Ruihang Xia <waynestxia@gmail.com> --------- Signed-off-by: Ruihang Xia <waynestxia@gmail.com>	2024-12-25 07:26:21 +00:00
Weny Xu	f33b378e45	chore: add log for converting region to follower (#5222 ) * chore: add log for converting region to follower * chore: apply suggestions from CR	2024-12-25 02:38:47 +00:00
Lei, HUANG	074846bbc2	feat(mito): parquet memtable reader (#4967 ) * wip: row group reader base * wip: memtable row group reader * Refactor MemtableRowGroupReader to streamline data fetching - Added early return when fetch_ranges is empty to optimize performance. - Replaced inline chunk data assignment with a call to `assign_dense_chunk` for cleaner code. * wip: row group reader * wip: reuse RowGroupReader * wip: bulk part reader * Enhance BulkPart Iteration with Filtering - Introduced `RangeBase` to `BulkIterContext` for improved filter handling. - Implemented filter application in `BulkPartIter` to prune batches based on predicates. - Updated `SimpleFilterContext::new_opt` to be public for broader access. * chore: add prune test * fix: clippy * fix: introduce prune reader for memtable and add more prune test * Enhance BulkPart read method to return Option<BoxedBatchIterator> - Modified `BulkPart::read` to return `Option<BoxedBatchIterator>` to handle cases where no row groups are selected. - Added logic to return `None` when all row groups are filtered out. - Updated tests to handle the new return type and added a test case to verify behavior when no row groups match the pr * refactor/separate-paraquet-reader: Add helper function to parse parquet metadata and integrate it into BulkPartEncoder * refactor/separate-paraquet-reader: Change BulkPartEncoder row_group_size from Option to usize and update tests * refactor/separate-paraquet-reader: Add context module for bulk memtable iteration and refactor part reading • Introduce context module to encapsulate context for bulk memtable iteration. • Refactor BulkPart to use BulkIterContextRef for reading operations. • Remove redundant code in BulkPart by centralizing context creation and row group pruning logic in the new context module. • Create new file context.rs with structures and logic for handling iteration context. • Adjust part_reader.rs and row_group_reader.rs to reference the new BulkIterContextRef. * refactor/separate-paraquet-reader: Refactor RowGroupReader traits and implementations in memtable and parquet reader modules • Rename RowGroupReaderVirtual to RowGroupReaderContext for clarity. • Replace BulkPartVirt with direct usage of BulkIterContextRef in MemtableRowGroupReader. • Simplify MemtableRowGroupReaderBuilder by directly passing context instead of creating a BulkPartVirt instance. • Update RowGroupReaderBase to use context field instead of virt, reflecting the trait renaming and usage. • Modify FileRangeVirt to FileRangeContextRef and adjust implementations accordingly. * refactor/separate-paraquet-reader: Refactor column page reader creation and remove unused code • Centralize creation of SerializedPageReader in RowGroupBase::column_reader method. • Remove unused RowGroupCachedReader and related code from MemtableRowGroupPageFetcher. • Eliminate redundant error handling for invalid column index in multiple places. * chore: rebase main and resolve conflicts * fix: some comments * chore: resolve conflicts * chore: resolve conflicts	2024-12-24 09:59:26 +00:00
Zhenchi	d51b65a8bf	feat(index-cache): abstract `IndexCache` to be shared by multi types of indexes (#5219 ) * feat(index-cache): abstract `IndexCache` to be shared by multi types of indexes Signed-off-by: Zhenchi <zhongzc_arch@outlook.com> * fix typo Signed-off-by: Zhenchi <zhongzc_arch@outlook.com> * fix: remove added label Signed-off-by: Zhenchi <zhongzc_arch@outlook.com> * refactor: simplify cached reader impl Signed-off-by: Zhenchi <zhongzc_arch@outlook.com> * rename func Signed-off-by: Zhenchi <zhongzc_arch@outlook.com> --------- Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>	2024-12-24 05:10:30 +00:00
Zhenchi	3d4121aefb	feat(bloom-filter): add memory control for creator (#5185 ) * feat(bloom-filter): add memory control for creator Signed-off-by: Zhenchi <zhongzc_arch@outlook.com> * refactor: remove meaningless buf Signed-off-by: Zhenchi <zhongzc_arch@outlook.com> * feat: add codec for intermediate Signed-off-by: Zhenchi <zhongzc_arch@outlook.com> --------- Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>	2024-12-20 06:59:44 +00:00
Ruihang Xia	422d18da8b	feat: bump opendal and switch prometheus layer to the upstream impl (#5179 ) * feat: bump opendal and switch prometheus layer to the upstream impl Signed-off-by: Ruihang Xia <waynestxia@gmail.com> * remove unused files Signed-off-by: Ruihang Xia <waynestxia@gmail.com> * fix tests Signed-off-by: Ruihang Xia <waynestxia@gmail.com> * remove unused things Signed-off-by: Ruihang Xia <waynestxia@gmail.com> * remove root dir on recovering cache Signed-off-by: Ruihang Xia <waynestxia@gmail.com> * filter out non-files entry in test Signed-off-by: Ruihang Xia <waynestxia@gmail.com> --------- Signed-off-by: Ruihang Xia <waynestxia@gmail.com>	2024-12-19 03:42:05 +00:00
Yingwen	c6b7caa2ec	feat: do not remove time filters in ScanRegion (#5180 ) * feat: do not remove time filters * chore: remove `time_range` from parquet reader * chore: print more message in the check script * chore: fix unused error	2024-12-18 06:39:49 +00:00
Yingwen	58d6982c93	feat: do not keep MemtableRefs in ScanInput (#5184 )	2024-12-18 06:37:22 +00:00
Yohan Wal	7d1bcc9d49	feat: introduce Buffer for non-continuous bytes (#5164 ) * feat: introduce Buffer for non-continuous bytes * Update src/mito2/src/cache/index.rs Co-authored-by: Weny Xu <wenymedia@gmail.com> * chore: apply review comments * refactor: use opendal::Buffer --------- Co-authored-by: Weny Xu <wenymedia@gmail.com>	2024-12-18 03:45:38 +00:00
LFC	18e8c45384	refactor: produce BatchBuilder from a Batch to modify it again (#5186 ) chore: pub some mods	2024-12-18 02:42:33 +00:00
Lei, HUANG	c33cf59398	perf: avoid holding memtable during compaction (#5157 ) * perf/avoid-holding-memtable-during-compaction: Refactor Compaction Version Handling • Introduced CompactionVersion struct to encapsulate region version details for compaction, removing dependency on VersionRef. • Updated CompactionRequest and CompactionRegion to use CompactionVersion. • Modified open_compaction_region to construct CompactionVersion without memtables. • Adjusted WindowedCompactionPicker to work with CompactionVersion. • Enhanced flush logic in WriteBufferManager to improve memory usage checks and logging. * reformat code * chore: change log level * reformat code --------- Co-authored-by: Yingwen <realevenyag@gmail.com>	2024-12-17 07:06:07 +00:00
Yingwen	bfc777e6ac	fix: deletion between two put may not work in `last_non_null` mode (#5168 ) * fix: deletion between rows with the same key may not work * test: add sqlness test case * chore: comments	2024-12-17 04:01:32 +00:00
Yingwen	8a5384697b	chore: add aquamarine to dep lists (#5181 )	2024-12-17 01:45:50 +00:00
Weny Xu	d0245473a9	fix: correct `set_region_role_state_gracefully` behaviors (#5171 ) * fix: reduce default max rows for fuzz testing * chore: remove Postgres setup from fuzz test workflow * chore(fuzz): increase resource limits for GreptimeDB cluster * chore(fuzz): increase resource limits for kafka * fix: correct `set_region_role_state_gracefully` behaviors * chore: remove Postgres setup from fuzz test workflow * chore(fuzz): redue resource limits for GreptimeDB & kafka	2024-12-16 14:01:40 +00:00
Lei, HUANG	5ffda7e971	chore: gauge for flush compaction (#5156 ) * add metrics * chore/bench-metrics: Add INFLIGHT_FLUSH_COUNT Metric to Flush Process • Introduced INFLIGHT_FLUSH_COUNT metric to track the number of ongoing flush operations. • Incremented INFLIGHT_FLUSH_COUNT in FlushScheduler to monitor active flushes. • Removed redundant increment of INFLIGHT_FLUSH_COUNT in RegionWorkerLoop to prevent double counting. * chore/bench-metrics: Add Metrics for Compaction and Flush Operations • Introduced INFLIGHT_COMPACTION_COUNT and INFLIGHT_FLUSH_COUNT metrics to track the number of ongoing compaction and flush operations. • Incremented INFLIGHT_COMPACTION_COUNT when scheduling remote and local compaction jobs, and decremented it upon completion. • Added INFLIGHT_FLUSH_COUNT increment and decrement logic around flush tasks to monitor active flush operations. • Removed redundant metric updates in worker.rs and handle_compaction.rs to streamline metric handling. * chore: add metrics for remote compaction jobs * chore: format * chore: also add dashbaord	2024-12-16 07:08:07 +00:00
shuiyisong	9d7fea902e	chore: remove unused dep (#5163 ) * chore: remove unused dep * chore: remove more unused dep	2024-12-16 06:17:27 +00:00
Yohan Wal	4b4c6dbb66	refactor: cache inverted index with fixed-size page (#5114 ) * feat: cache inverted index by page instead of file * fix: add unit test and fix bugs * chore: typo * chore: ci * fix: math * chore: apply review comments * chore: renames * test: add unit test for index key calculation * refactor: use ReadableSize * feat: add config for inverted index page size * chore: update config file * refactor: handle multiple range read and fix some related bugs * fix: add config * test: turn to a fs reader to match behaviors of object store	2024-12-13 07:34:24 +00:00
Yingwen	fee75a1fad	feat: collect reader metrics from prune reader (#5152 )	2024-12-12 11:27:22 +00:00
Weny Xu	2137c53274	feat(index): add `file_size_hint` for remote blob reader (#5147 ) feat(index): add file_size_hint for remote blob reader	2024-12-12 04:45:40 +00:00
Weny Xu	d53fbcb936	feat: introduce `PuffinMetadataCache` (#5148 ) * feat: introduce `PuffinMetadataCache` * refactor: remove too_many_arguments * chore: fmt toml	2024-12-12 04:09:36 +00:00
Lei, HUANG	a30d918df2	perf: avoid cache during compaction (#5135 ) * Revert "refactor: Avoid wrapping Option for CacheManagerRef (#4996)" This reverts commit `42bf7e9965`. * fix: memory usage during log ingestion * fix: fmt	2024-12-11 08:24:41 +00:00
dennis zhuang	03a28320d6	feat!: enable read cache and write cache when using remote object stores (#5093 ) * feat: enable read cache and write cache when using remote object stores * feat: make read cache be aware of remote store names * chore: docs * chore: apply review suggestions * chore: trim write cache path --------- Co-authored-by: Yingwen <realevenyag@gmail.com>	2024-12-10 04:03:44 +00:00
Lei, HUANG	ce86ba3425	chore: Reduce FETCH_OPTION_TIMEOUT from 10 to 3 seconds in config.rs (#5117 ) Reduce FETCH_OPTION_TIMEOUT from 10 to 3 seconds in config.rs	2024-12-09 13:39:18 +00:00
Yingwen	2fcb95f50a	fix!: fix regression caused by unbalanced partitions and splitting ranges (#5090 ) * feat: assign partition ranges by rows * feat: balance partition rows * feat: get uppoer bound for part nums * feat: only split in non-compaction seq scan * fix: parallel scan on multiple sources * fix: can split check * feat: scanner prepare by request * feat: remove scan_parallelism * docs: upate docs * chore: update comment * style: fix clippy * feat: skip merge and dedup if there is only one source * chore: Revert "feat: skip merge and dedup if there is only one source" Since memtable won't do dedup jobs This reverts commit `2fc7a54b11`. * test: avoid compaction in sqlness window sort test * chore: do not create semaphore if num partitions is enough * chore: more assertions * chore: fix typo * fix: compaction flag not set * chore: address review comments	2024-12-09 12:50:57 +00:00
Lin Yihai	19373d806d	chore: Add timeout setting for `find_ttl`. (#5088 )	2024-12-06 15:02:15 +00:00
discord9	8b944268da	feat: ttl=0/instant/forever/humantime&ttl refactor (#5089 ) * feat: ttl zero filter * refactor: use TimeToLive enum * fix: unit test * tests: sqlness * refactor: Option<TTL> None means UNSET * tests: sqlness * fix: 10000 years --> forever * chore: minor refactor from reviews * chore: rename back TimeToLive * refactor: split imme request from normal requests * fix: use correct lifetime * refactor: rename immediate to instant * tests: flow sink table default ttl * refactor: per review * tests: sqlness * fix: ttl alter to instant * tests: sqlness * refactor: per review * chore: per review * feat: add db ttl type&forbid instant for db * tests: more unit test	2024-12-06 09:20:42 +00:00
Yingwen	66c0445974	perf: take a new batch to reduce last row cache usage (#5095 ) * feat: take and cache last row to save memory * style: fix clippy	2024-12-05 03:59:28 +00:00
Lei, HUANG	a51853846a	fix: schema cache invalidation (#5067 ) * fix: use SchemaCache to locate database metadata * main: Refactor SchemaMetadataManager to use TableInfoCacheRef - Replace TableInfoManagerRef with TableInfoCacheRef in SchemaMetadataManager - Update DatanodeBuilder to pass TableInfoCacheRef to SchemaMetadataManager - Rename error MissingCacheRegistrySnafu to MissingCacheSnafu in datanode module - Adjust tests to use new mock_schema_metadata_manager with TableInfoCacheRef * fix/schema-cache-invalidation: Add cache module and integrate cache registry into datanode • Implement build_datanode_cache_registry function to create cache registry for datanode • Integrate cache registry into datanode by modifying DatanodeBuilder and HeartbeatTask • Refactor InvalidateTableCacheHandler to InvalidateCacheHandler and move to common-meta crate • Update Cargo.toml to include cache as a dev-dependency for datanode • Adjust related modules (flownode, frontend, tests-integration, standalone) to use new cache handler and registry • Remove obsolete handler module from frontend crate * fix: fuzz imports * chore: add some doc for cahce builder functions * refactor: change table info cache to table schema cache * fix: remove unused variants * fix fuzz * chore: apply suggestion Co-authored-by: Weny Xu <wenymedia@gmail.com> * chore: apply suggestion Co-authored-by: Weny Xu <wenymedia@gmail.com> * fix: compile --------- Co-authored-by: dennis zhuang <killme2008@gmail.com> Co-authored-by: Weny Xu <wenymedia@gmail.com>	2024-12-03 10:44:29 +00:00
Weny Xu	51c6eafb16	feat: recover file cache index asynchronously (#5087 ) * feat: recover file cache index asynchronously * chore: apply suggestions from CR	2024-12-03 09:33:52 +00:00
Lanqing Yang	8bdef776b3	fix: allow physical region alter region options (#5046 ) allow physical region alter region options	2024-11-27 08:24:34 +00:00
Yingwen	6130c70b63	fix: pass series row selector to file range reader (#5054 )	2024-11-26 11:13:00 +00:00
Lei, HUANG	3029b47a89	fix: find latest window (#5037 ) * fix: find latest window * more test files	2024-11-21 04:56:03 +00:00
Weny Xu	14d997e2d1	feat: add unset table options support (#5034 ) * feat: add unset table options support * test: add tests * chore: update greptime-proto * chore: add comments	2024-11-21 03:52:56 +00:00
Yohan Wal	55ced9aa71	refactor: split up different stmts (#4997 ) * refactor: set and unset * chore: error message * fix: reset Cargo.lock * Apply suggestions from code review Co-authored-by: jeremyhi <jiachun_feng@proton.me> * Apply suggestions from code review Co-authored-by: Weny Xu <wenymedia@gmail.com> --------- Co-authored-by: jeremyhi <jiachun_feng@proton.me> Co-authored-by: Weny Xu <wenymedia@gmail.com>	2024-11-20 06:02:51 +00:00
Yingwen	63bbfd04c7	fix: prune memtable/files range independently in each partition (#4998 ) * feat: prune in each partition * chore: change pick log to trace * chore: add in progress partition scan to metrics * feat: seqscan support pruning in partition * chore: remove commented codes	2024-11-19 12:43:30 +00:00

1 2 3 4 5 ...

477 Commits