LFC
5eac4f10aa
chore: remove dependency on "atty" ( #7725 )
...
Signed-off-by: luofucong <luofc@foxmail.com >
2026-02-26 09:58:01 +00:00
dennis zhuang
8883022742
refactor(vector-index): use protobuf for metadata and align code ( #7648 )
...
* refactor(vector-index): use protobuf for metadata and introduce lifecycle traits
Signed-off-by: Dennis Zhuang <killme2008@gmail.com >
* chore: minor change
Signed-off-by: Dennis Zhuang <killme2008@gmail.com >
* refactor: by suggestions
Signed-off-by: Dennis Zhuang <killme2008@gmail.com >
* chore: format
Signed-off-by: Dennis Zhuang <killme2008@gmail.com >
* fix: style
Signed-off-by: Dennis Zhuang <killme2008@gmail.com >
* chore: remove usearch from mito2
Signed-off-by: Dennis Zhuang <killme2008@gmail.com >
* chore: tweak errors
Signed-off-by: Dennis Zhuang <killme2008@gmail.com >
* test: update index size in result
Signed-off-by: Dennis Zhuang <killme2008@gmail.com >
* fix: clippy
Signed-off-by: Dennis Zhuang <killme2008@gmail.com >
* chore: update proto deps
Signed-off-by: Dennis Zhuang <killme2008@gmail.com >
---------
Signed-off-by: Dennis Zhuang <killme2008@gmail.com >
2026-02-05 02:41:48 +00:00
discord9
1afcddd5a9
chore: feature gate vector_index ( #7428 )
...
Signed-off-by: discord9 <discord9@163.com >
2025-12-17 07:14:25 +00:00
dennis zhuang
a35a39f726
feat(vector_index): adds the foundational types and SQL parsing support for vector index ( #7366 )
...
* feat: adds the foundational types and SQL parsing support for vector index
Signed-off-by: Dennis Zhuang <killme2008@gmail.com >
* refactor: by suggestions
Signed-off-by: Dennis Zhuang <killme2008@gmail.com >
* fix: ensure index option values must be greater than zero
Signed-off-by: Dennis Zhuang <killme2008@gmail.com >
* chore: validate connectivity strictly
Signed-off-by: Dennis Zhuang <killme2008@gmail.com >
* fix: compile error
Signed-off-by: Dennis Zhuang <killme2008@gmail.com >
* feat: disable SIMD for ci
Signed-off-by: Dennis Zhuang <killme2008@gmail.com >
---------
Signed-off-by: Dennis Zhuang <killme2008@gmail.com >
2025-12-16 22:45:36 +00:00
Yingwen
84e4e42ee7
feat: add more verbose metrics to scanners ( #7336 )
...
* feat: add inverted applier metrics
Signed-off-by: evenyag <realevenyag@gmail.com >
* feat: add metrics to bloom applier
Signed-off-by: evenyag <realevenyag@gmail.com >
* feat: add metrics to fulltext index applier
Signed-off-by: evenyag <realevenyag@gmail.com >
* feat: implement BloomFilterReadMetrics for BloomFilterReader
Signed-off-by: evenyag <realevenyag@gmail.com >
* feat: collect read metrics for inverted index
Signed-off-by: evenyag <realevenyag@gmail.com >
* feat: add metrics for range_read and metadata
Signed-off-by: evenyag <realevenyag@gmail.com >
* refactor: rename elapsed to fetch_elapsed
Signed-off-by: evenyag <realevenyag@gmail.com >
* feat: collect metadata fetch metrics for inverted index
Signed-off-by: evenyag <realevenyag@gmail.com >
* feat: collect cache metrics for inverted and bloom index
Signed-off-by: evenyag <realevenyag@gmail.com >
* feat: collect read metrics in appliers
Signed-off-by: evenyag <realevenyag@gmail.com >
* feat: collect fulltext dir metrics for applier
Signed-off-by: evenyag <realevenyag@gmail.com >
* feat: collect parquet row group metrics
Signed-off-by: evenyag <realevenyag@gmail.com >
* feat: add parquet metadata metrics
Signed-off-by: evenyag <realevenyag@gmail.com >
* feat: add apply metrics
Signed-off-by: evenyag <realevenyag@gmail.com >
* feat: collect more metrics for memory row group
Signed-off-by: evenyag <realevenyag@gmail.com >
* feat: add fetch metrics to ReaderMetrics
Signed-off-by: evenyag <realevenyag@gmail.com >
* feat: init verbose metrics
Signed-off-by: evenyag <realevenyag@gmail.com >
* feat: debug print metrics in ScanMetricsSet
Signed-off-by: evenyag <realevenyag@gmail.com >
* feat: implement debug for new metrics
Signed-off-by: evenyag <realevenyag@gmail.com >
* chore: fix compiler errors
Signed-off-by: evenyag <realevenyag@gmail.com >
* feat: update parquet fetch metrics
Signed-off-by: evenyag <realevenyag@gmail.com >
* feat: collect the whole fetch time
Signed-off-by: evenyag <realevenyag@gmail.com >
* feat: add file_scan_cost
Signed-off-by: evenyag <realevenyag@gmail.com >
* chore: parquet fetch add cache_miss counter
Signed-off-by: evenyag <realevenyag@gmail.com >
* feat: print index read metrics
Signed-off-by: evenyag <realevenyag@gmail.com >
* chore: use actual bytes to increase counter
Signed-off-by: evenyag <realevenyag@gmail.com >
* refactor: remove provided implementations for index reader traits
Signed-off-by: evenyag <realevenyag@gmail.com >
* refactor: change get_parquet_meta_data() method to receive metrics
Signed-off-by: evenyag <realevenyag@gmail.com >
* refactor: rename file_scan_cost to sst_scan_cost
Signed-off-by: evenyag <realevenyag@gmail.com >
* chore: refine ParquetFetchMetrics
Signed-off-by: evenyag <realevenyag@gmail.com >
* style: fix clippy
Signed-off-by: evenyag <realevenyag@gmail.com >
* style: fmt code
Signed-off-by: evenyag <realevenyag@gmail.com >
* refactor: remove useless inner method
Signed-off-by: evenyag <realevenyag@gmail.com >
* refactor: collect page size actual needed
Signed-off-by: evenyag <realevenyag@gmail.com >
* refactor: simplify InvertedIndexReadMetrics
Signed-off-by: evenyag <realevenyag@gmail.com >
* refactor: simplfy InvertedIndexApplyMetrics Debug
Signed-off-by: evenyag <realevenyag@gmail.com >
* refactor: simplify BloomFilterReadMetrics Debug
Signed-off-by: evenyag <realevenyag@gmail.com >
* refactor: simplify BloomFilterIndexApplyMetrics Debug
Signed-off-by: evenyag <realevenyag@gmail.com >
* refactor: simplify FulltextIndexApplyMetrics implementation
Signed-off-by: evenyag <realevenyag@gmail.com >
* refactor: simplify ParquetFetchMetrics Debug
Signed-off-by: evenyag <realevenyag@gmail.com >
* refactor: simplify MetadataCacheMetrics Debug
Signed-off-by: evenyag <realevenyag@gmail.com >
* feat: only print verbose metrics when they are not empty.
Signed-off-by: evenyag <realevenyag@gmail.com >
* refactor: use mutex to protect ParquetFetchMetrics
Signed-off-by: evenyag <realevenyag@gmail.com >
* style: fmt code
Signed-off-by: evenyag <realevenyag@gmail.com >
* refactor: use duration for elapsed in ParquetFetchMetricsData
Signed-off-by: evenyag <realevenyag@gmail.com >
---------
Signed-off-by: evenyag <realevenyag@gmail.com >
2025-12-04 13:40:18 +00:00
Zhenchi
7b396bb290
feat(mito2): expose puffin index metadata ( #7042 )
...
* Add encode/decode helpers for IndexTarget
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
* Use IndexTarget encode for puffin index blob keys
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
* Normalize puffin index blobs to use IndexTarget keys
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
* feat(mito2): expose puffin index metadata
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
* target json polish
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
* fix header
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
* add index path
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
* address copilot comments
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
* address comments
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
* reuse cached index metadata
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
* parallelism for reading index meta
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
---------
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
2025-10-17 06:22:07 +00:00
LFC
8fe17d43d5
chore: update rust to nightly 2025-10-01 ( #7069 )
...
* chore: update rust to nightly 2025-10-01
Signed-off-by: luofucong <luofc@foxmail.com >
* chore: nix update
---------
Signed-off-by: luofucong <luofc@foxmail.com >
Co-authored-by: Ning Sun <sunning@greptime.com >
2025-10-11 07:30:52 +00:00
Ruihang Xia
c9377e7c5a
build: bump rust edition to 2024 ( #6920 )
...
* bump edition
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
* format
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
* gen keyword
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
* lifetime and env var
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
* one more gen fix
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
* lifetime of temporaries in tail expressions
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
* format again
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
* clippy nested if
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
* clippy let and return
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
---------
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
2025-09-08 02:37:18 +00:00
Ruihang Xia
e495c614f7
perf: improve bloom filter reader's byte reading logic ( #6658 )
...
* perf: improve bloom filter reader's byte reading logic
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
* revert toml change
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
* clearify comment
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
* benchmark
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
* update lock file
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
* pub util fn
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
* note endian
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
---------
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
2025-08-12 11:37:25 +00:00
Yingwen
bbab35f285
perf: Reduce fulltext bloom load time ( #6651 )
...
* perf: cached reader do not get page concurrently
Otherwise they will all fetch the same pages in parallel
Signed-off-by: evenyag <realevenyag@gmail.com >
* perf: always disable zstd for bloom
Signed-off-by: evenyag <realevenyag@gmail.com >
---------
Signed-off-by: evenyag <realevenyag@gmail.com >
2025-08-06 08:25:31 +00:00
Ruihang Xia
757694ae38
feat: count underscore in English tokenizer and improve performance ( #6660 )
...
* feat: count underscore in English tokenizer and improve performance
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
* update lock file
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
* update test results
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
* assert lookup table
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
* handle utf8 alphanumeric
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
* finalize
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
---------
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
2025-08-06 07:23:18 +00:00
yihong
e19493db4a
chore: update jieba tantivy-jieba and tantivy version ( #6637 )
...
* chore: update jieba tantivy-jieba and tantivy version
Signed-off-by: yihong0618 <zouzou0208@gmail.com >
* fix: address comments
Signed-off-by: yihong0618 <zouzou0208@gmail.com >
---------
Signed-off-by: yihong0618 <zouzou0208@gmail.com >
2025-08-03 19:08:36 +00:00
Zhenchi
599f289f59
feat: add granularity and false_positive_rate options for indexes ( #6416 )
...
* feat: add `granularity` and `false_positive_rate` options for indexes
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
* address comments
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
* upgrade proto
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
---------
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
2025-07-02 07:33:39 +00:00
Zhenchi
400229c384
feat: introduce index result cache ( #6110 )
...
* feat: introduce index result cache
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
* Update src/mito2/src/sst/index/inverted_index/applier/builder.rs
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
* optimize selector_len
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
* address comments
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
* address comments
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
* address comments
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
---------
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
2025-05-20 01:45:42 +00:00
shuiyisong
3c943be189
chore: update rust toolchain ( #5818 )
...
* chore: update nightly version
* chore: sort lint lines
* chore: minor fix
* chore: update nix
* chore: update toolchain to 2024-04-14
* chore: update toolchain to 2024-04-15
* chore: remove unnecessory test
* chore: do not assert oid in sqlness test
* chore: fix margin issue
* chore: fix cr issues
* chore: fix cr issues
---------
Co-authored-by: Ning Sun <sunning@greptime.com >
2025-04-27 09:02:36 +00:00
Zhenchi
d5026f3491
perf: optimize fulltext zh tokenizer for ascii-only text ( #5975 )
...
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
2025-04-24 23:31:26 +00:00
Zhenchi
e3675494b4
feat: apply terms with fulltext bloom backend ( #5884 )
...
* feat: apply terms with fulltext bloom backend
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
* perf: preload jieba
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
* polish doc
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
---------
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
2025-04-14 07:08:59 +00:00
Zhenchi
dce5e35d7c
feat: apply terms with fulltext tantivy backend ( #5869 )
...
* feat: apply terms with fulltext tantivy backend
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
* fix test
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
* address comments
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
---------
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
2025-04-10 07:32:15 +00:00
Ruihang Xia
c26e165887
refactor: check and fix super import ( #5846 )
...
* refactor: check and fix super import
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
* add to makefile
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
* change dir
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
---------
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
2025-04-08 11:48:52 +00:00
Zhenchi
f797de3497
feat: add backend field to fulltext options ( #5806 )
...
* feat: add backend field to fulltext options
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
* update proto
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
* fix option conv
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
* fix display
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
* polish
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
---------
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
2025-04-02 09:15:54 +00:00
Zhenchi
aa486db8b7
refactor: allow bloom filter search to apply and conjunction ( #5770 )
...
* refactor: change bloom filter search from any to all match
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
* polish
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
* place back in list
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
* nit
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
---------
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
2025-04-01 12:50:34 +00:00
fys
2b2ea5bf72
chore: upgrade some dependencies ( #5777 )
...
* chore: upgrade some dependencies
* chore: upgrade some dependencies
* fix: cr
* fix: ci
* fix: test
* fix: cargo fmt
2025-03-27 02:48:44 +00:00
Zhenchi
7bcb01d269
feat: utilize blob metadata properties ( #5767 )
...
* feat: utilize blob metadata properties
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
* Update src/puffin/src/puffin_manager/fs_puffin_manager/reader.rs
---------
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
2025-03-26 02:47:20 +00:00
Zhenchi
face361fcb
feat: introduce roaring bitmap to optimize sparse value scenarios ( #5603 )
...
* feat: introduce roaring bitmap to optimize sparse value scenarios
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
* fix taplo
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
* address comments
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
* polish
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
* address comments
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
---------
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
2025-03-10 04:24:08 +00:00
Zhenchi
e714f7df6c
fix: out of bound during bloom search ( #5625 )
...
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
2025-03-03 09:53:14 +00:00
Zhenchi
8d05fb3503
feat: unify puffin name passed to stager ( #5564 )
...
* feat: purge a given puffin file in staging area
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
* polish log
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
* ttl set to 2d
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
* feat: expose staging_ttl to index config
* feat: unify puffin name passed to stager
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
* fix test
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
* address comments
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
* fallback to remote index
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
* fix
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
* refactor
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
---------
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
Co-authored-by: evenyag <realevenyag@gmail.com >
2025-02-21 09:27:03 +00:00
Zhenchi
421e38c481
feat: allow purging a given puffin file in staging area ( #5558 )
...
* feat: purge a given puffin file in staging area
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
* polish log
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
* ttl set to 2d
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
* feat: expose staging_ttl to index config
* fix test
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
* use `invalidate_entries_if` instead of maintaining map
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
* run_pending_tasks after purging
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
---------
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
Co-authored-by: evenyag <realevenyag@gmail.com >
2025-02-19 08:58:30 +00:00
Zhenchi
858dae7b23
feat: add stager nofitier to collect metrics ( #5530 )
...
* feat: add stager nofitier to collect metrics
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
* apply prev commit
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
* remove dup size
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
* add load cost
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
---------
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
2025-02-14 07:49:26 +00:00
Yingwen
35b635f639
feat!: Bump datafusion, prost, hyper, tonic, tower, axum ( #5417 )
...
* change dep
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
* feat: adapt to arrow's interval array
* chore: fix compile errors in datatypes crate
* chore: fix api crate compiler errors
* chore: fix compiler errors in common-grpc
* chore: fix common-datasource errors
* chore: fix deprecated code in common-datasource
* fix promql and physical plan related
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
* wip: upgrading network deps
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
* block on updating `sqlparser`
* upgrade sqlparser
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
* adapt new df's trait requirements
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
* chore: fix compiler errors in mito2
* chore: fix common-function crate errors
* chore: fix catalog errors
* change import path
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
* chore: fix some errors in query crate
* chore: fix some errors in query crate
* aggr expr and some other tiny fixes
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
* chore: fix expr related errors in query crate
* chore: fix query serializer and admin command
* chore: fix grpc services
* feat: axum serve
* chore: fix http server
* remove handle_error handler
* refactor timeout layer
* serve axum
* chore: fix flow aggr functions
* chore: fix flow
* feat: fix errors in meta-srv
* boxed()
* use TokioIo
* feat!: Remove script crate and python feature (#5321 )
* feat: exclude script crate
* chore: simplify feature
* feat: remove the script crate
* chore: remove python feature and some comments
* chore: fix warning
* chore: fix servers tests compiler errors
* feat: fix tests-integration errors
* chore: fix unused
* test: fix catalog test
* chore: fix compiler errors for crates using common-meta
testing feature is enabled when check with --workspace
* test: use display for logical plan test
* test: implement rewrite for ScanHintRule
* fix: http server build panic
* test: fix mito test
* fix: sql parser type alias error
* test: fix TestClient not listen
* test: some flow tests
* test(flow): more fix
* fix: test_otlp_logs
* test: fix promql test that using deprecated method fun()
* fix: sql type replace supports Int8 ~ Int64, UInt8 ~ UInt64
* test: fix infer schema test case
* test: fix tests related to plan display
* chore: fix last flow test
* test: fix function format related assertion
* test: use larger port range for tests
* fix: test_otlp_traces
* fix: test_otlp_metrics
* fix range query and dist plan
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
* fix: flow handle distinct use deprecated field
* fix: can't pass Join plan expressions to LogicalPlan::with_new_exprs
* test: fix deserialize test
* test: reduce split key case num
* tests: lower case aggr func name
* test: fix some sqlness tests
* tests: more sqlness fix
* tests: fixed sqlness test
* commit non-bug changes
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
* fix: make our udf correct
* fix: implement empty methods of ContextProvider for DfContextProviderAdapter
* test: update sqlness test result
* chore: remove unused
* fix: provide alias name for AggregateExprBuilder in range plan
* test: update range query result
* fix: implement missing ContextProvider methods for DfContextProviderAdapter
* test: update timestamps, cte result
* fix: supports empty projection in mito
* test: update comment for cte test
* fix: support projection for numbers
* test: update test cases after projection fix
* fix: fix range select first_value/last_value
* fix: handle CAST and time index conflict
* fix: handle order by correctly in range first_value/last_value
* test: update sqlness result
* test: update view test result
* test: update decimal test
wait for https://github.com/apache/datafusion/pull/14126 to fix this
* feat: remove redundant physical optimization
todo(ruihang): Check if we can remove this.
* test: update sqlness test result
* chore: range select default sort use nulls_first = false
* test: update filter push down test result
* test: comment deciaml test to avoid different panic message
* test: update some distributed test result
* test: update test for distributed count and filter push down
* test: update subqueries test
* fix: SessionState may overwrite our UDFs
* chore: fix compiler errors after merging main
* fix: fix elasticsearch and dashboard router panic
* chore: fix common-functions tests
* chore: update sqlness result
* test: fix id keyword and update sqlness result
* test: fix flow_null test
* fix: enlarge thread size in debug mode to avoid overflow
* chore: fix warnings in common-function
* chore: fix warning in flow
* chore: fix warnings in query crate
* chore: remove unused warnings
* chore: fix deprecated warnings for parquet
* chore: fix deprecated warning in servers crate
* style: fix clippy
* test: enlarge mito cache tttl test ttl time
* chore: fix typo
* style: fmt toml
* refactor: reimplement PartialOrd for RangeSelect
* chore: remove script crate files introduced by merge
* fix: return error if sql option is not kv
* chore: do not use ..default::default()
* chore: per review
* chore: update error message in BuildAdminFunctionArgsSnafu
Co-authored-by: jeremyhi <jiachun_feng@proton.me >
* refactor: typed precision
* update sqlness view case
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
* chore: flow per review
* chore: add example in comment
* chore: warn if parquet stats of timestamp is not INT64
* style: add a newline before derive to make the comment more clear
* test: update sqlness result
* fix: flow from substrait
* chore: change update_range_context log to debug level
* chore: move axum-extra axum-macros to workspace
---------
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
Co-authored-by: Ruihang Xia <waynestxia@gmail.com >
Co-authored-by: luofucong <luofc@foxmail.com >
Co-authored-by: discord9 <discord9@163.com >
Co-authored-by: shuiyisong <xixing.sys@gmail.com >
Co-authored-by: jeremyhi <jiachun_feng@proton.me >
2025-01-23 06:15:40 +00:00
Zhenchi
f74a955504
feat: bloom filter as fulltext index v2 (Part 1) ( #5406 )
...
* feat: bloom filter as fulltext index v2
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
* add unit tests for tokenizer
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
* address comments
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
* address comments
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
* refactor dup vars
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
* address comments
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
---------
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
2025-01-21 23:33:11 +00:00
Zhenchi
1acfb6ed1c
feat!: use indirect indices for bloom filter to reduce size ( #5377 )
...
* feat!(bloom-filter): use indirect indices to reduce size
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
* fix format
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
* update proto
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
* nit
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
* upgrade proto
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
---------
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
2025-01-16 13:18:29 +00:00
Zhenchi
5cf9d7b6ca
fix(bloom-filter): filter rows with segment precision ( #5286 )
...
* fix(bloom-filter): filter rows with segment precision
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
* add case
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
* address TODO
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
---------
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
2025-01-06 11:45:15 +00:00
Weny Xu
96b2a5fb28
feat: introduce ParallelFstValuesMapper ( #5276 )
...
* refactor: `RangeReader` to use `&self`
* refactor: `InvertedIndexReader` to use `&self`
* refactor: refactor: `BloomFilterReader` to use `&self`
* feat: introduce `ParallelFstValuesMapper`
* chore: change prefetch size to 8KiB
* chore: add `file_size_hint` for cached blob reader
* chore: fix clippy
* refactor: remove `FstValuesMapper`
* chore: apply suggestions from CR
2025-01-06 07:33:35 +00:00
Zhenchi
f4b2d393be
feat(config): add bloom filter config ( #5237 )
...
* feat(bloom-filter): integrate indexer with mito2
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
* feat(config) add bloom filter config
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
* fix
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
* fix docs
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
* address comments
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
* fix docs
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
* merge
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
* remove cache config
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
---------
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
2024-12-26 04:38:45 +00:00
Ruihang Xia
00ad27dd2e
feat(bloom-filter): bloom filter applier ( #5220 )
...
* wip
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
* draft search logic
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
* use defined BloomFilterReader
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
* fix clippy
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
* round the range end
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
* finish index applier
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
* integrate applier into mito2 with cache layer
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
* fix cache key and add unit test
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
* provide bloom filter index size hint
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
* revert BloomFilterReaderImpl::read_vec
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
* remove dead code
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
* ignore null on eq
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
* add more tests and fix bloom filter logic
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
---------
Signed-off-by: Ruihang Xia <waynestxia@gmail.com >
2024-12-26 02:51:18 +00:00
Zhenchi
a9f21915ef
feat(bloom-filter): integrate indexer with mito2 ( #5236 )
...
* feat(bloom-filter): integrate indexer with mito2
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
* rename skippingindextype
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
* address comments
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
---------
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
2024-12-25 14:30:07 +00:00
Zhenchi
c96903e60c
feat(bloom-filter): impl batch push to creator ( #5225 )
...
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
2024-12-25 07:53:53 +00:00
Zhenchi
d51b65a8bf
feat(index-cache): abstract IndexCache to be shared by multi types of indexes ( #5219 )
...
* feat(index-cache): abstract `IndexCache` to be shared by multi types of indexes
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
* fix typo
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
* fix: remove added label
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
* refactor: simplify cached reader impl
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
* rename func
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
---------
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
2024-12-24 05:10:30 +00:00
Zhenchi
4245bff8f2
feat(bloom-filter): add bloom filter reader ( #5204 )
...
* feat(bloom-filter): add bloom filter reader
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
* chore: remove unused dep
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
* fix conflict
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
* address comments
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
---------
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
2024-12-20 08:29:18 +00:00
Zhenchi
3d4121aefb
feat(bloom-filter): add memory control for creator ( #5185 )
...
* feat(bloom-filter): add memory control for creator
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
* refactor: remove meaningless buf
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
* feat: add codec for intermediate
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
---------
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
2024-12-20 06:59:44 +00:00
Yohan Wal
7d1bcc9d49
feat: introduce Buffer for non-continuous bytes ( #5164 )
...
* feat: introduce Buffer for non-continuous bytes
* Update src/mito2/src/cache/index.rs
Co-authored-by: Weny Xu <wenymedia@gmail.com >
* chore: apply review comments
* refactor: use opendal::Buffer
---------
Co-authored-by: Weny Xu <wenymedia@gmail.com >
2024-12-18 03:45:38 +00:00
Zhenchi
d821dc5a3e
feat(bloom-filter): add basic bloom filter creator (Part 1) ( #5177 )
...
* feat(bloom-filter): add a simple bloom filter creator (Part 1)
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
* fix: clippy
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
* fix: header
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
* docs: add format comment
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
---------
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
2024-12-17 06:55:42 +00:00
Yohan Wal
4b4c6dbb66
refactor: cache inverted index with fixed-size page ( #5114 )
...
* feat: cache inverted index by page instead of file
* fix: add unit test and fix bugs
* chore: typo
* chore: ci
* fix: math
* chore: apply review comments
* chore: renames
* test: add unit test for index key calculation
* refactor: use ReadableSize
* feat: add config for inverted index page size
* chore: update config file
* refactor: handle multiple range read and fix some related bugs
* fix: add config
* test: turn to a fs reader to match behaviors of object store
2024-12-13 07:34:24 +00:00
Weny Xu
8c1959c580
feat: add prefetch support to InvertedIndexFooterReader for reduced I/O time ( #5146 )
...
* feat: add prefetch support to `InvertedIndeFooterReader`
* chore: correct struct name
* chore: apply suggestions from CR
2024-12-12 03:49:54 +00:00
Weny Xu
e2a41ccaec
feat: add prefetch support to PuffinFileFooterReader for reduced I/O time ( #5145 )
...
* feat: introduce `PuffinFileFooterReader`
* refactor: remove `SyncReader` trait and impl
* refactor: replace `FooterParser` with `PuffinFileFooterReader`
* chore: remove unused errors
2024-12-12 03:13:36 +00:00
Zhenchi
cbf21e53a9
feat(puffin): apply range reader ( #4928 )
...
* wip
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
* feat(puffin): apply range reader
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
* refactor: read_vec reduce iteration
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
* refactor: simplify rangereader for vec<u8>
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
* test: add unit test
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
* fix: toml format
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
---------
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
2024-11-12 02:36:38 +00:00
Zhenchi
9c79baca4b
feat(index): support building inverted index for the field column on Mito ( #4887 )
...
feat(index): support building inverted index for the field column
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
2024-10-29 07:57:17 +00:00
Lei, HUANG
e328c7067c
chore: udapte Rust toolchain to 2024-10-19 ( #4857 )
...
* update rust toolchain
* change toolchain to 2024-10-17
* fix: clippy
* fix: ut
* bump shadow-rs
* fix: use nightly-2024-10-19
* fix: clippy
* chore/udapte-toolchain-2024-10-17: Update DEV_BUILDER_IMAGE_TAG to 2024-10-19-a5c00e85-20241024184445 in Makefile
2024-10-25 00:23:32 +00:00
Zhenchi
3b5b906543
feat(index): add explicit adapter between RangeReader and AsyncRead ( #4724 )
...
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
2024-09-18 03:33:55 +00:00
Zhenchi
f252599ac6
feat(index): add RangeReader trait ( #4718 )
...
* feat(index): add `RangeReader` trait`
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
* fix: return content_length as read bytes
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
* chore: remove buffer & use `BufMut`
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
---------
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com >
2024-09-10 15:24:06 +00:00