Commit Graph

34 Commits

Author SHA1 Message Date
Zhenchi
3b5b906543 feat(index): add explicit adapter between RangeReader and AsyncRead (#4724)
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>
2024-09-18 03:33:55 +00:00
Zhenchi
f252599ac6 feat(index): add RangeReader trait (#4718)
* feat(index): add `RangeReader` trait`

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* fix: return content_length as read bytes

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* chore: remove buffer & use `BufMut`

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

---------

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>
2024-09-10 15:24:06 +00:00
Ruihang Xia
93f202694c refactor: remove unused error variants (#4666)
* add python script

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* remove unused errors

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* fix all negative cases

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* setup CI

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* add license header

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

---------

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
2024-09-03 13:19:38 +00:00
Zhenchi
c8de8b80f4 fix(fulltext-index): single segment is not sufficient for >50M rows SST (#4552)
* fix(fulltext-index): single segment is not sufficient for a >50M rows SST

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* fix: update doc comment

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

---------

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>
2024-08-16 09:14:33 +00:00
LFC
a75cfaa516 chore: update snafu to make clippy happy (#4507)
* chore: update snafu to make clippy happy

* fix ci
2024-08-07 16:12:00 +00:00
Zhenchi
04ac0c8da0 feat(fulltext_index): integrate full-text indexer with parquet reader (#4348)
* feat(fulltext_index): integrate full-text indexer with parquet reader

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* disable reload

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* address comments

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* fix: range allow exceeding total row

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* test: unit tests in index

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* test: prune row groups

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* chore: rename creator

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* test: sst fulltext index

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* chore: address comment

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

---------

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>
2024-07-15 08:14:44 +00:00
Lei, HUANG
aa4d10eef7 feat(inverted_index): inverted index cache (#4309)
* feat/inverted-index-cache:
 Update dependencies and add caching for inverted index reader

 - Updated `atomic` to 0.6.0 and `uuid` to 1.9.1 in `Cargo.lock`.
 - Added `moka` and `uuid` dependencies in `Cargo.toml`.
 - Introduced `seek_read` method in `InvertedIndexBlobReader` for common seek and read operations.
 - Added `cache.rs` module to implement caching for inverted index reader using `moka`.
 - Updated `async-compression` to 0.4.11 in `puffin/Cargo.toml`.

* feat/inverted-index-cache:
 Refactor InvertedIndexReader and Add Index Cache Support

 - Refactored `InvertedIndexReader` to include `seek_read` method and default implementations for `fst` and `bitmap`.
 - Implemented `seek_read` in `InvertedIndexBlobReader` and `CachedInvertedIndexBlobReader`.
 - Introduced `InvertedIndexCache` in `CacheManager` and `SstIndexApplier`.
 - Updated `SstIndexApplierBuilder` to accept and utilize `InvertedIndexCache`.
 - Added `From<FileId> for Uuid` implementation.

* feat/inverted-index-cache:
 Update Cargo.toml and refactor SstIndexApplier

 - Moved `uuid.workspace` entry in Cargo.toml for better organization.

* feat/inverted-index-cache:
 Refactor InvertedIndexCache to use type alias for Arc

 - Replaced `Arc<InvertedIndexCache>` with `InvertedIndexCacheRef` type alias.

* feat/inverted-index-cache:
 Add Prometheus metrics and caching improvements for inverted index

 - Introduced `prometheus` and `puffin` dependencies for metrics.

* feat/inverted-index-cache:
 Refactor InvertedIndexReader and Cache handling

 - Simplified `InvertedIndexReader` trait by removing seek-related comments.

* feat/inverted-index-cache:
 Add configurable cache sizes for inverted index metadata and content
 - Introduced `index_metadata_size` and `index_content_size` in `CacheManagerBuilder`.

* feat/inverted-index-cache:
 Refactor and optimize inverted index caching

 - Removed `metrics.rs` and integrated cache metrics into `index.rs`.

* feat/inverted-index-cache:
 Remove unused dependencies from Cargo.lock and Cargo.toml

 - Removed `moka`, `prometheus`, and `puffin` dependencies from both Cargo.lock and Cargo.toml.

* feat/inverted-index-cache:
 Replace Uuid with FileId in CachedInvertedIndexBlobReader

 - Updated `file_id` type from `Uuid` to `FileId` in `CachedInvertedIndexBlobReader` and related methods.

* feat/inverted-index-cache:
 Refactor cache configuration for inverted index

 - Moved `inverted_index_metadata_cache_size` and `inverted_index_cache_size` from `MitoConfig` to `InvertedIndexConfig`.

* feat/inverted-index-cache:
 Remove unnecessary conversion of `file_id` in `SstIndexApplier`

 - Simplified the initialization of `CachedInvertedIndexBlobReader` by removing the redundant `into()` conversion for `file_id`.
2024-07-08 12:36:59 +00:00
Lei, HUANG
226136011e refactor: change InvertedIndexWriter method signature to offsets to f… (#4250)
refactor: change InvertedIndexWriter method signature to offsets to facilliate caching
2024-07-02 12:49:18 +00:00
Zhenchi
e64379d4f7 feat(fulltext_index): introduce creator (#4249)
* feat(fulltext_index): introduce creator

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* fix: typo

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* fix: typo

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* chore: polish

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* fix: return error if writer not found

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* refactor: helper function for tests

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

---------

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>
2024-07-02 09:06:14 +00:00
Ruihang Xia
45fee948e9 fix: display error in correct format (#4082)
* fix: display error in correct format

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* add address to RegionServer error

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

---------

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
2024-05-31 09:25:14 +00:00
Ruihang Xia
115c74791d build(deps): bump snafu to 0.8 (#3911)
* change Cargo.toml

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* global replace

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* handle alias in script engine

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* fix clippy

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

---------

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
2024-05-10 13:36:25 +00:00
Ruihang Xia
530353785c refactor: remove re-export from logging (#3865)
* refactor: remove re-export from logging

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* fix merge problem

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* run formatter

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

---------

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
2024-05-06 13:26:01 +00:00
Ruihang Xia
0c5f4801b7 build: update toolchain to nightly-2024-04-18 (#3740)
* chore: update toolchain to nightly-2024-04-17

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* fix test clippy

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* fix ut

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* update fuzz test

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* update to nightly-2024-04-18

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* add document

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* update CI

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* avoid unnecessary allow clippy attrs

Signed-off-by: tison <wander4096@gmail.com>

* help the compiler find the clone is unnecessary and make clippy happy

Signed-off-by: tison <wander4096@gmail.com>

---------

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
Signed-off-by: tison <wander4096@gmail.com>
Co-authored-by: tison <wander4096@gmail.com>
2024-04-19 05:42:34 +00:00
tison
7c1c6e8b8c refactor: try upgrade regex-automata (#3575)
* refactor: try upgrade regex-automata

Signed-off-by: tison <wander4096@gmail.com>

* try fix

Signed-off-by: tison <wander4096@gmail.com>

* always check match with next_eoi_state

Signed-off-by: tison <wander4096@gmail.com>

* add a guard to prevent over moving the state

Signed-off-by: tison <wander4096@gmail.com>

* tidy

Signed-off-by: tison <wander4096@gmail.com>

---------

Signed-off-by: tison <wander4096@gmail.com>
2024-03-26 04:28:14 +00:00
Zhenchi
c3c80b92c8 feat(index): measure memory usage in global instead of single-column and add metrics (#3383)
* feat(index): measure memory usage in global instead of single-column and add metrics

* feat: add leading zeros to streamline memory usage

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* fix: fmt

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* fix: remove println

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

---------

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>
2024-02-28 06:49:24 +00:00
SteveLauC
e9a2b0a9ee chore: use workspace-wide lints (#3352)
* chore: use workspace-wide lints

* respond to review
2024-02-22 01:01:10 +00:00
tison
4e04a4e48f build: support build without git (#3309)
* build: support build without git

Signed-off-by: tison <wander4096@gmail.com>

* chore

Signed-off-by: tison <wander4096@gmail.com>

* address comment

Signed-off-by: tison <wander4096@gmail.com>

* fix syntax

Signed-off-by: tison <wander4096@gmail.com>

---------

Signed-off-by: tison <wander4096@gmail.com>
2024-02-18 10:30:01 +00:00
Zhenchi
141ed51dcc feat(mito): adjust seg size of inverted index to finer granularity instead of row group level (#3289)
* feat(mito): adjust seg size of inverted index to finer granularity instead of row group level

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* fix: wrong metric

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* fix: more suitable name

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* feat: BitVec instead

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

---------

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>
2024-02-07 08:20:00 +00:00
Ruihang Xia
7da8f22cda fix: IntermediateWriter closes underlying writer twice (#3248)
* fix: IntermediateWriter closes underlying writer twice

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* close writer manually on error

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

---------

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
2024-01-26 10:03:50 +00:00
Zhenchi
6f07d69155 feat(mito): enable inverted index (#3158)
* feat(mito): enable inverted index

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* fix typos

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* fix typos

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* accidentally resolved the incorrect filtering issue within the Metric Engine

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* fix test

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* Update src/mito2/src/access_layer.rs

* Update src/mito2/src/test_util/scheduler_util.rs

Co-authored-by: Yingwen <realevenyag@gmail.com>

* fix: format -> join_dir

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* refactor: move intermediate_manager from arg of write_and_upload_sst to field of WriteCache

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* refactor: add IndexerBuidler

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* fix clippy

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

---------

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>
Co-authored-by: Yingwen <realevenyag@gmail.com>
2024-01-15 09:08:07 +00:00
Zhenchi
fd8fb641fd feat(parquet): introduce inverted index applier to reader (#3130)
* feat(parquet): introduce inverted index applier to reader

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* feat: purger removes index file

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* fix test

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* chore: add TODO for escape route

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* chore: add TODO for escape route

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* Update src/mito2/src/access_layer.rs

Co-authored-by: dennis zhuang <killme2008@gmail.com>

* Update src/mito2/src/sst/parquet/reader.rs

Co-authored-by: dennis zhuang <killme2008@gmail.com>

* feat: min-max index to prune row groups filtered by inverted index

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* feat: file_meta.inverted_index_available -> file_meta.available_indexes

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* chore: add TODO for leveraging WriteCache

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* fix fmt

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* fix: misset available indexes

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* feat: add index file size

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* refactor: use smallvec to reduce heap allocation

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* fix: add index size to disk usage

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

---------

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>
Co-authored-by: dennis zhuang <killme2008@gmail.com>
2024-01-11 08:04:59 +00:00
Zhenchi
db98484796 feat(inverted_index): introduce SstIndexCreator (#3107)
* feat(inverted_index): introduce SstIndexCreator

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* chore: tiny polish

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* feat: distinguish intermediate store and index store

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* chore: move comment as doc comment

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* refactor: column id as index name

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

---------

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>
2024-01-09 09:24:16 +00:00
Zhenchi
d973cf81f0 feat(inverted_index): implement apply for SstIndexApplier (#3088)
* feat(inverted_index): implement apply for SstIndexApplier

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* chore: rename metrics

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

---------

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>
2024-01-04 07:33:03 +00:00
Zhenchi
e4c71843e6 feat(inverted_index): get memory usage of appliers (#3081)
Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>
2024-01-03 06:56:56 +00:00
Zhenchi
69a53130c2 feat(inverted_index): Add applier builder to convert Expr to Predicates (Part 1) (#3034)
* feat(inverted_index.integration): Add applier builder to convert Expr to Predicates (Part 1)

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* chore: add docs

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* fix: typos

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* fix: address comments

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* Update src/mito2/src/sst/index/applier/builder.rs

Co-authored-by: Yingwen <realevenyag@gmail.com>

* fix: remove unwrap

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* chore: error source

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

---------

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>
Co-authored-by: Yingwen <realevenyag@gmail.com>
2023-12-30 07:32:32 +00:00
Ruihang Xia
286b9af661 chore: change all reference from develop to main (#3026)
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
2023-12-28 04:11:00 +00:00
Zhenchi
7d1724f832 feat(inverted_index.create): add index creator (#2960)
* feat(inverted_index.create): add read/write for external intermediate files

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* chore: MAGIC_CODEC_V1 -> CODEC_V1_MAGIC

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* chore: polish comments

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* chore: fix typos intermedia -> intermediate

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* fix: typos

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* feat(inverted_index.create): add external sorter

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* chore: fix typos intermedia -> intermediate

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* chore: polish comments

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* chore: polish comments

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* refactor: drop the stream as early as possible to avoid recursive calls to poll

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* refactor: project merge sorted stream

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* feat: add total_row_count to SortOutput

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* feat: remove change of format

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* feat(inverted_index.create): add index creator

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* chore: polish comments

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* feat: add check for total_row_count

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* feat: lazy set meta of writer

This reverts commit 63cb5bdb5c3a08406d978357d8167ca18ed1b83b.

* feat: lazyily provide inverted index writer

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* chore: polish readability

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* feat: add push_with_name_n

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

---------

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>
2023-12-20 07:02:13 +00:00
Zhenchi
83de399bef feat(inverted_index.create): add external sorter (#2950)
* feat(inverted_index.create): add read/write for external intermediate files

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* chore: MAGIC_CODEC_V1 -> CODEC_V1_MAGIC

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* chore: polish comments

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* chore: fix typos intermedia -> intermediate

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* fix: typos

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* feat(inverted_index.create): add external sorter

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* chore: fix typos intermedia -> intermediate

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* chore: polish comments

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* chore: polish comments

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* refactor: drop the stream as early as possible to avoid recursive calls to poll

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* refactor: project merge sorted stream

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* feat: add total_row_count to SortOutput

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* feat: remove change of format

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* refactor: rename segment null bitmap

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* refactor: test type alias

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* feat: allow `memory_usage_threshold` to be None to turn off dumping

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* feat: change segment_row_count type to NonZeroUsize

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* refactor: accept BytesRef instead

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* feat: add `push_n` to adapt mito2

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* chore: add k-way merge TODO

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* refactor: more sorter cases

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* refactor: make the merge tree balance

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* Update src/index/src/inverted_index/create/sort/external_sort.rs

Co-authored-by: Yingwen <realevenyag@gmail.com>

* chore: address comments

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* chore: stable feature

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

---------

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>
Co-authored-by: Yingwen <realevenyag@gmail.com>
2023-12-19 08:14:37 +00:00
Zhenchi
029ff2f1e3 feat(inverted_index.create): add read/write for external intermediate files (#2942)
* feat(inverted_index.create): add read/write for external intermediate files

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* chore: MAGIC_CODEC_V1 -> CODEC_V1_MAGIC

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* chore: polish comments

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* chore: fix typos intermedia -> intermediate

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* fix: typos

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* chore: futures_code -> asynchronous_codec

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* chore: bump bytes to 1.5

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

---------

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>
2023-12-18 09:44:48 +00:00
Zhenchi
1e22f1cb4f feat(inverted_index.format): add writer (#2900)
* feat(inverted_index.format): add writer

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* chore: remove clippy allow

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* Update src/index/src/inverted_index/error.rs

Co-authored-by: Ruihang Xia <waynestxia@gmail.com>

---------

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>
Co-authored-by: Ruihang Xia <waynestxia@gmail.com>
2023-12-11 09:55:25 +00:00
Zhenchi
0b421b5177 feat(inverted_index.search): add index applier (#2868)
* feat(inverted_index.search): add fst applier

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* fix: typos

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* feat(inverted_index.search): add fst values mapper

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* chore: remove meta check

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* fix: fmt & clippy

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* refactor: one expect for test

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* feat(inverted_index.search): add index applier

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* refactor: bitmap_full -> bitmap_full_range

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* feat: add check for segment_row_count

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* fix: remove redundant code

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* fix: reader test

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* chore: match error in test

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* fix: fmt

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* refactor: add helper function to construct fst value

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* refactor: polish unit tests

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* refactor: bytemuck to extract offset and size

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* fix: toml format

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* refactor: use bytemuck

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* refactor: reorg value in unit tests

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* chore: update proto

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* chore: add a TODO reminder to consider optimizing the order of apply

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* refactor: InList predicates are applied first to benefit from higher selectivity

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* chore: update proto

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* feat: add read options to control the behavior of index not found

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* refactor: polish

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* refactor: move read options to implementation instead of trait

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* feat: add SearchContext, refine doc comments

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* feat: move index_not_found_strategy as a field of SearchContext

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* chore: rename varient

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

---------

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>
2023-12-05 08:24:24 +00:00
Zhenchi
a9db80ab1a feat(inverted_index.search): add fst values mapper (#2862)
* feat(inverted_index.search): add fst applier

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* fix: typos

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* feat(inverted_index.search): add fst values mapper

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* chore: remove meta check

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* fix: fmt & clippy

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* refactor: one expect for test

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* chore: match error in test

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* fix: fmt

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* refactor: add helper function to construct fst value

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* refactor: bytemuck to extract offset and size

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* fix: toml format

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

---------

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>
2023-12-04 13:29:02 +00:00
Zhenchi
58c13739f0 feat(inverted_index.search): add fst applier (#2851)
* feat(inverted_index.search): add fst applier

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* fix: typos

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

---------

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>
2023-12-04 09:21:09 +00:00
Zhenchi
b3edbef1f3 feat(inverted_index): add index reader (#2803)
* feat(inverted_index): add reader

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* fix: toml format

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* chore: add prefix relative_ to the offset parameter

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* docs: add doc comment

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* chore: update proto

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

* fix: outdated docs

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>

---------

Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>
2023-11-27 03:31:44 +00:00