dependabot[bot]
c66af2c0a9
Update binggan requirement from 0.12.0 to 0.14.0 ( #2530 )
...
* Update binggan requirement from 0.12.0 to 0.14.0
---
updated-dependencies:
- dependency-name: binggan
dependency-type: direct:production
...
Signed-off-by: dependabot[bot] <support@github.com >
* fix build
---------
Signed-off-by: dependabot[bot] <support@github.com >
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Pascal Seitz <pascal.seitz@gmail.com >
2024-10-24 09:41:35 +08:00
dependabot[bot]
56fc56c5b9
Update binggan requirement from 0.8.0 to 0.10.0 ( #2493 )
...
* Update binggan requirement from 0.8.0 to 0.10.0
---
updated-dependencies:
- dependency-name: binggan
dependency-type: direct:production
...
Signed-off-by: dependabot[bot] <support@github.com >
* update PR
---------
Signed-off-by: dependabot[bot] <support@github.com >
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Pascal Seitz <pascal.seitz@gmail.com >
2024-09-10 14:26:06 +08:00
PSeitz
59084143ef
use optional index in multivalued index ( #2439 )
...
* use optional index in multivalued index
For mostly empty multivalued indices there was a large overhead during
creation when iterating all docids. This is alleviated by placing an
optional index in the multivalued index to mark documents that have values.
There's some performance overhead when accessing values in a multivalued
index. The accessing cost is now optional index + multivalue index. The
sparse codec performs relatively bad with the binary_search when accessing
data. This is reflected in the benchmarks below.
This changes the format of columnar to v2, but code is added to handle the v1
formats.
```
Running benches/bench_access.rs (/home/pascal/Development/tantivy/optional_multivalues/target/release/deps/bench_access-ea323c028db88db4)
multi sparse 1/13
access_values_for_doc Avg: 42.8946ms (+241.80%) Median: 42.8869ms (+244.10%) [42.7484ms .. 43.1074ms]
access_first_vals Avg: 42.8022ms (+421.93%) Median: 42.7553ms (+439.84%) [42.6794ms .. 43.7404ms]
multi 2x
access_values_for_doc Avg: 31.1244ms (+24.17%) Median: 30.8339ms (+23.46%) [30.7192ms .. 33.6059ms]
access_first_vals Avg: 24.3070ms (+70.92%) Median: 24.0966ms (+70.18%) [23.9328ms .. 26.4851ms]
sparse 1/13
access_values_for_doc Avg: 42.2490ms (+0.61%) Median: 42.2346ms (+2.28%) [41.8988ms .. 43.7821ms]
access_first_vals Avg: 43.6272ms (+0.23%) Median: 43.6197ms (+1.78%) [43.4920ms .. 43.9009ms]
dense 1/12
access_values_for_doc Avg: 8.6184ms (+23.18%) Median: 8.6126ms (+23.78%) [8.5843ms .. 8.7527ms]
access_first_vals Avg: 6.8112ms (+4.47%) Median: 6.8002ms (+4.55%) [6.7887ms .. 6.8991ms]
full
access_values_for_doc Avg: 9.4073ms (-5.09%) Median: 9.4023ms (-2.23%) [9.3694ms .. 9.4568ms]
access_first_vals Avg: 4.9531ms (+6.24%) Median: 4.9502ms (+7.85%) [4.9423ms .. 4.9718ms]
```
```
Running benches/bench_merge.rs (/home/pascal/Development/tantivy/optional_multivalues/target/release/deps/bench_merge-475697dfceb3639f)
merge_multi 2x_and_multi 2x Avg: 20.2280ms (+34.33%) Median: 20.1829ms (+35.33%) [19.9933ms .. 20.8806ms]
merge_multi sparse 1/13_and_multi sparse 1/13 Avg: 0.8961ms (-78.04%) Median: 0.8943ms (-77.61%) [0.8899ms .. 0.9272ms]
merge_dense 1/12_and_dense 1/12 Avg: 0.6619ms (-1.26%) Median: 0.6616ms (+2.20%) [0.6473ms .. 0.6837ms]
merge_sparse 1/13_and_sparse 1/13 Avg: 0.5508ms (-0.85%) Median: 0.5508ms (+2.80%) [0.5420ms .. 0.5634ms]
merge_sparse 1/13_and_dense 1/12 Avg: 0.6046ms (-4.64%) Median: 0.6038ms (+2.80%) [0.5939ms .. 0.6296ms]
merge_multi sparse 1/13_and_dense 1/12 Avg: 0.9111ms (-83.48%) Median: 0.9063ms (-83.50%) [0.9047ms .. 0.9663ms]
merge_multi sparse 1/13_and_sparse 1/13 Avg: 0.8451ms (-89.49%) Median: 0.8428ms (-89.43%) [0.8411ms .. 0.8563ms]
merge_multi 2x_and_dense 1/12 Avg: 10.6624ms (-4.82%) Median: 10.6568ms (-4.49%) [10.5738ms .. 10.8353ms]
merge_multi 2x_and_sparse 1/13 Avg: 10.6336ms (-22.95%) Median: 10.5925ms (-22.33%) [10.5149ms .. 11.5657ms]
```
* Update columnar/src/columnar/format_version.rs
Co-authored-by: Paul Masurel <paul@quickwit.io >
* Update columnar/src/column_index/mod.rs
Co-authored-by: Paul Masurel <paul@quickwit.io >
---------
Co-authored-by: Paul Masurel <paul@quickwit.io >
2024-06-19 14:54:12 +08:00
PSeitz
511b027350
update columnar bench ( #2438 )
...
* update columnar bench
* fix compile
2024-06-14 10:42:35 +08:00
PSeitz
72f61ff89c
remove index sorting ( #2434 )
...
closes https://github.com/quickwit-oss/tantivy/issues/2352
2024-06-13 15:51:53 +08:00
PSeitz
e90e7a25ae
add access benchmark for columnar ( #2432 )
2024-06-12 14:29:15 +08:00
PSeitz
714f363d43
add bench & test for columnar merging ( #2428 )
...
* add merge columnar proptest
* add columnar merge benchmark
2024-06-10 16:26:16 +08:00
PSeitz
7ce950f141
add method to fetch block of first vals in columnar ( #2330 )
...
* add method to fetch block of first vals in columnar
add method to fetch block of first vals in columnar (this is way faster
than single calls for full columns)
add benchmark
fix import warnings
```
test bench_get_block_first_on_full_column ... bench: 56 ns/iter (+/- 26)
test bench_get_block_first_on_full_column_single_calls ... bench: 311 ns/iter (+/- 6)
test bench_get_block_first_on_multi_column ... bench: 378 ns/iter (+/- 15)
test bench_get_block_first_on_multi_column_single_calls ... bench: 546 ns/iter (+/- 13)
test bench_get_block_first_on_optional_column ... bench: 291 ns/iter (+/- 6)
test bench_get_block_first_on_optional_column_single_calls ... bench: 362 ns/iter (+/- 8)
```
* use remainder
2024-03-15 08:01:47 +01:00
PSeitz
1cfb9ce59a
improve range query performance ( #1864 )
...
fix RowId vs DocId naming
fixes #1863
2023-02-14 13:25:39 +09:00
Paul Masurel
bd5eea9852
Integrated columnar work.
2023-02-09 13:14:31 +01:00