dependabot[bot]
e197b59258
Update itertools requirement from 0.12.0 to 0.13.0 ( #2400 )
...
Updates the requirements on [itertools](https://github.com/rust-itertools/itertools ) to permit the latest version.
- [Changelog](https://github.com/rust-itertools/itertools/blob/master/CHANGELOG.md )
- [Commits](https://github.com/rust-itertools/itertools/compare/v0.12.0...v0.13.0 )
---
updated-dependencies:
- dependency-name: itertools
dependency-type: direct:production
...
Signed-off-by: dependabot[bot] <support@github.com >
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-05-17 02:53:02 +02:00
PSeitz
5b7cca13e5
lower contention on AggregationLimits ( #2394 )
...
PR https://github.com/quickwit-oss/quickwit/pull/4962 fixes an issue
where the AggregationLimits are not passed correctly. Since the
AggregationLimits are shared properly we run into contention issues.
This PR includes some straightforward improvement to reduce contention,
by only calling if the memory changed and avoiding the second read.
We probably need some sharding with multiple counters or local caching before updating the
global after some threshold.
2024-05-15 12:25:40 +02:00
dependabot[bot]
a79590477e
Update binggan requirement from 0.5.2 to 0.6.2 ( #2399 )
...
---
updated-dependencies:
- dependency-name: binggan
dependency-type: direct:production
...
Signed-off-by: dependabot[bot] <support@github.com >
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-05-15 05:40:37 +02:00
Paul Masurel
6181c1eb5e
Small changes in the Executor API. ( #2391 )
...
Warning, this change is mildly not backward compatible
so I bumped tantivy's version.
2024-05-10 17:19:12 +09:00
Adam Reichold
1ee5f90761
Give allocation control to the caller instead of force a clone ( #2389 )
...
Achieved by moving the boxes out of the temporary reference wrappers which are
cloneable themselves, i.e. if required the caller can clone them already or
consume them to reuse existing allocations.
2024-05-09 16:01:13 +09:00
PSeitz
71f3b4e4e3
fix ReferenceValue API flaw ( #2372 )
...
* fix ReferenceValue API flaw
Remove `Facet` and `TokenizedString` values from the `ReferenceValue` API,
as this requires the trait value to have them stored somewhere.
Since `TokenizedString` is quite niche, I just copy it into a Box,
instead of designing a reference API around it.
* fix comment link
2024-05-09 06:14:42 +02:00
trinity-1686a
8cd7ddc535
run block decompression from executor ( #2386 )
...
* run block decompression from executor
* add a wrapper with is_closed to oneshot channel
* add cancelation test to Executor::spawn_blocking
2024-05-08 12:22:44 +02:00
Paul Masurel
2b76335a95
Removed usage of num_cpus ( #2387 )
...
* Removed usage of num_cpus
* handling error
2024-05-08 13:32:52 +09:00
PSeitz
c6b213d8f0
use bingang for agg benchmark ( #2378 )
...
* use bingang for agg benchmark
use bingang for agg benchmark, which includes memory consumption
Output:
```
full
histogram Memory: 15.8 KB Avg: 10.9322ms (+5.44%) Median: 10.8790ms (+9.28%) Min: 10.7470ms Max: 11.3263ms
histogram_hard_bounds Memory: 15.5 KB Avg: 5.1939ms (+6.61%) Median: 5.1722ms (+10.98%) Min: 5.0432ms Max: 5.3910ms
histogram_with_avg_sub_agg Memory: 48.7 KB Avg: 23.8165ms (+4.57%) Median: 23.7264ms (+10.06%) Min: 23.4995ms Max: 24.8107ms
dense
histogram Memory: 17.3 KB Avg: 15.6810ms (-8.54%) Median: 15.6174ms (-8.89%) Min: 15.4953ms Max: 16.0702ms
histogram_hard_bounds Memory: 15.4 KB Avg: 10.0720ms (-7.33%) Median: 10.0572ms (-7.06%) Min: 9.8500ms Max: 10.4819ms
histogram_with_avg_sub_agg Memory: 50.1 KB Avg: 33.0993ms (-7.04%) Median: 32.9499ms (-6.86%) Min: 32.8284ms Max: 34.0529ms
sparse
histogram Memory: 16.3 KB Avg: 19.2325ms (-0.44%) Median: 19.1211ms (-1.26%) Min: 19.0348ms Max: 19.7902ms
histogram_hard_bounds Memory: 16.1 KB Avg: 18.5179ms (-0.61%) Median: 18.4552ms (-0.90%) Min: 18.3799ms Max: 19.0535ms
histogram_with_avg_sub_agg Memory: 34.7 KB Avg: 21.2589ms (-0.69%) Median: 21.1867ms (-1.05%) Min: 21.0342ms Max: 21.9900ms
```
* add more bench with term as sub agg
2024-05-07 11:29:49 +02:00
PSeitz
eea70030bf
cleanup top level exports ( #2382 )
...
remove some top level exports
2024-05-07 09:59:41 +02:00
PSeitz
92b5526310
allow more JSON values, fix i64 special case ( #2383 )
...
This changes three things:
- Reuse positions_per_path hashmap instead of allocating one per
indexed JSON value
- Try to cast u64 values to i64 to streamline with search behaviour
- Allow top level json values to be of any type, instead of limiting it
to JSON objects. Remove special JSON object handling method.
TODO: We probably should also try to check f64 to i64 and u64 when
indexing, as values may get converted to f64 by the JSON parser
2024-05-01 12:08:12 +02:00
PSeitz
99a59ad37e
remove zero byte check ( #2379 )
...
remove zero byte checks in columnar. zero bytes are converted during serialization now.
unify code paths
extend test for expected column names
2024-04-26 06:03:28 +02:00
trinity-1686a
6a66a71cbb
modify fastfield range query heuristic ( #2375 )
2024-04-25 10:06:11 +02:00
PSeitz
ff40764204
make convert_to_fast_value_and_append_to_json_term pub ( #2370 )
...
* make convert_to_fast_value_and_append_to_json_term pub
* clippy
2024-04-23 04:05:41 +02:00
PSeitz
047da20b5b
add json path constructor to term ( #2367 )
2024-04-22 12:23:35 +02:00
PSeitz
1417eaf3a7
fix coverage ( #2368 )
2024-04-22 12:23:15 +02:00
PSeitz
4f8493d2de
improve document docs ( #2359 )
2024-04-22 12:05:16 +02:00
Paul Masurel
8861366137
Owned value relying on Vec instead of BTreeMap ( #2364 )
...
* Owned value relying on Vec instead of BTreeMap
* fmt
* fix build
* fix serialization
---------
Co-authored-by: Pascal Seitz <pascal.seitz@gmail.com >
2024-04-22 09:38:05 +02:00
PSeitz
0e9fced336
remove JsonTermWriter ( #2238 )
...
* remove JsonTermWriter
remove JsonTermWriter
remove path truncation logic, add assertion
* fix json_path_writer add sep logic
2024-04-18 16:28:05 +02:00
PSeitz
b257b960b3
validate sort by field type ( #2336 )
...
* validate sort by field type
* Update src/index/index.rs
Co-authored-by: Adam Reichold <adamreichold@users.noreply.github.com >
---------
Co-authored-by: Adam Reichold <adamreichold@users.noreply.github.com >
2024-04-16 04:42:24 +02:00
Adam Reichold
4708171a32
Fix some of the things current Clippy complains about ( #2363 )
2024-04-16 04:27:06 +02:00
Adam Reichold
b493743f8d
Fix trait bound of StoreReader::iter ( #2360 )
...
* Fix trait bound of StoreReader::iter
Similar to `StoreReader::get`, `StoreReader::iter` should only require
`DocumentDeserialize` and not `Document`.
* Mark the iterator returned by SegmentReader::doc_ids_alive as Send so it can be used in impls of Stream/AsyncIterator.
2024-04-15 15:50:02 +02:00
trinity-1686a
d2955a3fd2
extend field grouping ( #2333 )
...
* extend field grouping
2024-04-15 10:36:32 +02:00
PSeitz
17d5869ad6
update CHANGELOG, use github API in cliff ( #2354 )
...
* update CHANGELOG, use github API in cliff
* reset version to 0.21.1, before release
* chore: Release
* remove unreleased from CHANGELOG
0.22.0
2024-04-15 10:07:20 +02:00
PSeitz
dfa3aed32d
check unsupported parameters top_hits ( #2351 )
...
* check unsupported parameters top_hits
* move to function
2024-04-10 08:20:52 +02:00
PSeitz
398817ce7b
add index sorting deprecation warning ( #2353 )
...
* add index sorting deprecation warning
* remove deprecated IntOptions and DatePrecision
2024-04-10 08:09:09 +02:00
PSeitz
74940e9345
clippy ( #2349 )
...
* fix clippy
* fix clippy
* fix duplicate imports
2024-04-09 07:54:44 +02:00
PSeitz
1e9fc51535
update ahash ( #2344 )
2024-04-09 06:35:39 +02:00
PSeitz
92c32979d2
fix postcard compatibility for top_hits, add postcard test ( #2346 )
...
* fix postcard compatibility for top_hits, add postcard test
* fix top_hits naming, delay data fetch
closes #2347
* fix import
2024-04-09 06:17:25 +02:00
PSeitz
b644d78a32
fix null byte handling in JSON paths ( #2345 )
...
* fix null byte handling in JSON paths
closes https://github.com/quickwit-oss/tantivy/issues/2193
closes https://github.com/quickwit-oss/tantivy/issues/2340
* avoid repeated term truncation
* fix test
* Apply suggestions from code review
Co-authored-by: Paul Masurel <paul@quickwit.io >
* add comment
---------
Co-authored-by: Paul Masurel <paul@quickwit.io >
2024-04-05 09:53:35 +02:00
PSeitz
4e79e11007
add collect_block to BoxableSegmentCollector ( #2331 )
2024-03-21 09:10:25 +01:00
PSeitz
67ebba3c3c
expose collect_block buffer size ( #2326 )
...
* expose buffer of collect_block
* flip shard_size segment_size
2024-03-15 08:02:08 +01:00
PSeitz
7ce950f141
add method to fetch block of first vals in columnar ( #2330 )
...
* add method to fetch block of first vals in columnar
add method to fetch block of first vals in columnar (this is way faster
than single calls for full columns)
add benchmark
fix import warnings
```
test bench_get_block_first_on_full_column ... bench: 56 ns/iter (+/- 26)
test bench_get_block_first_on_full_column_single_calls ... bench: 311 ns/iter (+/- 6)
test bench_get_block_first_on_multi_column ... bench: 378 ns/iter (+/- 15)
test bench_get_block_first_on_multi_column_single_calls ... bench: 546 ns/iter (+/- 13)
test bench_get_block_first_on_optional_column ... bench: 291 ns/iter (+/- 6)
test bench_get_block_first_on_optional_column_single_calls ... bench: 362 ns/iter (+/- 8)
```
* use remainder
2024-03-15 08:01:47 +01:00
dependabot[bot]
0cffe5fb09
Update base64 requirement from 0.21.0 to 0.22.0 ( #2324 )
...
Updates the requirements on [base64](https://github.com/marshallpierce/rust-base64 ) to permit the latest version.
- [Changelog](https://github.com/marshallpierce/rust-base64/blob/master/RELEASE-NOTES.md )
- [Commits](https://github.com/marshallpierce/rust-base64/compare/v0.21.0...v0.22.0 )
---
updated-dependencies:
- dependency-name: base64
dependency-type: direct:production
...
Signed-off-by: dependabot[bot] <support@github.com >
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-03-15 15:50:34 +09:00
PSeitz
b0e65560a1
handle ip adresses in term aggregation ( #2319 )
...
* handle ip adresses in term aggregation
Stores IpAdresses during the segment term aggregation via u64 representation
and convert to u128(IpV6Adress) via downcast when converting to intermediate results.
Enable Downcasting on `ColumnValues`
Expose u64 variant for u128 encoded data via `open_u64_lenient` method.
Remove lifetime in VecColumn, to avoid 'static lifetime requirement coming
from downcast trait.
* rename method
2024-03-14 09:41:18 +01:00
PSeitz
ec37295b2f
add fast path for full columns in fetch_block ( #2328 )
...
Spotted in `range_date_histogram` query in quickwit benchmark:
5% of time copying docs around, which is not needed in the full index case
remove Column to ColumnIndex deref
2024-03-14 04:07:11 +01:00
trinity-1686a
f6b0cc1aab
allow some mixing of occur and bool in strict query parser ( #2323 )
...
* allow some mixing of occur and bool in strict query parser
* allow all mixing of binary and occur in strict parser
2024-03-07 15:17:48 +01:00
PSeitz
7e41d31c6e
agg: support to deserialize f64 from string ( #2311 )
...
* agg: support to deserialize f64 from string
* remove visit_string
* disallow NaN
2024-03-05 05:49:41 +01:00
Adam Reichold
40aa4abfe5
Make FacetCounts defaultable and cloneable. ( #2322 )
2024-03-05 04:11:11 +01:00
dependabot[bot]
2650317622
Update fs4 requirement from 0.7.0 to 0.8.0 ( #2321 )
...
Updates the requirements on [fs4](https://github.com/al8n/fs4-rs ) to permit the latest version.
- [Release notes](https://github.com/al8n/fs4-rs/releases )
- [Commits](https://github.com/al8n/fs4-rs/commits )
---
updated-dependencies:
- dependency-name: fs4
dependency-type: direct:production
...
Signed-off-by: dependabot[bot] <support@github.com >
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-02-27 03:38:04 +01:00
Paul Masurel
6739357314
Removing split_size and adding split_size and shard_size as segmnet_size ( #2320 )
...
aliases.
2024-02-26 11:35:22 +01:00
PSeitz
d57622d54b
support bool type in term aggregation ( #2318 )
...
* support bool type in term aggregation
* add Bool to Intermediate Key
2024-02-20 03:22:22 +01:00
PSeitz
f745dbc054
fix Clone for TopNComputer, add top_hits bench ( #2315 )
...
* fix Clone for TopNComputer, add top_hits bench
add top_hits agg bench
test aggregation::agg_bench::bench::bench_aggregation_terms_many_with_sub_agg ... bench: 123,475,175 ns/iter (+/- 30,608,889)
test aggregation::agg_bench::bench::bench_aggregation_terms_many_with_sub_agg_multi ... bench: 194,170,414 ns/iter (+/- 36,495,516)
test aggregation::agg_bench::bench::bench_aggregation_terms_many_with_sub_agg_opt ... bench: 179,742,809 ns/iter (+/- 29,976,507)
test aggregation::agg_bench::bench::bench_aggregation_terms_many_with_sub_agg_sparse ... bench: 27,592,534 ns/iter (+/- 2,672,370)
test aggregation::agg_bench::bench::bench_aggregation_terms_many_with_top_hits_agg ... bench: 552,851,227 ns/iter (+/- 71,975,886)
test aggregation::agg_bench::bench::bench_aggregation_terms_many_with_top_hits_agg_multi ... bench: 558,616,384 ns/iter (+/- 100,890,124)
test aggregation::agg_bench::bench::bench_aggregation_terms_many_with_top_hits_agg_opt ... bench: 554,031,368 ns/iter (+/- 165,452,650)
test aggregation::agg_bench::bench::bench_aggregation_terms_many_with_top_hits_agg_sparse ... bench: 46,435,919 ns/iter (+/- 13,681,935)
* add comment
2024-02-20 03:22:00 +01:00
PSeitz
79b041f81f
clippy ( #2314 )
2024-02-13 05:56:31 +01:00
PSeitz
0e16ed9ef7
Fix serde for TopNComputer ( #2313 )
...
* Fix serde for TopNComputer
The top hits aggregation changed the TopNComputer to be serializable,
but capacity needs to be carried over, as it contains logic which is
checked against when pushing elements (capacity == 0 is not allowed).
* use serde from deser
* remove pub, clippy
2024-02-07 12:52:06 +01:00
mochi
88a3275dbb
add shared search executor ( #2312 )
2024-02-05 09:33:00 +01:00
PSeitz
1223a87eb2
add fuzz test for hashmap ( #2310 )
2024-01-31 10:30:21 +01:00
PSeitz
48630ceec9
move into new index module ( #2259 )
...
move core modules to index module
2024-01-31 10:30:04 +01:00
Adam Reichold
72002e8a89
Make test builds Clippy clean. ( #2277 )
2024-01-31 02:47:06 +01:00
trinity-1686a
3c9297dd64
report if posting list was actually loaded when warming it up ( #2309 )
2024-01-29 15:23:16 +01:00