Paul Masurel
519e5d2ed1
clippy warnings
2025-03-05 11:15:06 +01:00
Paul Masurel
df2d52a84e
follow up on the fix of multiply with overflow
2025-03-05 11:15:05 +01:00
Paul Masurel
371dba9414
Merge pull request #2591 from quickwit-oss/cargo-fmt
...
Cargo fmt
2025-03-05 11:08:06 +01:00
Paul Masurel
0afabad494
Cargo fmt
2025-03-05 11:07:46 +01:00
Remi Dettai
89b052cd42
Catch panics during merges ( #2582 )
...
* Adding panic handler for the rayon merge thread pool
* Return panic message in error
---------
Co-authored-by: Paul Masurel <paul.masurel@datadoghq.com >
2025-03-05 10:36:48 +01:00
SteveLauC
c48c649436
refactor: use std AtomicU64 and remove wrapper ( #2585 )
2025-02-24 03:56:15 +01:00
Paul Masurel
58c0739953
Merge pull request #2581 from quickwit-oss/merge_dict_column_repro
...
use usize in bitpacker
2025-02-21 10:53:07 +09:00
Pascal Seitz
e7daf69de9
use usize in bitpacker
...
use usize in bitpacker to enable larger columns in the columnar store
Godbolt comparison with u32 vs u64 for get access: https://godbolt.org/z/cjf7nenYP
Add a mini-tool to inspect columnar files created by tantivy. (very basic functionality which can be extended later)
2025-02-20 15:39:10 +01:00
trinity-1686a
f060e86bc6
Merge pull request #2578 from quickwit-oss/1686a/buildable-histo-agg
...
make DateHistogramAggregationReq buildable
2025-02-18 15:30:54 +01:00
trinity Pointard
0368162ef0
make DateHistogramAggregationReq buildable
2025-02-18 11:45:24 +01:00
trinity-1686a
e843c71015
Merge pull request #2568 from quickwit-oss/trinity/wildcard-query-parser
...
allow term starting with wildcard in query parser
2025-02-12 16:47:25 +01:00
trinity Pointard
5cea16ef9f
improve handling of spcial char after exist query
2025-01-22 16:04:31 +01:00
dependabot[bot]
4aa8cd2470
Update downcast-rs requirement from 1.2.1 to 2.0.1 ( #2566 )
...
Updates the requirements on [downcast-rs](https://github.com/marcianx/downcast-rs ) to permit the latest version.
- [Changelog](https://github.com/marcianx/downcast-rs/blob/master/CHANGELOG.md )
- [Commits](https://github.com/marcianx/downcast-rs/compare/v1.2.1...v2.0.1 )
---
updated-dependencies:
- dependency-name: downcast-rs
dependency-type: direct:production
...
Signed-off-by: dependabot[bot] <support@github.com >
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-01-22 10:32:24 +01:00
trinity Pointard
4d4ee1b0ac
allow term starting with wildcard in query parser
2025-01-15 10:27:48 +01:00
dependabot[bot]
43c89b4360
Update itertools requirement from 0.13.0 to 0.14.0 ( #2563 )
...
Updates the requirements on [itertools](https://github.com/rust-itertools/itertools ) to permit the latest version.
- [Changelog](https://github.com/rust-itertools/itertools/blob/master/CHANGELOG.md )
- [Commits](https://github.com/rust-itertools/itertools/compare/v0.13.0...v0.14.0 )
---
updated-dependencies:
- dependency-name: itertools
dependency-type: direct:production
...
Signed-off-by: dependabot[bot] <support@github.com >
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-01-08 17:11:46 +01:00
trinity-1686a
d281ca3e65
Merge pull request #2559 from quickwit-oss/trinity/sstable-partial-automaton
...
allow warming partially an sstable for an automaton
2025-01-08 16:35:35 +01:00
trinity Pointard
be17daf658
split iterator
2025-01-08 16:24:34 +01:00
trinity Pointard
6ca84a61fa
make termdict always clone
2025-01-08 16:19:54 +01:00
trinity Pointard
037d12c9c9
fix deadlocking on automaton warmup
2025-01-06 11:58:58 +01:00
Remi Dettai
71cf19870b
Exist queries match subpath fields ( #2558 )
...
* Exist queries match subpath fields
* Make subpath check optional
* Add async subpath listing
2025-01-06 10:17:39 +01:00
trinity Pointard
175a529c41
use executor for cpu-heavy sstable decompression for automaton
2025-01-03 19:14:07 +01:00
trinity Pointard
fe0c7c5408
change rangebound style
2025-01-02 11:56:05 +01:00
Harrison Burt
148594f0f9
Improve IndexWriter customisation via builder ( #2562 )
...
* Improve `IndexWriter` customisation via builder
* Remove change noise from PR
* Correct documentation
* Resolve comments and add test
2025-01-02 09:43:22 +01:00
dependabot[bot]
8edb439440
Update rustc-hash requirement from 1.1.0 to 2.1.0 ( #2551 )
...
---
updated-dependencies:
- dependency-name: rustc-hash
dependency-type: direct:production
...
Signed-off-by: dependabot[bot] <support@github.com >
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-12-26 10:25:05 +01:00
trinity Pointard
dfff5f3bcb
rename merge_holes_under => merge_holes_under_bytes
2024-12-23 16:17:44 +01:00
trinity-1686a
ebf4d84553
add comment about cpu-intensive operation in async context
2024-12-20 12:23:49 +01:00
trinity-1686a
42efc7f7c8
clippy
2024-12-20 11:00:11 +01:00
trinity-1686a
192395c311
attempt at simplifying can_block_match_automaton
2024-12-20 10:25:38 +01:00
trinity-1686a
a1447cc9c2
remove breaking change in sstable public api
2024-12-19 17:30:05 +01:00
trinity-1686a
c39d91f827
Merge pull request #2547 from quickwit-oss/trinity/count-str
...
add support for counting non integer in aggregation
2024-12-17 15:27:30 +01:00
trinity Pointard
32b6e9711b
add tests
2024-12-13 16:06:24 +01:00
trinity-1686a
24c5dc2398
allow warming up automaton
2024-12-10 13:32:12 +01:00
trinity-1686a
9e2ddec4b3
merge adjacent block when building delta for automaton
2024-12-10 13:32:12 +01:00
trinity-1686a
1f6a8e74bb
support iterating over partially loaded sstable
2024-12-10 13:32:12 +01:00
trinity-1686a
7e901f523b
get iter for blocks of sstable matching automaton
2024-12-10 13:32:12 +01:00
trinity-1686a
3c30a41c14
add helper to figure if block can match automaton
2024-12-10 13:32:12 +01:00
dependabot[bot]
0f99d4f420
Update measure_time requirement from 0.8.2 to 0.9.0 ( #2557 )
...
---
updated-dependencies:
- dependency-name: measure_time
dependency-type: direct:production
...
Signed-off-by: dependabot[bot] <support@github.com >
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-12-09 21:39:01 +01:00
Pierre Barre
6e02c5cb25
Make NUM_MERGE_THREADS configurable ( #2535 )
...
* Make `NUM_MERGE_THREADS` configurable
* Remove unused import
* Reword comment src/index/index.rs
Co-authored-by: PSeitz <PSeitz@users.noreply.github.com >
---------
Co-authored-by: PSeitz <PSeitz@users.noreply.github.com >
2024-12-09 16:53:11 +08:00
PSeitz
876a579e5d
queryparser: add field respecification test ( #2550 )
2024-12-02 14:17:12 +01:00
PSeitz
4c52499622
clippy ( #2549 )
2024-11-29 16:08:21 +08:00
trinity-1686a
0bac391291
add support for counting non integer in aggregation
2024-11-28 19:52:47 +01:00
PSeitz
52d4e81e70
update CHANGELOG ( #2546 )
2024-11-27 20:49:35 +08:00
dependabot[bot]
c71ea7b2ef
Update thiserror requirement from 1.0.30 to 2.0.1 ( #2542 )
...
Updates the requirements on [thiserror](https://github.com/dtolnay/thiserror ) to permit the latest version.
- [Release notes](https://github.com/dtolnay/thiserror/releases )
- [Commits](https://github.com/dtolnay/thiserror/compare/1.0.30...2.0.1 )
---
updated-dependencies:
- dependency-name: thiserror
dependency-type: direct:production
...
Signed-off-by: dependabot[bot] <support@github.com >
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-11-09 08:08:34 +08:00
Paul Masurel
c35a782747
Updating rustc-hash and clippy fixes ( #2532 )
...
* Updating rustc-hash and clippy fixes
* fix terms_aggregation_min_doc_count_special_case
---------
Co-authored-by: Pascal Seitz <pascal.seitz@gmail.com >
2024-11-01 13:46:26 +08:00
dependabot[bot]
c66af2c0a9
Update binggan requirement from 0.12.0 to 0.14.0 ( #2530 )
...
* Update binggan requirement from 0.12.0 to 0.14.0
---
updated-dependencies:
- dependency-name: binggan
dependency-type: direct:production
...
Signed-off-by: dependabot[bot] <support@github.com >
* fix build
---------
Signed-off-by: dependabot[bot] <support@github.com >
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Pascal Seitz <pascal.seitz@gmail.com >
2024-10-24 09:41:35 +08:00
Joan Antoni RE
f9ac055847
Fix some links in architecture docs ( #2528 )
2024-10-23 21:06:54 +09:00
PSeitz
21d057059e
clippy ( #2527 )
...
* clippy
* clippy
* clippy
* clippy
* convert allow to expect and remove unused
* cargo fmt
* cleanup
* export sample
* clippy
2024-10-22 09:26:54 +08:00
PSeitz
dca508b4ca
remove read_postings_no_deletes ( #2526 )
...
closes #2525
2024-10-22 09:52:43 +09:00
PSeitz
aebae9965d
add RegexPhraseQuery ( #2516 )
...
* add RegexPhraseQuery
RegexPhraseQuery supports phrase queries with regex. It supports regex
and wildcards. E.g. a query with wildcards:
"b* b* wolf" matches "big bad wolf"
Slop is supported as well:
"b* wolf"~2 matches "big bad wolf"
Regex queries may match a lot of terms where we still need to
keep track which term hit to load the positions.
The phrase query algorithm groups terms by their frequency
together in the union to prefilter groups early.
This PR comes with some new datastructures:
SimpleUnion - A union docset for a list of docsets. It doesn't do any
caching and is therefore well suited for datasets with lots of skipping.
(phrase search, but intersections in general)
LoadedPostings - Like SegmentPostings, but all docs and positions are loaded in
memory. SegmentPostings uses 1840 bytes per instance with its caches,
which is equivalent to 460 docids.
LoadedPostings is used for terms which have less than 100 docs.
LoadedPostings is only used to reduce memory consumption.
BitSetPostingUnion - Creates a `Posting` that uses the bitset for docid
hits and the docsets for positions. The BitSet is the precalculated
union of the docsets
In the RegexPhraseQuery there is a size limit of 512 docsets per PreAggregatedUnion,
before creating a new one.
Renamed Union to BufferedUnionScorer
Added proptests to test different union types.
* cleanup
* use Box instead of Vec
* use RefCell instead of term_freq(&mut)
* remove wildcard mode
* move RefCell to outer
* clippy
2024-10-21 18:29:17 +08:00
Marvin
e7e3e3f44c
make casing in docs more consistent ( #2524 )
...
* make casing in docs more consistent
* more
* lowercase tantivy
2024-10-21 17:59:41 +09:00