Commit Graph

497 Commits

Author SHA1 Message Date
PSeitz
6a7a1106d6 work in batches of docs (#1937)
* work in batches of docs

* add fill_buffer test
2023-03-21 06:57:44 +01:00
trinity-1686a
064518156f refactor tokenization pipeline to use GATs (#1924)
* refactor tokenization pipeline to use GATs

* fix doctests

* fix clippy lints

* remove commented code
2023-03-09 09:39:37 +01:00
Paul Masurel
7fae4d98d7 Adapting for quickwit2 (#1912)
* Adapting tantivy to make it possible to be plugged to quickwit.

* Apply suggestions from code review

Co-authored-by: PSeitz <PSeitz@users.noreply.github.com>

* Added unit test

---------

Co-authored-by: PSeitz <PSeitz@users.noreply.github.com>
2023-03-01 16:27:46 +09:00
trinity-1686a
8a71e00da3 allow limiting the number of matched term in range query (#1899) 2023-02-27 10:44:08 +01:00
Paul Masurel
d25fc155b2 Making some of the column/termdict operations async-friendly (#1902) 2023-02-27 15:34:47 +09:00
Paul Masurel
66ff53b0f4 Various minor code cleanup (#1909) 2023-02-27 13:48:34 +09:00
Paul Masurel
d002698008 Re-export of query grammar. (#1908) 2023-02-27 12:26:34 +09:00
trinity-1686a
533ad99cd5 add PhrasePrefixQuery (#1842)
* add PhrasePrefixQuery
2023-02-22 11:18:33 +01:00
PSeitz
111f25a8f7 clippy (#1879)
* fix clippy

* fix clippy

* fmt
2023-02-17 11:34:21 +01:00
Alex Cole
f2f38c43ce Make BM25 scoring more flexible (#1855)
* Introduce Bm25StatisticsProvider to inject statistics

* fix formatting I accidentally changed
2023-02-16 19:14:12 +09:00
PSeitz
1cfb9ce59a improve range query performance (#1864)
fix RowId vs DocId naming
fixes #1863
2023-02-14 13:25:39 +09:00
PSeitz
36c6138e7f fix: auto downgrade index record option, instead of vint error (#1857)
Prev: thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: IoError(Custom { kind: InvalidData, error: "Reach end of buffer while reading VInt" })', src/main.rs:46:14
Now: Automatic downgrade to next available level
2023-02-10 13:45:23 +01:00
Paul Masurel
b7bfa20e38 Fixed test performance. 2023-02-09 17:39:55 +01:00
Paul Masurel
bd5eea9852 Integrated columnar work. 2023-02-09 13:14:31 +01:00
PSeitz
f687b3a5aa start migrate Field to &str (#1772)
start migrate Field to &str in preparation of columnar
return Result for get_field
2023-01-18 16:12:07 +09:00
Shikhar Bhushan
2650111b76 EnableScoring::Disabled - optional Searcher (#1780) 2023-01-12 09:26:50 -05:00
PSeitz
1176555eff handle user input on get_docid_for_value_range (#1760)
* handle user input on get_docid_for_value_range

fixes #1757

* pass range as parameter
2023-01-12 14:20:16 +01:00
Adrien Guillo
e17996f2fd Allow range queries via fast fields on non-indexed fields 2023-01-11 09:56:13 -05:00
Adrien Guillo
14222a47a3 Fix typo (#1776) 2023-01-11 00:49:13 +09:00
Adam Reichold
8312c882a5 More cosmetic fixes for upcoming Clippy lints. (#1771) 2023-01-10 10:32:45 +01:00
Paul Masurel
7a8fce0ae7 Minor mini fixes 2023-01-10 14:15:30 +09:00
PSeitz
7c6cc818ae enable range query on fast field for u64 compatible types (#1762)
* enable range query on fast field for u64 compatible types

* rename, update benches
2023-01-10 04:08:26 +01:00
Adam Reichold
1afa5bf3db Make construction of LevenshteinAutomatonBuilder for FuzzyTermQuery instances lazy. (#1756) 2023-01-06 12:44:49 +09:00
PSeitz
07a51eb7c8 refactor multivalue fastfield, refactor range query (#1749)
Introduce MakeZero trait, remove make_zero from FastValue
Merge two multivalue fastfield implementations into one
prepare range query on fastfield for different types
2023-01-05 12:09:50 +01:00
Adam Reichold
2080c370c2 Enable usage of FuzzyTermQuery for specific fields via QueryParser (#1750)
* Make nightly Clippy mostly happy.

* Document how to produce TermSetQuery queries using QueryParser.

* Enable construction of queries using FuzzyTermQuery via the QueryParser

* Use FxHashMap instead of HashMap in the QueryParser as these hash tables are not exposed to DoS attacks.

* Use a struct instead of a tuple to improve readability.
2023-01-04 18:11:27 +09:00
Hasnain Lakhani
f4804ce2f5 Adjust spelling of "returns" in docs for DisjunctionMaxQuery (#1733) 2022-12-22 14:04:07 +09:00
Paul Masurel
4a6bf50e78 Clippy 2022-12-21 15:43:34 +09:00
PSeitz
f9171a3981 fix clippy (#1725)
* fix clippy

* fix clippy fastfield codecs

* fix clippy bitpacker

* fix clippy common

* fix clippy stacker

* fix clippy sstable

* fmt
2022-12-20 07:30:06 +01:00
boraarslan
495824361a Move split_full_path to Schema (#1692) 2022-11-29 20:56:13 +09:00
Paul Masurel
0b40a7fe43 Added a expand_dots JsonObjectOptions. (#1687)
Related with quickwit#2345.
2022-11-21 23:03:00 +09:00
trinity-1686a
e758080465 add support for TermSetQuery in query parser (#1683) 2022-11-17 16:49:49 +01:00
Paul Masurel
2a39289a1b Handle escaped dot in json path in the QueryParser. (#1682) 2022-11-16 07:18:34 +09:00
Pascal Seitz
8641155cbb remove column from MultiValuedU128FastFieldReader 2022-11-14 18:49:15 +08:00
Pascal Seitz
b7d0dd154a fmt 2022-11-14 14:49:15 +08:00
Pascal Seitz
e034328a8b Improve position_to_docid, refactor, add tests 2022-11-14 14:21:53 +08:00
Pascal Seitz
f811d1616b add support for ip range query on multivalue fastfields 2022-11-14 14:21:52 +08:00
Pascal Seitz
9e8a0c2cca Allow range query on fastfield without INDEXED 2022-11-10 15:56:08 +08:00
Paul Masurel
3edf0a2724 Using the manual reload policy in IndexWriter. (#1667) 2022-11-09 11:20:41 +01:00
Pascal Seitz
38ad46e580 fix clippy 2022-11-07 16:09:55 +08:00
PSeitz
0f98d91a39 Merge pull request #1646 from quickwit-oss/no_score_calls
No score calls if score is not requested
2022-11-01 20:09:32 +08:00
PSeitz
2af6b01c17 Update src/query/boolean_query/boolean_weight.rs
Co-authored-by: Paul Masurel <paul@quickwit.io>
2022-11-01 16:13:00 +08:00
Pascal Seitz
43df356010 rename to docset 2022-10-27 16:53:38 +08:00
PSeitz
7a80851e36 Merge pull request #1645 from quickwit-oss/ip_field_range_query
add ip range query benchmark, add seek behaviour
2022-10-27 16:13:52 +08:00
Pascal Seitz
dfab201191 for_each_docset to iterate without score 2022-10-26 17:25:05 +08:00
PSeitz
0c2bd36fe3 Panic on duplicate field names (#1647)
fixes #1601
2022-10-26 16:17:33 +09:00
Pascal Seitz
af839753e0 No score calls if score is not requested 2022-10-26 12:18:35 +08:00
Pascal Seitz
fec2b63571 improve bench by adding more blanks in compact space 2022-10-25 22:09:01 +08:00
Pascal Seitz
6213ea476a pass positions parameter 2022-10-25 17:44:51 +08:00
Pascal Seitz
5e159c26bf add ip range query benchmark, add seek behaviour 2022-10-25 15:57:19 +08:00
Pascal Seitz
e772d3170d switch get_val() to u32
Fixes #1638
2022-10-24 19:05:57 +08:00