tantivy

mirror of https://github.com/quickwit-oss/tantivy.git synced 2026-01-06 01:02:55 +00:00

Author	SHA1	Message	Date
trinity-1686a	85395d942a	fix clippy lints from 1.80-1.81 (#2488 ) * fix some clippy lints * fix clippy::doc_lazy_continuation * fix some lints for 1.82	2024-09-05 14:33:05 +02:00
PSeitz	27be6aed91	lift clauses in LogicalAst (#2449 ) (a OR b) OR (c OR d) can be simplified to (a OR b OR c OR d) (a AND b) AND (c AND d) can be simplified to (a AND b AND c AND d) This directly affects how queries are executed remove unused SumWithCoordsCombiner the number of fields is unused and private	2024-08-14 19:21:26 +02:00
PSeitz	56d79cb203	fix cardinality aggregation performance (#2446 ) * fix cardinality aggregation performance fix cardinality performance by fetching multiple terms at once. This avoids decompressing the same block and keeps the buffer state between terms. add cardinality aggregation benchmark bump rust version to 1.66 Performance comparison to before (AllQuery) ``` full cardinality_agg Memory: 3.5 MB (-0.00%) Avg: 21.2256ms (-97.78%) Median: 21.0042ms (-97.82%) [20.4717ms .. 23.6206ms] terms_few_with_cardinality_agg Memory: 10.6 MB Avg: 81.9293ms (-97.37%) Median: 81.5526ms (-97.38%) [79.7564ms .. 88.0374ms] dense cardinality_agg Memory: 3.6 MB (-0.00%) Avg: 25.9372ms (-97.24%) Median: 25.7744ms (-97.25%) [24.7241ms .. 27.8793ms] terms_few_with_cardinality_agg Memory: 10.6 MB Avg: 93.9897ms (-96.91%) Median: 92.7821ms (-96.94%) [90.3312ms .. 117.4076ms] sparse cardinality_agg Memory: 895.4 KB (-0.00%) Avg: 22.5113ms (-95.01%) Median: 22.5629ms (-94.99%) [22.1628ms .. 22.9436ms] terms_few_with_cardinality_agg Memory: 680.2 KB Avg: 26.4250ms (-94.85%) Median: 26.4135ms (-94.86%) [26.3210ms .. 26.6774ms] ``` * clippy * assert for sorted ordinals	2024-07-02 15:29:00 +08:00
落叶乌龟	f9ae295507	feat(query): Make `BooleanQuery` supports `minimum_number_should_match` (#2405 ) * feat(query): Make `BooleanQuery` supports `minimum_number_should_match`. see issue #2398 In this commit, a novel scorer named DisjunctionScorer is introduced, which performs the union of inverted chains with the minimal required elements. BTW, it's implemented via a min-heap. Necessary modifications on `BooleanQuery` and `BooleanWeight` are performed as well. * fixup! fix test * fixup!: refactor code. 1. More meaningful names. 2. Add Cache for `Disjunction`'s scorers, and fix bug. 3. Optimize `BooleanWeight::complex_scorer` Thanks Paul Masurel <paul@quickwit.io> * squash!: come up with better variable naming. * squash!: fix naming issues. * squash!: fix typo. * squash!: Remove CombinationMethod::FullIntersection	2024-07-01 15:39:41 +08:00
Harrison Burt	1c7c6fd591	POC: Tantivy documents as a trait (#2071 ) * fix windows build (#1) * Fix windows build * Add doc traits * Add field value iter * Add value and serialization * Adjust order * Fix bug * Correct type * Fix generic bugs * Reformat code * Add generic to index writer which I forgot about * Fix missing generics on single segment writer * Add missing type export * Add default methods for convenience * Cleanup * Fix more-like-this query to use standard types * Update API and fix tests * Add doc traits * Add field value iter * Add value and serialization * Adjust order * Fix bug * Correct type * Rebase main and fix conflicts * Reformat code * Merge upstream * Fix missing generics on single segment writer * Add missing type export * Add default methods for convenience * Cleanup * Fix more-like-this query to use standard types * Update API and fix tests * Add tokenizer improvements from previous commits * Add tokenizer improvements from previous commits * Reformat * Fix unit tests * Fix unit tests * Use enum in changes * Stage changes * Add new deserializer logic * Add serializer integration * Add document deserializer * Implement new (de)serialization api for existing types * Fix bugs and type errors * Add helper implementations * Fix errors * Reformat code * Add unit tests and some code organisation for serialization * Add unit tests to deserializer * Add some small docs * Add support for deserializing serde values * Reformat * Fix typo * Fix typo * Change repr of facet * Remove unused trait methods * Add child value type * Resolve comments * Fix build * Fix more build errors * Fix more build errors * Fix the tests I missed * Fix examples * fix numerical order, serialize PreTok Str * fix coverage * rename Document to TantivyDocument, rename DocumentAccess to Document add Binary prefix to binary de/serialization * fix coverage --------- Co-authored-by: Pascal Seitz <pascal.seitz@gmail.com>	2023-10-02 10:01:16 +02:00
PSeitz	2d7390341c	increase min memory to 15MB for indexing (#2176 ) With tantivy 0.20 the minimum memory consumption per SegmentWriter increased to 12MB. 7MB are for the different fast field collectors types (they could be lazily created). Increase the minimum memory from 3MB to 15MB. Change memory variable naming from arena to budget. closes #2156	2023-09-13 07:38:34 +02:00
RT_Enzyme	ff3d3313c4	fix BooleanQuery document (#1999 ) * fix BooleanQuery document * Update src/query/boolean_query/boolean_query.rs --------- Co-authored-by: Paul Masurel <paul@quickwit.io>	2023-04-20 11:37:20 +02:00
Adam Reichold	2080c370c2	Enable usage of FuzzyTermQuery for specific fields via QueryParser (#1750 ) * Make nightly Clippy mostly happy. * Document how to produce TermSetQuery queries using QueryParser. * Enable construction of queries using FuzzyTermQuery via the QueryParser * Use FxHashMap instead of HashMap in the QueryParser as these hash tables are not exposed to DoS attacks. * Use a struct instead of a tuple to improve readability.	2023-01-04 18:11:27 +09:00
Paul Masurel	3edf0a2724	Using the manual reload policy in IndexWriter. (#1667 )	2022-11-09 11:20:41 +01:00
Adam Reichold	71ab482720	RFC: Use a more general but still object-safe signature for Query::query_terms. (#1468 ) * Use a more general but still object-safe signature for Query::query_terms. * Further constraint the generalized Query::query_terms signature to allow extracting references to terms.	2022-08-24 06:34:07 +09:00
Pasha Podolsky	09aae134e6	[feat] Implement `DisjunctionMaxQuery` and refactor `ScoreCombiner`	2022-07-28 20:47:20 +03:00
Ryan Russell	b33b4c0092	Fix various `occurrence` var names and references (#1385 ) Thank you Ryan! Signed-off-by: Ryan Russell <git@ryanrussell.org>	2022-06-07 11:08:19 +09:00
Paul Masurel	d7b46d2137	Added JSON Type (#1270 ) - Removed useless copy when ingesting JSON. - Bugfix in phrase query with a missing field norms. - Disabled range query on default fields Closes #1251	2022-02-24 16:25:22 +09:00
Paul Masurel	eca6628b3c	Minor refactoring (#1266 )	2022-01-28 15:55:55 +09:00
Paul Masurel	7234bef0eb	Issue/1198 (#1201 ) * Unit test reproducing #1198 * Fixing unit test to handle the error from add_document. * Bump project version	2021-11-11 16:42:19 +09:00
Paul Masurel	6e4b61154f	Issue/1070 (#1071 ) Add a boolean flag in the Query::query_terms informing on whether position information is required. Closes #1070	2021-06-03 22:33:20 +09:00
Stéphane Campinas	a0ec6e1e9d	Expand the DocAddress struct with named fields	2021-03-28 19:00:23 +02:00
Paul Masurel	9e27da8b4e	Added CR comments. Added Unit tests.	2020-10-28 17:35:34 +09:00
Adrien Guillo	7f373f232a	Add helper methods for BooleanQuery	2020-10-28 17:35:34 +09:00
Paul Masurel	ae14022bf0	Removed `use::Result`. (#771 )	2020-01-31 18:47:02 +09:00
petr-tik	4a8f7712f3	Add a doctest to BooleanQuery (#630 ) * Add a doctest to BooleanQuery Closes #446 Mark a function that is only used in tests to be compiled for tests only Fix doc-comments in a couple of related files * Minor corrections remove whitespace, fix typos, add explicit dyn marker * WIP: BooleanQuery doc test Trying to nest several BooleanQueries together * Addressed old review rust 2018 edition + make function available to everyone * Box the previous query to resolve the type error * Rework wording in DocAdress document strings * Reworded and restructured the docstring	2019-10-07 10:05:12 +09:00
Paul Masurel	462774b15c	Tiqb feature/2018 (#583 ) * rust 2018 * Added CHANGELOG comment	2019-07-01 10:01:46 +09:00
Paul Masurel	279a9eb5e3	Closes #449 (#450 ) Clippy working on stable. Clippy warnings addressed	2018-12-10 12:20:59 +09:00
Paul Masurel	10f6c07c53	Clippy (#422 ) * Cargo Format * Clippy	2018-09-15 20:20:22 +09:00
Paul Masurel	06e7bd18e7	Clippy (#421 ) * Cargo Format * Clippy * bugfix * still clippy stuff * clippy step 2	2018-09-15 14:56:14 +09:00
Paul Masurel	37e4280c0a	Cargo Format (#420 )	2018-09-15 07:44:22 +09:00
Paul Masurel	e32dba1a97	Phrase weight	2018-09-10 09:26:33 +09:00
Paul Masurel	8ebbf6b336	Issue/325 (#330 ) * Introducing a SegmentMea inventory. * Depending on census=0.1 * Cargo fmt	2018-06-30 13:11:41 +09:00
Paul Masurel	8ccbfdea5d	Preparing for release	2018-06-22 14:27:46 +09:00
Jason Wolfe	0cea706f10	Add docs to new Query methods (#307 )	2018-05-18 13:53:29 +09:00
Jason Wolfe	72acad0921	Add box_clone() and downcast::Any to Query (#303 )	2018-05-18 09:53:11 +09:00
Paul Masurel	78673172d0	Cargo fmt	2018-04-21 20:05:36 +09:00
Paul Masurel	e44782bf14	No more	2018-04-12 13:01:11 +09:00
Paul Masurel	ef94582203	Rustfmt	2018-02-19 12:12:10 +09:00
Paul Masurel	e608e0a1df	Removed half baked usage of Any	2018-02-18 10:01:14 +09:00
Paul Masurel	292bb17346	Disable scoring - Disabling scoring is an argument of the `.weight()` method - Collectors declare whether they need scoring	2018-02-17 12:43:16 +09:00
Paul Masurel	1da06d867b	Using the same logic when score is enabled.	2018-02-16 17:36:33 +09:00
Paul Masurel	c4125bda59	Backmerging master	2018-02-12 11:08:57 +09:00
Paul Masurel	1fc7afa90a	Issue/range query (#242 ) BitSet and RangeQuery	2018-02-05 09:33:25 +09:00
Paul Masurel	fb5476d5de	Query optimization: phrase query + union	2018-02-02 16:39:17 +09:00
Paul Masurel	dd8332c327	Added disabling scoring	2018-02-02 12:11:56 +09:00
Paul Masurel	1947a19700	Added bitse	2018-01-31 23:56:54 +09:00
Paul Masurel	f24e5f405e	NOBUG intellij misc lint	2017-12-14 18:23:35 +09:00
Paul Masurel	426cc436da	Test passing	2017-09-10 17:48:41 +09:00
Paul Masurel	f8710bd4b0	Format	2017-08-28 18:22:41 +09:00
Ashley Mannix	2b2703cf51	run cargo fmt	2017-05-29 18:29:39 +09:00
Paul Masurel	4c8f9742f8	format	2017-05-15 22:30:18 +09:00
Paul Masurel	69e11d3779	issue/57 Cleaning. Closes #57 Closes #56 Closes #23	2016-11-17 23:18:24 +09:00
Paul Masurel	f7c882f3da	issue/50 Added use case for BooleanQuery	2016-11-04 12:23:27 +09:00
Paul Masurel	f2df0bf0e9	issue/50 Small formatting change.	2016-11-04 00:11:46 +09:00

1 2

52 Commits