tantivy

mirror of https://github.com/quickwit-oss/tantivy.git synced 2025-12-27 20:42:54 +00:00

Author	SHA1	Message	Date
Paul Masurel	bd5eea9852	Integrated columnar work.	2023-02-09 13:14:31 +01:00
Shikhar Bhushan	2650111b76	EnableScoring::Disabled - optional Searcher (#1780 )	2023-01-12 09:26:50 -05:00
Paul Masurel	3edf0a2724	Using the manual reload policy in IndexWriter. (#1667 )	2022-11-09 11:20:41 +01:00
Pascal Seitz	dfab201191	for_each_docset to iterate without score	2022-10-26 17:25:05 +08:00
Pascal Seitz	af839753e0	No score calls if score is not requested	2022-10-26 12:18:35 +08:00
Pascal Seitz	6800fdec9d	add indexing for ip field Closes #1595	2022-10-18 10:07:48 +08:00
Adam Reichold	71ab482720	RFC: Use a more general but still object-safe signature for Query::query_terms. (#1468 ) * Use a more general but still object-safe signature for Query::query_terms. * Further constraint the generalized Query::query_terms signature to allow extracting references to terms.	2022-08-24 06:34:07 +09:00
PSeitz	23fe73a6c0	remove searcher pool and make Searcher cloneable (#1411 ) * remove searcher pool and make Searcher cloneable closes #1410 * use SearcherInner in InnerIndexReader	2022-07-12 18:07:48 +09:00
Paul Masurel	46d5de920d	Removes all usage of block_on, and use a oneshot channel instead. (#1315 ) * Removes all usage of block_on, and use a oneshot channel instead. Calling `block_on` panics in certain context. For instance, it panics when it is called in a the context of another call to block. Using it in tantivy is unnecessary. We replace it by a thin wrapper around a oneshot channel that supports both async/sync. * Removing needless uses of async in the API. Co-authored-by: PSeitz <PSeitz@users.noreply.github.com>	2022-03-18 16:54:58 +09:00
Paul Masurel	2ead010c83	Tantivy quickwit (#1293 ) * Added sstable and enabling it by default, and parallel boolean query. * Added async API for FileSlice. * Added async get_doc * Reduce blocksize to 32_000 * Added debug logs Quickwit specific feature a hidden behind the quickwit feature flag.	2022-02-25 17:32:49 +09:00
Paul Masurel	d7b46d2137	Added JSON Type (#1270 ) - Removed useless copy when ingesting JSON. - Bugfix in phrase query with a missing field norms. - Disabled range query on default fields Closes #1251	2022-02-24 16:25:22 +09:00
Paul Masurel	2069e3e52b	Fixing clippy comments	2022-02-01 10:24:05 +09:00
Paul Masurel	eca6628b3c	Minor refactoring (#1266 )	2022-01-28 15:55:55 +09:00
Paul Masurel	732f6847c0	Field type with codes (#1255 ) * Term are now typed. This change is backward compatible: While the Term has a byte representation that is modified, a Term itself is a transient object that is not serialized as is in the index. Its .field() and .value_bytes() on the other hand are unchanged. This change offers better Debug information for terms. While not necessary it also will help in the support for JSON types. * Renamed Hierarchical Facet -> Facet	2022-01-07 20:49:00 +09:00
Paul Masurel	c81b3030fa	Issue/922b (#1233 ) * Add a NORMED options on field Make fieldnorm indexation optional: * for all types except text => added a NORMED options * for text field if STRING, field has not fieldnorm retained if TEXT, field has fieldnorm computed * Finalize making fieldnorm optional for all field types. - Using Option for fieldnorm readers.	2021-12-10 21:12:29 +09:00
Paul Masurel	7234bef0eb	Issue/1198 (#1201 ) * Unit test reproducing #1198 * Fixing unit test to handle the error from add_document. * Bump project version	2021-11-11 16:42:19 +09:00
François Massot	0462754673	Optimize block wand for one and several TermScorer. (#1190 ) * Added optimisation using block wand for single TermScorer. A proptest was also added. * Fix block wand algorithm by taking the last doc id of scores until the pivot scorer (included). * In block wand, when block max score is lower than the threshold, advance the scorer with best score. * Fix wrong condition in block_wand_single_scorer and add debug_assert to have an equality check on doc to break the loop.	2021-11-01 09:18:05 +09:00
Pascal Seitz	d7a6a409a1	renames	2021-09-23 20:33:11 +08:00
Pascal Seitz	a1ac63ee1c	fix clippy	2021-07-01 18:06:03 +02:00
Pascal Seitz	1e4df54ab3	fix clippy	2021-07-01 17:41:53 +02:00
Paul Masurel	6e4b61154f	Issue/1070 (#1071 ) Add a boolean flag in the Query::query_terms informing on whether position information is required. Closes #1070	2021-06-03 22:33:20 +09:00
Paul Masurel	39dd8cfe24	Cargo clippy. Acronym should not be full uppercase apparently.	2021-04-26 11:49:18 +09:00
Stéphane Campinas	a0ec6e1e9d	Expand the DocAddress struct with named fields	2021-03-28 19:00:23 +02:00
Paul Masurel	9c3cabce40	Updated version of the rand crate.	2021-01-06 18:09:00 +09:00
Paul Masurel	80a99539ce	Several TermDict operation now returns an io::Result	2020-12-03 13:13:11 +09:00
Paul Masurel	e9aa27dace	Avoid computing the BM25 weight if scoring is disabled	2020-11-25 14:35:49 +09:00
Paul Masurel	d23aee76c9	Avoid loading fieldnorms when not necessary	2020-11-09 15:50:16 +09:00
Paul Masurel	36a0520a48	Added failing proptest and fixed it.	2020-11-05 15:40:00 +09:00
Paul Masurel	730ccefffb	Fixes a bug in TermQuery::explain. Closes #915	2020-10-28 22:29:15 +09:00
Paul Masurel	c23a03ad81	Large API Change in the Directory API. (#901 ) Tantivy used to assume that all files could be somehow memory mapped. After this change, Directory return a `FileSlice` that can be reduced and eventually read into an `OwnedBytes` object. Long and blocking io operation are still required by they do not span over the entire file.	2020-10-08 16:36:51 +09:00
Paul Masurel	439d6956a9	Returning Result in some of the API (#880 ) * Returning Result in some of the API * Introducing `.writer_for_test(..)`	2020-09-07 15:52:34 +09:00
Paul Masurel	e04f47e922	Using block wand for term queries too.	2020-08-20 15:51:21 +09:00
Paul Masurel	2481c87be8	Block wand (#856 )	2020-08-19 22:36:36 +09:00
Paul Masurel	6db8bb49d6	Assert nearly equals macro (#853 ) * Assert nearly equals macro * Renamed specialized_scorer in TermScorer	2020-07-17 16:40:41 +09:00
Paul Masurel	f71b04acb0	Bugfix. (#849 ) go_to_first_doc was typically calling seek with a target smaller than doc. Since SegmentPostings typically do a linear search on the full block, regardless of the current position, it could have our segment postings go backward.	2020-07-16 10:57:51 +09:00
Paul Masurel	c0f5645cd9	Move for_each functions from Scorer to Weight. (#836 ) * Move for_each functions from Scorer to Weight. * Specialized foreach / foreach_pruning for union of termscorer.	2020-06-01 11:31:18 +09:00
Paul Masurel	e25284bafe	Major change in the DocSet/Scorer API (#824 ) - Change in the DocSet and Scorer API. (@fulmicoton). A freshly created DocSet point directly to their first doc. A sentinel value called TERMINATED marks the end of a DocSet. `.advance()` returns the new DocId. `Scorer::skip(target)` has been replaced by `Scorer::seek(target)` and returns the resulting DocId. As a result, iterating through DocSet now looks as follows ```rust let mut doc = docset.doc(); while doc != TERMINATED { // ... doc = docset.advance(); } ``` The change made it possible to greatly simplify a lot of the docset's code. - Misc internal optimization and introduction of the `Scorer::for_each_pruning` function. (@fulmicoton)	2020-05-16 16:33:36 +09:00
Paul Masurel	7d6cfa58e1	[WIP] Alternative take on boosted queries (#772 ) * Alternative take on boosted queries * Fixing unit test * Added boosting to the query grammar. * Made BoostQuery public. * Added support for boosting field in QueryParser Closes #547	2020-02-19 11:04:38 +09:00
Paul Masurel	ae14022bf0	Removed `use::Result`. (#771 )	2020-01-31 18:47:02 +09:00
Paul Masurel	ef3eddf3da	clippy first stab (#711 )	2019-11-22 13:09:35 +09:00
Paul Masurel	7b21b3f25a	Refactoring around Field (#673 ) * Refactoring around Field Removing the contract about the order of the field, and the field id allocation. * Update delete_queue.rs * Update field.rs	2019-10-25 09:06:44 +09:00
Joshua Dutton	9f74786db2	Update import statements in examples, doctests (#633 ) Update import statements to edition 2018, including removing `extern crate` and `#[macro_use]`. Alphabetize the statements.	2019-08-19 07:26:35 +09:00
Paul Masurel	b3b0138b82	Change for tantivy-py Schema.convert_named_doc Better Debug string for Terms and TermQueries	2019-08-14 17:44:25 +09:00
Paul Masurel	462774b15c	Tiqb feature/2018 (#583 ) * rust 2018 * Added CHANGELOG comment	2019-07-01 10:01:46 +09:00
Paul Masurel	4822940b19	Issue/36 (#559 ) * Added explanation * Explain * Splitting weight and idf * Added comments Closes #36	2019-06-06 10:03:54 +09:00
Paul Masurel	b7c2d0de97	Clippy2 (#534 ) * Clippy comments Clippy complaints that about the cast of &[u32] to a const __m128i, because of the lack of alignment constraints. This commit passes the OutputBuffer object (which enforces proper alignment) instead of `&[u32]`. Clippy. Block alignment * Code simplification * Added comment. Code simplification * Removed the extraneous freq block len hack.	2019-04-24 12:31:32 +09:00
Paul Masurel	d823163d52	Closes #527 . (#529 ) Fixing the bug that affects the result of `query.count()` in presence of deletes.	2019-04-19 09:19:50 +09:00
Paul Masurel	663dd89c05	Feature/reader (#517 ) Adding IndexReader to the API. Making it possible to watch for changes. * Closes #500	2019-03-20 08:39:22 +09:00
Paul Masurel	63b593bd0a	Lower RAM usage in tests.	2019-01-24 09:10:38 +09:00
Paul Masurel	279a9eb5e3	Closes #449 (#450 ) Clippy working on stable. Clippy warnings addressed	2018-12-10 12:20:59 +09:00

1 2 3

111 Commits