tantivy

mirror of https://github.com/quickwit-oss/tantivy.git synced 2026-01-12 12:02:54 +00:00

Author	SHA1	Message	Date
François Massot	0cb53207ec	Fix tests.	2023-06-11 12:13:35 +02:00
PSeitz	fdecb79273	tokenizer-api: reduce Tokenizer overhead (#2062 ) * tokenizer-api: reduce Tokenizer overhead Previously a new `Token` for each text encountered was created, which contains `String::with_capacity(200)` In the new API the token_stream gets mutable access to the tokenizer, this allows state to be shared (in this PR Token is shared). Ideally the allocation for the BoxTokenStream would also be removed, but this may require some lifetime tricks. * simplify api * move lowercase and ascii folding buffer to global * empty Token text as default	2023-06-08 18:37:58 +08:00
Adam Reichold	b325d569ad	Expose phrase-prefix queries via the built-in query parser (#2044 ) * Expose phrase-prefix queries via the built-in query parser This proposes the less-than-imaginative syntax `field:"phrase ter"` to perform a phrase prefix query against `field` using `phrase` and `ter` as the terms. The aim of this is to make this type of query more discoverable and simplify manual testing. I did consider exposing the `max_expansions` parameter similar to how slop is handled, but I think that this is rather something that should be configured via the querser parser (similar to `set_field_boost` and `set_field_fuzzy`) as choosing it requires rather intimiate knowledge of the backing index. Prevent construction of zero or one term phrase-prefix queries via the query parser. * Add example using phrase-prefix search via surface API to improve feature discoverability.	2023-06-01 13:03:16 +02:00
Paul Masurel	62709b8094	Change in the query grammar. (#2050 ) * Change in the query grammar. Quotation mark can now be used for phrase queries. The delimiter is part of the `UserInputLeaf`. That information is meant to be used in Quickwit to solve #3364. This PR also adds support for quotation marks escaping in phrase queries. * Apply suggestions from code review	2023-05-19 12:07:10 +09:00
Adam Reichold	fedd9559e7	Expose create a query from a user input AST. (#2039 )	2023-05-11 21:53:18 +09:00
Yuri Astrakhan	74275b76a6	Inline format arguments where makes sense (#2038 ) Applied this command to the code, making it a bit shorter and slightly more readable. ``` cargo +nightly clippy --all-features --benches --tests --workspace --fix -- -A clippy::all -W clippy::uninlined_format_args cargo +nightly fmt --all ```	2023-05-10 18:03:59 +09:00
Paul Masurel	f28ddb711e	Exposing u64-based FastFieldRangeWeight (#2024 )	2023-05-03 18:32:00 +09:00
PSeitz	74f9eafefc	refactor Term (#2006 ) * refactor Term add ValueBytes for serialized term values add missing debug for ip skip unnecessary json path validation remove code duplication add DATE_TIME_PRECISION_INDEXED constant add missing Term clarification remove weird value_bytes_mut() API * fix naming	2023-04-20 15:31:43 +02:00
trinity-1686a	064518156f	refactor tokenization pipeline to use GATs (#1924 ) * refactor tokenization pipeline to use GATs * fix doctests * fix clippy lints * remove commented code	2023-03-09 09:39:37 +01:00
Paul Masurel	7fae4d98d7	Adapting for quickwit2 (#1912 ) * Adapting tantivy to make it possible to be plugged to quickwit. * Apply suggestions from code review Co-authored-by: PSeitz <PSeitz@users.noreply.github.com> * Added unit test --------- Co-authored-by: PSeitz <PSeitz@users.noreply.github.com>	2023-03-01 16:27:46 +09:00
trinity-1686a	8a71e00da3	allow limiting the number of matched term in range query (#1899 )	2023-02-27 10:44:08 +01:00
Paul Masurel	d002698008	Re-export of query grammar. (#1908 )	2023-02-27 12:26:34 +09:00
Paul Masurel	bd5eea9852	Integrated columnar work.	2023-02-09 13:14:31 +01:00
PSeitz	f687b3a5aa	start migrate Field to &str (#1772 ) start migrate Field to &str in preparation of columnar return Result for get_field	2023-01-18 16:12:07 +09:00
Adrien Guillo	e17996f2fd	Allow range queries via fast fields on non-indexed fields	2023-01-11 09:56:13 -05:00
Adrien Guillo	14222a47a3	Fix typo (#1776 )	2023-01-11 00:49:13 +09:00
Paul Masurel	7a8fce0ae7	Minor mini fixes	2023-01-10 14:15:30 +09:00
Adam Reichold	2080c370c2	Enable usage of FuzzyTermQuery for specific fields via QueryParser (#1750 ) * Make nightly Clippy mostly happy. * Document how to produce TermSetQuery queries using QueryParser. * Enable construction of queries using FuzzyTermQuery via the QueryParser * Use FxHashMap instead of HashMap in the QueryParser as these hash tables are not exposed to DoS attacks. * Use a struct instead of a tuple to improve readability.	2023-01-04 18:11:27 +09:00
boraarslan	495824361a	Move `split_full_path` to `Schema` (#1692 )	2022-11-29 20:56:13 +09:00
Paul Masurel	0b40a7fe43	Added a `expand_dots` JsonObjectOptions. (#1687 ) Related with quickwit#2345.	2022-11-21 23:03:00 +09:00
trinity-1686a	e758080465	add support for TermSetQuery in query parser (#1683 )	2022-11-17 16:49:49 +01:00
Paul Masurel	2a39289a1b	Handle escaped dot in json path in the QueryParser. (#1682 )	2022-11-16 07:18:34 +09:00
Pascal Seitz	9e8a0c2cca	Allow range query on fastfield without INDEXED	2022-11-10 15:56:08 +08:00
Pascal Seitz	1082ff60f9	add range query handling for ip via term dictionary since IPs are mapped monotonically we can use the term dictionary for range queries	2022-10-18 13:08:27 +08:00
Pascal Seitz	6800fdec9d	add indexing for ip field Closes #1595	2022-10-18 10:07:48 +08:00
Pascal Seitz	8d75e451bd	fix truncate, remove mutable access from term	2022-10-17 12:14:35 +08:00
Pascal Seitz	fcfd76ec55	refactor Term fixes some issues with Term Remove duplicate calls to truncate or resize Replace Magic Number 5 with constant Enforce minimum size of 5 for metadata Fix broken truncate docs use constructor instead new + set calls normalize constructor stack replace assert on internal behavior fixes #1585	2022-10-17 12:14:34 +08:00
Pascal Seitz	e50e74acf8	remove u128 type	2022-10-07 16:25:01 +08:00
Pascal Seitz	0b86658389	rename ip addr, use buffer	2022-10-07 16:25:01 +08:00
Pascal Seitz	309449dba3	rename to IpAddr	2022-10-07 16:25:01 +08:00
Pascal Seitz	400a20b7af	add ip field add u128 multivalue reader and writer add ip to schema add ip writers, handle merge	2022-10-07 16:25:01 +08:00
Bruce Mitchener	cb252a42af	docs: "associated to" -> "associated with" (#1557 ) This reads better this way.	2022-09-26 20:23:37 +09:00
Bruce Mitchener	cf02e32578	Improvements to doc linking, grammar, etc.	2022-09-19 18:10:22 +07:00
Paul Masurel	4a3169011d	clippy (#1452 )	2022-08-20 20:01:33 +09:00
Kian-Meng Ang	625bcb4877	Fix typos and markdowns Found via these commands: codespell -L crate,ser,panting,beauti,hart,ue,atleast,childs,ond,pris,hel,mot markdownlint .md doc/src/.md --disable MD013 MD025 MD033 MD001 MD024 MD036 MD041 MD003	2022-08-13 18:25:47 +08:00
Evance Soumaoro	a4be239d38	Updated DateTime to hold timestamp in microseconds, while making date field precision configurable (#1396 )	2022-07-12 10:04:28 +09:00
Antoine G	437cd350a2	Add support for phrase slop in query language (#1393 ) Closes #1390	2022-06-28 13:55:47 +09:00
Antoine G	11e4225f23	doc fix (#1391 ) Documentation fix.	2022-06-21 15:53:33 +09:00
boraarslan	811b91ecb3	Edit and add tests	2022-06-07 10:09:37 +03:00
boraarslan	2981e6c1df	First commit	2022-06-07 10:09:37 +03:00
PSeitz	58c0cb5fc4	Merge pull request #1357 from saroh/1302-json-term-writer-API Expose helpers to generate json field writer terms	2022-05-10 11:02:05 +08:00
saroh	0ade871126	rename constructor to be more explicit	2022-05-06 13:29:07 +02:00
Paul Masurel	ed26552296	Minor changes in query parsing for quickwit#1334. (#1356 ) Quickwit's still heavily relies on generating field names containing a '.' for nested object, yet allows for user defined field names to contain a dot. In order to reuse tantivy query parser, we will end up using quickwit field names directly into tantivy. Only '.' will be escaped. This PR makes minor changes in how tantivy query parser parses a field name and resolves it to a field. Some of the new edge case behavior is hacky. Closes #1355	2022-05-06 13:20:10 +09:00
Saroh	65d129afbd	better function names	2022-05-05 10:12:28 +02:00
Saroh	14cb66ee00	move helper to indexer module	2022-05-04 18:01:57 +02:00
Saroh	9e38343352	expose helpers for json field writer manipulation closes #1302	2022-05-04 18:01:45 +02:00
Pascal Seitz	b5b16948b0	print whole query on syntax error	2022-04-27 12:48:30 +08:00
PSeitz	038d234ff1	Merge pull request #1347 from quickwit-oss/query_parser_error fix query parser error field not found	2022-04-26 07:01:48 +02:00
Pascal Seitz	824d6f96fe	return query on parse error	2022-04-22 16:11:36 +08:00
Pascal Seitz	7cf821bac0	fix query parser error field not found	2022-04-22 12:40:00 +08:00

1 2 3

145 Commits