tantivy

mirror of https://github.com/quickwit-oss/tantivy.git synced 2026-05-26 21:20:40 +00:00

Author	SHA1	Message	Date
François Massot	d73706dede	Ngram tokenizer now returns an error with invalid arguments.	2023-06-25 20:13:24 +02:00
Adam Reichold	3b0cbf8102	Cosmetic updates to the warmer example. (#2095 ) Just some cosmetic tweaks to make the example easier on the eyes as a colleague was staring at this for quite some time this week.	2023-06-22 11:25:01 +09:00
PSeitz	fdecb79273	tokenizer-api: reduce Tokenizer overhead (#2062 ) * tokenizer-api: reduce Tokenizer overhead Previously a new `Token` for each text encountered was created, which contains `String::with_capacity(200)` In the new API the token_stream gets mutable access to the tokenizer, this allows state to be shared (in this PR Token is shared). Ideally the allocation for the BoxTokenStream would also be removed, but this may require some lifetime tricks. * simplify api * move lowercase and ascii folding buffer to global * empty Token text as default	2023-06-08 18:37:58 +08:00
Adam Reichold	b325d569ad	Expose phrase-prefix queries via the built-in query parser (#2044 ) * Expose phrase-prefix queries via the built-in query parser This proposes the less-than-imaginative syntax `field:"phrase ter"` to perform a phrase prefix query against `field` using `phrase` and `ter` as the terms. The aim of this is to make this type of query more discoverable and simplify manual testing. I did consider exposing the `max_expansions` parameter similar to how slop is handled, but I think that this is rather something that should be configured via the querser parser (similar to `set_field_boost` and `set_field_fuzzy`) as choosing it requires rather intimiate knowledge of the backing index. Prevent construction of zero or one term phrase-prefix queries via the query parser. * Add example using phrase-prefix search via surface API to improve feature discoverability.	2023-06-01 13:03:16 +02:00
Paul Masurel	7ee78bda52	Readding s in datetime precision variant names (#2065 ) There is no clear win and it change some serialization in quickwit.	2023-06-01 06:39:46 +02:00
Adrien Guillo	a789ad9aee	Rename `DatePrecision` to `DateTimePrecision` (#2051 )	2023-05-23 17:09:11 +02:00
Yuri Astrakhan	74275b76a6	Inline format arguments where makes sense (#2038 ) Applied this command to the code, making it a bit shorter and slightly more readable. ``` cargo +nightly clippy --all-features --benches --tests --workspace --fix -- -A clippy::all -W clippy::uninlined_format_args cargo +nightly fmt --all ```	2023-05-10 18:03:59 +09:00
PSeitz	2e369db936	switch to Aggregation without serde_untagged (#2003 ) * refactor result handling * remove Internal stuff * merge different accessors * switch to Aggregation without serde_untagged * fix doctests	2023-04-25 08:54:51 +02:00
PSeitz	e522163a1c	use json in agg tests (#1998 ) * switch to JSON in tests, add flat aggregation types * use method * clippy * remove commented file	2023-04-17 14:08:48 +02:00
PSeitz	41af70799d	add percentiles aggregations (#1984 ) * add percentiles aggregations add percentiles aggregation fix disabled agg benchmark * Update src/aggregation/metric/percentiles.rs Co-authored-by: Paul Masurel <paul@quickwit.io> * Apply suggestions from code review Co-authored-by: Paul Masurel <paul@quickwit.io> * fix import * fix import --------- Co-authored-by: Paul Masurel <paul@quickwit.io>	2023-04-07 07:18:28 +02:00
PSeitz	5c4ea6a708	tokenizer option on text fastfield (#1945 ) * tokenizer option on text fastfield allow to set tokenizer option on text fastfield (fixes #1901) handle PreTokenized strings in fast field * change visibility * remove custom de/serialization	2023-03-31 10:03:38 +02:00
PSeitz	9e2faecf5b	add memory limit for aggregations (#1942 ) * add memory limit for aggregations introduce AggregationLimits to set memory consumption limit and bucket limits memory limit is checked during aggregation, bucket limit is checked before returning the aggregation request. * Apply suggestions from code review Co-authored-by: Paul Masurel <paul@quickwit.io> * add ByteCount with human readable format --------- Co-authored-by: Paul Masurel <paul@quickwit.io>	2023-03-16 06:21:07 +01:00
PSeitz	61cfd8dc57	fix clippy (#1927 )	2023-03-13 03:12:02 +01:00
trinity-1686a	064518156f	refactor tokenization pipeline to use GATs (#1924 ) * refactor tokenization pipeline to use GATs * fix doctests * fix clippy lints * remove commented code	2023-03-09 09:39:37 +01:00
Paul Masurel	06850719dc	Renaming .values(DocId) to .values_for_doc(DocId) (#1906 )	2023-02-27 12:15:13 +09:00
PSeitz	c7278b3258	remove schema in aggs (#1888 ) * switch to ColumnType, move tests * remove Schema dependency in agg	2023-02-22 04:50:28 +01:00
Paul Masurel	e2aa5af075	Clippy warnings fixes (#1885 )	2023-02-20 19:04:13 +09:00
PSeitz	bf1449b22d	update examples for literate docs (#1880 )	2023-02-17 11:48:22 +01:00
PSeitz	01e5a22759	switch to new ff api (#1868 )	2023-02-14 15:57:32 +08:00
trinity-1686a	3120147a76	re-enable examples (#1860 )	2023-02-10 14:51:37 +01:00
Paul Masurel	bd5eea9852	Integrated columnar work.	2023-02-09 13:14:31 +01:00
PSeitz	f687b3a5aa	start migrate Field to &str (#1772 ) start migrate Field to &str in preparation of columnar return Result for get_field	2023-01-18 16:12:07 +09:00
Adrien Guillo	c51d9f9f83	Fix some Clippy warnings	2023-01-17 10:17:51 -05:00
PSeitz	4bac945709	add ip field example (#1775 )	2023-01-16 06:06:11 +01:00
Daw-Chih Liou	b22f96624e	doc: update comments in the faceted search example (#1737 ) * doc: update comments in the faceted search example * chore: format	2023-01-02 11:07:30 +01:00
PSeitz	ee1f2c1f28	add aggregation support for date type (#1693 ) * add aggregation support for date type fixes #1332 * serialize key_as_string as rfc3339 in date histogram * update docs * enable date for range aggregation	2022-11-28 09:12:08 +09:00
Pascal Seitz	e772d3170d	switch get_val() to u32 Fixes #1638	2022-10-24 19:05:57 +08:00
Bruce Mitchener	cb252a42af	docs: "associated to" -> "associated with" (#1557 ) This reads better this way.	2022-09-26 20:23:37 +09:00
Bruce Mitchener	6a88ac3fe3	Documentation improvements. Fix some linking, some grammar, some typos, etc.	2022-09-18 18:05:37 +07:00
Paul Masurel	8e775b6c3d	Refactoring dyn Column (#1502 )	2022-09-02 17:26:30 +09:00
Paul Masurel	5331be800b	Introducing a column trait	2022-08-28 14:14:27 +02:00
Kian-Meng Ang	625bcb4877	Fix typos and markdowns Found via these commands: codespell -L crate,ser,panting,beauti,hart,ue,atleast,childs,ond,pris,hel,mot markdownlint .md doc/src/.md --disable MD013 MD025 MD033 MD001 MD024 MD036 MD041 MD003	2022-08-13 18:25:47 +08:00
k-yomo	5b564916f0	Add support for keyed parameter in range and histgram aggregations	2022-07-26 04:28:21 +09:00
PSeitz	23fe73a6c0	remove searcher pool and make Searcher cloneable (#1411 ) * remove searcher pool and make Searcher cloneable closes #1410 * use SearcherInner in InnerIndexReader	2022-07-12 18:07:48 +09:00
Evance Soumaoro	a4be239d38	Updated DateTime to hold timestamp in microseconds, while making date field precision configurable (#1396 )	2022-07-12 10:04:28 +09:00
PSeitz	6ca5f77466	Merge pull request #1363 from quickwit-oss/refactor_aggregation Add aggregation bucket limit	2022-06-23 10:27:57 +08:00
saroh	e766375700	remove useless example	2022-05-23 19:49:31 +02:00
PSeitz	496b4a4fdb	Update examples/json_field.rs	2022-05-23 12:24:36 +02:00
PSeitz	93cc8498b3	Update examples/json_field.rs	2022-05-23 11:59:42 +02:00
PSeitz	0aa3d63a9f	Update examples/json_field.rs	2022-05-23 11:39:45 +02:00
PSeitz	4e2a053b69	Update examples/json_field.rs	2022-05-23 11:27:05 +02:00
saroh	b2e97e266a	more examples to explain default field handling	2022-05-21 17:36:39 +02:00
Pascal Seitz	b114e553cd	Revert "return result from segment collector" This reverts commit `a99e5459e3`.	2022-05-19 16:57:55 +08:00
Pascal Seitz	44ea7313ca	set max bucket size as parameter	2022-05-13 13:21:52 +08:00
Pascal Seitz	11ac451250	abort aggregation when too many buckets are created Validation happens on different phases depending on the aggregation Term: During segment collection Histogram: At the end when converting in intermediate buckets (we preallocate empty buckets for the range) Revisit after #1370 Range: When validating the request update CHANGELOG	2022-05-12 12:26:43 +08:00
Pascal Seitz	a99e5459e3	return result from segment collector	2022-05-12 12:26:43 +08:00
Pascal Seitz	ab6b532cc4	add comments	2022-04-14 12:06:36 +08:00
PSeitz	b105bf72e1	use defaults in meta.json (#1310 ) This change allows to have unset fields in meta.json and fall back to their defaults Currently it is required to explicitly put e.g. fieldnorms: false	2022-03-14 13:54:06 +09:00
Paul Masurel	d7b46d2137	Added JSON Type (#1270 ) - Removed useless copy when ingesting JSON. - Bugfix in phrase query with a missing field norms. - Disabled range query on default fields Closes #1251	2022-02-24 16:25:22 +09:00
Pascal Seitz	704498a1ac	rename IntOptions to NumericOptions keep IntOptions with deprecation warning Fixes #1286	2022-02-21 22:20:07 +01:00

1 2 3

148 Commits