Commit Graph

161 Commits

Author SHA1 Message Date
PSeitz
496b4a4fdb Update examples/json_field.rs 2022-05-23 12:24:36 +02:00
PSeitz
93cc8498b3 Update examples/json_field.rs 2022-05-23 11:59:42 +02:00
PSeitz
0aa3d63a9f Update examples/json_field.rs 2022-05-23 11:39:45 +02:00
PSeitz
4e2a053b69 Update examples/json_field.rs 2022-05-23 11:27:05 +02:00
saroh
b2e97e266a more examples to explain default field handling 2022-05-21 17:36:39 +02:00
Pascal Seitz
b114e553cd Revert "return result from segment collector"
This reverts commit a99e5459e3.
2022-05-19 16:57:55 +08:00
Pascal Seitz
44ea7313ca set max bucket size as parameter 2022-05-13 13:21:52 +08:00
Pascal Seitz
11ac451250 abort aggregation when too many buckets are created
Validation happens on different phases depending on the aggregation
Term: During segment collection
Histogram: At the end when converting in intermediate buckets (we preallocate empty buckets for the range) Revisit after #1370
Range: When validating the request

update CHANGELOG
2022-05-12 12:26:43 +08:00
Pascal Seitz
a99e5459e3 return result from segment collector 2022-05-12 12:26:43 +08:00
Pascal Seitz
ab6b532cc4 add comments 2022-04-14 12:06:36 +08:00
PSeitz
b105bf72e1 use defaults in meta.json (#1310)
This change allows to have unset fields in meta.json and fall back to their defaults
Currently it is required to explicitly put e.g. fieldnorms: false
2022-03-14 13:54:06 +09:00
Paul Masurel
d7b46d2137 Added JSON Type (#1270)
- Removed useless copy when ingesting JSON.
- Bugfix in phrase query with a missing field norms.
- Disabled range query on default fields

Closes #1251
2022-02-24 16:25:22 +09:00
Pascal Seitz
704498a1ac rename IntOptions to NumericOptions
keep IntOptions with deprecation warning
Fixes #1286
2022-02-21 22:20:07 +01:00
PSeitz
972cb6c26d Aggregation (#1276)
Added support for aggregation compatible with Elasticsearch's API.
2022-02-21 09:59:11 +09:00
Paul Masurel
bdedefe07d Adding an IndexingContext object (#1268) 2022-02-04 15:08:01 +09:00
Paul Masurel
eca6628b3c Minor refactoring (#1266) 2022-01-28 15:55:55 +09:00
Shikhar Bhushan
5a2497b6fd Avoid exposing TrackedObject from Warmer API (#1264) 2022-01-25 10:04:08 +09:00
Shikhar Bhushan
99d4b1a177 Searcher Warming API (#1261)
Adds an API to register Warmers in the IndexReader.

Co-authored-by: Paul Masurel <paul@quickwit.io>
2022-01-20 23:40:25 +09:00
Liam Warfield
17e00df112 Change Snippet.fragments -> Snippet.fragment (#1243)
* Change Snippet.fragments -> Snippet.fragment
* Apply suggestions from code review

Co-authored-by: Liam Warfield <lwarfield@arista.com>
2022-01-03 22:23:51 +09:00
Paul Masurel
dde49ac8e2 Closes #1195 (#1222)
Removes the indexed option for facets.
Facets are now always indexed.

Closes #1195
2021-12-02 14:37:19 +09:00
Paul Masurel
7234bef0eb Issue/1198 (#1201)
* Unit test reproducing #1198
* Fixing unit test to handle the error from add_document.
* Bump project version
2021-11-11 16:42:19 +09:00
sigaloid
096ce7488e Resolve some clippys, format (#1144)
* cargo +nightly clippy --fix -Z unstable-options
2021-08-26 08:46:00 +09:00
Pascal Seitz
0062fe705d cargo fmt 2021-07-01 18:17:08 +02:00
Pascal Seitz
1e4df54ab3 fix clippy 2021-07-01 17:41:53 +02:00
Bernard Swart
9f32d40b27 Misspelling of misspelled was fixed (#1078) 2021-06-14 16:29:12 +09:00
PSeitz
dff0ffd38a prepare for multiple fastfield codecs (#1063)
* prepare for multiple fastfield codecs

 prepare for multiple fastfield codecs by wrapping the codecs in an enum #1042

* add FastFieldSerializer trait, add DynamicFastFieldSerializer

add FastFieldSerializer trait
add DynamicFastFieldSerializer enum to wrap all implementors of the FastFieldSerializer trait

* add estimation for fastfield bitpacker
2021-05-31 23:14:14 +09:00
Moriyoshi Koizumi
4afba005f9 Provide a means to deal with malformed facet text representation for the query parser (#1056)
* Provide a means to deal with malformed facet text representation for the query parser.
* Specific error enum for the facet parse error.
2021-05-27 12:16:49 +09:00
Laurent Pouget
4b34231f28 Make facet indexation and storage optional
Added a FacetOptions for HierarchicalFacet which add indexed and stored flags to it.
Propagate change and update tests accordingly
Added a test to ensure that a not indexed flag was taken care of.
Added on Value implem the `path()` function to return the stored facet.
2021-03-24 14:56:27 +01:00
Paul Masurel
31137beea6 Replacing (start, end) by Range 2021-03-10 14:06:21 +09:00
Vishal Sodani
6ae96038c2 Fixed spelling 2021-02-08 20:18:45 +05:30
Vishal Sodani
4bcdca8545 Fixed spelling 2021-02-08 19:51:36 +05:30
Paul Masurel
1b4be24dca Fast field are not loaded on the opening of a segment.
They are instead loaded lazily when they are request.
2021-01-21 18:13:08 +09:00
Paul Masurel
80a99539ce Several TermDict operation now returns an io::Result 2020-12-03 13:13:11 +09:00
Paul Masurel
c23a03ad81 Large API Change in the Directory API. (#901)
Tantivy used to assume that all files could be somehow memory mapped. After this change, Directory return a `FileSlice` that can be reduced and eventually read into an `OwnedBytes` object. Long and blocking io operation are still required by they do not span over the entire file.
2020-10-08 16:36:51 +09:00
Paul Masurel
848afa43ee Merge branch 'issue/896' into main 2020-10-01 20:43:42 +09:00
Paul Masurel
7720d21265 Closes #896 - Facet reader related
Bugfix. Acquiring a facet reader on a segment that does not contain any
doc with this facet returns `None`.
2020-10-01 20:25:28 +09:00
stephenlagree
5e06e7de5a Update basic_search.rs (#865)
Remove duplicated document entry.
2020-08-21 11:23:09 +09:00
Paul Masurel
2481c87be8 Block wand (#856) 2020-08-19 22:36:36 +09:00
Paul Masurel
f71b04acb0 Bugfix. (#849)
go_to_first_doc was typically calling seek with a target smaller than
doc.

Since SegmentPostings typically do a linear search on the full block,
regardless of the current position, it could have our segment postings
go backward.
2020-07-16 10:57:51 +09:00
Paul Masurel
e25284bafe Major change in the DocSet/Scorer API (#824)
- Change in the DocSet and Scorer API. (@fulmicoton). 
A freshly created DocSet point directly to their first doc. A sentinel value called TERMINATED marks the end of a DocSet.
`.advance()` returns the new DocId. `Scorer::skip(target)` has been replaced by `Scorer::seek(target)` and returns the resulting DocId.
As a result, iterating through DocSet now looks as follows
```rust
let mut doc = docset.doc();
while doc != TERMINATED {
   // ...
   doc = docset.advance();
}
```
The change made it possible to greatly simplify a lot of the docset's code.
- Misc internal optimization and introduction of the `Scorer::for_each_pruning` function. (@fulmicoton)
2020-05-16 16:33:36 +09:00
Rob Young
77f363987a Make TweakScore and CustomScore mutable at the segment level (#807)
* Make TweakScore and CustomScore mutable

Make TweakScore and CustomScore mutable at the segment level.

Addresses issue #806

* Add example to show tweak_score working for facets
2020-04-19 15:54:00 +09:00
Paul Masurel
486b8fa9c5 Removing serde-derive dependency (#786) 2020-03-06 23:33:58 +09:00
Paul Masurel
811fd0cb9e Dynamic analyzer (#755)
* Removed generics in tokenizers

* lowercaser

* Added TokenizerExt

* Introducing BoxedTokenizer

* Introducing BoxXXXXX helper struct

* Closes #762.

* Introducing a TextAnalyzer
2020-01-29 18:23:37 +09:00
Paul Masurel
c1400f25a7 Handle facet search in the QueryParser. (#741)
Closes #738
2019-12-25 17:43:33 +09:00
Paul Masurel
afe0134d0f Kkoziara remove tokens from doc store (#715)
* Prevent tokens from being stored in the document store.

Commit adds prepare_for_store method to Document, which changes all
PreTokenizedString values into String values. The method is called
before adding document to the document store to prevent tokens from
being saved there. Commit also adds small changes to comments in
pre_tokenized_text example.

* Avoid storing the pretokenized text.
2019-11-25 22:39:12 +09:00
Paul Masurel
fb3d6fa332 Adding Value::From<PretokenizedText> (#697) 2019-11-10 14:39:44 +09:00
kkoziara
0519056bd8 Added handling of pre-tokenized text fields (#642). (#669)
* Added handling of pre-tokenized text fields (#642).

* * Updated changelog and examples concerning #642.
* Added tokenized_text method to Value implementation.
* Implemented From<TokenizedString> for TokenizedStream.

* * Removed tokenized flag from TextOptions and code reliance on the flag.
* Changed naming to use word "pre-tokenized" instead of "tokenized".
* Updated example code.
* Fixed comments.

* Minor code refactoring. Test improvements.
2019-11-07 10:10:56 +09:00
Paul Masurel
4b9c1dce69 Moving queyr grammar to a different crate. (#645) 2019-09-05 09:37:28 +09:00
Joshua Dutton
9f74786db2 Update import statements in examples, doctests (#633)
Update import statements to edition 2018, including removing
`extern crate` and  `#[macro_use]`. Alphabetize the statements.
2019-08-19 07:26:35 +09:00
Joshua Dutton
84c615cff1 Fixing typos (#634) 2019-08-19 07:24:05 +09:00