* prepare for multiple fastfield codecs
prepare for multiple fastfield codecs by wrapping the codecs in an enum #1042
* add FastFieldSerializer trait, add DynamicFastFieldSerializer
add FastFieldSerializer trait
add DynamicFastFieldSerializer enum to wrap all implementors of the FastFieldSerializer trait
* add estimation for fastfield bitpacker
Added a FacetOptions for HierarchicalFacet which add indexed and stored flags to it.
Propagate change and update tests accordingly
Added a test to ensure that a not indexed flag was taken care of.
Added on Value implem the `path()` function to return the stored facet.
Tantivy used to assume that all files could be somehow memory mapped. After this change, Directory return a `FileSlice` that can be reduced and eventually read into an `OwnedBytes` object. Long and blocking io operation are still required by they do not span over the entire file.
go_to_first_doc was typically calling seek with a target smaller than
doc.
Since SegmentPostings typically do a linear search on the full block,
regardless of the current position, it could have our segment postings
go backward.
- Change in the DocSet and Scorer API. (@fulmicoton).
A freshly created DocSet point directly to their first doc. A sentinel value called TERMINATED marks the end of a DocSet.
`.advance()` returns the new DocId. `Scorer::skip(target)` has been replaced by `Scorer::seek(target)` and returns the resulting DocId.
As a result, iterating through DocSet now looks as follows
```rust
let mut doc = docset.doc();
while doc != TERMINATED {
// ...
doc = docset.advance();
}
```
The change made it possible to greatly simplify a lot of the docset's code.
- Misc internal optimization and introduction of the `Scorer::for_each_pruning` function. (@fulmicoton)
* Make TweakScore and CustomScore mutable
Make TweakScore and CustomScore mutable at the segment level.
Addresses issue #806
* Add example to show tweak_score working for facets
* Prevent tokens from being stored in the document store.
Commit adds prepare_for_store method to Document, which changes all
PreTokenizedString values into String values. The method is called
before adding document to the document store to prevent tokens from
being saved there. Commit also adds small changes to comments in
pre_tokenized_text example.
* Avoid storing the pretokenized text.
* Added handling of pre-tokenized text fields (#642).
* * Updated changelog and examples concerning #642.
* Added tokenized_text method to Value implementation.
* Implemented From<TokenizedString> for TokenizedStream.
* * Removed tokenized flag from TextOptions and code reliance on the flag.
* Changed naming to use word "pre-tokenized" instead of "tokenized".
* Updated example code.
* Fixed comments.
* Minor code refactoring. Test improvements.
* Split Collector into an overall Collector and a per-segment SegmentCollector. Precursor to cross-segment parallelism, and as a side benefit cleans up any per-segment fields from being Option<T> to just T.
* Attempt to add MultiCollector back
* working. Chained collector is broken though
* Fix chained collector
* Fix test
* Make Weight Send+Sync for parallelization purposes
* Expose parameters of RangeQuery for external usage
* Removed &mut self
* fixing tests
* Restored TestCollectors
* blop
* multicollector working
* chained collector working
* test broken
* fixing unit test
* blop
* blop
* Blop
* simplifying APi
* blop
* better syntax
* Simplifying top_collector
* refactoring
* blop
* Sync with master
* Added multithread search
* Collector refactoring
* Schema::builder
* CR and rustdoc
* CR comments
* blop
* Added an executor
* Sorted the segment readers in the searcher
* Update searcher.rs
* Fixed unit testst
* changed the place where we have the sort-segment-by-count heuristic
* using crossbeam::channel
* inlining
* Comments about panics propagating
* Added unit test for executor panicking
* Readded default
* Removed Default impl
* Added unit test for executor