Applied this command to the code, making it a bit shorter and slightly
more readable.
```
cargo +nightly clippy --all-features --benches --tests --workspace --fix -- -A clippy::all -W clippy::uninlined_format_args
cargo +nightly fmt --all
```
* add memory limit for aggregations
introduce AggregationLimits to set memory consumption limit and bucket limits
memory limit is checked during aggregation, bucket limit is checked before returning the aggregation request.
* Apply suggestions from code review
Co-authored-by: Paul Masurel <paul@quickwit.io>
* add ByteCount with human readable format
---------
Co-authored-by: Paul Masurel <paul@quickwit.io>
* Make nightly Clippy mostly happy.
* Document how to produce TermSetQuery queries using QueryParser.
* Enable construction of queries using FuzzyTermQuery via the QueryParser
* Use FxHashMap instead of HashMap in the QueryParser as these hash tables are not exposed to DoS attacks.
* Use a struct instead of a tuple to improve readability.
When building without default features (so without mmap, etc),
there are some warnings about unused things. This fixes the
ones related to `ArcBytes` and `WeakArcBytes`, which are only
used with the `mmap_directory` code.
The file offsets were recorded incorrectly in some cases, e.g. when the recording looked like this [(Field 1, Index 0, Offset 0), (Field 1, Index 1, Offset 14), (Field 0, Index 0, Offset 14)]. The last file is offset 14 to end of file for field 0. But the data was converted to a vec and sorted, which changes the last file to Field 1.
* FileHandle: Change from boxed to Arc.
Changing from a Box<dyn FileHandle> to an Arc<dyn FileHandle> would
allow for a user of tantivy to manage file handles outside of tantivy
and be able to manage their life cycle.
* Fix: Rust linter
* Removes all usage of block_on, and use a oneshot channel instead.
Calling `block_on` panics in certain context.
For instance, it panics when it is called in a the context of another
call to block.
Using it in tantivy is unnecessary. We replace it by a thin wrapper
around a oneshot channel that supports both async/sync.
* Removing needless uses of async in the API.
Co-authored-by: PSeitz <PSeitz@users.noreply.github.com>
* Added sstable and enabling it by default, and parallel boolean query.
* Added async API for FileSlice.
* Added async get_doc
* Reduce blocksize to 32_000
* Added debug logs
Quickwit specific feature a hidden behind the quickwit feature flag.
This work by introducing a new API method in the Directory
trait. The user needs to explicitely call this method.
(In particular, once before a commmit)
Closes#1225
In addition this PR:
- removes unnecessary flushes and fsyncs on files.
- replace all fsync by fdatasync. The latter triggers
a meta sync if a metadata required to read the file
has changed. It is therefore sufficient for us.
Closes#1224