* add missing aggregation part 2
Add missing support for:
- Mixed types columns
- Key of type string on numerical fields
The special aggregation is slower than the integrated one in TermsAggregation and therefore not
chosen by default, although it can cover all use cases.
* simplify, add num_docs to empty
* Improve aggregation error message
Improve aggregation error message by wrapping the deserialization with a
custom struct. This deserialization variant is slower, since we need to
keep the deserialized data around twice with this approach.
For now the valid variants list is manually updated. This could be
replaced with a proc macro.
closes#2143
* Simpler implementation
---------
Co-authored-by: Paul Masurel <paul@quickwit.io>
* lazy columnar merge
This is the first part of addressing #3633
Instead of loading all Column into memory for the merge, only the current column_name
group is loaded. This can be done since the sstable streams the columns lexicographically.
* refactor
* add rustdoc
* replace iterator with BTreeMap
* move query parser to nom
* add suupport for term grouping
* initial work on infallible parser
* fmt
* add tests and fix minor parsing bugs
* address review comments
* add support for lenient queries in tantivy
* make lenient parser report errors
* allow mixing occur and bool in query
* alternative mixed field aggregation collection
instead of having multiple accessor in one AggregationWithAccessor split it into
multiple independent AggregationWithAccessor
* Update src/aggregation/agg_req_with_accessor.rs
Co-authored-by: Paul Masurel <paul@quickwit.io>
---------
Co-authored-by: Paul Masurel <paul@quickwit.io>
This makes it obvious where the `StableDerefTrait` is invoked and avoids
`transmute` when only a lifetime needs to be extended. Furthermore, it makes use
of `slice::split_at` where that seemed appropriate.
LZ4 provides fast and simple compression whereas Zstd is exceptionally flexible
so that the additional support for Brotli and Snappy does not really add
any distinct functionality on top of those two algorithms.
Removing them reduces our maintenance burden and reduces the number of choices
users have to make when setting up their project based on Tantivy.
* Include only built-in compression algorithms as enum variants
This enables compile-time errors when a compression algorithm is requested which
is not actually enabled for the current Cargo project. The cost is that indexes
using other compression algorithms cannot even be loaded (even though they
are not fully accessible in any case).
As a drive-by, this also fixes `--no-default-features` on `cfg(unix)`.
* Provide more instructive error messages for unsupported, but not unknown compression variants.
* feat: order_by_fast_field allows sorting using parameter order
* chore: change the corresponding values to original one
* chore: fix formatting issues
* fix: first_or_default_col should also sort by order
* chore: empty doc to testcase and docstest fixes
* chore: fix failure tests
* core: add empty document without fastfield
* chore: fix fmt
* chore: change variable name