* read path for new fst based index
* implement BlockAddrStoreWriter
* extract slop/derivation computation
* use better linear approximator and allow negative correction to approximator
* document format and reorder some fields
* optimize single block sstable size
* plug backward compat
* Do some Clippy- and Cargo-related boy-scouting.
* Add BytesFilterCollector to support filtering based on a bytes fast field
This is basically a copy of the existing FilterCollector but modified and
specialised to work on a bytes fast field.
* Changed semantics of filter collectors to consider multi-valued fields
Applied this command to the code, making it a bit shorter and slightly
more readable.
```
cargo +nightly clippy --all-features --benches --tests --workspace --fix -- -A clippy::all -W clippy::uninlined_format_args
cargo +nightly fmt --all
```
* compress sstable with zstd
* add some details to sstable readme
* compress only block which benefit from it
* multiple changes to sstable
make compression optional
use OwnedBytes instead of impl Read in sstable, required for next point
use zstd bulk api, which is much faster on small records
* cleanup and use bulk api for compression
* use dedicated byte for compression
* switch block len and compression flag
* change default zstd level in sstable
* Added proptest on columnar merge with a shuffle
Made column serialization more explicit.
Bugfix when a bytes column is missing, and with a shuffle.
Improved the cardinality detection logic / column detection.
* Code review
* CR comments
* Following CR
* document a new sstable format
* add support for changing target block size
* use new format for sstable index
* handle sstable version errror
* use very small blocks for proptests
* add a footer structure
* add way to get up to `limit` terms from sstable
* make some function of sstable load less data
* add some tests to sstable
* add tests on sstable dictionary
* fix some bugs with sstable
* Added support for dynamic fast field.
See README for more information.
* Apply suggestions from code review
Co-authored-by: PSeitz <PSeitz@users.noreply.github.com>
* Refactoring to prepare for the addition of dynamic fast field
- Exposing insert_key / insert_value
- Renamed SSTable::{Reader/Writer}-> SSTable::{ValueReader/ValueWriter}
- Added a generic Dictionary object in the sstable crate
- Removing the TermDictionary wrapper from tantivy, relying directly on
an alias of the generic Dictionary object.
- dropped the use of byteorder in sstable.
- Stopped scanning / reading the entire dictionary when streaming a range.
* Added a benchmark for streaming sstable ranges.
* CR comments.
Rename deserialize_u64 -> deserialize_vint_u64
* Removed needless allocation, split serialize into serialize and clear.