mirror of
https://github.com/quickwit-oss/tantivy.git
synced 2026-01-07 09:32:54 +00:00
* Refactoring to prepare for the addition of dynamic fast field
- Exposing insert_key / insert_value
- Renamed SSTable::{Reader/Writer}-> SSTable::{ValueReader/ValueWriter}
- Added a generic Dictionary object in the sstable crate
- Removing the TermDictionary wrapper from tantivy, relying directly on
an alias of the generic Dictionary object.
- dropped the use of byteorder in sstable.
- Stopped scanning / reading the entire dictionary when streaming a range.
* Added a benchmark for streaming sstable ranges.
* CR comments.
Rename deserialize_u64 -> deserialize_vint_u64
* Removed needless allocation, split serialize into serialize and clear.
29 lines
935 B
Markdown
29 lines
935 B
Markdown
# SSTable
|
|
|
|
The `tantivy-sstable` crate is yet another sstable crate.
|
|
|
|
It has been designed to be used in `quickwit`:
|
|
- as an alternative to the default tantivy fst dictionary.
|
|
- as a way to store the column index for dynamic fast fields.
|
|
|
|
The benefit compared to the fst crate is locality.
|
|
Searching a key in the fst crate requires downloading the entire dictionary.
|
|
|
|
Once the sstable index is downloaded, running a `get` in the sstable
|
|
crate only requires a single fetch.
|
|
|
|
Right now, the block index and the default block size have been thought
|
|
for quickwit, and the performance of a get is very bad.
|
|
|
|
# Sorted strings?
|
|
|
|
SSTable stands for Sorted String Table.
|
|
Strings have to be insert in sorted order.
|
|
|
|
That sorted order is used in different ways:
|
|
- it makes gets and streaming ranges of keys
|
|
possible.
|
|
- it allows incremental encoding of the keys
|
|
- the front compression is leveraged to optimize
|
|
the intersection with an automaton
|