mirror of
https://github.com/quickwit-oss/tantivy.git
synced 2026-06-03 17:10:48 +00:00
This breaks the doc on doc.rs as the type seems to shadow the struct https://docs.rs/tantivy/latest/tantivy/termdict/type.TermDictionary.html introduced by #1293 which may not have been up to date with what was done in #1242
39 lines
1.5 KiB
Rust
39 lines
1.5 KiB
Rust
//! The term dictionary main role is to associate the sorted [`Term`s](../struct.Term.html) to
|
|
//! a [`TermInfo`](../postings/struct.TermInfo.html) struct that contains some meta-information
|
|
//! about the term.
|
|
//!
|
|
//! Internally, the term dictionary relies on the `fst` crate to store
|
|
//! a sorted mapping that associate each term to its rank in the lexicographical order.
|
|
//! For instance, in a dictionary containing the sorted terms "abba", "bjork", "blur" and "donovan",
|
|
//! the [TermOrdinal] are respectively `0`, `1`, `2`, and `3`.
|
|
//!
|
|
//! For `u64`-terms, tantivy explicitely uses a `BigEndian` representation to ensure that the
|
|
//! lexicographical order matches the natural order of integers.
|
|
//!
|
|
//! `i64`-terms are transformed to `u64` using a continuous mapping `val ⟶ val - i64::min_value()`
|
|
//! and then treated as a `u64`.
|
|
//!
|
|
//! `f64`-terms are transformed to `u64` using a mapping that preserve order, and are then treated
|
|
//! as `u64`.
|
|
//!
|
|
//! A second datastructure makes it possible to access a
|
|
//! [`TermInfo`](../postings/struct.TermInfo.html).
|
|
|
|
#[cfg(not(feature = "quickwit"))]
|
|
mod fst_termdict;
|
|
#[cfg(not(feature = "quickwit"))]
|
|
use fst_termdict as termdict;
|
|
|
|
#[cfg(feature = "quickwit")]
|
|
mod sstable_termdict;
|
|
#[cfg(feature = "quickwit")]
|
|
use sstable_termdict as termdict;
|
|
|
|
#[cfg(test)]
|
|
mod tests;
|
|
|
|
/// Position of the term in the sorted list of terms.
|
|
pub type TermOrdinal = u64;
|
|
|
|
pub use self::termdict::{TermDictionary, TermDictionaryBuilder, TermMerger, TermStreamer};
|