mirror of
https://github.com/quickwit-oss/tantivy.git
synced 2026-01-14 04:52:54 +00:00
Compare commits
19 Commits
debugging-
...
0.14
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
784717749f | ||
|
|
945bcc5bd3 | ||
|
|
51aa9c319e | ||
|
|
74d8d2946b | ||
|
|
0a160cc16e | ||
|
|
f099f97daa | ||
|
|
769e9ba14d | ||
|
|
a482c0e966 | ||
|
|
86d92a72e7 | ||
|
|
ef618a5999 | ||
|
|
94d3d7a89a | ||
|
|
aa9e79f957 | ||
|
|
84a2f534db | ||
|
|
1b4be24dca | ||
|
|
824ccc37ae | ||
|
|
5231651020 | ||
|
|
fa2c6f80c7 | ||
|
|
43c7b3bfec | ||
|
|
b17a10546a |
25
CHANGELOG.md
25
CHANGELOG.md
@@ -1,6 +1,6 @@
|
|||||||
Tantivy 0.14.0
|
Tantivy 0.14.0
|
||||||
=========================
|
=========================
|
||||||
- Remove dependency to atomicwrites #833 .Implemented by @pmasurel upon suggestion and research from @asafigan).
|
- Remove dependency to atomicwrites #833 .Implemented by @fulmicoton upon suggestion and research from @asafigan).
|
||||||
- Migrated tantivy error from the now deprecated `failure` crate to `thiserror` #760. (@hirevo)
|
- Migrated tantivy error from the now deprecated `failure` crate to `thiserror` #760. (@hirevo)
|
||||||
- API Change. Accessing the typed value off a `Schema::Value` now returns an Option instead of panicking if the type does not match.
|
- API Change. Accessing the typed value off a `Schema::Value` now returns an Option instead of panicking if the type does not match.
|
||||||
- Large API Change in the Directory API. Tantivy used to assume that all files could be somehow memory mapped. After this change, Directory return a `FileSlice` that can be reduced and eventually read into an `OwnedBytes` object. Long and blocking io operation are still required by they do not span over the entire file.
|
- Large API Change in the Directory API. Tantivy used to assume that all files could be somehow memory mapped. After this change, Directory return a `FileSlice` that can be reduced and eventually read into an `OwnedBytes` object. Long and blocking io operation are still required by they do not span over the entire file.
|
||||||
@@ -9,8 +9,9 @@ Tantivy 0.14.0
|
|||||||
- Bugfix in `Query::explain`
|
- Bugfix in `Query::explain`
|
||||||
- Removed dependency on `notify` #924. Replaced with `FileWatcher` struct that polls meta file every 500ms in background thread. (@halvorboe @guilload)
|
- Removed dependency on `notify` #924. Replaced with `FileWatcher` struct that polls meta file every 500ms in background thread. (@halvorboe @guilload)
|
||||||
- Added `FilterCollector`, which wraps another collector and filters docs using a predicate over a fast field (@barrotsteindev)
|
- Added `FilterCollector`, which wraps another collector and filters docs using a predicate over a fast field (@barrotsteindev)
|
||||||
- Simplified the encoding of the skip reader struct. BlockWAND max tf is now encoded over a single byte. (@pmasurel)
|
- Simplified the encoding of the skip reader struct. BlockWAND max tf is now encoded over a single byte. (@fulmicoton)
|
||||||
- `FilterCollector` now supports all Fast Field value types (@barrotsteindev)
|
- `FilterCollector` now supports all Fast Field value types (@barrotsteindev)
|
||||||
|
- FastField are not all loaded when opening the segment reader. (@fulmicoton)
|
||||||
|
|
||||||
This version breaks compatibility and requires users to reindex everything.
|
This version breaks compatibility and requires users to reindex everything.
|
||||||
|
|
||||||
@@ -65,7 +66,7 @@ Tantivy 0.12.0
|
|||||||
## How to update?
|
## How to update?
|
||||||
|
|
||||||
Crates relying on custom tokenizer, or registering tokenizer in the manager will require some
|
Crates relying on custom tokenizer, or registering tokenizer in the manager will require some
|
||||||
minor changes. Check https://github.com/tantivy-search/tantivy/blob/master/examples/custom_tokenizer.rs
|
minor changes. Check https://github.com/tantivy-search/tantivy/blob/main/examples/custom_tokenizer.rs
|
||||||
to check for some code sample.
|
to check for some code sample.
|
||||||
|
|
||||||
Tantivy 0.11.3
|
Tantivy 0.11.3
|
||||||
@@ -126,10 +127,10 @@ Tantivy 0.10.0
|
|||||||
*Tantivy 0.10.0 index format is compatible with the index format in 0.9.0.*
|
*Tantivy 0.10.0 index format is compatible with the index format in 0.9.0.*
|
||||||
|
|
||||||
- Added an API to easily tweak or entirely replace the
|
- Added an API to easily tweak or entirely replace the
|
||||||
default score. See `TopDocs::tweak_score`and `TopScore::custom_score` (@pmasurel)
|
default score. See `TopDocs::tweak_score`and `TopScore::custom_score` (@fulmicoton)
|
||||||
- Added an ASCII folding filter (@drusellers)
|
- Added an ASCII folding filter (@drusellers)
|
||||||
- Bugfix in `query.count` in presence of deletes (@pmasurel)
|
- Bugfix in `query.count` in presence of deletes (@fulmicoton)
|
||||||
- Added `.explain(...)` in `Query` and `Weight` to (@pmasurel)
|
- Added `.explain(...)` in `Query` and `Weight` to (@fulmicoton)
|
||||||
- Added an efficient way to `delete_all_documents` in `IndexWriter` (@petr-tik).
|
- Added an efficient way to `delete_all_documents` in `IndexWriter` (@petr-tik).
|
||||||
All segments are simply removed.
|
All segments are simply removed.
|
||||||
|
|
||||||
@@ -144,7 +145,7 @@ on segment postings should panic from now on.
|
|||||||
- `IndexMeta` is now public. (@hntd187)
|
- `IndexMeta` is now public. (@hntd187)
|
||||||
- `IndexWriter` `add_document`, `delete_term`. `IndexWriter` is `Sync`, making it possible to use it with a `
|
- `IndexWriter` `add_document`, `delete_term`. `IndexWriter` is `Sync`, making it possible to use it with a `
|
||||||
Arc<RwLock<IndexWriter>>`. `add_document` and `delete_term` can
|
Arc<RwLock<IndexWriter>>`. `add_document` and `delete_term` can
|
||||||
only require a read lock. (@pmasurel)
|
only require a read lock. (@fulmicoton)
|
||||||
- Introducing `Opstamp` as an expressive type alias for `u64`. (@petr-tik)
|
- Introducing `Opstamp` as an expressive type alias for `u64`. (@petr-tik)
|
||||||
- Stamper now relies on `AtomicU64` on all platforms (@petr-tik)
|
- Stamper now relies on `AtomicU64` on all platforms (@petr-tik)
|
||||||
- Bugfix - Files get deleted slightly earlier
|
- Bugfix - Files get deleted slightly earlier
|
||||||
@@ -268,10 +269,10 @@ Tantivy 0.6
|
|||||||
Special thanks to @drusellers and @jason-wolfe for their contributions
|
Special thanks to @drusellers and @jason-wolfe for their contributions
|
||||||
to this release!
|
to this release!
|
||||||
|
|
||||||
- Removed C code. Tantivy is now pure Rust. (@pmasurel)
|
- Removed C code. Tantivy is now pure Rust. (@fulmicoton)
|
||||||
- BM25 (@pmasurel)
|
- BM25 (@fulmicoton)
|
||||||
- Approximate field norms encoded over 1 byte. (@pmasurel)
|
- Approximate field norms encoded over 1 byte. (@fulmicoton)
|
||||||
- Compiles on stable rust (@pmasurel)
|
- Compiles on stable rust (@fulmicoton)
|
||||||
- Add &[u8] fastfield for associating arbitrary bytes to each document (@jason-wolfe) (#270)
|
- Add &[u8] fastfield for associating arbitrary bytes to each document (@jason-wolfe) (#270)
|
||||||
- Completely uncompressed
|
- Completely uncompressed
|
||||||
- Internally: One u64 fast field for indexes, one fast field for the bytes themselves.
|
- Internally: One u64 fast field for indexes, one fast field for the bytes themselves.
|
||||||
@@ -279,7 +280,7 @@ to this release!
|
|||||||
- Add Stopword Filter support (@drusellers)
|
- Add Stopword Filter support (@drusellers)
|
||||||
- Add a FuzzyTermQuery (@drusellers)
|
- Add a FuzzyTermQuery (@drusellers)
|
||||||
- Add a RegexQuery (@drusellers)
|
- Add a RegexQuery (@drusellers)
|
||||||
- Various performance improvements (@pmasurel)_
|
- Various performance improvements (@fulmicoton)_
|
||||||
|
|
||||||
|
|
||||||
Tantivy 0.5.2
|
Tantivy 0.5.2
|
||||||
|
|||||||
@@ -1,6 +1,6 @@
|
|||||||
[package]
|
[package]
|
||||||
name = "tantivy"
|
name = "tantivy"
|
||||||
version = "0.14.0-dev"
|
version = "0.14.0"
|
||||||
authors = ["Paul Masurel <paul.masurel@gmail.com>"]
|
authors = ["Paul Masurel <paul.masurel@gmail.com>"]
|
||||||
license = "MIT"
|
license = "MIT"
|
||||||
categories = ["database-implementations", "data-structures"]
|
categories = ["database-implementations", "data-structures"]
|
||||||
@@ -33,7 +33,7 @@ levenshtein_automata = "0.2"
|
|||||||
uuid = { version = "0.8", features = ["v4", "serde"] }
|
uuid = { version = "0.8", features = ["v4", "serde"] }
|
||||||
crossbeam = "0.8"
|
crossbeam = "0.8"
|
||||||
futures = {version = "0.3", features=["thread-pool"] }
|
futures = {version = "0.3", features=["thread-pool"] }
|
||||||
tantivy-query-grammar = { version="0.14.0-dev", path="./query-grammar" }
|
tantivy-query-grammar = { version="0.14.0", path="./query-grammar" }
|
||||||
stable_deref_trait = "1"
|
stable_deref_trait = "1"
|
||||||
rust-stemmers = "1"
|
rust-stemmers = "1"
|
||||||
downcast-rs = "1"
|
downcast-rs = "1"
|
||||||
|
|||||||
@@ -1,9 +1,9 @@
|
|||||||
|
|
||||||
[](https://travis-ci.org/tantivy-search/tantivy)
|
[](https://travis-ci.org/tantivy-search/tantivy)
|
||||||
[](https://codecov.io/gh/tantivy-search/tantivy)
|
[](https://codecov.io/gh/tantivy-search/tantivy)
|
||||||
[](https://gitter.im/tantivy-search/tantivy?utm_source=badge&utm_medium=badge&utm_campaign=pr-badge&utm_content=badge)
|
[](https://gitter.im/tantivy-search/tantivy?utm_source=badge&utm_medium=badge&utm_campaign=pr-badge&utm_content=badge)
|
||||||
[](https://opensource.org/licenses/MIT)
|
[](https://opensource.org/licenses/MIT)
|
||||||
[](https://ci.appveyor.com/project/fulmicoton/tantivy/branch/master)
|
[](https://ci.appveyor.com/project/fulmicoton/tantivy/branch/main)
|
||||||
[](https://crates.io/crates/tantivy)
|
[](https://crates.io/crates/tantivy)
|
||||||
|
|
||||||

|

|
||||||
|
|||||||
@@ -14,7 +14,7 @@ use tantivy::fastfield::FastFieldReader;
|
|||||||
use tantivy::query::QueryParser;
|
use tantivy::query::QueryParser;
|
||||||
use tantivy::schema::Field;
|
use tantivy::schema::Field;
|
||||||
use tantivy::schema::{Schema, FAST, INDEXED, TEXT};
|
use tantivy::schema::{Schema, FAST, INDEXED, TEXT};
|
||||||
use tantivy::{doc, Index, Score, SegmentReader, TantivyError};
|
use tantivy::{doc, Index, Score, SegmentReader};
|
||||||
|
|
||||||
#[derive(Default)]
|
#[derive(Default)]
|
||||||
struct Stats {
|
struct Stats {
|
||||||
@@ -72,16 +72,7 @@ impl Collector for StatsCollector {
|
|||||||
_segment_local_id: u32,
|
_segment_local_id: u32,
|
||||||
segment_reader: &SegmentReader,
|
segment_reader: &SegmentReader,
|
||||||
) -> tantivy::Result<StatsSegmentCollector> {
|
) -> tantivy::Result<StatsSegmentCollector> {
|
||||||
let fast_field_reader = segment_reader
|
let fast_field_reader = segment_reader.fast_fields().u64(self.field)?;
|
||||||
.fast_fields()
|
|
||||||
.u64(self.field)
|
|
||||||
.ok_or_else(|| {
|
|
||||||
let field_name = segment_reader.schema().get_field_name(self.field);
|
|
||||||
TantivyError::SchemaError(format!(
|
|
||||||
"Field {:?} is not a u64 fast field.",
|
|
||||||
field_name
|
|
||||||
))
|
|
||||||
})?;
|
|
||||||
Ok(StatsSegmentCollector {
|
Ok(StatsSegmentCollector {
|
||||||
fast_field_reader,
|
fast_field_reader,
|
||||||
stats: Stats::default(),
|
stats: Stats::default(),
|
||||||
|
|||||||
@@ -1,6 +1,6 @@
|
|||||||
[package]
|
[package]
|
||||||
name = "tantivy-query-grammar"
|
name = "tantivy-query-grammar"
|
||||||
version = "0.14.0-dev"
|
version = "0.14.0"
|
||||||
authors = ["Paul Masurel <paul.masurel@gmail.com>"]
|
authors = ["Paul Masurel <paul.masurel@gmail.com>"]
|
||||||
license = "MIT"
|
license = "MIT"
|
||||||
categories = ["database-implementations", "data-structures"]
|
categories = ["database-implementations", "data-structures"]
|
||||||
|
|||||||
@@ -398,6 +398,8 @@ impl<'a> Iterator for FacetChildIterator<'a> {
|
|||||||
}
|
}
|
||||||
|
|
||||||
impl FacetCounts {
|
impl FacetCounts {
|
||||||
|
/// Returns an iterator over all of the facet count pairs inside this result.
|
||||||
|
/// See the documentation for `FacetCollector` for a usage example.
|
||||||
pub fn get<T>(&self, facet_from: T) -> FacetChildIterator<'_>
|
pub fn get<T>(&self, facet_from: T) -> FacetChildIterator<'_>
|
||||||
where
|
where
|
||||||
Facet: From<T>,
|
Facet: From<T>,
|
||||||
@@ -417,6 +419,8 @@ impl FacetCounts {
|
|||||||
FacetChildIterator { underlying }
|
FacetChildIterator { underlying }
|
||||||
}
|
}
|
||||||
|
|
||||||
|
/// Returns a vector of top `k` facets with their counts, sorted highest-to-lowest by counts.
|
||||||
|
/// See the documentation for `FacetCollector` for a usage example.
|
||||||
pub fn top_k<T>(&self, facet: T, k: usize) -> Vec<(&Facet, u64)>
|
pub fn top_k<T>(&self, facet: T, k: usize) -> Vec<(&Facet, u64)>
|
||||||
where
|
where
|
||||||
Facet: From<T>,
|
Facet: From<T>,
|
||||||
|
|||||||
@@ -124,13 +124,7 @@ where
|
|||||||
|
|
||||||
let fast_field_reader = segment_reader
|
let fast_field_reader = segment_reader
|
||||||
.fast_fields()
|
.fast_fields()
|
||||||
.typed_fast_field_reader(self.field)
|
.typed_fast_field_reader(self.field)?;
|
||||||
.ok_or_else(|| {
|
|
||||||
TantivyError::SchemaError(format!(
|
|
||||||
"{:?} is not declared as a fast field in the schema.",
|
|
||||||
self.field
|
|
||||||
))
|
|
||||||
})?;
|
|
||||||
|
|
||||||
let segment_collector = self
|
let segment_collector = self
|
||||||
.collector
|
.collector
|
||||||
|
|||||||
@@ -109,6 +109,7 @@ pub use self::tweak_score_top_collector::{ScoreSegmentTweaker, ScoreTweaker};
|
|||||||
|
|
||||||
mod facet_collector;
|
mod facet_collector;
|
||||||
pub use self::facet_collector::FacetCollector;
|
pub use self::facet_collector::FacetCollector;
|
||||||
|
pub use self::facet_collector::FacetCounts;
|
||||||
use crate::query::Weight;
|
use crate::query::Weight;
|
||||||
|
|
||||||
mod docset_collector;
|
mod docset_collector;
|
||||||
|
|||||||
@@ -240,12 +240,7 @@ impl Collector for BytesFastFieldTestCollector {
|
|||||||
_segment_local_id: u32,
|
_segment_local_id: u32,
|
||||||
segment_reader: &SegmentReader,
|
segment_reader: &SegmentReader,
|
||||||
) -> crate::Result<BytesFastFieldSegmentCollector> {
|
) -> crate::Result<BytesFastFieldSegmentCollector> {
|
||||||
let reader = segment_reader
|
let reader = segment_reader.fast_fields().bytes(self.field)?;
|
||||||
.fast_fields()
|
|
||||||
.bytes(self.field)
|
|
||||||
.ok_or_else(|| {
|
|
||||||
crate::TantivyError::InvalidArgument("Field is not a bytes fast field.".to_string())
|
|
||||||
})?;
|
|
||||||
Ok(BytesFastFieldSegmentCollector {
|
Ok(BytesFastFieldSegmentCollector {
|
||||||
vals: Vec::new(),
|
vals: Vec::new(),
|
||||||
reader,
|
reader,
|
||||||
|
|||||||
@@ -2,9 +2,9 @@ use crate::DocAddress;
|
|||||||
use crate::DocId;
|
use crate::DocId;
|
||||||
use crate::SegmentLocalId;
|
use crate::SegmentLocalId;
|
||||||
use crate::SegmentReader;
|
use crate::SegmentReader;
|
||||||
use serde::export::PhantomData;
|
|
||||||
use std::cmp::Ordering;
|
use std::cmp::Ordering;
|
||||||
use std::collections::BinaryHeap;
|
use std::collections::BinaryHeap;
|
||||||
|
use std::marker::PhantomData;
|
||||||
|
|
||||||
/// Contains a feature (field, score, etc.) of a document along with the document address.
|
/// Contains a feature (field, score, etc.) of a document along with the document address.
|
||||||
///
|
///
|
||||||
|
|||||||
@@ -146,15 +146,14 @@ impl CustomScorer<u64> for ScorerByField {
|
|||||||
type Child = ScorerByFastFieldReader;
|
type Child = ScorerByFastFieldReader;
|
||||||
|
|
||||||
fn segment_scorer(&self, segment_reader: &SegmentReader) -> crate::Result<Self::Child> {
|
fn segment_scorer(&self, segment_reader: &SegmentReader) -> crate::Result<Self::Child> {
|
||||||
let ff_reader = segment_reader
|
// We interpret this field as u64, regardless of its type, that way,
|
||||||
|
// we avoid needless conversion. Regardless of the fast field type, the
|
||||||
|
// mapping is monotonic, so it is sufficient to compute our top-K docs.
|
||||||
|
//
|
||||||
|
// The conversion will then happen only on the top-K docs.
|
||||||
|
let ff_reader: FastFieldReader<u64> = segment_reader
|
||||||
.fast_fields()
|
.fast_fields()
|
||||||
.u64_lenient(self.field)
|
.typed_fast_field_reader(self.field)?;
|
||||||
.ok_or_else(|| {
|
|
||||||
crate::TantivyError::SchemaError(format!(
|
|
||||||
"Field requested ({:?}) is not a fast field.",
|
|
||||||
self.field
|
|
||||||
))
|
|
||||||
})?;
|
|
||||||
Ok(ScorerByFastFieldReader { ff_reader })
|
Ok(ScorerByFastFieldReader { ff_reader })
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
@@ -994,9 +993,7 @@ mod tests {
|
|||||||
let segment = searcher.segment_reader(0);
|
let segment = searcher.segment_reader(0);
|
||||||
let top_collector = TopDocs::with_limit(4).order_by_u64_field(size);
|
let top_collector = TopDocs::with_limit(4).order_by_u64_field(size);
|
||||||
let err = top_collector.for_segment(0, segment).err().unwrap();
|
let err = top_collector.for_segment(0, segment).err().unwrap();
|
||||||
assert!(
|
assert!(matches!(err, crate::TantivyError::SchemaError(_)));
|
||||||
matches!(err, crate::TantivyError::SchemaError(msg) if msg == "Field requested (Field(0)) is not a fast field.")
|
|
||||||
);
|
|
||||||
Ok(())
|
Ok(())
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|||||||
@@ -35,12 +35,21 @@ fn load_metas(
|
|||||||
inventory: &SegmentMetaInventory,
|
inventory: &SegmentMetaInventory,
|
||||||
) -> crate::Result<IndexMeta> {
|
) -> crate::Result<IndexMeta> {
|
||||||
let meta_data = directory.atomic_read(&META_FILEPATH)?;
|
let meta_data = directory.atomic_read(&META_FILEPATH)?;
|
||||||
let meta_string = String::from_utf8_lossy(&meta_data);
|
let meta_string = String::from_utf8(meta_data).map_err(|_utf8_err| {
|
||||||
|
error!("Meta data is not valid utf8.");
|
||||||
|
DataCorruption::new(
|
||||||
|
META_FILEPATH.to_path_buf(),
|
||||||
|
"Meta file does not contain valid utf8 file.".to_string(),
|
||||||
|
)
|
||||||
|
})?;
|
||||||
IndexMeta::deserialize(&meta_string, &inventory)
|
IndexMeta::deserialize(&meta_string, &inventory)
|
||||||
.map_err(|e| {
|
.map_err(|e| {
|
||||||
DataCorruption::new(
|
DataCorruption::new(
|
||||||
META_FILEPATH.to_path_buf(),
|
META_FILEPATH.to_path_buf(),
|
||||||
format!("Meta file cannot be deserialized. {:?}.", e),
|
format!(
|
||||||
|
"Meta file cannot be deserialized. {:?}. Content: {:?}",
|
||||||
|
e, meta_string
|
||||||
|
),
|
||||||
)
|
)
|
||||||
})
|
})
|
||||||
.map_err(From::from)
|
.map_err(From::from)
|
||||||
|
|||||||
@@ -114,12 +114,7 @@ impl SegmentReader {
|
|||||||
field_entry.name()
|
field_entry.name()
|
||||||
)));
|
)));
|
||||||
}
|
}
|
||||||
let term_ords_reader = self.fast_fields().u64s(field).ok_or_else(|| {
|
let term_ords_reader = self.fast_fields().u64s(field)?;
|
||||||
DataCorruption::comment_only(format!(
|
|
||||||
"Cannot find data for hierarchical facet {:?}",
|
|
||||||
field_entry.name()
|
|
||||||
))
|
|
||||||
})?;
|
|
||||||
let termdict = self
|
let termdict = self
|
||||||
.termdict_composite
|
.termdict_composite
|
||||||
.open_read(field)
|
.open_read(field)
|
||||||
@@ -183,8 +178,10 @@ impl SegmentReader {
|
|||||||
|
|
||||||
let fast_fields_data = segment.open_read(SegmentComponent::FASTFIELDS)?;
|
let fast_fields_data = segment.open_read(SegmentComponent::FASTFIELDS)?;
|
||||||
let fast_fields_composite = CompositeFile::open(&fast_fields_data)?;
|
let fast_fields_composite = CompositeFile::open(&fast_fields_data)?;
|
||||||
let fast_field_readers =
|
let fast_field_readers = Arc::new(FastFieldReaders::new(
|
||||||
Arc::new(FastFieldReaders::load_all(&schema, &fast_fields_composite)?);
|
schema.clone(),
|
||||||
|
fast_fields_composite,
|
||||||
|
)?);
|
||||||
|
|
||||||
let fieldnorm_data = segment.open_read(SegmentComponent::FIELDNORMS)?;
|
let fieldnorm_data = segment.open_read(SegmentComponent::FIELDNORMS)?;
|
||||||
let fieldnorm_readers = FieldNormReaders::open(fieldnorm_data)?;
|
let fieldnorm_readers = FieldNormReaders::open(fieldnorm_data)?;
|
||||||
@@ -310,7 +307,7 @@ impl SegmentReader {
|
|||||||
}
|
}
|
||||||
|
|
||||||
/// Returns an iterator that will iterate over the alive document ids
|
/// Returns an iterator that will iterate over the alive document ids
|
||||||
pub fn doc_ids_alive<'a>(&'a self) -> impl Iterator<Item = DocId> + 'a {
|
pub fn doc_ids_alive(&self) -> impl Iterator<Item = DocId> + '_ {
|
||||||
(0u32..self.max_doc).filter(move |doc| !self.is_deleted(*doc))
|
(0u32..self.max_doc).filter(move |doc| !self.is_deleted(*doc))
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|||||||
@@ -226,13 +226,9 @@ impl Directory for RAMDirectory {
|
|||||||
)));
|
)));
|
||||||
let path_buf = PathBuf::from(path);
|
let path_buf = PathBuf::from(path);
|
||||||
|
|
||||||
// Reserve the path to prevent calls to .write() to succeed.
|
self.fs.write().unwrap().write(path_buf, data);
|
||||||
self.fs.write().unwrap().write(path_buf.clone(), &[]);
|
|
||||||
|
|
||||||
let mut vec_writer = VecWriter::new(path_buf, self.clone());
|
if path == *META_FILEPATH {
|
||||||
vec_writer.write_all(data)?;
|
|
||||||
vec_writer.flush()?;
|
|
||||||
if path == Path::new(&*META_FILEPATH) {
|
|
||||||
let _ = self.fs.write().unwrap().watch_router.broadcast();
|
let _ = self.fs.write().unwrap().watch_router.broadcast();
|
||||||
}
|
}
|
||||||
Ok(())
|
Ok(())
|
||||||
|
|||||||
@@ -1,4 +1,4 @@
|
|||||||
use super::MultiValueIntFastFieldReader;
|
use super::MultiValuedFastFieldReader;
|
||||||
use crate::error::DataCorruption;
|
use crate::error::DataCorruption;
|
||||||
use crate::schema::Facet;
|
use crate::schema::Facet;
|
||||||
use crate::termdict::TermDictionary;
|
use crate::termdict::TermDictionary;
|
||||||
@@ -20,7 +20,7 @@ use std::str;
|
|||||||
/// list of facets. This ordinal is segment local and
|
/// list of facets. This ordinal is segment local and
|
||||||
/// only makes sense for a given segment.
|
/// only makes sense for a given segment.
|
||||||
pub struct FacetReader {
|
pub struct FacetReader {
|
||||||
term_ords: MultiValueIntFastFieldReader<u64>,
|
term_ords: MultiValuedFastFieldReader<u64>,
|
||||||
term_dict: TermDictionary,
|
term_dict: TermDictionary,
|
||||||
buffer: Vec<u8>,
|
buffer: Vec<u8>,
|
||||||
}
|
}
|
||||||
@@ -29,12 +29,12 @@ impl FacetReader {
|
|||||||
/// Creates a new `FacetReader`.
|
/// Creates a new `FacetReader`.
|
||||||
///
|
///
|
||||||
/// A facet reader just wraps :
|
/// A facet reader just wraps :
|
||||||
/// - a `MultiValueIntFastFieldReader` that makes it possible to
|
/// - a `MultiValuedFastFieldReader` that makes it possible to
|
||||||
/// access the list of facet ords for a given document.
|
/// access the list of facet ords for a given document.
|
||||||
/// - a `TermDictionary` that helps associating a facet to
|
/// - a `TermDictionary` that helps associating a facet to
|
||||||
/// an ordinal and vice versa.
|
/// an ordinal and vice versa.
|
||||||
pub fn new(
|
pub fn new(
|
||||||
term_ords: MultiValueIntFastFieldReader<u64>,
|
term_ords: MultiValuedFastFieldReader<u64>,
|
||||||
term_dict: TermDictionary,
|
term_dict: TermDictionary,
|
||||||
) -> FacetReader {
|
) -> FacetReader {
|
||||||
FacetReader {
|
FacetReader {
|
||||||
|
|||||||
@@ -28,7 +28,7 @@ pub use self::delete::write_delete_bitset;
|
|||||||
pub use self::delete::DeleteBitSet;
|
pub use self::delete::DeleteBitSet;
|
||||||
pub use self::error::{FastFieldNotAvailableError, Result};
|
pub use self::error::{FastFieldNotAvailableError, Result};
|
||||||
pub use self::facet_reader::FacetReader;
|
pub use self::facet_reader::FacetReader;
|
||||||
pub use self::multivalued::{MultiValueIntFastFieldReader, MultiValueIntFastFieldWriter};
|
pub use self::multivalued::{MultiValuedFastFieldReader, MultiValuedFastFieldWriter};
|
||||||
pub use self::reader::FastFieldReader;
|
pub use self::reader::FastFieldReader;
|
||||||
pub use self::readers::FastFieldReaders;
|
pub use self::readers::FastFieldReaders;
|
||||||
pub use self::serializer::FastFieldSerializer;
|
pub use self::serializer::FastFieldSerializer;
|
||||||
|
|||||||
@@ -1,8 +1,8 @@
|
|||||||
mod reader;
|
mod reader;
|
||||||
mod writer;
|
mod writer;
|
||||||
|
|
||||||
pub use self::reader::MultiValueIntFastFieldReader;
|
pub use self::reader::MultiValuedFastFieldReader;
|
||||||
pub use self::writer::MultiValueIntFastFieldWriter;
|
pub use self::writer::MultiValuedFastFieldWriter;
|
||||||
|
|
||||||
#[cfg(test)]
|
#[cfg(test)]
|
||||||
mod tests {
|
mod tests {
|
||||||
|
|||||||
@@ -10,29 +10,22 @@ use crate::DocId;
|
|||||||
/// The `idx_reader` associated, for each document, the index of its first value.
|
/// The `idx_reader` associated, for each document, the index of its first value.
|
||||||
///
|
///
|
||||||
#[derive(Clone)]
|
#[derive(Clone)]
|
||||||
pub struct MultiValueIntFastFieldReader<Item: FastValue> {
|
pub struct MultiValuedFastFieldReader<Item: FastValue> {
|
||||||
idx_reader: FastFieldReader<u64>,
|
idx_reader: FastFieldReader<u64>,
|
||||||
vals_reader: FastFieldReader<Item>,
|
vals_reader: FastFieldReader<Item>,
|
||||||
}
|
}
|
||||||
|
|
||||||
impl<Item: FastValue> MultiValueIntFastFieldReader<Item> {
|
impl<Item: FastValue> MultiValuedFastFieldReader<Item> {
|
||||||
pub(crate) fn open(
|
pub(crate) fn open(
|
||||||
idx_reader: FastFieldReader<u64>,
|
idx_reader: FastFieldReader<u64>,
|
||||||
vals_reader: FastFieldReader<Item>,
|
vals_reader: FastFieldReader<Item>,
|
||||||
) -> MultiValueIntFastFieldReader<Item> {
|
) -> MultiValuedFastFieldReader<Item> {
|
||||||
MultiValueIntFastFieldReader {
|
MultiValuedFastFieldReader {
|
||||||
idx_reader,
|
idx_reader,
|
||||||
vals_reader,
|
vals_reader,
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
pub(crate) fn into_u64s_reader(self) -> MultiValueIntFastFieldReader<u64> {
|
|
||||||
MultiValueIntFastFieldReader {
|
|
||||||
idx_reader: self.idx_reader,
|
|
||||||
vals_reader: self.vals_reader.into_u64_reader(),
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
/// Returns `(start, stop)`, such that the values associated
|
/// Returns `(start, stop)`, such that the values associated
|
||||||
/// to the given document are `start..stop`.
|
/// to the given document are `start..stop`.
|
||||||
fn range(&self, doc: DocId) -> (u64, u64) {
|
fn range(&self, doc: DocId) -> (u64, u64) {
|
||||||
|
|||||||
@@ -18,7 +18,7 @@ use std::io;
|
|||||||
/// in your schema
|
/// in your schema
|
||||||
/// - add your document simply by calling `.add_document(...)`.
|
/// - add your document simply by calling `.add_document(...)`.
|
||||||
///
|
///
|
||||||
/// The `MultiValueIntFastFieldWriter` can be acquired from the
|
/// The `MultiValuedFastFieldWriter` can be acquired from the
|
||||||
/// fastfield writer, by calling [`.get_multivalue_writer(...)`](./struct.FastFieldsWriter.html#method.get_multivalue_writer).
|
/// fastfield writer, by calling [`.get_multivalue_writer(...)`](./struct.FastFieldsWriter.html#method.get_multivalue_writer).
|
||||||
///
|
///
|
||||||
/// Once acquired, writing is done by calling calls to
|
/// Once acquired, writing is done by calling calls to
|
||||||
@@ -29,17 +29,17 @@ use std::io;
|
|||||||
/// This makes it possible to push unordered term ids,
|
/// This makes it possible to push unordered term ids,
|
||||||
/// during indexing and remap them to their respective
|
/// during indexing and remap them to their respective
|
||||||
/// term ids when the segment is getting serialized.
|
/// term ids when the segment is getting serialized.
|
||||||
pub struct MultiValueIntFastFieldWriter {
|
pub struct MultiValuedFastFieldWriter {
|
||||||
field: Field,
|
field: Field,
|
||||||
vals: Vec<UnorderedTermId>,
|
vals: Vec<UnorderedTermId>,
|
||||||
doc_index: Vec<u64>,
|
doc_index: Vec<u64>,
|
||||||
is_facet: bool,
|
is_facet: bool,
|
||||||
}
|
}
|
||||||
|
|
||||||
impl MultiValueIntFastFieldWriter {
|
impl MultiValuedFastFieldWriter {
|
||||||
/// Creates a new `IntFastFieldWriter`
|
/// Creates a new `IntFastFieldWriter`
|
||||||
pub(crate) fn new(field: Field, is_facet: bool) -> Self {
|
pub(crate) fn new(field: Field, is_facet: bool) -> Self {
|
||||||
MultiValueIntFastFieldWriter {
|
MultiValuedFastFieldWriter {
|
||||||
field,
|
field,
|
||||||
vals: Vec::new(),
|
vals: Vec::new(),
|
||||||
doc_index: Vec::new(),
|
doc_index: Vec::new(),
|
||||||
@@ -47,7 +47,7 @@ impl MultiValueIntFastFieldWriter {
|
|||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
/// Access the field associated to the `MultiValueIntFastFieldWriter`
|
/// Access the field associated to the `MultiValuedFastFieldWriter`
|
||||||
pub fn field(&self) -> Field {
|
pub fn field(&self) -> Field {
|
||||||
self.field
|
self.field
|
||||||
}
|
}
|
||||||
|
|||||||
@@ -42,24 +42,6 @@ impl<Item: FastValue> FastFieldReader<Item> {
|
|||||||
})
|
})
|
||||||
}
|
}
|
||||||
|
|
||||||
pub(crate) fn into_u64_reader(self) -> FastFieldReader<u64> {
|
|
||||||
FastFieldReader {
|
|
||||||
bit_unpacker: self.bit_unpacker,
|
|
||||||
min_value_u64: self.min_value_u64,
|
|
||||||
max_value_u64: self.max_value_u64,
|
|
||||||
_phantom: PhantomData,
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
pub(crate) fn cast<TFastValue: FastValue>(self) -> FastFieldReader<TFastValue> {
|
|
||||||
FastFieldReader {
|
|
||||||
bit_unpacker: self.bit_unpacker,
|
|
||||||
min_value_u64: self.min_value_u64,
|
|
||||||
max_value_u64: self.max_value_u64,
|
|
||||||
_phantom: PhantomData,
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
/// Return the value associated to the given document.
|
/// Return the value associated to the given document.
|
||||||
///
|
///
|
||||||
/// This accessor should return as fast as possible.
|
/// This accessor should return as fast as possible.
|
||||||
|
|||||||
@@ -1,28 +1,22 @@
|
|||||||
use crate::common::CompositeFile;
|
use crate::common::CompositeFile;
|
||||||
use crate::fastfield::MultiValueIntFastFieldReader;
|
use crate::directory::FileSlice;
|
||||||
|
use crate::fastfield::MultiValuedFastFieldReader;
|
||||||
use crate::fastfield::{BytesFastFieldReader, FastValue};
|
use crate::fastfield::{BytesFastFieldReader, FastValue};
|
||||||
use crate::fastfield::{FastFieldNotAvailableError, FastFieldReader};
|
use crate::fastfield::{FastFieldNotAvailableError, FastFieldReader};
|
||||||
use crate::schema::{Cardinality, Field, FieldType, Schema};
|
use crate::schema::{Cardinality, Field, FieldType, Schema};
|
||||||
use crate::space_usage::PerFieldSpaceUsage;
|
use crate::space_usage::PerFieldSpaceUsage;
|
||||||
use std::collections::HashMap;
|
use crate::TantivyError;
|
||||||
|
|
||||||
/// Provides access to all of the FastFieldReader.
|
/// Provides access to all of the FastFieldReader.
|
||||||
///
|
///
|
||||||
/// Internally, `FastFieldReaders` have preloaded fast field readers,
|
/// Internally, `FastFieldReaders` have preloaded fast field readers,
|
||||||
/// and just wraps several `HashMap`.
|
/// and just wraps several `HashMap`.
|
||||||
|
#[derive(Clone)]
|
||||||
pub struct FastFieldReaders {
|
pub struct FastFieldReaders {
|
||||||
fast_field_i64: HashMap<Field, FastFieldReader<i64>>,
|
schema: Schema,
|
||||||
fast_field_u64: HashMap<Field, FastFieldReader<u64>>,
|
|
||||||
fast_field_f64: HashMap<Field, FastFieldReader<f64>>,
|
|
||||||
fast_field_date: HashMap<Field, FastFieldReader<crate::DateTime>>,
|
|
||||||
fast_field_i64s: HashMap<Field, MultiValueIntFastFieldReader<i64>>,
|
|
||||||
fast_field_u64s: HashMap<Field, MultiValueIntFastFieldReader<u64>>,
|
|
||||||
fast_field_f64s: HashMap<Field, MultiValueIntFastFieldReader<f64>>,
|
|
||||||
fast_field_dates: HashMap<Field, MultiValueIntFastFieldReader<crate::DateTime>>,
|
|
||||||
fast_bytes: HashMap<Field, BytesFastFieldReader>,
|
|
||||||
fast_fields_composite: CompositeFile,
|
fast_fields_composite: CompositeFile,
|
||||||
}
|
}
|
||||||
|
#[derive(Eq, PartialEq, Debug)]
|
||||||
enum FastType {
|
enum FastType {
|
||||||
I64,
|
I64,
|
||||||
U64,
|
U64,
|
||||||
@@ -50,236 +44,167 @@ fn type_and_cardinality(field_type: &FieldType) -> Option<(FastType, Cardinality
|
|||||||
}
|
}
|
||||||
|
|
||||||
impl FastFieldReaders {
|
impl FastFieldReaders {
|
||||||
pub(crate) fn load_all(
|
pub(crate) fn new(
|
||||||
schema: &Schema,
|
schema: Schema,
|
||||||
fast_fields_composite: &CompositeFile,
|
fast_fields_composite: CompositeFile,
|
||||||
) -> crate::Result<FastFieldReaders> {
|
) -> crate::Result<FastFieldReaders> {
|
||||||
let mut fast_field_readers = FastFieldReaders {
|
Ok(FastFieldReaders {
|
||||||
fast_field_i64: Default::default(),
|
fast_fields_composite,
|
||||||
fast_field_u64: Default::default(),
|
schema,
|
||||||
fast_field_f64: Default::default(),
|
})
|
||||||
fast_field_date: Default::default(),
|
|
||||||
fast_field_i64s: Default::default(),
|
|
||||||
fast_field_u64s: Default::default(),
|
|
||||||
fast_field_f64s: Default::default(),
|
|
||||||
fast_field_dates: Default::default(),
|
|
||||||
fast_bytes: Default::default(),
|
|
||||||
fast_fields_composite: fast_fields_composite.clone(),
|
|
||||||
};
|
|
||||||
for (field, field_entry) in schema.fields() {
|
|
||||||
let field_type = field_entry.field_type();
|
|
||||||
if let FieldType::Bytes(bytes_option) = field_type {
|
|
||||||
if !bytes_option.is_fast() {
|
|
||||||
continue;
|
|
||||||
}
|
|
||||||
let fast_field_idx_file = fast_fields_composite
|
|
||||||
.open_read_with_idx(field, 0)
|
|
||||||
.ok_or_else(|| FastFieldNotAvailableError::new(field_entry))?;
|
|
||||||
let idx_reader = FastFieldReader::open(fast_field_idx_file)?;
|
|
||||||
let data = fast_fields_composite
|
|
||||||
.open_read_with_idx(field, 1)
|
|
||||||
.ok_or_else(|| FastFieldNotAvailableError::new(field_entry))?;
|
|
||||||
let bytes_fast_field_reader = BytesFastFieldReader::open(idx_reader, data)?;
|
|
||||||
fast_field_readers
|
|
||||||
.fast_bytes
|
|
||||||
.insert(field, bytes_fast_field_reader);
|
|
||||||
} else if let Some((fast_type, cardinality)) = type_and_cardinality(field_type) {
|
|
||||||
match cardinality {
|
|
||||||
Cardinality::SingleValue => {
|
|
||||||
if let Some(fast_field_data) = fast_fields_composite.open_read(field) {
|
|
||||||
match fast_type {
|
|
||||||
FastType::U64 => {
|
|
||||||
let fast_field_reader = FastFieldReader::open(fast_field_data)?;
|
|
||||||
fast_field_readers
|
|
||||||
.fast_field_u64
|
|
||||||
.insert(field, fast_field_reader);
|
|
||||||
}
|
|
||||||
FastType::I64 => {
|
|
||||||
let fast_field_reader =
|
|
||||||
FastFieldReader::open(fast_field_data.clone())?;
|
|
||||||
fast_field_readers
|
|
||||||
.fast_field_i64
|
|
||||||
.insert(field, fast_field_reader);
|
|
||||||
}
|
|
||||||
FastType::F64 => {
|
|
||||||
let fast_field_reader =
|
|
||||||
FastFieldReader::open(fast_field_data.clone())?;
|
|
||||||
fast_field_readers
|
|
||||||
.fast_field_f64
|
|
||||||
.insert(field, fast_field_reader);
|
|
||||||
}
|
|
||||||
FastType::Date => {
|
|
||||||
let fast_field_reader =
|
|
||||||
FastFieldReader::open(fast_field_data.clone())?;
|
|
||||||
fast_field_readers
|
|
||||||
.fast_field_date
|
|
||||||
.insert(field, fast_field_reader);
|
|
||||||
}
|
|
||||||
}
|
|
||||||
} else {
|
|
||||||
return Err(From::from(FastFieldNotAvailableError::new(field_entry)));
|
|
||||||
}
|
|
||||||
}
|
|
||||||
Cardinality::MultiValues => {
|
|
||||||
let idx_opt = fast_fields_composite.open_read_with_idx(field, 0);
|
|
||||||
let data_opt = fast_fields_composite.open_read_with_idx(field, 1);
|
|
||||||
if let (Some(fast_field_idx), Some(fast_field_data)) = (idx_opt, data_opt) {
|
|
||||||
let idx_reader = FastFieldReader::open(fast_field_idx)?;
|
|
||||||
match fast_type {
|
|
||||||
FastType::I64 => {
|
|
||||||
let vals_reader = FastFieldReader::open(fast_field_data)?;
|
|
||||||
let multivalued_int_fast_field =
|
|
||||||
MultiValueIntFastFieldReader::open(idx_reader, vals_reader);
|
|
||||||
fast_field_readers
|
|
||||||
.fast_field_i64s
|
|
||||||
.insert(field, multivalued_int_fast_field);
|
|
||||||
}
|
|
||||||
FastType::U64 => {
|
|
||||||
let vals_reader = FastFieldReader::open(fast_field_data)?;
|
|
||||||
let multivalued_int_fast_field =
|
|
||||||
MultiValueIntFastFieldReader::open(idx_reader, vals_reader);
|
|
||||||
fast_field_readers
|
|
||||||
.fast_field_u64s
|
|
||||||
.insert(field, multivalued_int_fast_field);
|
|
||||||
}
|
|
||||||
FastType::F64 => {
|
|
||||||
let vals_reader = FastFieldReader::open(fast_field_data)?;
|
|
||||||
let multivalued_int_fast_field =
|
|
||||||
MultiValueIntFastFieldReader::open(idx_reader, vals_reader);
|
|
||||||
fast_field_readers
|
|
||||||
.fast_field_f64s
|
|
||||||
.insert(field, multivalued_int_fast_field);
|
|
||||||
}
|
|
||||||
FastType::Date => {
|
|
||||||
let vals_reader = FastFieldReader::open(fast_field_data)?;
|
|
||||||
let multivalued_int_fast_field =
|
|
||||||
MultiValueIntFastFieldReader::open(idx_reader, vals_reader);
|
|
||||||
fast_field_readers
|
|
||||||
.fast_field_dates
|
|
||||||
.insert(field, multivalued_int_fast_field);
|
|
||||||
}
|
|
||||||
}
|
|
||||||
} else {
|
|
||||||
return Err(From::from(FastFieldNotAvailableError::new(field_entry)));
|
|
||||||
}
|
|
||||||
}
|
|
||||||
}
|
|
||||||
}
|
|
||||||
}
|
|
||||||
Ok(fast_field_readers)
|
|
||||||
}
|
}
|
||||||
|
|
||||||
pub(crate) fn space_usage(&self) -> PerFieldSpaceUsage {
|
pub(crate) fn space_usage(&self) -> PerFieldSpaceUsage {
|
||||||
self.fast_fields_composite.space_usage()
|
self.fast_fields_composite.space_usage()
|
||||||
}
|
}
|
||||||
|
|
||||||
/// Returns the `u64` fast field reader reader associated to `field`.
|
fn fast_field_data(&self, field: Field, idx: usize) -> crate::Result<FileSlice> {
|
||||||
///
|
self.fast_fields_composite
|
||||||
/// If `field` is not a u64 fast field, this method returns `None`.
|
.open_read_with_idx(field, idx)
|
||||||
pub fn u64(&self, field: Field) -> Option<FastFieldReader<u64>> {
|
.ok_or_else(|| {
|
||||||
self.fast_field_u64.get(&field).cloned()
|
let field_name = self.schema.get_field_entry(field).name();
|
||||||
|
TantivyError::SchemaError(format!("Field({}) data was not found", field_name))
|
||||||
|
})
|
||||||
}
|
}
|
||||||
|
|
||||||
/// If the field is a u64-fast field return the associated reader.
|
fn check_type(
|
||||||
/// If the field is a i64-fast field, return the associated u64 reader. Values are
|
&self,
|
||||||
/// mapped from i64 to u64 using a (well the, it is unique) monotonic mapping. ///
|
field: Field,
|
||||||
///
|
expected_fast_type: FastType,
|
||||||
/// This method is useful when merging segment reader.
|
expected_cardinality: Cardinality,
|
||||||
pub(crate) fn u64_lenient(&self, field: Field) -> Option<FastFieldReader<u64>> {
|
) -> crate::Result<()> {
|
||||||
if let Some(u64_ff_reader) = self.u64(field) {
|
let field_entry = self.schema.get_field_entry(field);
|
||||||
return Some(u64_ff_reader);
|
let (fast_type, cardinality) =
|
||||||
|
type_and_cardinality(field_entry.field_type()).ok_or_else(|| {
|
||||||
|
crate::TantivyError::SchemaError(format!(
|
||||||
|
"Field {:?} is not a fast field.",
|
||||||
|
field_entry.name()
|
||||||
|
))
|
||||||
|
})?;
|
||||||
|
if fast_type != expected_fast_type {
|
||||||
|
return Err(crate::TantivyError::SchemaError(format!(
|
||||||
|
"Field {:?} is of type {:?}, expected {:?}.",
|
||||||
|
field_entry.name(),
|
||||||
|
fast_type,
|
||||||
|
expected_fast_type
|
||||||
|
)));
|
||||||
}
|
}
|
||||||
if let Some(i64_ff_reader) = self.i64(field) {
|
if cardinality != expected_cardinality {
|
||||||
return Some(i64_ff_reader.into_u64_reader());
|
return Err(crate::TantivyError::SchemaError(format!(
|
||||||
|
"Field {:?} is of cardinality {:?}, expected {:?}.",
|
||||||
|
field_entry.name(),
|
||||||
|
cardinality,
|
||||||
|
expected_cardinality
|
||||||
|
)));
|
||||||
}
|
}
|
||||||
if let Some(f64_ff_reader) = self.f64(field) {
|
Ok(())
|
||||||
return Some(f64_ff_reader.into_u64_reader());
|
|
||||||
}
|
|
||||||
if let Some(date_ff_reader) = self.date(field) {
|
|
||||||
return Some(date_ff_reader.into_u64_reader());
|
|
||||||
}
|
|
||||||
None
|
|
||||||
}
|
}
|
||||||
|
|
||||||
pub(crate) fn typed_fast_field_reader<TFastValue: FastValue>(
|
pub(crate) fn typed_fast_field_reader<TFastValue: FastValue>(
|
||||||
&self,
|
&self,
|
||||||
field: Field,
|
field: Field,
|
||||||
) -> Option<FastFieldReader<TFastValue>> {
|
) -> crate::Result<FastFieldReader<TFastValue>> {
|
||||||
self.u64_lenient(field)
|
let fast_field_slice = self.fast_field_data(field, 0)?;
|
||||||
.map(|fast_field_reader| fast_field_reader.cast())
|
FastFieldReader::open(fast_field_slice)
|
||||||
|
}
|
||||||
|
|
||||||
|
pub(crate) fn typed_fast_field_multi_reader<TFastValue: FastValue>(
|
||||||
|
&self,
|
||||||
|
field: Field,
|
||||||
|
) -> crate::Result<MultiValuedFastFieldReader<TFastValue>> {
|
||||||
|
let fast_field_slice_idx = self.fast_field_data(field, 0)?;
|
||||||
|
let fast_field_slice_vals = self.fast_field_data(field, 1)?;
|
||||||
|
let idx_reader = FastFieldReader::open(fast_field_slice_idx)?;
|
||||||
|
let vals_reader: FastFieldReader<TFastValue> =
|
||||||
|
FastFieldReader::open(fast_field_slice_vals)?;
|
||||||
|
Ok(MultiValuedFastFieldReader::open(idx_reader, vals_reader))
|
||||||
|
}
|
||||||
|
|
||||||
|
/// Returns the `u64` fast field reader reader associated to `field`.
|
||||||
|
///
|
||||||
|
/// If `field` is not a u64 fast field, this method returns `None`.
|
||||||
|
pub fn u64(&self, field: Field) -> crate::Result<FastFieldReader<u64>> {
|
||||||
|
self.check_type(field, FastType::U64, Cardinality::SingleValue)?;
|
||||||
|
self.typed_fast_field_reader(field)
|
||||||
}
|
}
|
||||||
|
|
||||||
/// Returns the `i64` fast field reader reader associated to `field`.
|
/// Returns the `i64` fast field reader reader associated to `field`.
|
||||||
///
|
///
|
||||||
/// If `field` is not a i64 fast field, this method returns `None`.
|
/// If `field` is not a i64 fast field, this method returns `None`.
|
||||||
pub fn i64(&self, field: Field) -> Option<FastFieldReader<i64>> {
|
pub fn i64(&self, field: Field) -> crate::Result<FastFieldReader<i64>> {
|
||||||
self.fast_field_i64.get(&field).cloned()
|
self.check_type(field, FastType::I64, Cardinality::SingleValue)?;
|
||||||
|
self.typed_fast_field_reader(field)
|
||||||
}
|
}
|
||||||
|
|
||||||
/// Returns the `i64` fast field reader reader associated to `field`.
|
/// Returns the `i64` fast field reader reader associated to `field`.
|
||||||
///
|
///
|
||||||
/// If `field` is not a i64 fast field, this method returns `None`.
|
/// If `field` is not a i64 fast field, this method returns `None`.
|
||||||
pub fn date(&self, field: Field) -> Option<FastFieldReader<crate::DateTime>> {
|
pub fn date(&self, field: Field) -> crate::Result<FastFieldReader<crate::DateTime>> {
|
||||||
self.fast_field_date.get(&field).cloned()
|
self.check_type(field, FastType::Date, Cardinality::SingleValue)?;
|
||||||
|
self.typed_fast_field_reader(field)
|
||||||
}
|
}
|
||||||
|
|
||||||
/// Returns the `f64` fast field reader reader associated to `field`.
|
/// Returns the `f64` fast field reader reader associated to `field`.
|
||||||
///
|
///
|
||||||
/// If `field` is not a f64 fast field, this method returns `None`.
|
/// If `field` is not a f64 fast field, this method returns `None`.
|
||||||
pub fn f64(&self, field: Field) -> Option<FastFieldReader<f64>> {
|
pub fn f64(&self, field: Field) -> crate::Result<FastFieldReader<f64>> {
|
||||||
self.fast_field_f64.get(&field).cloned()
|
self.check_type(field, FastType::F64, Cardinality::SingleValue)?;
|
||||||
|
self.typed_fast_field_reader(field)
|
||||||
}
|
}
|
||||||
|
|
||||||
/// Returns a `u64s` multi-valued fast field reader reader associated to `field`.
|
/// Returns a `u64s` multi-valued fast field reader reader associated to `field`.
|
||||||
///
|
///
|
||||||
/// If `field` is not a u64 multi-valued fast field, this method returns `None`.
|
/// If `field` is not a u64 multi-valued fast field, this method returns `None`.
|
||||||
pub fn u64s(&self, field: Field) -> Option<MultiValueIntFastFieldReader<u64>> {
|
pub fn u64s(&self, field: Field) -> crate::Result<MultiValuedFastFieldReader<u64>> {
|
||||||
self.fast_field_u64s.get(&field).cloned()
|
self.check_type(field, FastType::U64, Cardinality::MultiValues)?;
|
||||||
}
|
self.typed_fast_field_multi_reader(field)
|
||||||
|
|
||||||
/// If the field is a u64s-fast field return the associated reader.
|
|
||||||
/// If the field is a i64s-fast field, return the associated u64s reader. Values are
|
|
||||||
/// mapped from i64 to u64 using a (well the, it is unique) monotonic mapping.
|
|
||||||
///
|
|
||||||
/// This method is useful when merging segment reader.
|
|
||||||
pub(crate) fn u64s_lenient(&self, field: Field) -> Option<MultiValueIntFastFieldReader<u64>> {
|
|
||||||
if let Some(u64s_ff_reader) = self.u64s(field) {
|
|
||||||
return Some(u64s_ff_reader);
|
|
||||||
}
|
|
||||||
if let Some(i64s_ff_reader) = self.i64s(field) {
|
|
||||||
return Some(i64s_ff_reader.into_u64s_reader());
|
|
||||||
}
|
|
||||||
if let Some(f64s_ff_reader) = self.f64s(field) {
|
|
||||||
return Some(f64s_ff_reader.into_u64s_reader());
|
|
||||||
}
|
|
||||||
None
|
|
||||||
}
|
}
|
||||||
|
|
||||||
/// Returns a `i64s` multi-valued fast field reader reader associated to `field`.
|
/// Returns a `i64s` multi-valued fast field reader reader associated to `field`.
|
||||||
///
|
///
|
||||||
/// If `field` is not a i64 multi-valued fast field, this method returns `None`.
|
/// If `field` is not a i64 multi-valued fast field, this method returns `None`.
|
||||||
pub fn i64s(&self, field: Field) -> Option<MultiValueIntFastFieldReader<i64>> {
|
pub fn i64s(&self, field: Field) -> crate::Result<MultiValuedFastFieldReader<i64>> {
|
||||||
self.fast_field_i64s.get(&field).cloned()
|
self.check_type(field, FastType::I64, Cardinality::MultiValues)?;
|
||||||
|
self.typed_fast_field_multi_reader(field)
|
||||||
}
|
}
|
||||||
|
|
||||||
/// Returns a `f64s` multi-valued fast field reader reader associated to `field`.
|
/// Returns a `f64s` multi-valued fast field reader reader associated to `field`.
|
||||||
///
|
///
|
||||||
/// If `field` is not a f64 multi-valued fast field, this method returns `None`.
|
/// If `field` is not a f64 multi-valued fast field, this method returns `None`.
|
||||||
pub fn f64s(&self, field: Field) -> Option<MultiValueIntFastFieldReader<f64>> {
|
pub fn f64s(&self, field: Field) -> crate::Result<MultiValuedFastFieldReader<f64>> {
|
||||||
self.fast_field_f64s.get(&field).cloned()
|
self.check_type(field, FastType::F64, Cardinality::MultiValues)?;
|
||||||
|
self.typed_fast_field_multi_reader(field)
|
||||||
}
|
}
|
||||||
|
|
||||||
/// Returns a `crate::DateTime` multi-valued fast field reader reader associated to `field`.
|
/// Returns a `crate::DateTime` multi-valued fast field reader reader associated to `field`.
|
||||||
///
|
///
|
||||||
/// If `field` is not a `crate::DateTime` multi-valued fast field, this method returns `None`.
|
/// If `field` is not a `crate::DateTime` multi-valued fast field, this method returns `None`.
|
||||||
pub fn dates(&self, field: Field) -> Option<MultiValueIntFastFieldReader<crate::DateTime>> {
|
pub fn dates(
|
||||||
self.fast_field_dates.get(&field).cloned()
|
&self,
|
||||||
|
field: Field,
|
||||||
|
) -> crate::Result<MultiValuedFastFieldReader<crate::DateTime>> {
|
||||||
|
self.check_type(field, FastType::Date, Cardinality::MultiValues)?;
|
||||||
|
self.typed_fast_field_multi_reader(field)
|
||||||
}
|
}
|
||||||
|
|
||||||
/// Returns the `bytes` fast field reader associated to `field`.
|
/// Returns the `bytes` fast field reader associated to `field`.
|
||||||
///
|
///
|
||||||
/// If `field` is not a bytes fast field, returns `None`.
|
/// If `field` is not a bytes fast field, returns `None`.
|
||||||
pub fn bytes(&self, field: Field) -> Option<BytesFastFieldReader> {
|
pub fn bytes(&self, field: Field) -> crate::Result<BytesFastFieldReader> {
|
||||||
self.fast_bytes.get(&field).cloned()
|
let field_entry = self.schema.get_field_entry(field);
|
||||||
|
if let FieldType::Bytes(bytes_option) = field_entry.field_type() {
|
||||||
|
if !bytes_option.is_fast() {
|
||||||
|
return Err(crate::TantivyError::SchemaError(format!(
|
||||||
|
"Field {:?} is not a fast field.",
|
||||||
|
field_entry.name()
|
||||||
|
)));
|
||||||
|
}
|
||||||
|
let fast_field_idx_file = self.fast_field_data(field, 0)?;
|
||||||
|
let idx_reader = FastFieldReader::open(fast_field_idx_file)?;
|
||||||
|
let data = self.fast_field_data(field, 1)?;
|
||||||
|
BytesFastFieldReader::open(idx_reader, data)
|
||||||
|
} else {
|
||||||
|
Err(FastFieldNotAvailableError::new(field_entry).into())
|
||||||
|
}
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|||||||
@@ -1,4 +1,4 @@
|
|||||||
use super::multivalued::MultiValueIntFastFieldWriter;
|
use super::multivalued::MultiValuedFastFieldWriter;
|
||||||
use crate::common;
|
use crate::common;
|
||||||
use crate::common::BinarySerializable;
|
use crate::common::BinarySerializable;
|
||||||
use crate::common::VInt;
|
use crate::common::VInt;
|
||||||
@@ -13,7 +13,7 @@ use std::io;
|
|||||||
/// The fastfieldswriter regroup all of the fast field writers.
|
/// The fastfieldswriter regroup all of the fast field writers.
|
||||||
pub struct FastFieldsWriter {
|
pub struct FastFieldsWriter {
|
||||||
single_value_writers: Vec<IntFastFieldWriter>,
|
single_value_writers: Vec<IntFastFieldWriter>,
|
||||||
multi_values_writers: Vec<MultiValueIntFastFieldWriter>,
|
multi_values_writers: Vec<MultiValuedFastFieldWriter>,
|
||||||
bytes_value_writers: Vec<BytesFastFieldWriter>,
|
bytes_value_writers: Vec<BytesFastFieldWriter>,
|
||||||
}
|
}
|
||||||
|
|
||||||
@@ -46,14 +46,14 @@ impl FastFieldsWriter {
|
|||||||
single_value_writers.push(fast_field_writer);
|
single_value_writers.push(fast_field_writer);
|
||||||
}
|
}
|
||||||
Some(Cardinality::MultiValues) => {
|
Some(Cardinality::MultiValues) => {
|
||||||
let fast_field_writer = MultiValueIntFastFieldWriter::new(field, false);
|
let fast_field_writer = MultiValuedFastFieldWriter::new(field, false);
|
||||||
multi_values_writers.push(fast_field_writer);
|
multi_values_writers.push(fast_field_writer);
|
||||||
}
|
}
|
||||||
None => {}
|
None => {}
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
FieldType::HierarchicalFacet => {
|
FieldType::HierarchicalFacet => {
|
||||||
let fast_field_writer = MultiValueIntFastFieldWriter::new(field, true);
|
let fast_field_writer = MultiValuedFastFieldWriter::new(field, true);
|
||||||
multi_values_writers.push(fast_field_writer);
|
multi_values_writers.push(fast_field_writer);
|
||||||
}
|
}
|
||||||
FieldType::Bytes(bytes_option) => {
|
FieldType::Bytes(bytes_option) => {
|
||||||
@@ -87,7 +87,7 @@ impl FastFieldsWriter {
|
|||||||
pub fn get_multivalue_writer(
|
pub fn get_multivalue_writer(
|
||||||
&mut self,
|
&mut self,
|
||||||
field: Field,
|
field: Field,
|
||||||
) -> Option<&mut MultiValueIntFastFieldWriter> {
|
) -> Option<&mut MultiValuedFastFieldWriter> {
|
||||||
// TODO optimize
|
// TODO optimize
|
||||||
self.multi_values_writers
|
self.multi_values_writers
|
||||||
.iter_mut()
|
.iter_mut()
|
||||||
|
|||||||
@@ -7,7 +7,7 @@ use crate::fastfield::BytesFastFieldReader;
|
|||||||
use crate::fastfield::DeleteBitSet;
|
use crate::fastfield::DeleteBitSet;
|
||||||
use crate::fastfield::FastFieldReader;
|
use crate::fastfield::FastFieldReader;
|
||||||
use crate::fastfield::FastFieldSerializer;
|
use crate::fastfield::FastFieldSerializer;
|
||||||
use crate::fastfield::MultiValueIntFastFieldReader;
|
use crate::fastfield::MultiValuedFastFieldReader;
|
||||||
use crate::fieldnorm::FieldNormsSerializer;
|
use crate::fieldnorm::FieldNormsSerializer;
|
||||||
use crate::fieldnorm::FieldNormsWriter;
|
use crate::fieldnorm::FieldNormsWriter;
|
||||||
use crate::fieldnorm::{FieldNormReader, FieldNormReaders};
|
use crate::fieldnorm::{FieldNormReader, FieldNormReaders};
|
||||||
@@ -246,7 +246,7 @@ impl IndexMerger {
|
|||||||
for reader in &self.readers {
|
for reader in &self.readers {
|
||||||
let u64_reader: FastFieldReader<u64> = reader
|
let u64_reader: FastFieldReader<u64> = reader
|
||||||
.fast_fields()
|
.fast_fields()
|
||||||
.u64_lenient(field)
|
.typed_fast_field_reader(field)
|
||||||
.expect("Failed to find a reader for single fast field. This is a tantivy bug and it should never happen.");
|
.expect("Failed to find a reader for single fast field. This is a tantivy bug and it should never happen.");
|
||||||
if let Some((seg_min_val, seg_max_val)) =
|
if let Some((seg_min_val, seg_max_val)) =
|
||||||
compute_min_max_val(&u64_reader, reader.max_doc(), reader.delete_bitset())
|
compute_min_max_val(&u64_reader, reader.max_doc(), reader.delete_bitset())
|
||||||
@@ -290,7 +290,7 @@ impl IndexMerger {
|
|||||||
fast_field_serializer: &mut FastFieldSerializer,
|
fast_field_serializer: &mut FastFieldSerializer,
|
||||||
) -> crate::Result<()> {
|
) -> crate::Result<()> {
|
||||||
let mut total_num_vals = 0u64;
|
let mut total_num_vals = 0u64;
|
||||||
let mut u64s_readers: Vec<MultiValueIntFastFieldReader<u64>> = Vec::new();
|
let mut u64s_readers: Vec<MultiValuedFastFieldReader<u64>> = Vec::new();
|
||||||
|
|
||||||
// In the first pass, we compute the total number of vals.
|
// In the first pass, we compute the total number of vals.
|
||||||
//
|
//
|
||||||
@@ -298,9 +298,8 @@ impl IndexMerger {
|
|||||||
// what should be the bit length use for bitpacking.
|
// what should be the bit length use for bitpacking.
|
||||||
for reader in &self.readers {
|
for reader in &self.readers {
|
||||||
let u64s_reader = reader.fast_fields()
|
let u64s_reader = reader.fast_fields()
|
||||||
.u64s_lenient(field)
|
.typed_fast_field_multi_reader(field)
|
||||||
.expect("Failed to find index for multivalued field. This is a bug in tantivy, please report.");
|
.expect("Failed to find index for multivalued field. This is a bug in tantivy, please report.");
|
||||||
|
|
||||||
if let Some(delete_bitset) = reader.delete_bitset() {
|
if let Some(delete_bitset) = reader.delete_bitset() {
|
||||||
for doc in 0u32..reader.max_doc() {
|
for doc in 0u32..reader.max_doc() {
|
||||||
if delete_bitset.is_alive(doc) {
|
if delete_bitset.is_alive(doc) {
|
||||||
@@ -353,7 +352,7 @@ impl IndexMerger {
|
|||||||
for (segment_ord, segment_reader) in self.readers.iter().enumerate() {
|
for (segment_ord, segment_reader) in self.readers.iter().enumerate() {
|
||||||
let term_ordinal_mapping: &[TermOrdinal] =
|
let term_ordinal_mapping: &[TermOrdinal] =
|
||||||
term_ordinal_mappings.get_segment(segment_ord);
|
term_ordinal_mappings.get_segment(segment_ord);
|
||||||
let ff_reader: MultiValueIntFastFieldReader<u64> = segment_reader
|
let ff_reader: MultiValuedFastFieldReader<u64> = segment_reader
|
||||||
.fast_fields()
|
.fast_fields()
|
||||||
.u64s(field)
|
.u64s(field)
|
||||||
.expect("Could not find multivalued u64 fast value reader.");
|
.expect("Could not find multivalued u64 fast value reader.");
|
||||||
@@ -397,8 +396,10 @@ impl IndexMerger {
|
|||||||
// We go through a complete first pass to compute the minimum and the
|
// We go through a complete first pass to compute the minimum and the
|
||||||
// maximum value and initialize our Serializer.
|
// maximum value and initialize our Serializer.
|
||||||
for reader in &self.readers {
|
for reader in &self.readers {
|
||||||
let ff_reader: MultiValueIntFastFieldReader<u64> =
|
let ff_reader: MultiValuedFastFieldReader<u64> = reader
|
||||||
reader.fast_fields().u64s_lenient(field).expect(
|
.fast_fields()
|
||||||
|
.typed_fast_field_multi_reader(field)
|
||||||
|
.expect(
|
||||||
"Failed to find multivalued fast field reader. This is a bug in \
|
"Failed to find multivalued fast field reader. This is a bug in \
|
||||||
tantivy. Please report.",
|
tantivy. Please report.",
|
||||||
);
|
);
|
||||||
@@ -445,11 +446,7 @@ impl IndexMerger {
|
|||||||
let mut bytes_readers: Vec<BytesFastFieldReader> = Vec::new();
|
let mut bytes_readers: Vec<BytesFastFieldReader> = Vec::new();
|
||||||
|
|
||||||
for reader in &self.readers {
|
for reader in &self.readers {
|
||||||
let bytes_reader = reader.fast_fields().bytes(field).ok_or_else(|| {
|
let bytes_reader = reader.fast_fields().bytes(field)?;
|
||||||
crate::TantivyError::InvalidArgument(
|
|
||||||
"Bytes fast field {:?} not found in segment.".to_string(),
|
|
||||||
)
|
|
||||||
})?;
|
|
||||||
if let Some(delete_bitset) = reader.delete_bitset() {
|
if let Some(delete_bitset) = reader.delete_bitset() {
|
||||||
for doc in 0u32..reader.max_doc() {
|
for doc in 0u32..reader.max_doc() {
|
||||||
if delete_bitset.is_alive(doc) {
|
if delete_bitset.is_alive(doc) {
|
||||||
|
|||||||
26
src/lib.rs
26
src/lib.rs
@@ -96,7 +96,7 @@
|
|||||||
//! A good place for you to get started is to check out
|
//! A good place for you to get started is to check out
|
||||||
//! the example code (
|
//! the example code (
|
||||||
//! [literate programming](https://tantivy-search.github.io/examples/basic_search.html) /
|
//! [literate programming](https://tantivy-search.github.io/examples/basic_search.html) /
|
||||||
//! [source code](https://github.com/tantivy-search/tantivy/blob/master/examples/basic_search.rs))
|
//! [source code](https://github.com/tantivy-search/tantivy/blob/main/examples/basic_search.rs))
|
||||||
|
|
||||||
#[cfg_attr(test, macro_use)]
|
#[cfg_attr(test, macro_use)]
|
||||||
extern crate serde_json;
|
extern crate serde_json;
|
||||||
@@ -866,39 +866,39 @@ mod tests {
|
|||||||
let searcher = reader.searcher();
|
let searcher = reader.searcher();
|
||||||
let segment_reader: &SegmentReader = searcher.segment_reader(0);
|
let segment_reader: &SegmentReader = searcher.segment_reader(0);
|
||||||
{
|
{
|
||||||
let fast_field_reader_opt = segment_reader.fast_fields().u64(text_field);
|
let fast_field_reader_res = segment_reader.fast_fields().u64(text_field);
|
||||||
assert!(fast_field_reader_opt.is_none());
|
assert!(fast_field_reader_res.is_err());
|
||||||
}
|
}
|
||||||
{
|
{
|
||||||
let fast_field_reader_opt = segment_reader.fast_fields().u64(stored_int_field);
|
let fast_field_reader_opt = segment_reader.fast_fields().u64(stored_int_field);
|
||||||
assert!(fast_field_reader_opt.is_none());
|
assert!(fast_field_reader_opt.is_err());
|
||||||
}
|
}
|
||||||
{
|
{
|
||||||
let fast_field_reader_opt = segment_reader.fast_fields().u64(fast_field_signed);
|
let fast_field_reader_opt = segment_reader.fast_fields().u64(fast_field_signed);
|
||||||
assert!(fast_field_reader_opt.is_none());
|
assert!(fast_field_reader_opt.is_err());
|
||||||
}
|
}
|
||||||
{
|
{
|
||||||
let fast_field_reader_opt = segment_reader.fast_fields().u64(fast_field_float);
|
let fast_field_reader_opt = segment_reader.fast_fields().u64(fast_field_float);
|
||||||
assert!(fast_field_reader_opt.is_none());
|
assert!(fast_field_reader_opt.is_err());
|
||||||
}
|
}
|
||||||
{
|
{
|
||||||
let fast_field_reader_opt = segment_reader.fast_fields().u64(fast_field_unsigned);
|
let fast_field_reader_opt = segment_reader.fast_fields().u64(fast_field_unsigned);
|
||||||
assert!(fast_field_reader_opt.is_some());
|
assert!(fast_field_reader_opt.is_ok());
|
||||||
let fast_field_reader = fast_field_reader_opt.unwrap();
|
let fast_field_reader = fast_field_reader_opt.unwrap();
|
||||||
assert_eq!(fast_field_reader.get(0), 4u64)
|
assert_eq!(fast_field_reader.get(0), 4u64)
|
||||||
}
|
}
|
||||||
|
|
||||||
{
|
{
|
||||||
let fast_field_reader_opt = segment_reader.fast_fields().i64(fast_field_signed);
|
let fast_field_reader_res = segment_reader.fast_fields().i64(fast_field_signed);
|
||||||
assert!(fast_field_reader_opt.is_some());
|
assert!(fast_field_reader_res.is_ok());
|
||||||
let fast_field_reader = fast_field_reader_opt.unwrap();
|
let fast_field_reader = fast_field_reader_res.unwrap();
|
||||||
assert_eq!(fast_field_reader.get(0), 4i64)
|
assert_eq!(fast_field_reader.get(0), 4i64)
|
||||||
}
|
}
|
||||||
|
|
||||||
{
|
{
|
||||||
let fast_field_reader_opt = segment_reader.fast_fields().f64(fast_field_float);
|
let fast_field_reader_res = segment_reader.fast_fields().f64(fast_field_float);
|
||||||
assert!(fast_field_reader_opt.is_some());
|
assert!(fast_field_reader_res.is_ok());
|
||||||
let fast_field_reader = fast_field_reader_opt.unwrap();
|
let fast_field_reader = fast_field_reader_res.unwrap();
|
||||||
assert_eq!(fast_field_reader.get(0), 4f64)
|
assert_eq!(fast_field_reader.get(0), 4f64)
|
||||||
}
|
}
|
||||||
Ok(())
|
Ok(())
|
||||||
|
|||||||
@@ -132,7 +132,7 @@ impl PositionReader {
|
|||||||
"offset arguments should be increasing."
|
"offset arguments should be increasing."
|
||||||
);
|
);
|
||||||
let delta_to_block_offset = offset as i64 - self.block_offset as i64;
|
let delta_to_block_offset = offset as i64 - self.block_offset as i64;
|
||||||
if delta_to_block_offset < 0 || delta_to_block_offset >= 128 {
|
if !(0..128).contains(&delta_to_block_offset) {
|
||||||
// The first position is not within the first block.
|
// The first position is not within the first block.
|
||||||
// We need to decompress the first block.
|
// We need to decompress the first block.
|
||||||
let delta_to_anchor_offset = offset - self.anchor_offset;
|
let delta_to_anchor_offset = offset - self.anchor_offset;
|
||||||
|
|||||||
@@ -1,14 +1,11 @@
|
|||||||
use crate::common::HasLen;
|
use crate::common::HasLen;
|
||||||
use crate::directory::FileSlice;
|
|
||||||
use crate::docset::DocSet;
|
use crate::docset::DocSet;
|
||||||
use crate::fastfield::DeleteBitSet;
|
use crate::fastfield::DeleteBitSet;
|
||||||
use crate::positions::PositionReader;
|
use crate::positions::PositionReader;
|
||||||
use crate::postings::compression::COMPRESSION_BLOCK_SIZE;
|
use crate::postings::compression::COMPRESSION_BLOCK_SIZE;
|
||||||
use crate::postings::serializer::PostingsSerializer;
|
|
||||||
use crate::postings::BlockSearcher;
|
use crate::postings::BlockSearcher;
|
||||||
use crate::postings::BlockSegmentPostings;
|
use crate::postings::BlockSegmentPostings;
|
||||||
use crate::postings::Postings;
|
use crate::postings::Postings;
|
||||||
use crate::schema::IndexRecordOption;
|
|
||||||
use crate::{DocId, TERMINATED};
|
use crate::{DocId, TERMINATED};
|
||||||
|
|
||||||
/// `SegmentPostings` represents the inverted list or postings associated to
|
/// `SegmentPostings` represents the inverted list or postings associated to
|
||||||
@@ -68,7 +65,11 @@ impl SegmentPostings {
|
|||||||
/// It serializes the doc ids using tantivy's codec
|
/// It serializes the doc ids using tantivy's codec
|
||||||
/// and returns a `SegmentPostings` object that embeds a
|
/// and returns a `SegmentPostings` object that embeds a
|
||||||
/// buffer with the serialized data.
|
/// buffer with the serialized data.
|
||||||
|
#[cfg(test)]
|
||||||
pub fn create_from_docs(docs: &[u32]) -> SegmentPostings {
|
pub fn create_from_docs(docs: &[u32]) -> SegmentPostings {
|
||||||
|
use crate::directory::FileSlice;
|
||||||
|
use crate::postings::serializer::PostingsSerializer;
|
||||||
|
use crate::schema::IndexRecordOption;
|
||||||
let mut buffer = Vec::new();
|
let mut buffer = Vec::new();
|
||||||
{
|
{
|
||||||
let mut postings_serializer =
|
let mut postings_serializer =
|
||||||
@@ -97,6 +98,9 @@ impl SegmentPostings {
|
|||||||
doc_and_tfs: &[(u32, u32)],
|
doc_and_tfs: &[(u32, u32)],
|
||||||
fieldnorms: Option<&[u32]>,
|
fieldnorms: Option<&[u32]>,
|
||||||
) -> SegmentPostings {
|
) -> SegmentPostings {
|
||||||
|
use crate::directory::FileSlice;
|
||||||
|
use crate::postings::serializer::PostingsSerializer;
|
||||||
|
use crate::schema::IndexRecordOption;
|
||||||
use crate::fieldnorm::FieldNormReader;
|
use crate::fieldnorm::FieldNormReader;
|
||||||
use crate::Score;
|
use crate::Score;
|
||||||
let mut buffer: Vec<u8> = Vec::new();
|
let mut buffer: Vec<u8> = Vec::new();
|
||||||
|
|||||||
@@ -28,8 +28,7 @@ pub struct Checkpoint {
|
|||||||
|
|
||||||
impl Checkpoint {
|
impl Checkpoint {
|
||||||
pub(crate) fn follows(&self, other: &Checkpoint) -> bool {
|
pub(crate) fn follows(&self, other: &Checkpoint) -> bool {
|
||||||
(self.start_doc == other.end_doc) &&
|
(self.start_doc == other.end_doc) && (self.start_offset == other.end_offset)
|
||||||
(self.start_offset == other.end_offset)
|
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
@@ -96,7 +95,7 @@ mod tests {
|
|||||||
Checkpoint {
|
Checkpoint {
|
||||||
start_doc: 0,
|
start_doc: 0,
|
||||||
end_doc: 3,
|
end_doc: 3,
|
||||||
start_offset: 4,
|
start_offset: 0,
|
||||||
end_offset: 9,
|
end_offset: 9,
|
||||||
},
|
},
|
||||||
Checkpoint {
|
Checkpoint {
|
||||||
@@ -201,19 +200,21 @@ mod tests {
|
|||||||
Ok(())
|
Ok(())
|
||||||
}
|
}
|
||||||
|
|
||||||
fn integrate_delta(mut vals: Vec<u64>) -> Vec<u64> {
|
fn integrate_delta(vals: Vec<u64>) -> Vec<u64> {
|
||||||
|
let mut output = Vec::with_capacity(vals.len() + 1);
|
||||||
|
output.push(0u64);
|
||||||
let mut prev = 0u64;
|
let mut prev = 0u64;
|
||||||
for val in vals.iter_mut() {
|
for val in vals {
|
||||||
let new_val = *val + prev;
|
let new_val = val + prev;
|
||||||
prev = new_val;
|
prev = new_val;
|
||||||
*val = new_val;
|
output.push(new_val);
|
||||||
}
|
}
|
||||||
vals
|
output
|
||||||
}
|
}
|
||||||
|
|
||||||
// Generates a sequence of n valid checkpoints, with n < max_len.
|
// Generates a sequence of n valid checkpoints, with n < max_len.
|
||||||
fn monotonic_checkpoints(max_len: usize) -> BoxedStrategy<Vec<Checkpoint>> {
|
fn monotonic_checkpoints(max_len: usize) -> BoxedStrategy<Vec<Checkpoint>> {
|
||||||
(1..max_len)
|
(0..max_len)
|
||||||
.prop_flat_map(move |len: usize| {
|
.prop_flat_map(move |len: usize| {
|
||||||
(
|
(
|
||||||
proptest::collection::vec(1u64..20u64, len as usize).prop_map(integrate_delta),
|
proptest::collection::vec(1u64..20u64, len as usize).prop_map(integrate_delta),
|
||||||
|
|||||||
@@ -35,11 +35,11 @@ struct Layer {
|
|||||||
}
|
}
|
||||||
|
|
||||||
impl Layer {
|
impl Layer {
|
||||||
fn cursor<'a>(&'a self) -> impl Iterator<Item = Checkpoint> + 'a {
|
fn cursor(&self) -> impl Iterator<Item = Checkpoint> + '_ {
|
||||||
self.cursor_at_offset(0u64)
|
self.cursor_at_offset(0u64)
|
||||||
}
|
}
|
||||||
|
|
||||||
fn cursor_at_offset<'a>(&'a self, start_offset: u64) -> impl Iterator<Item = Checkpoint> + 'a {
|
fn cursor_at_offset(&self, start_offset: u64) -> impl Iterator<Item = Checkpoint> + '_ {
|
||||||
let data = &self.data.as_slice();
|
let data = &self.data.as_slice();
|
||||||
LayerCursor {
|
LayerCursor {
|
||||||
remaining: &data[start_offset as usize..],
|
remaining: &data[start_offset as usize..],
|
||||||
@@ -77,7 +77,7 @@ impl SkipIndex {
|
|||||||
SkipIndex { layers }
|
SkipIndex { layers }
|
||||||
}
|
}
|
||||||
|
|
||||||
pub(crate) fn checkpoints<'a>(&'a self) -> impl Iterator<Item = Checkpoint> + 'a {
|
pub(crate) fn checkpoints(&self) -> impl Iterator<Item = Checkpoint> + '_ {
|
||||||
self.layers
|
self.layers
|
||||||
.last()
|
.last()
|
||||||
.into_iter()
|
.into_iter()
|
||||||
|
|||||||
@@ -46,7 +46,7 @@ impl StoreReader {
|
|||||||
})
|
})
|
||||||
}
|
}
|
||||||
|
|
||||||
pub(crate) fn block_checkpoints<'a>(&'a self) -> impl Iterator<Item = Checkpoint> + 'a {
|
pub(crate) fn block_checkpoints(&self) -> impl Iterator<Item = Checkpoint> + '_ {
|
||||||
self.skip_index.checkpoints()
|
self.skip_index.checkpoints()
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|||||||
Reference in New Issue
Block a user