add lz4 block compressor using lz4_flex, add lz4-block-compression feature flag
add snappy-compression feature flag for snap compressor, make snap crate optional
set lz4-block-compression as default feature flag
Tantivy used to assume that all files could be somehow memory mapped. After this change, Directory return a `FileSlice` that can be reduced and eventually read into an `OwnedBytes` object. Long and blocking io operation are still required by they do not span over the entire file.
* WIP implemented is_compatible
hide Footer::from_bytes from public consumption - only found Footer::extract
used outside the module
Add a new error type for IncompatibleIndex
add a prototypical call to footer.is_compatible() in ManagedDirectory::open_read
to make sure we error before reading it further
* Make error handling more ergonomic
Add an error subtype for OpenReadError and converters to TantivyError
* Remove an unnecessary assert
it's follower by the same check that Errors instead of panicking
* Correct the compatibility check logic
Leave a defensive versioned footer check to make sure we add new logic handling
when we add possible footer versions
Restricted VersionedFooter::from_bytes to be used inside the crate only
remove a half-baked test
* WIP.
* Return an error if index incompatible - closes#662
Enrich the error type with incompatibility
Change return type to Result<bool, TantivyError>, instead of bool
Add an Incompatibility enum that enriches the IncompatibleIndex error variant
with information, which then allows us to generate a developer-friendly hint how
to upgrade library version or switch feature flags for a different compression
algorithm
Updated changelog
Change the signature of is_compatible
Added documentation to the Incompatibility
Added a conditional test on a Footer with lz4 erroring
* add checksum check in ManagedDirectory
fix#400
* flush after writing checksum
* don't checksum atomic file access and clone managed_paths
* implement a footer storing metadata about a file
this is more of a poc, it require some refactoring into multiple files
`terminate(self)` is implemented, but not used anywhere yet
* address comments and simplify things with new contract
use BitOrder for integer to raw byte conversion
consider atomic write imply atomic read, which might not actually be true
use some indirection to have a boxable terminating writer
* implement TerminatingWrite and make terminate() be called where it should
add dependancy to drop_bomb to help find where terminate() should be called
implement TerminatingWrite for wrapper writers
make tests pass
/!\ some tests seems to pass where they shouldn't
* remove usage of drop_bomb
* fmt
* add test for checksum
* address some review comments
* update changelog
* fmt
* Refactor deletes
* Removing generation from SegmentUpdater. These have been obsolete for a long time
* Number literal clippy
* Removed clippy useless allow statement
* Enables clearing the index
Closes#510
* Adds an examples to clear and rebuild index
* Addressing code review
Moved the example from examples/ to docstring above `clear`
* Corrected minor typos and missed/duplicate words
* Added stamper.revert method to be used for rollback
Added type alias for Opstamp
Moved to AtomicU64 on stable rust (since 1.34)
* Change the method name and doc-string
* Remove rollback from delete_all_documents
test_add_then_delete_all_documents fails with --test-threads 2
* Passes all the tests with any number of test-threads
(ran locally 5 times)
* Addressed code review
Deleted comments with debug info
changed ReloadPolicy to Manual
* Removing useless garbage_collect call and updated CHANGELOG
* Split Collector into an overall Collector and a per-segment SegmentCollector. Precursor to cross-segment parallelism, and as a side benefit cleans up any per-segment fields from being Option<T> to just T.
* Attempt to add MultiCollector back
* working. Chained collector is broken though
* Fix chained collector
* Fix test
* Make Weight Send+Sync for parallelization purposes
* Expose parameters of RangeQuery for external usage
* Removed &mut self
* fixing tests
* Restored TestCollectors
* blop
* multicollector working
* chained collector working
* test broken
* fixing unit test
* blop
* blop
* Blop
* simplifying APi
* blop
* better syntax
* Simplifying top_collector
* refactoring
* blop
* Sync with master
* Added multithread search
* Collector refactoring
* Schema::builder
* CR and rustdoc
* CR comments
* blop
* Added an executor
* Sorted the segment readers in the searcher
* Update searcher.rs
* Fixed unit testst
* changed the place where we have the sort-segment-by-count heuristic
* using crossbeam::channel
* inlining
* Comments about panics propagating
* Added unit test for executor panicking
* Readded default
* Removed Default impl
* Added unit test for executor
* Compute space usage of a Searcher / SegmentReader / CompositeFile
* Fix typo
* Add serde Serialize/Deserialize for all the SpaceUsage structs
* Fix indexing
* Public methods for consuming space usage information
* #281: Add a space usage method that takes a SegmentComponent to support code that is unaware of particular segment components, and to make it more likely to update methods when a new component type is added.
* Add support for space usage computation of positions skip index file (#281)
* Add some tests for space usage computation (#281)
* Add skip information for posting list (skip to doc ids)
* Separate num bits from data for positions (skip n positions)
* Address in the position using a n-position offset
* Added a long skip structure to allow efficient opening of the position for a given term.
* Replaced lz4 by a pure rust implementation of snappy.
Closes#257
* snappy is the default compression. One can use lz4 by enabling the lz4 feature flag.
* Removed Compression trait
* Changed the heap to a paged memory arena.
* Trying to simplify the indexing term hashmap
* Exploding datastruct
* Removed some complexity in bitpacker