10 KiB
Tantivy 0.11.0
- Added f64 field. Internally reuse u64 code the same way i64 does (@fdb-hiroshima)
- Various bugfixes in the query parser.
- Better handling of hyphens in query parser. (#609)
- Better handling of whitespaces.
- Closes #498 - add support for Elastic-style unbounded range queries for alphanumeric types eg. "title:>hello", "weight:>=70.5", "height:<200" (@petr-tik)
- API change around
Box<BoxableTokenizer>. See detail in #629 - Avoid rebuilding Regex automaton whenever a regex query is reused. #639 (@brainlock)
- Add footer with some metadata to index files. #605 (@fdb-hiroshima)
How to update?
Box<dyn BoxableTokenizer>has been replaced by aBoxedTokenizerstruct.- Regex are now compiled when the
RegexQueryinstance is built. As a result, it can now return an error and handling theResultis required.
Tantivy 0.10.2
- Closes #656. Solving memory leak.
Tantivy 0.10.1
- Closes #544. A few users experienced problems with the directory watching system. Avoid watching the mmap directory until someone effectively creates a reader that uses this functionality.
Tantivy 0.10.0
Tantivy 0.10.0 index format is compatible with the index format in 0.9.0.
- Added an API to easily tweak or entirely replace the
default score. See
TopDocs::tweak_scoreandTopScore::custom_score(@pmasurel) - Added an ASCII folding filter (@drusellers)
- Bugfix in
query.countin presence of deletes (@pmasurel) - Added
.explain(...)inQueryandWeightto (@pmasurel) - Added an efficient way to
delete_all_documentsinIndexWriter(@petr-tik). All segments are simply removed.
Minor
- Switched to Rust 2018 (@uvd)
- Small simplification of the code. Calling .freq() or .doc() when .advance() has never been called on segment postings should panic from now on.
- Tokens exceeding
u16::max_value() - 4chars are discarded silently instead of panicking. - Fast fields are now preloaded when the
SegmentReaderis created. IndexMetais now public. (@hntd187)IndexWriteradd_document,delete_term.IndexWriterisSync, making it possible to use it with aArc<RwLock<IndexWriter>>.add_documentanddelete_termcan only require a read lock. (@pmasurel)- Introducing
Opstampas an expressive type alias foru64. (@petr-tik) - Stamper now relies on
AtomicU64on all platforms (@petr-tik) - Bugfix - Files get deleted slightly earlier
- Compilation resources improved (@fdb-hiroshima)
How to update?
Your program should be usable as is.
Fast fields
Fast fields used to be accessed directly from the SegmentReader.
The API changed, you are now required to acquire your fast field reader via the
segment_reader.fast_fields(), and use one of the typed method:
.u64(),.i64()if your field is single-valued ;.u64s(),.i64s()if your field is multi-valued ;.bytes()if your field is bytes fast field.
Tantivy 0.9.0
0.9.0 index format is not compatible with the previous index format.
- MAJOR BUGFIX :
Some
Mmapobjects were being leaked, and would never get released. (@fulmicoton) - Removed most unsafe (@fulmicoton)
- Indexer memory footprint improved. (VInt comp, inlining the first block. (@fulmicoton)
- Stemming in other language possible (@pentlander)
- Segments with no docs are deleted earlier (@barrotsteindev)
- Added grouped add and delete operations. They are guaranteed to happen together (i.e. they cannot be split by a commit). In addition, adds are guaranteed to happen on the same segment. (@elbow-jason)
- Removed
INT_STOREDandINT_INDEXED. It is now possible to useSTOREDandINDEXEDfor int fields. (@fulmicoton) - Added DateTime field (@barrotsteindev)
- Added IndexReader. By default, index is reloaded automatically upon new commits (@fulmicoton)
- SIMD linear search within blocks (@fulmicoton)
How to update ?
tantivy 0.9 brought some API breaking change. To update from tantivy 0.8, you will need to go through the following steps.
-
schema::INT_INDEXEDandschema::INT_STOREDshould be replaced byschema::INDEXEDandschema::INT_STORED. -
The index now does not hold the pool of searcher anymore. You are required to create an intermediary object called
IndexReaderfor this.// create the reader. You typically need to create 1 reader for the entire // lifetime of you program. let reader = index.reader()?; // Acquire a searcher (previously `index.searcher()`) is now written: let searcher = reader.searcher(); // With the default setting of the reader, you are not required to // call `index.load_searchers()` anymore. // // The IndexReader will pick up that change automatically, regardless // of whether the update was done in a different process or not. // If this behavior is not wanted, you can create your reader with // the `ReloadPolicy::Manual`, and manually decide when to reload the index // by calling `reader.reload()?`.
Tantivy 0.8.2
Fixing build for x86_64 platforms. (#496) No need to update from 0.8.1 if tantivy is building on your platform.
Tantivy 0.8.1
Hotfix of #476.
Merge was reflecting deletes before commit was passed. Thanks @barrotsteindev for reporting the bug.
Tantivy 0.8.0
No change in the index format
- API Breaking change in the collector API. (@jwolfe, @fulmicoton)
- Multithreaded search (@jwolfe, @fulmicoton)
Tantivy 0.7.1
No change in the index format
- Bugfix: NGramTokenizer panics on non ascii chars
- Added a space usage API
Tantivy 0.7
- Skip data for doc ids and positions (@fulmicoton), greatly improving performance
- Tantivy error now rely on the failure crate (@drusellers)
- Added support for
AND,OR,NOTsyntax in addition to the+,-syntax - Added a snippet generator with highlight (@vigneshsarma, @fulmicoton)
- Added a
TopFieldCollector(@pentlander)
Tantivy 0.6.1
- Bugfix #324. GC removing was removing file that were still in useful
- Added support for parsing AllQuery and RangeQuery via QueryParser
- AllQuery:
* - RangeQuery:
- Inclusive
field:[startIncl to endIncl] - Exclusive
field:{startExcl to endExcl} - Mixed
field:[startIncl to endExcl}and vice versa - Unbounded
field:[start to *],field:[* to end]
- Inclusive
- AllQuery:
Tantivy 0.6
Special thanks to @drusellers and @jason-wolfe for their contributions to this release!
- Removed C code. Tantivy is now pure Rust. (@pmasurel)
- BM25 (@pmasurel)
- Approximate field norms encoded over 1 byte. (@pmasurel)
- Compiles on stable rust (@pmasurel)
- Add &[u8] fastfield for associating arbitrary bytes to each document (@jason-wolfe) (#270)
- Completely uncompressed
- Internally: One u64 fast field for indexes, one fast field for the bytes themselves.
- Add NGram token support (@drusellers)
- Add Stopword Filter support (@drusellers)
- Add a FuzzyTermQuery (@drusellers)
- Add a RegexQuery (@drusellers)
- Various performance improvements (@pmasurel)_
Tantivy 0.5.2
- bugfix #274
- bugfix #280
- bugfix #289
Tantivy 0.5.1
- bugfix #254 : tantivy failed if no documents in a segment contained a specific field.
Tantivy 0.5
- Faceting
- RangeQuery
- Configurable tokenization pipeline
- Bugfix in PhraseQuery
- Various query optimisation
- Allowing very large indexes
- 64 bits file address
- Smarter encoding of the
TermInfoobjects
Tantivy 0.4.3
- Bugfix race condition when deleting files. (#198)
Tantivy 0.4.2
- Prevent usage of AVX2 instructions (#201)
Tantivy 0.4.1
- Bugfix for non-indexed fields. (#199)
Tantivy 0.4.0
- Raise the limit of number of fields (previously 256 fields) (@fulmicoton)
- Removed u32 fields. They are replaced by u64 and i64 fields (#65) (@fulmicoton)
- Optimized skip in SegmentPostings (#130) (@lnicola)
- Replacing rustc_serialize by serde. Kudos to @KodrAus and @lnicola
- Using error-chain (@KodrAus)
- QueryParser: (@fulmicoton)
- Explicit error returned when searched for a term that is not indexed
- Searching for a int term via the query parser was broken
(age:1) - Searching for a non-indexed field returns an explicit Error
- Phrase query for non-tokenized field are not tokenized by the query parser.
- Faster/Better indexing (@fulmicoton)
- using murmurhash2
- faster merging
- more memory efficient fast field writer (@lnicola )
- better handling of collisions
- lesser memory usage
- Added API, most notably to iterate over ranges of terms (@fulmicoton)
- Bugfix that was preventing to unmap segment files, on index drop (@fulmicoton)
- Made the doc! macro public (@fulmicoton)
- Added an alternative implementation of the streaming dictionary (@fulmicoton)
Tantivy 0.3.1
- Expose a method to trigger files garbage collection
Tantivy 0.3
Special thanks to @Kodraus @lnicola @Ameobea @manuel-woelker @celaus for their contribution to this release.
Thanks also to everyone in tantivy gitter chat for their advise and company :)
https://gitter.im/tantivy-search/tantivy
Warning:
Tantivy 0.3 is NOT backward compatible with tantivy 0.2 code and index format. You should not expect backward compatibility before tantivy 1.0.
New Features
- Delete. You can now delete documents from an index.
- Support for windows (Thanks to @lnicola)
Various Bugfixes & small improvements
- Added CI for Windows (https://ci.appveyor.com/project/fulmicoton/tantivy) Thanks to @KodrAus ! (#108)
- Various dependy version update (Thanks to @Ameobea) #76
- Fixed several race conditions in
Index.wait_merge_threads - Fixed #72. Mmap were never released.
- Fixed #80. Fast field used to take an amplitude of 32 bits after a merge. (Ouch!)
- Fixed #92. u32 are now encoded using big endian in the fst in order to make there enumeration consistent with the natural ordering.
- Building binary targets for tantivy-cli (Thanks to @KodrAus)
- Misc invisible bug fixes, and code cleanup.
- Use