mirror of
https://github.com/quickwit-oss/tantivy.git
synced 2026-06-02 08:30:41 +00:00
07d87e154bdbe6dc4548ea9aef773846a8b1322e
* Split Collector into an overall Collector and a per-segment SegmentCollector. Precursor to cross-segment parallelism, and as a side benefit cleans up any per-segment fields from being Option<T> to just T. * Attempt to add MultiCollector back * working. Chained collector is broken though * Fix chained collector * Fix test * Make Weight Send+Sync for parallelization purposes * Expose parameters of RangeQuery for external usage * Removed &mut self * fixing tests * Restored TestCollectors * blop * multicollector working * chained collector working * test broken * fixing unit test * blop * blop * Blop * simplifying APi * blop * better syntax * Simplifying top_collector * refactoring * blop * Sync with master * Added multithread search * Collector refactoring * Schema::builder * CR and rustdoc * CR comments * blop * Added an executor * Sorted the segment readers in the searcher * Update searcher.rs * Fixed unit testst * changed the place where we have the sort-segment-by-count heuristic * using crossbeam::channel * inlining * Comments about panics propagating * Added unit test for executor panicking * Readded default * Removed Default impl * Added unit test for executor
Tantivy is a full text search engine library written in rust.
It is closer to Apache Lucene than to Elastic Search and Apache Solr in the sense it is not an off-the-shelf search engine server, but rather a crate that can be used to build such a search engine.
Tantivy is, in fact, strongly inspired by Lucene's design.
Features
- Full-text search
- Fast (check out the 🐎 ✨ benchmark ✨ 🐎)
- Tiny startup time (<10ms), perfect for command line tools
- BM25 scoring (the same as lucene)
- Natural query language
(michael AND jackson) OR "king of pop" - Phrase queries search (
"michael jackson") - Incremental indexing
- Multithreaded indexing (indexing English Wikipedia takes < 3 minutes on my desktop)
- Mmap directory
- SIMD integer compression when the platform/CPU includes the SSE2 instruction set.
- Single valued and multivalued u64 and i64 fast fields (equivalent of doc values in Lucene)
&[u8]fast fields- LZ4 compressed document store
- Range queries
- Faceted search
- Configurable indexing (optional term frequency and position indexing)
- Cheesy logo with a horse
Non-features
- Distributed search is out of the scope of tantivy. That being said, tantivy is meant as a library upon which one could build a distributed search. Serializable/mergeable collector state for instance, are within the scope of tantivy.
Supported OS and compiler
Tantivy works on stable rust (>= 1.27) and supports Linux, MacOS and Windows.
Getting started
- tantivy's simple search example
- tantivy-cli and its tutorial.
tantivy-cliis an actual command line interface that makes it easy for you to create a search engine, index documents and search via the CLI or a small server with a REST API. It will walk you through getting a wikipedia search engine up and running in a few minutes. - [reference doc]
Compiling
Development
Tantivy compiles on stable rust but requires Rust >= 1.27.
To check out and run tests, you can simply run :
git clone git@github.com:tantivy-search/tantivy.git
cd tantivy
cargo build
Running tests
Some tests will not run with just cargo test because of fail-rs.
To run the tests exhaustively, run ./run-tests.sh.
Contribute
Send me an email (paul.masurel at gmail.com) if you want to contribute to tantivy.
Languages
Rust
100%
