mirror of
https://github.com/quickwit-oss/tantivy.git
synced 2025-12-23 02:29:57 +00:00
make casing in docs more consistent (#2524)
* make casing in docs more consistent * more * lowercase tantivy
This commit is contained in:
@@ -2,7 +2,7 @@
|
|||||||
|
|
||||||
> Tantivy is a **search** engine **library** for Rust.
|
> Tantivy is a **search** engine **library** for Rust.
|
||||||
|
|
||||||
If you are familiar with Lucene, it's an excellent approximation to consider tantivy as Lucene for rust. tantivy is heavily inspired by Lucene's design and
|
If you are familiar with Lucene, it's an excellent approximation to consider tantivy as Lucene for Rust. Tantivy is heavily inspired by Lucene's design and
|
||||||
they both have the same scope and targeted use cases.
|
they both have the same scope and targeted use cases.
|
||||||
|
|
||||||
If you are not familiar with Lucene, let's break down our little tagline.
|
If you are not familiar with Lucene, let's break down our little tagline.
|
||||||
@@ -17,7 +17,7 @@ relevancy, collapsing, highlighting, spatial search.
|
|||||||
experience. But keep in mind this is just a toolbox.
|
experience. But keep in mind this is just a toolbox.
|
||||||
Which bring us to the second keyword...
|
Which bring us to the second keyword...
|
||||||
|
|
||||||
- **Library** means that you will have to write code. tantivy is not an *all-in-one* server solution like elastic search for instance.
|
- **Library** means that you will have to write code. Tantivy is not an *all-in-one* server solution like Elasticsearch for instance.
|
||||||
|
|
||||||
Sometimes a functionality will not be available in tantivy because it is too
|
Sometimes a functionality will not be available in tantivy because it is too
|
||||||
specific to your use case. By design, tantivy should make it possible to extend
|
specific to your use case. By design, tantivy should make it possible to extend
|
||||||
@@ -31,4 +31,4 @@ relevancy, collapsing, highlighting, spatial search.
|
|||||||
index from a different format.
|
index from a different format.
|
||||||
|
|
||||||
Tantivy exposes a lot of low level API to do all of these things.
|
Tantivy exposes a lot of low level API to do all of these things.
|
||||||
|
|
||||||
|
|||||||
@@ -11,7 +11,7 @@ directory shipped with tantivy is the `MmapDirectory`.
|
|||||||
While this design has some downsides, this greatly simplifies the source code of
|
While this design has some downsides, this greatly simplifies the source code of
|
||||||
tantivy. Caching is also entirely delegated to the OS.
|
tantivy. Caching is also entirely delegated to the OS.
|
||||||
|
|
||||||
`tantivy` works entirely (or almost) by directly reading the datastructures as they are laid on disk. As a result, the act of opening an indexing does not involve loading different datastructures from the disk into random access memory : starting a process, opening an index, and performing your first query can typically be done in a matter of milliseconds.
|
Tantivy works entirely (or almost) by directly reading the datastructures as they are laid on disk. As a result, the act of opening an indexing does not involve loading different datastructures from the disk into random access memory : starting a process, opening an index, and performing your first query can typically be done in a matter of milliseconds.
|
||||||
|
|
||||||
This is an interesting property for a command line search engine, or for some multi-tenant log search engine : spawning a new process for each new query can be a perfectly sensible solution in some use case.
|
This is an interesting property for a command line search engine, or for some multi-tenant log search engine : spawning a new process for each new query can be a perfectly sensible solution in some use case.
|
||||||
|
|
||||||
|
|||||||
@@ -31,13 +31,13 @@ Compression ratio is mainly affected on the fast field of the sorted property, e
|
|||||||
When data is presorted by a field and search queries request sorting by the same field, we can leverage the natural order of the documents.
|
When data is presorted by a field and search queries request sorting by the same field, we can leverage the natural order of the documents.
|
||||||
E.g. if the data is sorted by timestamp and want the top n newest docs containing a term, we can simply leveraging the order of the docids.
|
E.g. if the data is sorted by timestamp and want the top n newest docs containing a term, we can simply leveraging the order of the docids.
|
||||||
|
|
||||||
Note: Tantivy 0.16 does not do this optimization yet.
|
Note: tantivy 0.16 does not do this optimization yet.
|
||||||
|
|
||||||
### Pruning
|
### Pruning
|
||||||
|
|
||||||
Let's say we want all documents and want to apply the filter `>= 2010-08-11`. When the data is sorted, we could make a lookup in the fast field to find the docid range and use this as the filter.
|
Let's say we want all documents and want to apply the filter `>= 2010-08-11`. When the data is sorted, we could make a lookup in the fast field to find the docid range and use this as the filter.
|
||||||
|
|
||||||
Note: Tantivy 0.16 does not do this optimization yet.
|
Note: tantivy 0.16 does not do this optimization yet.
|
||||||
|
|
||||||
### Other?
|
### Other?
|
||||||
|
|
||||||
@@ -45,7 +45,7 @@ In principle there are many algorithms possible that exploit the monotonically i
|
|||||||
|
|
||||||
## Usage
|
## Usage
|
||||||
|
|
||||||
The index sorting can be configured setting [`sort_by_field`](https://github.com/quickwit-oss/tantivy/blob/000d76b11a139a84b16b9b95060a1c93e8b9851c/src/core/index_meta.rs#L238) on `IndexSettings` and passing it to a `IndexBuilder`. As of Tantivy 0.16 only fast fields are allowed to be used.
|
The index sorting can be configured setting [`sort_by_field`](https://github.com/quickwit-oss/tantivy/blob/000d76b11a139a84b16b9b95060a1c93e8b9851c/src/core/index_meta.rs#L238) on `IndexSettings` and passing it to a `IndexBuilder`. As of tantivy 0.16 only fast fields are allowed to be used.
|
||||||
|
|
||||||
```rust
|
```rust
|
||||||
let settings = IndexSettings {
|
let settings = IndexSettings {
|
||||||
|
|||||||
@@ -39,7 +39,7 @@ Its representation is done by separating segments by a unicode char `\x01`, and
|
|||||||
- `value`: The value representation is just the regular Value representation.
|
- `value`: The value representation is just the regular Value representation.
|
||||||
|
|
||||||
This representation is designed to align the natural sort of Terms with the lexicographical sort
|
This representation is designed to align the natural sort of Terms with the lexicographical sort
|
||||||
of their binary representation (Tantivy's dictionary (whether fst or sstable) is sorted and does prefix encoding).
|
of their binary representation (tantivy's dictionary (whether fst or sstable) is sorted and does prefix encoding).
|
||||||
|
|
||||||
In the example above, the terms will be sorted as
|
In the example above, the terms will be sorted as
|
||||||
|
|
||||||
|
|||||||
Reference in New Issue
Block a user