Kian-Meng Ang
625bcb4877
Fix typos and markdowns
...
Found via these commands:
codespell -L crate,ser,panting,beauti,hart,ue,atleast,childs,ond,pris,hel,mot
markdownlint *.md doc/src/*.md --disable MD013 MD025 MD033 MD001 MD024 MD036 MD041 MD003
2022-08-13 18:25:47 +08:00
Evance Soumaoro
fad3faefe2
added InvertedIndexReader::doc_freq_async and SnippetGenerator::new methods
2022-08-12 06:39:10 +00:00
boraarslan
d4b2b7de8b
Expose inner file slice
2022-08-04 18:13:17 +03:00
PSeitz
23fe73a6c0
remove searcher pool and make Searcher cloneable ( #1411 )
...
* remove searcher pool and make Searcher cloneable
closes #1410
* use SearcherInner in InnerIndexReader
2022-07-12 18:07:48 +09:00
Pascal Seitz
5750224d4c
set docstore cache size at construction
2022-07-04 14:27:55 +08:00
Pascal Seitz
9db2f0e82b
expose doc store cache size
...
expose lru doc store cache size
optimize doc store cache size
2022-07-04 13:54:41 +08:00
PSeitz
de178a1901
Merge pull request #1395 from PSeitz/fix_clippy
...
fix clippy
2022-06-21 16:30:59 +08:00
Antoine G
11e4225f23
doc fix ( #1391 )
...
Documentation fix.
2022-06-21 15:53:33 +09:00
Pascal Seitz
1440f3243b
fix clippy
2022-06-21 14:47:01 +08:00
Kanji Yomoda
83d0c13fb0
Fix outdated variable naming and comments to alive bitset ( #1387 )
...
* Fix outdated variables and comments for alive bitset
* Fix expired link to delete bitset
2022-06-14 15:59:15 +09:00
Pascal Seitz
4d9d2b6db0
split into compressor/decompressor
...
use custom de/serializer for compressor
accept parameters like zstd(compression_level=5) as compressor
2022-06-02 23:29:24 +08:00
Pascal Seitz
ed868f93a3
enable setting compression level
2022-06-02 16:47:29 +08:00
Paul Masurel
f0a2b1cc44
Bumped tantivy and subcrate versions.
2022-05-25 22:50:33 +09:00
Kryesh
03040ed81d
Add Zstd compression support
2022-05-18 14:04:43 +10:00
Kryesh
aaa22ad225
Make block size configurable to allow for better compression ratios on large documents
2022-05-18 11:13:15 +10:00
PSeitz
7f45a6ac96
allow setting tokenizer manager on index ( #1362 )
...
handle json in tokenizer_for_field
2022-05-09 18:15:45 +09:00
Paul Masurel
be70804d17
Removed AtomicUsize.
2022-05-04 16:45:24 +09:00
PSeitz
a1afc80600
Update src/core/executor.rs
...
Co-authored-by: Paul Masurel <paul@quickwit.io >
2022-05-04 08:39:44 +02:00
Pascal Seitz
4db655ae82
update dependencies, update edition
2022-04-28 22:50:55 +08:00
PSeitz
ae83fc8298
bump uuid to 1.0 ( #1345 )
2022-04-22 10:02:24 +09:00
Pascal Seitz
bb5254de12
always serialize, use enum as param
2022-04-04 13:50:23 +08:00
Pascal Seitz
8807bfd13d
fast field on string
...
enables FAST on string fields, which creates a fastfield containing the term ordinals
2022-03-29 12:40:10 +08:00
Paul Masurel
46d5de920d
Removes all usage of block_on, and use a oneshot channel instead. ( #1315 )
...
* Removes all usage of block_on, and use a oneshot channel instead.
Calling `block_on` panics in certain context.
For instance, it panics when it is called in a the context of another
call to block.
Using it in tantivy is unnecessary. We replace it by a thin wrapper
around a oneshot channel that supports both async/sync.
* Removing needless uses of async in the API.
Co-authored-by: PSeitz <PSeitz@users.noreply.github.com >
2022-03-18 16:54:58 +09:00
PSeitz
b105bf72e1
use defaults in meta.json ( #1310 )
...
This change allows to have unset fields in meta.json and fall back to their defaults
Currently it is required to explicitly put e.g. fieldnorms: false
2022-03-14 13:54:06 +09:00
Antoine G
e37775fe21
iff->if or if and only if ( #1298 )
...
* has_xxx is_xxx -> if, these function usualy define equivalence
xxx returns bool -> specify equivalence when appropriate
* fix doc
2022-03-02 11:00:00 +09:00
Paul Masurel
5004290daa
Return an error on certain type of corruption. ( #1296 )
2022-03-01 11:35:56 +09:00
Paul Masurel
2ead010c83
Tantivy quickwit ( #1293 )
...
* Added sstable and enabling it by default, and parallel boolean query.
* Added async API for FileSlice.
* Added async get_doc
* Reduce blocksize to 32_000
* Added debug logs
Quickwit specific feature a hidden behind the quickwit feature flag.
2022-02-25 17:32:49 +09:00
Paul Masurel
d7b46d2137
Added JSON Type ( #1270 )
...
- Removed useless copy when ingesting JSON.
- Bugfix in phrase query with a missing field norms.
- Disabled range query on default fields
Closes #1251
2022-02-24 16:25:22 +09:00
Pascal Seitz
704498a1ac
rename IntOptions to NumericOptions
...
keep IntOptions with deprecation warning
Fixes #1286
2022-02-21 22:20:07 +01:00
Paul Masurel
bdedefe07d
Adding an IndexingContext object ( #1268 )
2022-02-04 15:08:01 +09:00
Paul Masurel
2069e3e52b
Fixing clippy comments
2022-02-01 10:24:05 +09:00
Paul Masurel
eca6628b3c
Minor refactoring ( #1266 )
2022-01-28 15:55:55 +09:00
Shikhar Bhushan
99d4b1a177
Searcher Warming API ( #1261 )
...
Adds an API to register Warmers in the IndexReader.
Co-authored-by: Paul Masurel <paul@quickwit.io >
2022-01-20 23:40:25 +09:00
Paul Masurel
732f6847c0
Field type with codes ( #1255 )
...
* Term are now typed.
This change is backward compatible:
While the Term has a byte representation that is modified, a Term itself
is a transient object that is not serialized as is in the index.
Its .field() and .value_bytes() on the other hand are unchanged.
This change offers better Debug information for terms.
While not necessary it also will help in the support for JSON types.
* Renamed Hierarchical Facet -> Facet
2022-01-07 20:49:00 +09:00
Shikhar Bhushan
e5e252cbc0
LogMergePolicy knob del_docs_percentage_before_merge ( #1238 )
...
Add a knob to LogMergePolicy to always merge segments that exceed a threshold of deleted docs
Closes #115
2021-12-20 13:14:56 +09:00
Paul Masurel
c81b3030fa
Issue/922b ( #1233 )
...
* Add a NORMED options on field
Make fieldnorm indexation optional:
* for all types except text => added a NORMED options
* for text field
** if STRING, field has not fieldnorm retained
** if TEXT, field has fieldnorm computed
* Finalize making fieldnorm optional for all field types.
- Using Option for fieldnorm readers.
2021-12-10 21:12:29 +09:00
Paul Masurel
7234bef0eb
Issue/1198 ( #1201 )
...
* Unit test reproducing #1198
* Fixing unit test to handle the error from add_document.
* Bump project version
2021-11-11 16:42:19 +09:00
PSeitz
e2fbbc08ca
Merge pull request #1182 from PSeitz/remove_directory_generic
...
use Box<dyn Directory> as parameter to open/create an Index
2021-10-25 12:49:55 +08:00
Pascal Seitz
99cd25beae
use <T: Into<Box<dyn Directory>>> as parameter to open/create an Index
...
This is done in order to support Box<dyn Directory> additionally to generic implementations of the trait Directory.
Remove boxing in ManagedDirectory.
2021-10-25 12:34:40 +08:00
Kanji Yomoda
737ecc7015
Fix outdated comment for IndexWriter::new ( #1183 )
2021-10-25 10:59:18 +09:00
Paul Masurel
02cffa4dea
Code simplification. ( #1169 )
...
Code simplification and Clippy
2021-10-07 14:11:44 +09:00
PSeitz
352e0cc58d
Adde demux operation ( #1150 )
...
* add merge for DeleteBitSet, allow custom DeleteBitSet on merge
* forward delete bitsets on merge, add tests
* add demux operation and tests
2021-10-06 16:05:16 +09:00
Paul Masurel
0855649986
Leaning more on the alive (vs delete) semantics. ( #1164 )
2021-10-05 18:53:29 +09:00
Pascal Seitz
5ee5037934
create and use ReadSerializedBitSet
2021-09-24 12:53:33 +08:00
Pascal Seitz
d7a6a409a1
renames
2021-09-23 20:33:11 +08:00
Pascal Seitz
a1f5cead96
AliveBitSet instead of DeleteBitSet
2021-09-23 20:03:57 +08:00
Pascal Seitz
93cbd52bf0
move code to biset, add inline, add benchmark
2021-09-18 17:35:22 +08:00
Pascal Seitz
c22177a005
add iterator
2021-09-17 15:29:27 +08:00
Pascal Seitz
4ae1d87632
add DeleteBitSet iterator
2021-09-15 23:10:04 +08:00
Kanji Yomoda
9d87b89718
Fix incorrect comment for Index::create_in_dir ( #1148 )
...
* Fix incorrect comment for Index::create_in_dir
2021-09-03 10:37:16 +09:00