Commit Graph

117 Commits

Author SHA1 Message Date
Antoine G
e37775fe21 iff->if or if and only if (#1298)
* has_xxx is_xxx -> if, these function usualy define equivalence
xxx returns bool -> specify equivalence when appropriate

* fix doc
2022-03-02 11:00:00 +09:00
Paul Masurel
2ead010c83 Tantivy quickwit (#1293)
* Added sstable and enabling it by default, and parallel boolean query.
* Added async API for FileSlice.
* Added async get_doc
* Reduce blocksize to 32_000
* Added debug logs

Quickwit specific feature a hidden behind the quickwit feature flag.
2022-02-25 17:32:49 +09:00
Paul Masurel
eca6628b3c Minor refactoring (#1266) 2022-01-28 15:55:55 +09:00
Paul Masurel
3ea6800ac5 Pleasing clippy (#1253) 2022-01-06 16:41:24 +09:00
Antoine G
3129d86743 doc(termdict) expose structs (#1242)
* doc(termdict) expose structs
also add merger doc + lint
refs #1232
2022-01-03 22:20:31 +09:00
PSeitz
352e0cc58d Adde demux operation (#1150)
* add merge for DeleteBitSet, allow custom DeleteBitSet on merge
* forward delete bitsets on merge, add tests
* add demux operation and tests
2021-10-06 16:05:16 +09:00
Pascal Seitz
3265f7bec3 dissolve common module 2021-08-19 23:26:34 +01:00
Pascal Seitz
1e4df54ab3 fix clippy 2021-07-01 17:41:53 +02:00
Pascal Seitz
10f056fbb4 apply clippy fixes 2021-07-01 17:08:44 +02:00
Stéphane Campinas
41ea14840d add benchmark of term streams merge (#1024)
* add benchmark of term streams merge
* use union based on FST for merging the term dictionaries
* Rename TermMerger benchmark
2021-05-31 23:15:01 +09:00
Pascal Seitz
c200d59d1e add blocked bitpacker, add benches 2021-04-29 19:53:54 +02:00
Pascal Seitz
daa53522b5 move tantivy bitpacker to crate, refator bitpacker
remove byteorder dependency
2021-04-29 16:40:11 +02:00
Paul Masurel
2dc5403e7b Closes #1022 2021-04-26 14:01:14 +09:00
Paul Masurel
aead5d4068 First stab 2021-04-26 12:46:06 +09:00
Paul Masurel
39dd8cfe24 Cargo clippy. Acronym should not be full uppercase apparently. 2021-04-26 11:49:18 +09:00
Paul Masurel
868f4fd174 Removing TermMerger::next().
Closing #933
2021-04-14 12:06:04 +09:00
Paul Masurel
31137beea6 Replacing (start, end) by Range 2021-03-10 14:06:21 +09:00
Paul Masurel
be626083a0 Reorganized and added termdict unit tests. 2020-12-07 12:50:36 +09:00
Paul Masurel
af6dfa1856 Small refactoring 2020-12-03 14:27:05 +09:00
Paul Masurel
654c400a0b TermDictionary.finish does not flush 2020-12-03 13:36:25 +09:00
Paul Masurel
80a99539ce Several TermDict operation now returns an io::Result 2020-12-03 13:13:11 +09:00
Paul Masurel
3491645e69 Moved the term merger 2020-12-03 10:24:04 +09:00
Paul Masurel
b4b3bc7acd Cargo fmt 2020-12-03 10:08:38 +09:00
Paul Masurel
521c7b271b Isolated fst impl of termdictionary in a specific module. 2020-12-02 21:18:33 +09:00
Paul Masurel
6d4b982417 Marked blockwand test as ignored.
- Using impl trait for iterating `matching_segments` in the termdict
merger
2020-11-16 13:44:14 +09:00
Paul Masurel
d23aee76c9 Avoid loading fieldnorms when not necessary 2020-11-09 15:50:16 +09:00
Paul Masurel
b5f3dcdc8b TermInfo contain the end_offset of the postings.
We slice the ReadOnlySource tightly.
2020-11-06 15:18:51 +09:00
Paul Masurel
01b4aa9adc Refactoring dir (#905) 2020-10-11 22:22:56 +09:00
Paul Masurel
c23a03ad81 Large API Change in the Directory API. (#901)
Tantivy used to assume that all files could be somehow memory mapped. After this change, Directory return a `FileSlice` that can be reduced and eventually read into an `OwnedBytes` object. Long and blocking io operation are still required by they do not span over the entire file.
2020-10-08 16:36:51 +09:00
Paul Masurel
439d6956a9 Returning Result in some of the API (#880)
* Returning Result in some of the API

* Introducing `.writer_for_test(..)`
2020-09-07 15:52:34 +09:00
Paul Masurel
1e5ebdbf3c Format and remove useless import (#819) 2020-04-27 11:56:49 +09:00
Paul Masurel
186d7fc20e Fix build 2020-04-01 09:32:45 +09:00
Paul Masurel
6227a0555a Added unit test for empty dictionaries. 2020-01-30 10:08:27 +09:00
Audun Halland
f85d0a522a Optimize TermDictionary::empty by precomputed data source (#767) 2020-01-30 10:04:58 +09:00
Halvor Fladsrud Bø
5795488ba7 Backward iteration for termdict range (#757)
* Added backwards iteration to termdict

* Ran formatter

* Updated fst dependency

* Updated dependency

* Changelog and version

* Fixed version

* Made it part of 12.0
2020-01-30 09:59:21 +09:00
Paul Masurel
5c6580eb15 fmt (#661) 2019-10-04 12:10:01 +09:00
fdb-hiroshima
6eb4e08636 add support for float (#603)
* add basic support for float

as for i64, they are mapped to u64 for indexing
query parser don't work yet

* Update value.rs

* implement support for float in query parser

* Update README.md
2019-07-27 17:57:33 +09:00
Paul Masurel
498057c5b7 Refactor deletes (#597)
* Refactor deletes

* Removing generation from SegmentUpdater. These have been obsolete for a long time

* Number literal clippy

* Removed clippy useless allow statement
2019-07-17 13:06:44 +09:00
Paul Masurel
462774b15c Tiqb feature/2018 (#583)
* rust 2018

* Added CHANGELOG comment
2019-07-01 10:01:46 +09:00
Paul Masurel
663dd89c05 Feature/reader (#517)
Adding IndexReader to the API. Making it possible to watch for changes.

* Closes #500
2019-03-20 08:39:22 +09:00
Paul Masurel
774fcecf23 cargo fmt 2019-02-26 10:44:59 +09:00
Paul Masurel
b422f9c389 Partially addresses #500 (#502)
Using `tantivy_fst`. Storing `Weak<Mmap>` in the Mmap cache.
2019-02-23 10:33:59 +09:00
Paul Masurel
63b593bd0a Lower RAM usage in tests. 2019-01-24 09:10:38 +09:00
Paul Masurel
98ca703daa More efficient indexing (#462)
* Using unrolled u32 VInt and caching Vec s

* cargo fmt

* Exposing a io::Write in the Expull thing

* expull as a writer. clippy + format

* inline the first block

* simplified -if let Some-

* vint reader iterator
2019-01-13 14:41:56 +09:00
Paul Masurel
b9d25cda5d Using LittleEndian explicitely 2019-01-08 12:41:58 +09:00
Paul Masurel
beb4289ec2 Less unsafe 2019-01-08 00:48:14 +09:00
Paul Masurel
a3042e956b Facet remove unsafe (#454)
* Removing some unsafe

* Removing some unsafe (2)
2018-12-17 09:31:09 +09:00
Paul Masurel
279a9eb5e3 Closes #449 (#450)
Clippy working on stable.
Clippy warnings addressed
2018-12-10 12:20:59 +09:00
Paul Masurel
07d87e154b Collector refactoring and multithreaded search (#437)
* Split Collector into an overall Collector and a per-segment SegmentCollector. Precursor to cross-segment parallelism, and as a side benefit cleans up any per-segment fields from being Option<T> to just T.

* Attempt to add MultiCollector back

* working. Chained collector is broken though

* Fix chained collector

* Fix test

* Make Weight Send+Sync for parallelization purposes

* Expose parameters of RangeQuery for external usage

* Removed &mut self

* fixing tests

* Restored TestCollectors

* blop

* multicollector working

* chained collector working

* test broken

* fixing unit test

* blop

* blop

* Blop

* simplifying APi

* blop

* better syntax

* Simplifying top_collector

* refactoring

* blop

* Sync with master

* Added multithread search

* Collector refactoring

* Schema::builder

* CR and rustdoc

* CR comments

* blop

* Added an executor

* Sorted the segment readers in the searcher

* Update searcher.rs

* Fixed unit testst

* changed the place where we have the sort-segment-by-count heuristic

* using crossbeam::channel

* inlining

* Comments about panics propagating

* Added unit test for executor panicking

* Readded default

* Removed Default impl

* Added unit test for executor
2018-11-30 22:46:59 +09:00
Paul Masurel
10f6c07c53 Clippy (#422)
* Cargo Format
* Clippy
2018-09-15 20:20:22 +09:00