Commit Graph

1686 Commits

Author SHA1 Message Date
Paul Masurel
61e955039d Added JSON type
JSON field
2022-02-21 13:55:16 +09:00
Paul Masurel
9815067171 Minor changes 2022-02-21 13:55:01 +09:00
PSeitz
972cb6c26d Aggregation (#1276)
Added support for aggregation compatible with Elasticsearch's API.
2022-02-21 09:59:11 +09:00
Paul Masurel
4dc80cfa25 Removes TokenStream chain. (#1283)
This change is mostly motivated by the introduction of json object.

We need to be able to inject a position object to make the position
shift.
2022-02-21 09:51:27 +09:00
PSeitz
cef145790c Fix opening bytes index with dynamic codec (#1279)
* Fix opening bytes index with dynamic codec

Fix #1278

* extend proptest to cover bytes field codec bug
2022-02-18 20:44:21 +09:00
Paul Masurel
e028515caf Simplified expull code. (#1281) 2022-02-18 18:57:10 +09:00
Shikhar Bhushan
505e6a440c Remove test assertion sensitive to background segment merging (#1274) 2022-02-17 10:59:46 +09:00
Paul Masurel
bdedefe07d Adding an IndexingContext object (#1268) 2022-02-04 15:08:01 +09:00
Paul Masurel
2069e3e52b Fixing clippy comments 2022-02-01 10:24:05 +09:00
Paul Masurel
65b365b81c Fixing all-features build. 2022-01-31 14:41:14 +09:00
Paul Masurel
eca6628b3c Minor refactoring (#1266) 2022-01-28 15:55:55 +09:00
Paul Masurel
9679c5f306 Rename quickwit-inc -> quickwit-oss 2022-01-27 15:37:09 +09:00
Shikhar Bhushan
5a2497b6fd Avoid exposing TrackedObject from Warmer API (#1264) 2022-01-25 10:04:08 +09:00
Shikhar Bhushan
99d4b1a177 Searcher Warming API (#1261)
Adds an API to register Warmers in the IndexReader.

Co-authored-by: Paul Masurel <paul@quickwit.io>
2022-01-20 23:40:25 +09:00
Paul Masurel
732f6847c0 Field type with codes (#1255)
* Term are now typed.

This change is backward compatible:
While the Term has a byte representation that is modified, a Term itself
is a transient object that is not serialized as is in the index.

Its .field() and .value_bytes() on the other hand are unchanged.
This change offers better Debug information for terms.

While not necessary it also will help in the support for JSON types.

* Renamed Hierarchical Facet -> Facet
2022-01-07 20:49:00 +09:00
Paul Masurel
1c6d9bdc6a Comparison of Value based on serialization. (#1250) 2022-01-07 20:31:26 +09:00
Paul Masurel
3ea6800ac5 Pleasing clippy (#1253) 2022-01-06 16:41:24 +09:00
Antoine G
395303b644 Collector + directory doc fixes (#1247)
* doc(collector)

* doc(directory)

* doc(misc)

* wording
2022-01-04 09:22:58 +09:00
Daniel Müller
2c200b46cb Use test-log instead of test-env-log (#1248)
The test-env-log crate has been renamed to test-log to better reflect
its intent of not only catering to env_logger specific initialization
but also tracing (and potentially others in the future).
This change updates the crate to use test-log instead of the now
deprecated test-env-log.
2022-01-04 09:20:30 +09:00
Liam Warfield
17e00df112 Change Snippet.fragments -> Snippet.fragment (#1243)
* Change Snippet.fragments -> Snippet.fragment
* Apply suggestions from code review

Co-authored-by: Liam Warfield <lwarfield@arista.com>
2022-01-03 22:23:51 +09:00
Antoine G
3129d86743 doc(termdict) expose structs (#1242)
* doc(termdict) expose structs
also add merger doc + lint
refs #1232
2022-01-03 22:20:31 +09:00
Shikhar Bhushan
e5e252cbc0 LogMergePolicy knob del_docs_percentage_before_merge (#1238)
Add a knob to LogMergePolicy to always merge segments that exceed a threshold of deleted docs

Closes #115
2021-12-20 13:14:56 +09:00
Paul Masurel
b2da82f151 Making MergeCandidate public in order to allow the usage of custom merge (#1237)
policies.

Closes #1235
2021-12-13 09:54:21 +09:00
Paul Masurel
c81b3030fa Issue/922b (#1233)
* Add a NORMED options on field

Make fieldnorm indexation optional:

* for all types except text => added a NORMED options
* for text field
** if STRING, field has not fieldnorm retained
** if TEXT, field has fieldnorm computed

* Finalize making fieldnorm optional for all field types.

- Using Option for fieldnorm readers.
2021-12-10 21:12:29 +09:00
Paul Masurel
ebdbb6bd2e Fixing compilation warnings & clippy comments. 2021-12-10 16:47:59 +09:00
Antoine G
c980b19dd9 canonicalize path when opening MmapDirectory (#1231)
* canonicalize path when opening `MmapDirectory`
fixes #1229
2021-12-09 10:19:52 +09:00
Paul Masurel
098eea843a Reducing the number of call to fsync on the directory. (#1228)
This work by introducing a new API method in the Directory
trait. The user needs to explicitely call this method.
(In particular, once before a commmit)

Closes #1225
2021-12-03 03:10:52 +00:00
Paul Masurel
466dc8233c Cargo fmt 2021-12-02 18:46:28 +09:00
Paul Masurel
03c2f6ece2 We are missing 4 bytes in the LZ4 compression buffer. (#1226)
Closes #831
2021-12-02 16:00:29 +09:00
Paul Masurel
1d4e9a29db Cargo fmt 2021-12-02 15:51:44 +09:00
Paul Masurel
f378d9a57b Pleasing clippy 2021-12-02 14:48:33 +09:00
Paul Masurel
dde49ac8e2 Closes #1195 (#1222)
Removes the indexed option for facets.
Facets are now always indexed.

Closes #1195
2021-12-02 14:37:19 +09:00
Paul Masurel
c3cc93406d Bugfix: adds missing fdatasync on atomic_write.
In addition this PR:
- removes unnecessary flushes and fsyncs on files.
- replace all fsync by fdatasync. The latter triggers
a meta sync if a metadata required to read the file
has changed. It is therefore sufficient for us.

Closes #1224
2021-12-02 13:42:44 +09:00
Kanji Yomoda
bd0f9211da Remove unused sort for segmenta meta list (#1218)
* Remove unused sort for segment meta list
* Fix segment meta order dependent test
2021-12-01 11:18:17 +09:00
PSeitz
c503c6e4fa Switch to non-strict schema (#1216)
Fixes #1211
2021-11-29 10:38:59 +09:00
Shikhar Bhushan
72cef12db1 Add none compression (#1208) 2021-11-16 10:50:42 +09:00
Paul Masurel
8802d125f8 Prepare commit is public again (#1202)
- Simplified some of the prepare commit & segment updater code using
async.
- Made PrepareCommit public again.
2021-11-12 23:25:39 +09:00
Paul Masurel
7234bef0eb Issue/1198 (#1201)
* Unit test reproducing #1198
* Fixing unit test to handle the error from add_document.
* Bump project version
2021-11-11 16:42:19 +09:00
azerowall
fcff91559b Fix the deserialization error of FieldEntry when the 'options' field appears before the 'type' field (#1199)
Co-authored-by: quel <azerowall>
2021-11-10 18:39:58 +09:00
Paul Masurel
b75d4e59d1 Remove the broken panic on drop unit test. (#1200) 2021-11-10 18:39:37 +09:00
Paul Masurel
c6b5ab1dbe Replacing the panic check in the RAM Directory on lack of flush. 2021-11-09 11:04:31 +09:00
PSeitz
7dc0dc1c9b extend proptests with adding case (#1191)
This extends the proptest to cover a case where up to a 100 documents are added to an index.
2021-11-01 09:27:10 +09:00
François Massot
0462754673 Optimize block wand for one and several TermScorer. (#1190)
* Added optimisation using block wand for single TermScorer.

A proptest was also added.

* Fix block wand algorithm by taking the last doc id of scores until the pivot scorer (included).
* In block wand, when block max score is lower than the threshold, advance the scorer with best score.
* Fix wrong condition in block_wand_single_scorer and add debug_assert to have an equality check on doc to break the loop.
2021-11-01 09:18:05 +09:00
PSeitz
dbaf4f3623 Merge pull request #1187 from PSeitz/sort_issue
check searcher num docs in proptest
2021-10-29 16:19:24 +08:00
Pascal Seitz
4808648322 check searcher num docs in proptest 2021-10-29 14:38:30 +08:00
Paul Masurel
54afb9b34a Made PrepareCommit private 2021-10-29 14:13:14 +09:00
Dan Cecile
6317982876 Make indexer::prepared_commit public (#1184)
* Make indexer::prepared_commit public

* Add PreparedCommit to lib
2021-10-26 12:21:24 +09:00
PSeitz
e2fbbc08ca Merge pull request #1182 from PSeitz/remove_directory_generic
use Box<dyn Directory> as parameter to open/create an Index
2021-10-25 12:49:55 +08:00
Pascal Seitz
99cd25beae use <T: Into<Box<dyn Directory>>> as parameter to open/create an Index
This is done in order to support Box<dyn Directory> additionally to generic implementations of the trait Directory.
Remove boxing in ManagedDirectory.
2021-10-25 12:34:40 +08:00
Kanji Yomoda
737ecc7015 Fix outdated comment for IndexWriter::new (#1183) 2021-10-25 10:59:18 +09:00