Commit Graph

2773 Commits

Author SHA1 Message Date
Paul Masurel
848b795b9f Apply suggestions from code review 2022-03-01 18:37:51 +09:00
Pascal Seitz
091b668624 fix clippy issues 2022-03-01 08:58:51 +01:00
Paul Masurel
5004290daa Return an error on certain type of corruption. (#1296) 2022-03-01 11:35:56 +09:00
StyMaar
5d2c2b804c Fix link to RamDirectory and MMapDirectory in Directory's documentation (#1295) 2022-03-01 09:46:53 +09:00
PSeitz
1a92b588e0 Merge pull request #1294 from PSeitz/aggregation
fix intermediate result de/serialization
2022-02-28 08:39:23 +01:00
Pascal Seitz
010e92c118 fix intermediate result de/serialization
return None for empty average/stats metric
add test for de/serialization of intermediate result
add test for metric on empty result
2022-02-25 16:39:57 +01:00
Paul Masurel
2ead010c83 Tantivy quickwit (#1293)
* Added sstable and enabling it by default, and parallel boolean query.
* Added async API for FileSlice.
* Added async get_doc
* Reduce blocksize to 32_000
* Added debug logs

Quickwit specific feature a hidden behind the quickwit feature flag.
2022-02-25 17:32:49 +09:00
PSeitz
c4f66eb185 improve validation in aggregation, extend invalid field test (#1292)
* improve validation in aggregation, extend invalid field test

improve validation in aggregation
extend invalid field test
Fixes #1291

* collect fast field names on request structure

* fix visibility of AggregationSegmentCollector
2022-02-25 15:21:19 +09:00
Paul Masurel
d7b46d2137 Added JSON Type (#1270)
- Removed useless copy when ingesting JSON.
- Bugfix in phrase query with a missing field norms.
- Disabled range query on default fields

Closes #1251
2022-02-24 16:25:22 +09:00
PSeitz
d042ce74c7 Merge pull request #1289 from PSeitz/numeric_options
rename IntOptions to NumericOptions
2022-02-23 14:04:40 +01:00
PSeitz
7ba9e662b8 Merge pull request #1290 from PSeitz/improve_docs
improve aggregation docs
2022-02-23 14:04:20 +01:00
Pascal Seitz
fdd5ef85e5 improve aggregation docs 2022-02-22 10:37:54 +01:00
Pascal Seitz
704498a1ac rename IntOptions to NumericOptions
keep IntOptions with deprecation warning
Fixes #1286
2022-02-21 22:20:07 +01:00
PSeitz
1232af7928 fix docs (#1288) 2022-02-21 23:15:58 +09:00
Paul Masurel
d37633e034 Minor changes in indexing. (#1285) 2022-02-21 17:16:52 +09:00
Paul Masurel
9815067171 Minor changes 2022-02-21 13:55:01 +09:00
PSeitz
972cb6c26d Aggregation (#1276)
Added support for aggregation compatible with Elasticsearch's API.
2022-02-21 09:59:11 +09:00
Paul Masurel
4dc80cfa25 Removes TokenStream chain. (#1283)
This change is mostly motivated by the introduction of json object.

We need to be able to inject a position object to make the position
shift.
2022-02-21 09:51:27 +09:00
PSeitz
cef145790c Fix opening bytes index with dynamic codec (#1279)
* Fix opening bytes index with dynamic codec

Fix #1278

* extend proptest to cover bytes field codec bug
2022-02-18 20:44:21 +09:00
Paul Masurel
e05e2a0c51 Added profiling to indexing bench (#1282) 2022-02-18 20:43:28 +09:00
Paul Masurel
e028515caf Simplified expull code. (#1281) 2022-02-18 18:57:10 +09:00
Paul Masurel
850b9eaea4 added a bench to measure the perf of indexing logs (#1275) 2022-02-18 16:48:29 +09:00
Shikhar Bhushan
505e6a440c Remove test assertion sensitive to background segment merging (#1274) 2022-02-17 10:59:46 +09:00
Koichi Akabe
fcd651f6a9 Add Vaporetto tokenizer to README (#1271)
* Add Vaporetto tokenizer to README

* Update README.md
2022-02-14 18:19:57 +09:00
Paul Masurel
e6653228a9 Renamed github workflows (#1269) 2022-02-04 15:10:24 +09:00
Paul Masurel
bdedefe07d Adding an IndexingContext object (#1268) 2022-02-04 15:08:01 +09:00
Paul Masurel
13a4473faa Removing obsolete clippy allow thingy. 2022-02-01 11:54:01 +09:00
Paul Masurel
2069e3e52b Fixing clippy comments 2022-02-01 10:24:05 +09:00
Paul Masurel
0d8263cba1 Using nightly to format 2022-01-31 16:10:11 +09:00
Paul Masurel
65b365b81c Fixing all-features build. 2022-01-31 14:41:14 +09:00
dependabot[bot]
4c1366da87 Update fastdivide requirement from 0.3 to 0.4 (#1265)
Updates the requirements on fastdivide to permit the latest version.

---
updated-dependencies:
- dependency-name: fastdivide
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-01-31 11:26:50 +09:00
Paul Masurel
eca6628b3c Minor refactoring (#1266) 2022-01-28 15:55:55 +09:00
Paul Masurel
9679c5f306 Rename quickwit-inc -> quickwit-oss 2022-01-27 15:37:09 +09:00
Shikhar Bhushan
5a2497b6fd Avoid exposing TrackedObject from Warmer API (#1264) 2022-01-25 10:04:08 +09:00
Shikhar Bhushan
99d4b1a177 Searcher Warming API (#1261)
Adds an API to register Warmers in the IndexReader.

Co-authored-by: Paul Masurel <paul@quickwit.io>
2022-01-20 23:40:25 +09:00
Paul Masurel
732f6847c0 Field type with codes (#1255)
* Term are now typed.

This change is backward compatible:
While the Term has a byte representation that is modified, a Term itself
is a transient object that is not serialized as is in the index.

Its .field() and .value_bytes() on the other hand are unchanged.
This change offers better Debug information for terms.

While not necessary it also will help in the support for JSON types.

* Renamed Hierarchical Facet -> Facet
2022-01-07 20:49:00 +09:00
Paul Masurel
1c6d9bdc6a Comparison of Value based on serialization. (#1250) 2022-01-07 20:31:26 +09:00
Paul Masurel
3ea6800ac5 Pleasing clippy (#1253) 2022-01-06 16:41:24 +09:00
Antoine G
395303b644 Collector + directory doc fixes (#1247)
* doc(collector)

* doc(directory)

* doc(misc)

* wording
2022-01-04 09:22:58 +09:00
Daniel Müller
2c200b46cb Use test-log instead of test-env-log (#1248)
The test-env-log crate has been renamed to test-log to better reflect
its intent of not only catering to env_logger specific initialization
but also tracing (and potentially others in the future).
This change updates the crate to use test-log instead of the now
deprecated test-env-log.
2022-01-04 09:20:30 +09:00
Liam Warfield
17e00df112 Change Snippet.fragments -> Snippet.fragment (#1243)
* Change Snippet.fragments -> Snippet.fragment
* Apply suggestions from code review

Co-authored-by: Liam Warfield <lwarfield@arista.com>
2022-01-03 22:23:51 +09:00
Antoine G
3129d86743 doc(termdict) expose structs (#1242)
* doc(termdict) expose structs
also add merger doc + lint
refs #1232
2022-01-03 22:20:31 +09:00
Shikhar Bhushan
e5e252cbc0 LogMergePolicy knob del_docs_percentage_before_merge (#1238)
Add a knob to LogMergePolicy to always merge segments that exceed a threshold of deleted docs

Closes #115
2021-12-20 13:14:56 +09:00
Paul Masurel
b2da82f151 Making MergeCandidate public in order to allow the usage of custom merge (#1237)
policies.

Closes #1235
2021-12-13 09:54:21 +09:00
Paul Masurel
c81b3030fa Issue/922b (#1233)
* Add a NORMED options on field

Make fieldnorm indexation optional:

* for all types except text => added a NORMED options
* for text field
** if STRING, field has not fieldnorm retained
** if TEXT, field has fieldnorm computed

* Finalize making fieldnorm optional for all field types.

- Using Option for fieldnorm readers.
2021-12-10 21:12:29 +09:00
Paul Masurel
9e66c75fc6 Using stable in CI as rustc nightly seems broken 2021-12-10 18:45:23 +09:00
Paul Masurel
ebdbb6bd2e Fixing compilation warnings & clippy comments. 2021-12-10 16:47:59 +09:00
Antoine G
c980b19dd9 canonicalize path when opening MmapDirectory (#1231)
* canonicalize path when opening `MmapDirectory`
fixes #1229
2021-12-09 10:19:52 +09:00
Paul Masurel
098eea843a Reducing the number of call to fsync on the directory. (#1228)
This work by introducing a new API method in the Directory
trait. The user needs to explicitely call this method.
(In particular, once before a commmit)

Closes #1225
2021-12-03 03:10:52 +00:00
Paul Masurel
466dc8233c Cargo fmt 2021-12-02 18:46:28 +09:00