Commit Graph

203 Commits

Author SHA1 Message Date
PSeitz
79e42d4a6d Update src/store/writer.rs
Co-authored-by: Paul Masurel <paul@quickwit.io>
2022-06-23 15:34:21 +08:00
PSeitz
0135fbc4c8 Update src/store/writer.rs
Co-authored-by: Paul Masurel <paul@quickwit.io>
2022-06-23 15:34:21 +08:00
PSeitz
449594f67a Update src/store/writer.rs
Co-authored-by: Paul Masurel <paul@quickwit.io>
2022-06-23 15:34:21 +08:00
Pascal Seitz
8b6647e908 move writer to compressor thread 2022-06-23 15:34:21 +08:00
PSeitz
efabcbcdf5 Update src/store/writer.rs
Co-authored-by: Paul Masurel <paul@quickwit.io>
2022-06-23 15:34:21 +08:00
Pascal Seitz
7bf5962554 merge match, explicit type 2022-06-23 15:34:21 +08:00
Pascal Seitz
4c7dedef29 use seperate thread to compress block store
Use seperate thread to compress block store for increased indexing performance. This allows to use slower compressors with higher compression ratio, with less or no perfomance impact (with enough cores).

A seperate thread is spawned to compress the docstore, which handles single blocks and stacking from other docstores.
The spawned compressor thread does not write, instead it sends back the compressed data. This is done in order to avoid writing multithreaded on the same file.
2022-06-23 15:34:21 +08:00
Antoine G
11e4225f23 doc fix (#1391)
Documentation fix.
2022-06-21 15:53:33 +09:00
Pascal Seitz
4d9d2b6db0 split into compressor/decompressor
use custom de/serializer for compressor
accept parameters like zstd(compression_level=5) as compressor
2022-06-02 23:29:24 +08:00
Pascal Seitz
ed868f93a3 enable setting compression level 2022-06-02 16:47:29 +08:00
Pascal Seitz
314ae43a45 fix fmt 2022-06-02 14:54:23 +08:00
Pascal Seitz
fce91b2f3a vec without capacity 2022-06-02 13:50:18 +08:00
Pascal Seitz
9bcd2b8104 fix read_block_async 2022-06-02 13:37:52 +08:00
Pascal Seitz
0c9c257150 move cache handling into single function 2022-06-02 13:25:29 +08:00
Pascal Seitz
1af85a2956 accept usize instead &usize 2022-06-02 11:23:36 +08:00
Pascal Seitz
bc4c3d0c6b add peek_lru test 2022-06-02 11:13:17 +08:00
Pascal Seitz
6937c75f05 hide advanced doc store api 2022-06-02 11:13:17 +08:00
Pascal Seitz
e54429e827 expose doc store functions
expose doc store functions for advanced usage
refactor cache
expose cache statistics
remove unnecessary arc
unduplicate code
2022-06-02 11:13:17 +08:00
Kryesh
fc045e6bf9 Cleanup imports, remove unneeded error mapping 2022-05-19 10:34:02 +10:00
Kryesh
6837a4d468 Fix bench 2022-05-18 20:35:29 +10:00
Kryesh
0759bf9448 Cleanup zstd structure and serialise to u32 in line with lz4 2022-05-18 20:31:22 +10:00
Kryesh
152e8238d7 Fix silly errors from running tests without feature flag 2022-05-18 19:49:10 +10:00
Kryesh
d4e5b48437 Apply feedback - standardise on u64 and fix correct compression bounds 2022-05-18 19:37:28 +10:00
Kryesh
03040ed81d Add Zstd compression support 2022-05-18 14:04:43 +10:00
Kryesh
aaa22ad225 Make block size configurable to allow for better compression ratios on large documents 2022-05-18 11:13:15 +10:00
Paul Masurel
46d5de920d Removes all usage of block_on, and use a oneshot channel instead. (#1315)
* Removes all usage of block_on, and use a oneshot channel instead.

Calling `block_on` panics in certain context.
For instance, it panics when it is called in a the context of another
call to block.

Using it in tantivy is unnecessary. We replace it by a thin wrapper
around a oneshot channel that supports both async/sync.

* Removing needless uses of async in the API.

Co-authored-by: PSeitz <PSeitz@users.noreply.github.com>
2022-03-18 16:54:58 +09:00
Paul Masurel
2ead010c83 Tantivy quickwit (#1293)
* Added sstable and enabling it by default, and parallel boolean query.
* Added async API for FileSlice.
* Added async get_doc
* Reduce blocksize to 32_000
* Added debug logs

Quickwit specific feature a hidden behind the quickwit feature flag.
2022-02-25 17:32:49 +09:00
Paul Masurel
bdedefe07d Adding an IndexingContext object (#1268) 2022-02-04 15:08:01 +09:00
Paul Masurel
65b365b81c Fixing all-features build. 2022-01-31 14:41:14 +09:00
Paul Masurel
eca6628b3c Minor refactoring (#1266) 2022-01-28 15:55:55 +09:00
Paul Masurel
c81b3030fa Issue/922b (#1233)
* Add a NORMED options on field

Make fieldnorm indexation optional:

* for all types except text => added a NORMED options
* for text field
** if STRING, field has not fieldnorm retained
** if TEXT, field has fieldnorm computed

* Finalize making fieldnorm optional for all field types.

- Using Option for fieldnorm readers.
2021-12-10 21:12:29 +09:00
Paul Masurel
ebdbb6bd2e Fixing compilation warnings & clippy comments. 2021-12-10 16:47:59 +09:00
Paul Masurel
466dc8233c Cargo fmt 2021-12-02 18:46:28 +09:00
Paul Masurel
03c2f6ece2 We are missing 4 bytes in the LZ4 compression buffer. (#1226)
Closes #831
2021-12-02 16:00:29 +09:00
Paul Masurel
f378d9a57b Pleasing clippy 2021-12-02 14:48:33 +09:00
Shikhar Bhushan
72cef12db1 Add none compression (#1208) 2021-11-16 10:50:42 +09:00
Paul Masurel
7234bef0eb Issue/1198 (#1201)
* Unit test reproducing #1198
* Fixing unit test to handle the error from add_document.
* Bump project version
2021-11-11 16:42:19 +09:00
PSeitz
352e0cc58d Adde demux operation (#1150)
* add merge for DeleteBitSet, allow custom DeleteBitSet on merge
* forward delete bitsets on merge, add tests
* add demux operation and tests
2021-10-06 16:05:16 +09:00
Paul Masurel
0855649986 Leaning more on the alive (vs delete) semantics. (#1164) 2021-10-05 18:53:29 +09:00
PSeitz
0ce49c9dd4 use lz4_flex 0.9.0 (#1160) 2021-09-27 10:12:20 +09:00
Pascal Seitz
d7a6a409a1 renames 2021-09-23 20:33:11 +08:00
Pascal Seitz
a1f5cead96 AliveBitSet instead of DeleteBitSet 2021-09-23 20:03:57 +08:00
Pascal Seitz
3265f7bec3 dissolve common module 2021-08-19 23:26:34 +01:00
Pascal Seitz
0062fe705d cargo fmt 2021-07-01 18:17:08 +02:00
Pascal Seitz
e496ae0470 clippy fixes 2021-07-01 17:43:50 +02:00
Pascal Seitz
1e4df54ab3 fix clippy 2021-07-01 17:41:53 +02:00
Pascal Seitz
10f056fbb4 apply clippy fixes 2021-07-01 17:08:44 +02:00
Andre-Philippe Paquet
57ae5b27dc fix store reader iterator, take 2 2021-06-16 07:51:39 -04:00
Pascal Seitz
ebe55a7ae1 refactor test, fixes #1077
replace test with smaller test in doc_store
2021-06-14 10:10:05 +02:00
Andre-Philippe Paquet
473a346814 remove debugging 2021-06-13 16:49:44 -04:00