PSeitz
79e42d4a6d
Update src/store/writer.rs
...
Co-authored-by: Paul Masurel <paul@quickwit.io >
2022-06-23 15:34:21 +08:00
PSeitz
0135fbc4c8
Update src/store/writer.rs
...
Co-authored-by: Paul Masurel <paul@quickwit.io >
2022-06-23 15:34:21 +08:00
PSeitz
449594f67a
Update src/store/writer.rs
...
Co-authored-by: Paul Masurel <paul@quickwit.io >
2022-06-23 15:34:21 +08:00
Pascal Seitz
8b6647e908
move writer to compressor thread
2022-06-23 15:34:21 +08:00
PSeitz
efabcbcdf5
Update src/store/writer.rs
...
Co-authored-by: Paul Masurel <paul@quickwit.io >
2022-06-23 15:34:21 +08:00
Pascal Seitz
7bf5962554
merge match, explicit type
2022-06-23 15:34:21 +08:00
Pascal Seitz
4c7dedef29
use seperate thread to compress block store
...
Use seperate thread to compress block store for increased indexing performance. This allows to use slower compressors with higher compression ratio, with less or no perfomance impact (with enough cores).
A seperate thread is spawned to compress the docstore, which handles single blocks and stacking from other docstores.
The spawned compressor thread does not write, instead it sends back the compressed data. This is done in order to avoid writing multithreaded on the same file.
2022-06-23 15:34:21 +08:00
Antoine G
11e4225f23
doc fix ( #1391 )
...
Documentation fix.
2022-06-21 15:53:33 +09:00
Pascal Seitz
4d9d2b6db0
split into compressor/decompressor
...
use custom de/serializer for compressor
accept parameters like zstd(compression_level=5) as compressor
2022-06-02 23:29:24 +08:00
Pascal Seitz
ed868f93a3
enable setting compression level
2022-06-02 16:47:29 +08:00
Pascal Seitz
314ae43a45
fix fmt
2022-06-02 14:54:23 +08:00
Pascal Seitz
fce91b2f3a
vec without capacity
2022-06-02 13:50:18 +08:00
Pascal Seitz
9bcd2b8104
fix read_block_async
2022-06-02 13:37:52 +08:00
Pascal Seitz
0c9c257150
move cache handling into single function
2022-06-02 13:25:29 +08:00
Pascal Seitz
1af85a2956
accept usize instead &usize
2022-06-02 11:23:36 +08:00
Pascal Seitz
bc4c3d0c6b
add peek_lru test
2022-06-02 11:13:17 +08:00
Pascal Seitz
6937c75f05
hide advanced doc store api
2022-06-02 11:13:17 +08:00
Pascal Seitz
e54429e827
expose doc store functions
...
expose doc store functions for advanced usage
refactor cache
expose cache statistics
remove unnecessary arc
unduplicate code
2022-06-02 11:13:17 +08:00
Kryesh
fc045e6bf9
Cleanup imports, remove unneeded error mapping
2022-05-19 10:34:02 +10:00
Kryesh
6837a4d468
Fix bench
2022-05-18 20:35:29 +10:00
Kryesh
0759bf9448
Cleanup zstd structure and serialise to u32 in line with lz4
2022-05-18 20:31:22 +10:00
Kryesh
152e8238d7
Fix silly errors from running tests without feature flag
2022-05-18 19:49:10 +10:00
Kryesh
d4e5b48437
Apply feedback - standardise on u64 and fix correct compression bounds
2022-05-18 19:37:28 +10:00
Kryesh
03040ed81d
Add Zstd compression support
2022-05-18 14:04:43 +10:00
Kryesh
aaa22ad225
Make block size configurable to allow for better compression ratios on large documents
2022-05-18 11:13:15 +10:00
Paul Masurel
46d5de920d
Removes all usage of block_on, and use a oneshot channel instead. ( #1315 )
...
* Removes all usage of block_on, and use a oneshot channel instead.
Calling `block_on` panics in certain context.
For instance, it panics when it is called in a the context of another
call to block.
Using it in tantivy is unnecessary. We replace it by a thin wrapper
around a oneshot channel that supports both async/sync.
* Removing needless uses of async in the API.
Co-authored-by: PSeitz <PSeitz@users.noreply.github.com >
2022-03-18 16:54:58 +09:00
Paul Masurel
2ead010c83
Tantivy quickwit ( #1293 )
...
* Added sstable and enabling it by default, and parallel boolean query.
* Added async API for FileSlice.
* Added async get_doc
* Reduce blocksize to 32_000
* Added debug logs
Quickwit specific feature a hidden behind the quickwit feature flag.
2022-02-25 17:32:49 +09:00
Paul Masurel
bdedefe07d
Adding an IndexingContext object ( #1268 )
2022-02-04 15:08:01 +09:00
Paul Masurel
65b365b81c
Fixing all-features build.
2022-01-31 14:41:14 +09:00
Paul Masurel
eca6628b3c
Minor refactoring ( #1266 )
2022-01-28 15:55:55 +09:00
Paul Masurel
c81b3030fa
Issue/922b ( #1233 )
...
* Add a NORMED options on field
Make fieldnorm indexation optional:
* for all types except text => added a NORMED options
* for text field
** if STRING, field has not fieldnorm retained
** if TEXT, field has fieldnorm computed
* Finalize making fieldnorm optional for all field types.
- Using Option for fieldnorm readers.
2021-12-10 21:12:29 +09:00
Paul Masurel
ebdbb6bd2e
Fixing compilation warnings & clippy comments.
2021-12-10 16:47:59 +09:00
Paul Masurel
466dc8233c
Cargo fmt
2021-12-02 18:46:28 +09:00
Paul Masurel
03c2f6ece2
We are missing 4 bytes in the LZ4 compression buffer. ( #1226 )
...
Closes #831
2021-12-02 16:00:29 +09:00
Paul Masurel
f378d9a57b
Pleasing clippy
2021-12-02 14:48:33 +09:00
Shikhar Bhushan
72cef12db1
Add none compression ( #1208 )
2021-11-16 10:50:42 +09:00
Paul Masurel
7234bef0eb
Issue/1198 ( #1201 )
...
* Unit test reproducing #1198
* Fixing unit test to handle the error from add_document.
* Bump project version
2021-11-11 16:42:19 +09:00
PSeitz
352e0cc58d
Adde demux operation ( #1150 )
...
* add merge for DeleteBitSet, allow custom DeleteBitSet on merge
* forward delete bitsets on merge, add tests
* add demux operation and tests
2021-10-06 16:05:16 +09:00
Paul Masurel
0855649986
Leaning more on the alive (vs delete) semantics. ( #1164 )
2021-10-05 18:53:29 +09:00
PSeitz
0ce49c9dd4
use lz4_flex 0.9.0 ( #1160 )
2021-09-27 10:12:20 +09:00
Pascal Seitz
d7a6a409a1
renames
2021-09-23 20:33:11 +08:00
Pascal Seitz
a1f5cead96
AliveBitSet instead of DeleteBitSet
2021-09-23 20:03:57 +08:00
Pascal Seitz
3265f7bec3
dissolve common module
2021-08-19 23:26:34 +01:00
Pascal Seitz
0062fe705d
cargo fmt
2021-07-01 18:17:08 +02:00
Pascal Seitz
e496ae0470
clippy fixes
2021-07-01 17:43:50 +02:00
Pascal Seitz
1e4df54ab3
fix clippy
2021-07-01 17:41:53 +02:00
Pascal Seitz
10f056fbb4
apply clippy fixes
2021-07-01 17:08:44 +02:00
Andre-Philippe Paquet
57ae5b27dc
fix store reader iterator, take 2
2021-06-16 07:51:39 -04:00
Pascal Seitz
ebe55a7ae1
refactor test, fixes #1077
...
replace test with smaller test in doc_store
2021-06-14 10:10:05 +02:00
Andre-Philippe Paquet
473a346814
remove debugging
2021-06-13 16:49:44 -04:00