Commit Graph

122 Commits

Author SHA1 Message Date
Paul Masurel
85ebb3c420 Introducing ColumnReader.
Introducing a ColumnReader trait and .reader() to Column,
hence removing the dreaded Mutex in the `MultiValueStartIndex`
thingy.
2022-09-21 12:47:44 +09:00
Paul Masurel
1998111521 Minor refactoring fast fields (#1537) 2022-09-21 12:46:11 +09:00
Pascal Seitz
a06039dea8 fix benches
move some benches to lib.rs to test unexported items
2022-09-19 11:07:20 +08:00
Pascal Seitz
02599ebeb7 remove ip_to_u128 2022-09-16 18:16:16 +08:00
Pascal Seitz
a16b466460 merge ColumnExt with Column trait 2022-09-16 18:15:18 +08:00
Pascal Seitz
b8d8fdeb6e move benches, improve bench data 2022-09-16 16:42:23 +08:00
Pascal Seitz
12856d80fa change bench, update numbers 2022-09-16 16:41:01 +08:00
Pascal Seitz
e75472ec9a add serialize_u128, open_u128, refactor 2022-09-16 16:40:59 +08:00
Pascal Seitz
e2e6c94ba8 remove ColumnV2 2022-09-16 16:40:06 +08:00
Pascal Seitz
9f610b25af fix benches, add benches 2022-09-16 16:38:48 +08:00
Pascal Seitz
237b64025e take ColumnV2 as parameter
improve algorithm
stricter assertions
improve names
2022-09-16 16:38:48 +08:00
Pascal Seitz
592caeefa0 renames 2022-09-16 16:38:48 +08:00
Pascal Seitz
570009b5b1 move to mod.rs 2022-09-16 16:38:48 +08:00
Pascal Seitz
61b5110db7 use 0 as null in compact space 2022-09-16 16:38:48 +08:00
PSeitz
58af1235e4 Apply suggestions from code review
Co-authored-by: Paul Masurel <paul@quickwit.io>
2022-09-16 16:38:48 +08:00
Pascal Seitz
d3e7c41a1f refactor to range_mapping 2022-09-16 16:38:48 +08:00
Pascal Seitz
11275854ca unroll get range iteration 2022-09-16 16:38:48 +08:00
Pascal Seitz
3ca48cd826 fix test 2022-09-16 16:38:48 +08:00
Pascal Seitz
47dc511733 add inline 2022-09-16 16:38:48 +08:00
Pascal Seitz
cae6b28a8f remove num_vals param 2022-09-16 16:38:48 +08:00
Pascal Seitz
9aa9efe2a4 fix bench 2022-09-16 16:38:48 +08:00
Pascal Seitz
57570b38a2 use vint, forward errors, removed unused var 2022-09-16 16:38:48 +08:00
Pascal Seitz
584394db1e fix Cargo.toml 2022-09-16 16:38:48 +08:00
Pascal Seitz
3aeb026970 fix blank_size, add comments 2022-09-16 16:38:48 +08:00
Pascal Seitz
df32ee2df2 refactor, use BTreeSet for sorted deduped values 2022-09-16 16:38:48 +08:00
Pascal Seitz
762e662bfd extend proptest for get_range 2022-09-16 16:38:48 +08:00
Pascal Seitz
63b2420058 fix get_range
change blank handling
optimize blank collection
fix off by one errors
extend tests
fix get_range
dedupe values to save space
add bench
2022-09-16 16:38:47 +08:00
Pascal Seitz
ced21b8791 move tests 2022-09-16 16:38:02 +08:00
Pascal Seitz
bc85947105 add ip codec 2022-09-16 16:38:01 +08:00
Paul Masurel
64f08a1a5c Hiding useless symbols and removing code. (#1522) 2022-09-16 14:42:27 +09:00
Paul Masurel
e029fdfca7 Perf fix on the MonotonicMapping column (#1519)
The Monotonic mapping was using the default implementation
for `get_range` and `.iter`.

As a result, some of the column used in merge (e.g. multivalued
fast fields) were exhibiting a very strong performance regression.
2022-09-15 14:20:43 +09:00
Shikhar Bhushan
1eab12396d Make Column: Send + Sync (#1518) 2022-09-13 13:31:28 +09:00
Pascal Seitz
29d56111de refactor, fix api
refactor
fix clippy
fix docs
remove unused code
fix bytesfield index api flaw
2022-09-07 18:43:04 +08:00
Paul Masurel
c5d30a54bc CR 2022-09-06 00:16:41 +09:00
Paul Masurel
c632fc014e Refactoring fast fields codecs.
This removes the GCD part as a codec, and
makes it so that fastfield codecs all share
the same normalization part (shift + gcd).
2022-09-05 23:07:12 +09:00
Paul Masurel
ea72cf34d6 Int based linear interpol (#1482)
* Rename BlockwiseLinear to BlockwiseLinearLegacy

Reimplements the blockwise multilinear codec using integer arithmetics.
Added comments

* add estimate for blockwise

* Added one unit test

* use int based for linear interpol

* fix merge conflicts

* reuse code

* cargo fmt

* fix clippy

* fix test

* fix off by one

fix off by one to accurately interpolate autoincrement fields

* extend test, fix estimate

* remove legacy codec

Co-authored-by: Pascal Seitz <pascal.seitz@gmail.com>
2022-09-05 15:53:00 +09:00
Paul Masurel
26876d41d7 Moving the serialization logic to the fastfield_codecs crate. 2022-09-03 00:29:52 +09:00
Paul Masurel
8e775b6c3d Refactoring dyn Column (#1502) 2022-09-02 17:26:30 +09:00
Pascal Seitz
d3dd620048 fix clippy 2022-08-31 13:13:56 +02:00
Pascal Seitz
e89c220b56 custom num strategy, faster test
closes #1486
faster test with rand values
2022-08-31 12:08:44 +02:00
Pascal Seitz
7a26cc9022 add VecColumn 2022-08-29 15:49:43 +02:00
Pascal Seitz
54972caa7c remove Column impl on Vec
remove Column impl on Vec to avoid function shadowing
2022-08-29 11:57:41 +02:00
PSeitz
5d436759b0 Merge pull request #1480 from quickwit-oss/overflow_issue
fix overflow issue in interpolation
2022-08-28 16:44:00 -07:00
Paul Masurel
5331be800b Introducing a column trait 2022-08-28 14:14:27 +02:00
Paul Masurel
54cfd0d154 Removing Deserializer trait (#1489)
Removing Deserializer trait and renaming the `Serializer` trait `FastFieldCodec`.
Small refactoring estimate.
2022-08-28 04:54:55 +09:00
PSeitz
0dd62169c8 merge FastFieldCodecReader wit FastFieldDataAccess (#1485)
* num_vals to FastFieldCodecReader

* split open_from_bytes to own trait

* rename get_u64 to ge_val

* merge traits
2022-08-28 03:58:28 +09:00
Paul Masurel
3a9727aa91 Pleasing Clippy 2022-08-27 11:33:03 +02:00
Paul Masurel
4ae0317d68 Cargo fmt 2022-08-26 00:50:07 +02:00
Paul Masurel
107b19855f Fixing the fastfield codec benchmark (#1484) 2022-08-26 05:54:14 +09:00
Paul Masurel
d8f66ba07e Rename fastfield codecs (#1483) 2022-08-26 01:19:30 +09:00