Commit Graph

2861 Commits

Author SHA1 Message Date
Paul Masurel
7b06db062b Adding a special method to Arc<dyn ColumnValues> 2023-02-14 23:14:14 +09:00
Paul Masurel
097fd6138d Fix clippy comments (#1872) 2023-02-14 23:12:45 +09:00
PSeitz
01e5a22759 switch to new ff api (#1868) 2023-02-14 15:57:32 +08:00
Antoine Gauthier
b60b7d2afe fix(CI) enable coverage on doctest (#1839)
* fix(CI) enable coverage on doctest
⚠️ Marked as [unstable](https://github.com/taiki-e/cargo-llvm-cov/issues/2)
refs #1761

* remove obsolete CI directory
2023-02-14 16:42:44 +09:00
Yukun Guo
dfe4e95fde Make index compatible with virtual drives on Windows (#1843)
* Make index compatible with virtual drives on Windows

* Get rid of normpath
2023-02-14 16:41:48 +09:00
Paul Masurel
60cc2644d6 Fixing test_fail_on_flush_segment_but_one_worker_remains (#1869)
The new fast field code, based on columnar, had a larger minimum memory
footprint, causing the first docuemnt to trigger a flush of the asegment
in this unit test.

This PR prevents the allocation of a large capacity for the different hashmap tables
using in the columnar writer.

Closes #1859
2023-02-14 16:09:42 +09:00
Paul Masurel
10bccac61b Bugfix in parse_into_milliseconds (#1867) 2023-02-14 15:06:40 +09:00
PSeitz
1cfb9ce59a improve range query performance (#1864)
fix RowId vs DocId naming
fixes #1863
2023-02-14 13:25:39 +09:00
trinity-1686a
539ff08a79 move DateTime to tantivy_common (#1861)
* move DateTime to tantivy_common

* resolve imports of columnar::DateTime as import of common::DateTime
2023-02-11 17:03:06 +01:00
PSeitz
dab93df94e fix benchmarks (#1862) 2023-02-11 15:44:47 +09:00
trinity-1686a
3120147a76 re-enable examples (#1860) 2023-02-10 14:51:37 +01:00
PSeitz
cbcafae04c fix: doc store for files larger 4GB (#1856)
Fixes an issue in the skip list deserialization, which deserialized the byte start offset incorrectly as u32.
`get_doc` will fail for any docs that live in a block with start offset larger than u32::MAX (~4GB).
Causes index corruption, if a segment with a doc store larger 4GB is merged.

tantivy version 0.19 is affected
2023-02-10 14:29:43 +01:00
PSeitz
36c6138e7f fix: auto downgrade index record option, instead of vint error (#1857)
Prev: thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: IoError(Custom { kind: InvalidData, error: "Reach end of buffer while reading VInt" })', src/main.rs:46:14
Now: Automatic downgrade to next available level
2023-02-10 13:45:23 +01:00
PSeitz
7a9befd18d fix sort order test for term aggregation (#1858)
fix sort order test for term aggregation
fix invalid request test
2023-02-10 10:26:58 +01:00
Paul Masurel
62c811df2b Added a columnar cli 2023-02-09 19:02:16 +01:00
PSeitz
03345f0aa2 fmt code, update lz4_flex (#1838)
formatting on nightly changed
2023-02-10 01:42:32 +09:00
Paul Masurel
b7bfa20e38 Fixed test performance. 2023-02-09 17:39:55 +01:00
Paul Masurel
db8583db75 Fixing unit test 2023-02-09 16:53:05 +01:00
trinity-1686a
1390834ae8 make Term::as_slice public (#1846) 2023-02-09 15:37:07 +01:00
trinity-1686a
3ac973bea4 fix invalid endianness in documentation (#1845)
* fix doc about term endianness

* rustfmt
2023-02-09 15:36:38 +01:00
Paul Masurel
405e2cf4d9 Merge with main 2023-02-09 14:28:57 +01:00
Paul Masurel
b63c6c27bc adding change from main 2023-02-09 14:18:46 +01:00
Paul Masurel
bd5eea9852 Integrated columnar work. 2023-02-09 13:14:31 +01:00
PSeitz
0f20787917 fix doc store cache docs (#1821)
* fix doc store cache docs

addresses an issue reported in #1820

* rename doc_store_cache_size
2023-01-23 07:06:49 +01:00
Paul Masurel
2874554ee4 Removed the sorting logic that forced column type to be sorted like (#1816)
* Removed the sorting logic that forced column type to be sorted like
ColumnTypes.

* add comments

Co-authored-by: PSeitz <PSeitz@users.noreply.github.com>
2023-01-20 12:43:28 +01:00
PSeitz
cbc70a9eae Cargo.toml cleanup (#1817) 2023-01-20 12:30:35 +01:00
PSeitz
226d0f88bc add columnar to workspace (#1808) 2023-01-20 11:47:10 +01:00
Paul Masurel
9548570e88 Fixing broken test build 2023-01-20 18:18:32 +09:00
Paul Masurel
9a296b29b7 Renamed dense file to dense.rs 2023-01-20 17:22:25 +09:00
PSeitz
b31fd389d8 collect columns for merge (#1812)
* collect columns for merge

* return column_type from, fix visibility

* fix

Co-authored-by: Paul Masurel <paul@quickwit.io>
2023-01-20 07:58:29 +01:00
Paul Masurel
89cec79813 Make it possible to force a column type and intricate bugfix. (#1815) 2023-01-20 14:30:56 +09:00
PSeitz
d09d91a856 fix tests (#1813) 2023-01-19 23:41:21 +09:00
PSeitz
50d8a8bc32 Update README (#1804)
Some parts are outdated

For the debugging tutorial, debugging is really easy now with VSCode, and there are plenty of other sources for debugging rust
2023-01-19 18:09:45 +09:00
Paul Masurel
08919a2900 Improvement on the scalar / random bitpacker code. (#1781)
* Improvement on the scalar / random bitpacker code.

Added proptesting
Added simple benchmark
Added assert and comments on the very non trivial hidden contract
Remove the need for an extra padding.

The last point introduces a small performance regression (~10%).

* Fixing unit tests
2023-01-19 18:09:13 +09:00
Lonre Wang
8ba333f1b4 Typo fix (#1803)
* Update text_options.rs

* Update src/schema/text_options.rs

Co-authored-by: Paul Masurel <paul@quickwit.io>
2023-01-19 17:56:05 +09:00
PSeitz
a2ca12995e update aggregation docs (#1807) 2023-01-19 09:52:47 +01:00
Paul Masurel
e3d504d833 Minor code cleanup (#1810) 2023-01-19 17:47:26 +09:00
Paul Masurel
5a42c5aae9 Add support for multivalues (#1809) 2023-01-19 16:55:01 +09:00
Paul Masurel
a86b104a40 Differentiating between str and bytes, + unit test 2023-01-19 14:38:12 +09:00
PSeitz
f9abd256b7 add ip addr to columnar (#1805) 2023-01-19 05:36:06 +01:00
Paul Masurel
9f42b6440a Completed unit test for dictionary encoded column 2023-01-19 12:15:27 +09:00
Paul Masurel
c723ed3f0b Columnar merge (#1806) 2023-01-19 11:52:27 +09:00
trinity-1686a
d72ea7d353 modify getters for sstable metadata (#1793)
* add way to get up to `limit` terms from sstable

* make some function of sstable load less data

* add some tests to sstable

* add tests on sstable dictionary

* fix some bugs with sstable
2023-01-18 14:42:55 +01:00
Paul Masurel
5180b612ef Removing the demuxer code (#1799) 2023-01-18 16:12:35 +09:00
PSeitz
f687b3a5aa start migrate Field to &str (#1772)
start migrate Field to &str in preparation of columnar
return Result for get_field
2023-01-18 16:12:07 +09:00
PSeitz
c4af63e588 add rename (#1797) 2023-01-18 13:28:37 +09:00
Adrien Guillo
4b343b3189 Merge pull request #1802 from quickwit-oss/guilload/clippy-fixes
Fix some Clippy warnings
2023-01-17 10:39:55 -05:00
Adrien Guillo
c51d9f9f83 Fix some Clippy warnings 2023-01-17 10:17:51 -05:00
Adrien Guillo
c9cb3d04bf Merge pull request #1788 from quickwit-oss/guilload/remove-std-dev-from-stats-agg
Remove standard deviation from stats aggregation
2023-01-16 23:16:36 -05:00
Adrien Guillo
0caaf13a90 Remove standard deviation from stats aggregation 2023-01-16 22:58:23 -05:00