PSeitz
7ce8a65619
fix: doc store for files larger 4GB ( #1856 )
...
Fixes an issue in the skip list deserialization, which deserialized the byte start offset incorrectly as u32.
`get_doc` will fail for any docs that live in a block with start offset larger than u32::MAX (~4GB).
Causes index corruption, if a segment with a doc store larger 4GB is merged.
tantivy version 0.19 is affected
quickwit-0.5-rev
2023-03-13 15:07:55 +09:00
PSeitz
7bf0a14041
fix: auto downgrade index record option, instead of vint error ( #1857 )
...
Prev: thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: IoError(Custom { kind: InvalidData, error: "Reach end of buffer while reading VInt" })', src/main.rs:46:14
Now: Automatic downgrade to next available level
2023-03-13 15:07:28 +09:00
PSeitz
c91d4e4e65
fix sort order test for term aggregation ( #1858 )
...
fix sort order test for term aggregation
fix invalid request test
2023-03-13 13:49:10 +08:00
PSeitz
6f6f639170
fmt code, update lz4_flex ( #1838 )
...
formatting on nightly changed
2023-03-13 14:14:15 +09:00
Paul Masurel
a022e97dc2
Bumped tantivy version
2023-03-13 14:10:41 +09:00
Paul Masurel
6474a0f58e
Created branch specifically for Quickwit 0.5
2023-03-11 12:27:20 +09:00
PSeitz
0f20787917
fix doc store cache docs ( #1821 )
...
* fix doc store cache docs
addresses an issue reported in #1820
* rename doc_store_cache_size
2023-01-23 07:06:49 +01:00
Paul Masurel
2874554ee4
Removed the sorting logic that forced column type to be sorted like ( #1816 )
...
* Removed the sorting logic that forced column type to be sorted like
ColumnTypes.
* add comments
Co-authored-by: PSeitz <PSeitz@users.noreply.github.com >
2023-01-20 12:43:28 +01:00
PSeitz
cbc70a9eae
Cargo.toml cleanup ( #1817 )
2023-01-20 12:30:35 +01:00
PSeitz
226d0f88bc
add columnar to workspace ( #1808 )
2023-01-20 11:47:10 +01:00
Paul Masurel
9548570e88
Fixing broken test build
2023-01-20 18:18:32 +09:00
Paul Masurel
9a296b29b7
Renamed dense file to dense.rs
2023-01-20 17:22:25 +09:00
PSeitz
b31fd389d8
collect columns for merge ( #1812 )
...
* collect columns for merge
* return column_type from, fix visibility
* fix
Co-authored-by: Paul Masurel <paul@quickwit.io >
2023-01-20 07:58:29 +01:00
Paul Masurel
89cec79813
Make it possible to force a column type and intricate bugfix. ( #1815 )
2023-01-20 14:30:56 +09:00
PSeitz
d09d91a856
fix tests ( #1813 )
2023-01-19 23:41:21 +09:00
PSeitz
50d8a8bc32
Update README ( #1804 )
...
Some parts are outdated
For the debugging tutorial, debugging is really easy now with VSCode, and there are plenty of other sources for debugging rust
2023-01-19 18:09:45 +09:00
Paul Masurel
08919a2900
Improvement on the scalar / random bitpacker code. ( #1781 )
...
* Improvement on the scalar / random bitpacker code.
Added proptesting
Added simple benchmark
Added assert and comments on the very non trivial hidden contract
Remove the need for an extra padding.
The last point introduces a small performance regression (~10%).
* Fixing unit tests
2023-01-19 18:09:13 +09:00
Lonre Wang
8ba333f1b4
Typo fix ( #1803 )
...
* Update text_options.rs
* Update src/schema/text_options.rs
Co-authored-by: Paul Masurel <paul@quickwit.io >
2023-01-19 17:56:05 +09:00
PSeitz
a2ca12995e
update aggregation docs ( #1807 )
2023-01-19 09:52:47 +01:00
Paul Masurel
e3d504d833
Minor code cleanup ( #1810 )
2023-01-19 17:47:26 +09:00
Paul Masurel
5a42c5aae9
Add support for multivalues ( #1809 )
2023-01-19 16:55:01 +09:00
Paul Masurel
a86b104a40
Differentiating between str and bytes, + unit test
2023-01-19 14:38:12 +09:00
PSeitz
f9abd256b7
add ip addr to columnar ( #1805 )
2023-01-19 05:36:06 +01:00
Paul Masurel
9f42b6440a
Completed unit test for dictionary encoded column
2023-01-19 12:15:27 +09:00
Paul Masurel
c723ed3f0b
Columnar merge ( #1806 )
2023-01-19 11:52:27 +09:00
trinity-1686a
d72ea7d353
modify getters for sstable metadata ( #1793 )
...
* add way to get up to `limit` terms from sstable
* make some function of sstable load less data
* add some tests to sstable
* add tests on sstable dictionary
* fix some bugs with sstable
2023-01-18 14:42:55 +01:00
Paul Masurel
5180b612ef
Removing the demuxer code ( #1799 )
2023-01-18 16:12:35 +09:00
PSeitz
f687b3a5aa
start migrate Field to &str ( #1772 )
...
start migrate Field to &str in preparation of columnar
return Result for get_field
2023-01-18 16:12:07 +09:00
PSeitz
c4af63e588
add rename ( #1797 )
2023-01-18 13:28:37 +09:00
Adrien Guillo
4b343b3189
Merge pull request #1802 from quickwit-oss/guilload/clippy-fixes
...
Fix some Clippy warnings
2023-01-17 10:39:55 -05:00
Adrien Guillo
c51d9f9f83
Fix some Clippy warnings
2023-01-17 10:17:51 -05:00
Adrien Guillo
c9cb3d04bf
Merge pull request #1788 from quickwit-oss/guilload/remove-std-dev-from-stats-agg
...
Remove standard deviation from stats aggregation
2023-01-16 23:16:36 -05:00
Adrien Guillo
0caaf13a90
Remove standard deviation from stats aggregation
2023-01-16 22:58:23 -05:00
Adrien Guillo
a59bd965cc
Merge pull request #1794 from quickwit-oss/guilload/count-min-max-sum-aggs
...
Add count, min, max, and sum aggregations
2023-01-16 22:45:01 -05:00
Adrien Guillo
f2dad194ea
Add count, min, max, and sum aggregations
2023-01-16 12:22:20 -05:00
Paul Masurel
25bad784ad
Integrated fastfield codecs into columnar. ( #1782 )
...
Introduced asymetric OptionalCodec / SerializableOptionalCodec
Removed cardinality from the columnar sstable.
Added DynamicColumn
Reorganized all files
Change DenseCodec serialization logic.
Renamed methods to rank/select
Moved versioning footer to the columnar level
2023-01-16 17:24:49 +09:00
PSeitz
4bac945709
add ip field example ( #1775 )
2023-01-16 06:06:11 +01:00
trinity-1686a
16b704e190
make file_slice_for_range on sstable public ( #1784 )
2023-01-16 13:59:57 +09:00
PSeitz
6ca9a477f3
reuse stats for average ( #1785 )
...
* reuse stats for average
* fix count type
2023-01-13 23:32:27 +08:00
Shikhar Bhushan
2650111b76
EnableScoring::Disabled - optional Searcher ( #1780 )
2023-01-12 09:26:50 -05:00
PSeitz
1176555eff
handle user input on get_docid_for_value_range ( #1760 )
...
* handle user input on get_docid_for_value_range
fixes #1757
* pass range as parameter
2023-01-12 14:20:16 +01:00
Adrien Guillo
f8d111a75e
Merge pull request #1777 from quickwit-oss/guilload/ff-range-query-on-not-indexed-fields
...
Allow range queries via fast fields on non-indexed fields
2023-01-11 10:14:32 -05:00
Adrien Guillo
e17996f2fd
Allow range queries via fast fields on non-indexed fields
2023-01-11 09:56:13 -05:00
Adrien Guillo
f3621c0487
Add license to tokenizer-api crate ( #1778 )
2023-01-11 05:26:41 +01:00
Adrien Guillo
14222a47a3
Fix typo ( #1776 )
2023-01-11 00:49:13 +09:00
Adam Reichold
8312c882a5
More cosmetic fixes for upcoming Clippy lints. ( #1771 )
2023-01-10 10:32:45 +01:00
Paul Masurel
7a8fce0ae7
Minor mini fixes
2023-01-10 14:15:30 +09:00
Michael Kleen
196e42f33e
Add regex tokenizer ( #1759 )
...
This adds a regex tokenizer which tokenizes the text by using a
regex pattern to split.
Co-authored-by: Michael Kleen <mkleen@gmailw.com >
2023-01-10 13:38:37 +09:00
Adam Reichold
82a183bc2d
Bump dependency on lru to from version 0.7.5 to version 0.9.0. ( #1755 )
2023-01-10 13:35:37 +09:00
dependabot[bot]
3090d49615
Update base64 requirement from 0.20.0 to 0.21.0 ( #1769 )
...
Updates the requirements on [base64](https://github.com/marshallpierce/rust-base64 ) to permit the latest version.
- [Release notes](https://github.com/marshallpierce/rust-base64/releases )
- [Changelog](https://github.com/marshallpierce/rust-base64/blob/master/RELEASE-NOTES.md )
- [Commits](https://github.com/marshallpierce/rust-base64/compare/v0.20.0...v0.21.0 )
---
updated-dependencies:
- dependency-name: base64
dependency-type: direct:production
...
Signed-off-by: dependabot[bot] <support@github.com >
Signed-off-by: dependabot[bot] <support@github.com >
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-01-10 13:35:05 +09:00