Paul Masurel
4640fae516
Added solution to force the type of a column.
2023-01-17 15:13:41 +09:00
Adrien Guillo
c9cb3d04bf
Merge pull request #1788 from quickwit-oss/guilload/remove-std-dev-from-stats-agg
...
Remove standard deviation from stats aggregation
2023-01-16 23:16:36 -05:00
Adrien Guillo
0caaf13a90
Remove standard deviation from stats aggregation
2023-01-16 22:58:23 -05:00
Adrien Guillo
a59bd965cc
Merge pull request #1794 from quickwit-oss/guilload/count-min-max-sum-aggs
...
Add count, min, max, and sum aggregations
2023-01-16 22:45:01 -05:00
Adrien Guillo
f2dad194ea
Add count, min, max, and sum aggregations
2023-01-16 12:22:20 -05:00
Paul Masurel
25bad784ad
Integrated fastfield codecs into columnar. ( #1782 )
...
Introduced asymetric OptionalCodec / SerializableOptionalCodec
Removed cardinality from the columnar sstable.
Added DynamicColumn
Reorganized all files
Change DenseCodec serialization logic.
Renamed methods to rank/select
Moved versioning footer to the columnar level
2023-01-16 17:24:49 +09:00
PSeitz
4bac945709
add ip field example ( #1775 )
2023-01-16 06:06:11 +01:00
trinity-1686a
16b704e190
make file_slice_for_range on sstable public ( #1784 )
2023-01-16 13:59:57 +09:00
PSeitz
6ca9a477f3
reuse stats for average ( #1785 )
...
* reuse stats for average
* fix count type
2023-01-13 23:32:27 +08:00
Shikhar Bhushan
2650111b76
EnableScoring::Disabled - optional Searcher ( #1780 )
2023-01-12 09:26:50 -05:00
PSeitz
1176555eff
handle user input on get_docid_for_value_range ( #1760 )
...
* handle user input on get_docid_for_value_range
fixes #1757
* pass range as parameter
2023-01-12 14:20:16 +01:00
Adrien Guillo
f8d111a75e
Merge pull request #1777 from quickwit-oss/guilload/ff-range-query-on-not-indexed-fields
...
Allow range queries via fast fields on non-indexed fields
2023-01-11 10:14:32 -05:00
Adrien Guillo
e17996f2fd
Allow range queries via fast fields on non-indexed fields
2023-01-11 09:56:13 -05:00
Adrien Guillo
f3621c0487
Add license to tokenizer-api crate ( #1778 )
2023-01-11 05:26:41 +01:00
Adrien Guillo
14222a47a3
Fix typo ( #1776 )
2023-01-11 00:49:13 +09:00
Adam Reichold
8312c882a5
More cosmetic fixes for upcoming Clippy lints. ( #1771 )
2023-01-10 10:32:45 +01:00
Paul Masurel
7a8fce0ae7
Minor mini fixes
2023-01-10 14:15:30 +09:00
Michael Kleen
196e42f33e
Add regex tokenizer ( #1759 )
...
This adds a regex tokenizer which tokenizes the text by using a
regex pattern to split.
Co-authored-by: Michael Kleen <mkleen@gmailw.com >
2023-01-10 13:38:37 +09:00
Adam Reichold
82a183bc2d
Bump dependency on lru to from version 0.7.5 to version 0.9.0. ( #1755 )
2023-01-10 13:35:37 +09:00
dependabot[bot]
3090d49615
Update base64 requirement from 0.20.0 to 0.21.0 ( #1769 )
...
Updates the requirements on [base64](https://github.com/marshallpierce/rust-base64 ) to permit the latest version.
- [Release notes](https://github.com/marshallpierce/rust-base64/releases )
- [Changelog](https://github.com/marshallpierce/rust-base64/blob/master/RELEASE-NOTES.md )
- [Commits](https://github.com/marshallpierce/rust-base64/compare/v0.20.0...v0.21.0 )
---
updated-dependencies:
- dependency-name: base64
dependency-type: direct:production
...
Signed-off-by: dependabot[bot] <support@github.com >
Signed-off-by: dependabot[bot] <support@github.com >
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-01-10 13:35:05 +09:00
PSeitz
7c6cc818ae
enable range query on fast field for u64 compatible types ( #1762 )
...
* enable range query on fast field for u64 compatible types
* rename, update benches
2023-01-10 04:08:26 +01:00
PSeitz
514d23a20c
move tokenizer API to seperate crate ( #1767 )
...
closes #1766
Finding tantivy tokenizers is a frustrating experience currently, since
they need be updated for each tantivy version. That's unnecessary since
the API is rather stable anyway.
2023-01-09 06:37:38 +01:00
Paul Masurel
4f9efe654c
Support for columnar ( #1734 )
...
* Added support for dynamic fast field.
See README for more information.
* Apply suggestions from code review
Co-authored-by: PSeitz <PSeitz@users.noreply.github.com >
2023-01-07 17:37:00 +09:00
Adam Reichold
1afa5bf3db
Make construction of LevenshteinAutomatonBuilder for FuzzyTermQuery instances lazy. ( #1756 )
2023-01-06 12:44:49 +09:00
PSeitz
07a51eb7c8
refactor multivalue fastfield, refactor range query ( #1749 )
...
Introduce MakeZero trait, remove make_zero from FastValue
Merge two multivalue fastfield implementations into one
prepare range query on fastfield for different types
2023-01-05 12:09:50 +01:00
Adam Reichold
2080c370c2
Enable usage of FuzzyTermQuery for specific fields via QueryParser ( #1750 )
...
* Make nightly Clippy mostly happy.
* Document how to produce TermSetQuery queries using QueryParser.
* Enable construction of queries using FuzzyTermQuery via the QueryParser
* Use FxHashMap instead of HashMap in the QueryParser as these hash tables are not exposed to DoS attacks.
* Use a struct instead of a tuple to improve readability.
2023-01-04 18:11:27 +09:00
Daw-Chih Liou
b22f96624e
doc: update comments in the faceted search example ( #1737 )
...
* doc: update comments in the faceted search example
* chore: format
2023-01-02 11:07:30 +01:00
pinkforest(she/her)
b78dc5e313
Bump prettytables ( #1746 )
2022-12-31 15:01:39 +01:00
Paul Masurel
3f915925af
Fixing unit tests
2022-12-27 12:02:16 +09:00
Paul Masurel
9c5fef5af7
Fixing sstable proptest ( #1743 )
2022-12-26 16:29:33 +09:00
Paul Masurel
9948a84ebe
Simplifies the count_ones definition. ( #1742 )
2022-12-26 16:08:01 +09:00
PSeitz
45156fd869
use group_by in translate_codec_idx_to_original_id ( #1736 )
2022-12-26 06:13:29 +01:00
Paul Masurel
bc959006fa
Ooops. Removing ordered_floats.
2022-12-22 19:50:34 +09:00
Paul Masurel
7385a8f80c
Supporting PartialCmp in VectorColumn. ( #1735 )
...
* Supporting PartialCmp in VectorColumn.
* Apply suggestions from code review
Co-authored-by: PSeitz <PSeitz@users.noreply.github.com >
2022-12-22 17:47:25 +09:00
Paul Masurel
13b89cba17
Adding inlines.
2022-12-22 14:29:41 +09:00
Hasnain Lakhani
f4804ce2f5
Adjust spelling of "returns" in docs for DisjunctionMaxQuery ( #1733 )
2022-12-22 14:04:07 +09:00
Paul Masurel
2a6d1eaf78
Added missing license.
2022-12-22 12:47:43 +09:00
Paul Masurel
540a9972bd
Support for NotNaN in fast fields
2022-12-22 12:28:25 +09:00
Paul Masurel
bb48c3e488
Refactoring to prepare for the addition of dynamic fast field ( #1730 )
...
* Refactoring to prepare for the addition of dynamic fast field
- Exposing insert_key / insert_value
- Renamed SSTable::{Reader/Writer}-> SSTable::{ValueReader/ValueWriter}
- Added a generic Dictionary object in the sstable crate
- Removing the TermDictionary wrapper from tantivy, relying directly on
an alias of the generic Dictionary object.
- dropped the use of byteorder in sstable.
- Stopped scanning / reading the entire dictionary when streaming a range.
* Added a benchmark for streaming sstable ranges.
* CR comments.
Rename deserialize_u64 -> deserialize_vint_u64
* Removed needless allocation, split serialize into serialize and clear.
2022-12-22 12:25:46 +09:00
Paul Masurel
3339a3ec05
Removed feature(quickwit) in tantivy-common.
2022-12-22 10:19:57 +09:00
Paul Masurel
f39165e1e7
Moving FileSlice to tantivy-common ( #1729 )
2022-12-21 16:35:11 +09:00
Paul Masurel
32cb1d22da
Removed AsyncIoResult. ( #1728 )
2022-12-21 16:01:17 +09:00
Paul Masurel
4a6bf50e78
Clippy
2022-12-21 15:43:34 +09:00
PSeitz
2ac1cc2fc0
add sparse codec ( #1723 )
...
* add sparse codec
* Apply suggestions from code review
Co-authored-by: Paul Masurel <paul@quickwit.io >
* Apply suggestions from code review
Co-authored-by: Paul Masurel <paul@quickwit.io >
* Apply suggestions from code review
Co-authored-by: Paul Masurel <paul@quickwit.io >
* add the -1 u16 fix for metadata num_vals
* add dense block encoding to sparse codec
* add comment, refactor u16 reading
Co-authored-by: Paul Masurel <paul@quickwit.io >
2022-12-20 15:30:33 +01:00
PSeitz
f9171a3981
fix clippy ( #1725 )
...
* fix clippy
* fix clippy fastfield codecs
* fix clippy bitpacker
* fix clippy common
* fix clippy stacker
* fix clippy sstable
* fmt
2022-12-20 07:30:06 +01:00
PSeitz
a2cf6a79b4
Sparse dense index ( #1716 )
...
* add dense codec
* benchmark fix and important optimisation
* move code to DenseIndexBlock
improve benchmark
* Apply suggestions from code review
Co-authored-by: Paul Masurel <paul@quickwit.io >
* Apply suggestions from code review
Co-authored-by: Paul Masurel <paul@quickwit.io >
* extend benchmarks
* Apply suggestions from code review
Co-authored-by: Paul Masurel <paul@quickwit.io >
Co-authored-by: Paul Masurel <paul@quickwit.io >
2022-12-13 07:50:09 +01:00
Paul Masurel
f6e87a5319
Cargo fmt
2022-12-13 12:30:40 +09:00
Paul Masurel
f9971e15fe
Fixing unit test with sstable test.
2022-12-13 12:22:44 +09:00
PSeitz
3cdc8e7472
pass index info to serialize ( #1719 )
2022-12-13 04:20:31 +01:00
dependabot[bot]
fbb0f8b55d
Update base64 requirement from 0.13.0 to 0.20.0 ( #1720 )
...
Updates the requirements on [base64](https://github.com/marshallpierce/rust-base64 ) to permit the latest version.
- [Release notes](https://github.com/marshallpierce/rust-base64/releases )
- [Changelog](https://github.com/marshallpierce/rust-base64/blob/master/RELEASE-NOTES.md )
- [Commits](https://github.com/marshallpierce/rust-base64/compare/v0.13.0...v0.20.0 )
---
updated-dependencies:
- dependency-name: base64
dependency-type: direct:production
...
Signed-off-by: dependabot[bot] <support@github.com >
Signed-off-by: dependabot[bot] <support@github.com >
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-12-13 11:46:23 +09:00