Pascal Seitz
e07f1970ea
fix count type
2023-01-13 20:10:23 +08:00
Pascal Seitz
78273bfb0d
reuse stats for average
2023-01-13 17:43:25 +08:00
Shikhar Bhushan
2650111b76
EnableScoring::Disabled - optional Searcher ( #1780 )
2023-01-12 09:26:50 -05:00
PSeitz
1176555eff
handle user input on get_docid_for_value_range ( #1760 )
...
* handle user input on get_docid_for_value_range
fixes #1757
* pass range as parameter
2023-01-12 14:20:16 +01:00
Adrien Guillo
f8d111a75e
Merge pull request #1777 from quickwit-oss/guilload/ff-range-query-on-not-indexed-fields
...
Allow range queries via fast fields on non-indexed fields
2023-01-11 10:14:32 -05:00
Adrien Guillo
e17996f2fd
Allow range queries via fast fields on non-indexed fields
2023-01-11 09:56:13 -05:00
Adrien Guillo
f3621c0487
Add license to tokenizer-api crate ( #1778 )
2023-01-11 05:26:41 +01:00
Adrien Guillo
14222a47a3
Fix typo ( #1776 )
2023-01-11 00:49:13 +09:00
Adam Reichold
8312c882a5
More cosmetic fixes for upcoming Clippy lints. ( #1771 )
2023-01-10 10:32:45 +01:00
Paul Masurel
7a8fce0ae7
Minor mini fixes
2023-01-10 14:15:30 +09:00
Michael Kleen
196e42f33e
Add regex tokenizer ( #1759 )
...
This adds a regex tokenizer which tokenizes the text by using a
regex pattern to split.
Co-authored-by: Michael Kleen <mkleen@gmailw.com >
2023-01-10 13:38:37 +09:00
Adam Reichold
82a183bc2d
Bump dependency on lru to from version 0.7.5 to version 0.9.0. ( #1755 )
2023-01-10 13:35:37 +09:00
dependabot[bot]
3090d49615
Update base64 requirement from 0.20.0 to 0.21.0 ( #1769 )
...
Updates the requirements on [base64](https://github.com/marshallpierce/rust-base64 ) to permit the latest version.
- [Release notes](https://github.com/marshallpierce/rust-base64/releases )
- [Changelog](https://github.com/marshallpierce/rust-base64/blob/master/RELEASE-NOTES.md )
- [Commits](https://github.com/marshallpierce/rust-base64/compare/v0.20.0...v0.21.0 )
---
updated-dependencies:
- dependency-name: base64
dependency-type: direct:production
...
Signed-off-by: dependabot[bot] <support@github.com >
Signed-off-by: dependabot[bot] <support@github.com >
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-01-10 13:35:05 +09:00
PSeitz
7c6cc818ae
enable range query on fast field for u64 compatible types ( #1762 )
...
* enable range query on fast field for u64 compatible types
* rename, update benches
2023-01-10 04:08:26 +01:00
PSeitz
514d23a20c
move tokenizer API to seperate crate ( #1767 )
...
closes #1766
Finding tantivy tokenizers is a frustrating experience currently, since
they need be updated for each tantivy version. That's unnecessary since
the API is rather stable anyway.
2023-01-09 06:37:38 +01:00
Paul Masurel
4f9efe654c
Support for columnar ( #1734 )
...
* Added support for dynamic fast field.
See README for more information.
* Apply suggestions from code review
Co-authored-by: PSeitz <PSeitz@users.noreply.github.com >
2023-01-07 17:37:00 +09:00
Adam Reichold
1afa5bf3db
Make construction of LevenshteinAutomatonBuilder for FuzzyTermQuery instances lazy. ( #1756 )
2023-01-06 12:44:49 +09:00
PSeitz
07a51eb7c8
refactor multivalue fastfield, refactor range query ( #1749 )
...
Introduce MakeZero trait, remove make_zero from FastValue
Merge two multivalue fastfield implementations into one
prepare range query on fastfield for different types
2023-01-05 12:09:50 +01:00
Adam Reichold
2080c370c2
Enable usage of FuzzyTermQuery for specific fields via QueryParser ( #1750 )
...
* Make nightly Clippy mostly happy.
* Document how to produce TermSetQuery queries using QueryParser.
* Enable construction of queries using FuzzyTermQuery via the QueryParser
* Use FxHashMap instead of HashMap in the QueryParser as these hash tables are not exposed to DoS attacks.
* Use a struct instead of a tuple to improve readability.
2023-01-04 18:11:27 +09:00
Daw-Chih Liou
b22f96624e
doc: update comments in the faceted search example ( #1737 )
...
* doc: update comments in the faceted search example
* chore: format
2023-01-02 11:07:30 +01:00
pinkforest(she/her)
b78dc5e313
Bump prettytables ( #1746 )
2022-12-31 15:01:39 +01:00
Paul Masurel
3f915925af
Fixing unit tests
2022-12-27 12:02:16 +09:00
Paul Masurel
9c5fef5af7
Fixing sstable proptest ( #1743 )
2022-12-26 16:29:33 +09:00
Paul Masurel
9948a84ebe
Simplifies the count_ones definition. ( #1742 )
2022-12-26 16:08:01 +09:00
PSeitz
45156fd869
use group_by in translate_codec_idx_to_original_id ( #1736 )
2022-12-26 06:13:29 +01:00
Paul Masurel
bc959006fa
Ooops. Removing ordered_floats.
2022-12-22 19:50:34 +09:00
Paul Masurel
7385a8f80c
Supporting PartialCmp in VectorColumn. ( #1735 )
...
* Supporting PartialCmp in VectorColumn.
* Apply suggestions from code review
Co-authored-by: PSeitz <PSeitz@users.noreply.github.com >
2022-12-22 17:47:25 +09:00
Paul Masurel
13b89cba17
Adding inlines.
2022-12-22 14:29:41 +09:00
Hasnain Lakhani
f4804ce2f5
Adjust spelling of "returns" in docs for DisjunctionMaxQuery ( #1733 )
2022-12-22 14:04:07 +09:00
Paul Masurel
2a6d1eaf78
Added missing license.
2022-12-22 12:47:43 +09:00
Paul Masurel
540a9972bd
Support for NotNaN in fast fields
2022-12-22 12:28:25 +09:00
Paul Masurel
bb48c3e488
Refactoring to prepare for the addition of dynamic fast field ( #1730 )
...
* Refactoring to prepare for the addition of dynamic fast field
- Exposing insert_key / insert_value
- Renamed SSTable::{Reader/Writer}-> SSTable::{ValueReader/ValueWriter}
- Added a generic Dictionary object in the sstable crate
- Removing the TermDictionary wrapper from tantivy, relying directly on
an alias of the generic Dictionary object.
- dropped the use of byteorder in sstable.
- Stopped scanning / reading the entire dictionary when streaming a range.
* Added a benchmark for streaming sstable ranges.
* CR comments.
Rename deserialize_u64 -> deserialize_vint_u64
* Removed needless allocation, split serialize into serialize and clear.
2022-12-22 12:25:46 +09:00
Paul Masurel
3339a3ec05
Removed feature(quickwit) in tantivy-common.
2022-12-22 10:19:57 +09:00
Paul Masurel
f39165e1e7
Moving FileSlice to tantivy-common ( #1729 )
2022-12-21 16:35:11 +09:00
Paul Masurel
32cb1d22da
Removed AsyncIoResult. ( #1728 )
2022-12-21 16:01:17 +09:00
Paul Masurel
4a6bf50e78
Clippy
2022-12-21 15:43:34 +09:00
PSeitz
2ac1cc2fc0
add sparse codec ( #1723 )
...
* add sparse codec
* Apply suggestions from code review
Co-authored-by: Paul Masurel <paul@quickwit.io >
* Apply suggestions from code review
Co-authored-by: Paul Masurel <paul@quickwit.io >
* Apply suggestions from code review
Co-authored-by: Paul Masurel <paul@quickwit.io >
* add the -1 u16 fix for metadata num_vals
* add dense block encoding to sparse codec
* add comment, refactor u16 reading
Co-authored-by: Paul Masurel <paul@quickwit.io >
2022-12-20 15:30:33 +01:00
PSeitz
f9171a3981
fix clippy ( #1725 )
...
* fix clippy
* fix clippy fastfield codecs
* fix clippy bitpacker
* fix clippy common
* fix clippy stacker
* fix clippy sstable
* fmt
2022-12-20 07:30:06 +01:00
PSeitz
a2cf6a79b4
Sparse dense index ( #1716 )
...
* add dense codec
* benchmark fix and important optimisation
* move code to DenseIndexBlock
improve benchmark
* Apply suggestions from code review
Co-authored-by: Paul Masurel <paul@quickwit.io >
* Apply suggestions from code review
Co-authored-by: Paul Masurel <paul@quickwit.io >
* extend benchmarks
* Apply suggestions from code review
Co-authored-by: Paul Masurel <paul@quickwit.io >
Co-authored-by: Paul Masurel <paul@quickwit.io >
2022-12-13 07:50:09 +01:00
Paul Masurel
f6e87a5319
Cargo fmt
2022-12-13 12:30:40 +09:00
Paul Masurel
f9971e15fe
Fixing unit test with sstable test.
2022-12-13 12:22:44 +09:00
PSeitz
3cdc8e7472
pass index info to serialize ( #1719 )
2022-12-13 04:20:31 +01:00
dependabot[bot]
fbb0f8b55d
Update base64 requirement from 0.13.0 to 0.20.0 ( #1720 )
...
Updates the requirements on [base64](https://github.com/marshallpierce/rust-base64 ) to permit the latest version.
- [Release notes](https://github.com/marshallpierce/rust-base64/releases )
- [Changelog](https://github.com/marshallpierce/rust-base64/blob/master/RELEASE-NOTES.md )
- [Commits](https://github.com/marshallpierce/rust-base64/compare/v0.13.0...v0.20.0 )
---
updated-dependencies:
- dependency-name: base64
dependency-type: direct:production
...
Signed-off-by: dependabot[bot] <support@github.com >
Signed-off-by: dependabot[bot] <support@github.com >
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-12-13 11:46:23 +09:00
Paul Masurel
136a8f4124
Isolating sstable and stacker in independant crates. ( #1718 )
...
Both crate will be used in the new (optional + dynamic) fastfield work.
2022-12-13 11:44:17 +09:00
PSeitz
5d4535de83
Changelog fix ( #1717 )
2022-12-12 14:28:42 +09:00
PSeitz
2c50b02eb3
Fix max bucket limit in histogram ( #1703 )
...
* Fix max bucket limit in histogram
The max bucket limit in histogram was broken, since some code introduced temporary filtering of buckets, which then resulted into an incorrect increment on the bucket count.
The provided solution covers more scenarios, but there are still some scenarios unhandled (See #1702 ).
* Apply suggestions from code review
Co-authored-by: Paul Masurel <paul@quickwit.io >
Co-authored-by: Paul Masurel <paul@quickwit.io >
0.19
2022-12-12 04:40:15 +01:00
PSeitz
509adab79d
Bump version ( #1715 )
...
* group workspace deps
* update cargo.toml
* revert tant version
* chore: Release
2022-12-12 04:39:43 +01:00
PSeitz
96c93a6ba3
Merge pull request #1700 from quickwit-oss/PSeitz-patch-1
...
Update CHANGELOG.md
2022-12-02 16:31:11 +01:00
boraarslan
495824361a
Move split_full_path to Schema ( #1692 )
2022-11-29 20:56:13 +09:00
PSeitz
485a8f507e
Update CHANGELOG.md
2022-11-28 15:41:31 +01:00