436 Commits

Author SHA1 Message Date
PSeitz
47009ed2d3 remove unused deps (#2264)
found with cargo machete
remove pprof (doesn't work)
2023-11-20 02:59:59 +01:00
Paul Masurel
9caab45136 Preparing for 0.21.2 release. (#2256) 2023-11-15 10:43:36 +09:00
dependabot[bot]
7a2c5804b1 Update itertools requirement from 0.11.0 to 0.12.0 (#2255)
Updates the requirements on [itertools](https://github.com/rust-itertools/itertools) to permit the latest version.
- [Changelog](https://github.com/rust-itertools/itertools/blob/master/CHANGELOG.md)
- [Commits](https://github.com/rust-itertools/itertools/compare/v0.11.0...v0.12.0)

---
updated-dependencies:
- dependency-name: itertools
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-11-15 01:03:08 +01:00
PSeitz
927b4432c9 Perf: use term hashmap in fastfield (#2243)
* add shared arena hashmap

* bench fastfield indexing

* use shared arena hashmap in columnar

lower minimum resize in hashtable

* clippy

* add comments
2023-11-09 13:44:02 +01:00
PSeitz
2e7327205d fix coverage run (#2232)
coverage run uses the compare_hash_only feature which is not compativle
with the test_hashmap_size test
2023-11-06 11:18:38 +00:00
trinity-1686a
0d4589219b encode some part of posting list as -1 instead of direct values (#2185)
* add support for delta-1 encoding posting list

* encode term frequency minus one

* don't emit tf for json integer terms

* make skipreader not pub(crate) mutable
2023-10-20 16:58:26 +02:00
dependabot[bot]
337ffadefd Update lru requirement from 0.11.0 to 0.12.0 (#2208)
Updates the requirements on [lru](https://github.com/jeromefroe/lru-rs) to permit the latest version.
- [Changelog](https://github.com/jeromefroe/lru-rs/blob/master/CHANGELOG.md)
- [Commits](https://github.com/jeromefroe/lru-rs/compare/0.11.0...0.12.0)

---
updated-dependencies:
- dependency-name: lru
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-10-12 12:09:56 +02:00
dependabot[bot]
22aa4daf19 Update zstd requirement from 0.12 to 0.13 (#2214)
Updates the requirements on [zstd](https://github.com/gyscos/zstd-rs) to permit the latest version.
- [Release notes](https://github.com/gyscos/zstd-rs/releases)
- [Commits](https://github.com/gyscos/zstd-rs/compare/v0.12.0...v0.13.0)

---
updated-dependencies:
- dependency-name: zstd
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-10-12 04:24:44 +02:00
PSeitz
493f9b2f2a Read list of JSON fields encoded in dictionary (#2184)
* Read list of JSON fields encoded in dictionary

add method to get list of fields on InvertedIndexReader

* add field type
2023-10-09 12:06:22 +02:00
dependabot[bot]
166fc15239 Update memmap2 requirement from 0.7.1 to 0.9.0 (#2204)
Updates the requirements on [memmap2](https://github.com/RazrFalcon/memmap2-rs) to permit the latest version.
- [Changelog](https://github.com/RazrFalcon/memmap2-rs/blob/master/CHANGELOG.md)
- [Commits](https://github.com/RazrFalcon/memmap2-rs/compare/v0.7.1...v0.9.0)

---
updated-dependencies:
- dependency-name: memmap2
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-10-04 05:00:46 +02:00
dependabot[bot]
82d9127191 Update fs4 requirement from 0.6.3 to 0.7.0 (#2199)
Updates the requirements on [fs4](https://github.com/al8n/fs4-rs) to permit the latest version.
- [Release notes](https://github.com/al8n/fs4-rs/releases)
- [Commits](https://github.com/al8n/fs4-rs/commits/0.7.0)

---
updated-dependencies:
- dependency-name: fs4
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-10-03 04:43:09 +02:00
PSeitz
2d7390341c increase min memory to 15MB for indexing (#2176)
With tantivy 0.20 the minimum memory consumption per SegmentWriter increased to
12MB. 7MB are for the different fast field collectors types (they could be
lazily created). Increase the minimum memory from 3MB to 15MB.

Change memory variable naming from arena to budget.

closes #2156
2023-09-13 07:38:34 +02:00
PSeitz
49448b31c6 chore: Release (#2168)
* chore: Release

* update CHANGELOG
2023-09-01 13:58:58 +02:00
Harrison Burt
267dfe58d7 Fix testing on windows (#2155)
* Fix missing trait imports

* Fix building tests on windows

* Revert other PR change
2023-08-27 09:20:44 +09:00
Adam Reichold
820f126075 Remove support for Brotli and Snappy compression (#2123)
LZ4 provides fast and simple compression whereas Zstd is exceptionally flexible
so that the additional support for Brotli and Snappy does not really add
any distinct functionality on top of those two algorithms.

Removing them reduces our maintenance burden and reduces the number of choices
users have to make when setting up their project based on Tantivy.
2023-07-14 16:54:59 +09:00
dependabot[bot]
7f51d85bbd Update lru requirement from 0.10.0 to 0.11.0 (#2117)
Updates the requirements on [lru](https://github.com/jeromefroe/lru-rs) to permit the latest version.
- [Changelog](https://github.com/jeromefroe/lru-rs/blob/master/CHANGELOG.md)
- [Commits](https://github.com/jeromefroe/lru-rs/compare/0.10.0...0.11.0)

---
updated-dependencies:
- dependency-name: lru
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-07-13 09:42:21 +09:00
dependabot[bot]
7575f9bf1c Update itertools requirement from 0.10.3 to 0.11.0 (#2098)
Updates the requirements on [itertools](https://github.com/rust-itertools/itertools) to permit the latest version.
- [Changelog](https://github.com/rust-itertools/itertools/blob/master/CHANGELOG.md)
- [Commits](https://github.com/rust-itertools/itertools/compare/v0.10.5...v0.11.0)

---
updated-dependencies:
- dependency-name: itertools
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-07-07 11:14:46 +02:00
PSeitz
040554f2f9 Update to lz4_flex 0.11 (#2106) 2023-06-29 14:16:00 +08:00
dependabot[bot]
1a1f252a3f Update memmap2 requirement from 0.6.0 to 0.7.1 (#2104)
Updates the requirements on [memmap2](https://github.com/RazrFalcon/memmap2-rs) to permit the latest version.
- [Changelog](https://github.com/RazrFalcon/memmap2-rs/blob/master/CHANGELOG.md)
- [Commits](https://github.com/RazrFalcon/memmap2-rs/compare/v0.6.0...v0.7.1)

---
updated-dependencies:
- dependency-name: memmap2
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-06-27 05:15:43 +02:00
PSeitz
44850e1036 move fail dep to dev only (#2094)
wasm compilation fails with dep only
2023-06-22 06:59:11 +02:00
PSeitz
8199aa7de7 bump version to 0.20.2 (#2089) 2023-06-12 18:56:54 +08:00
PSeitz
862f367f9e release without Alice in Wonderland, bump version to 0.20.1 (#2087)
* Release without Alice in Wonderland

* bump version to 0.20.1
2023-06-12 10:54:03 +09:00
PSeitz
e3eacb4388 release tantivy (#2083)
* prerelease

* chore: Release
2023-06-09 10:47:46 +02:00
PSeitz
fdecb79273 tokenizer-api: reduce Tokenizer overhead (#2062)
* tokenizer-api: reduce Tokenizer overhead

Previously a new `Token` for each text encountered was created, which
contains `String::with_capacity(200)`
In the new API the token_stream gets mutable access to the tokenizer,
this allows state to be shared (in this PR Token is shared).
Ideally the allocation for the BoxTokenStream would also be removed, but
this may require some lifetime tricks.

* simplify api

* move lowercase and ascii folding buffer to global

* empty Token text as default
2023-06-08 18:37:58 +08:00
dependabot[bot]
4be6f83b0a Update criterion requirement from 0.4 to 0.5 (#2056)
Updates the requirements on [criterion](https://github.com/bheisler/criterion.rs) to permit the latest version.
- [Changelog](https://github.com/bheisler/criterion.rs/blob/master/CHANGELOG.md)
- [Commits](https://github.com/bheisler/criterion.rs/compare/0.4.0...0.5.0)

---
updated-dependencies:
- dependency-name: criterion
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-05-24 15:59:51 +09:00
PSeitz
00c5df610c update termmap benchmark (#2040) 2023-05-12 07:35:06 +02:00
dependabot[bot]
f479840a1b Update memmap2 requirement from 0.5.3 to 0.6.0 (#2033)
Updates the requirements on [memmap2](https://github.com/RazrFalcon/memmap2-rs) to permit the latest version.
- [Changelog](https://github.com/RazrFalcon/memmap2-rs/blob/master/CHANGELOG.md)
- [Commits](https://github.com/RazrFalcon/memmap2-rs/compare/v0.5.3...v0.6.0)

---
updated-dependencies:
- dependency-name: memmap2
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-05-10 03:50:14 +02:00
tottoto
73452284ae Remove unused crates from dependencies (#2018)
* Remove unused crates from dependencies

* Revert rand to columnar

* Revert criterion to stacker
2023-05-02 12:34:20 +02:00
trinity-1686a
9c93bfeb51 optimise warmup code path (#2007)
* optimise warmup code path

* better function naming
2023-04-21 11:23:09 +02:00
Adam Reichold
c1defdda05 Bump aho-corasick dependency to version 1.0 and adjust to API changes (#2002)
* Drop additional Arc-layer as the automaton itself is now cheap-to-clone.
* Drop state ID type parameter as it is not exposed by the library any more.
2023-04-18 07:34:30 +02:00
PSeitz
b0ef9a6252 use crates.io dependency (#1990) 2023-04-14 09:35:20 +08:00
PSeitz
41af70799d add percentiles aggregations (#1984)
* add percentiles aggregations

add percentiles aggregation
fix disabled agg benchmark

* Update src/aggregation/metric/percentiles.rs

Co-authored-by: Paul Masurel <paul@quickwit.io>

* Apply suggestions from code review

Co-authored-by: Paul Masurel <paul@quickwit.io>

* fix import

* fix import

---------

Co-authored-by: Paul Masurel <paul@quickwit.io>
2023-04-07 07:18:28 +02:00
Till Wegmüller
1a35f6573d Switch fs2 to fs4 as it is now unmaintained and does not support illumos (#1944)
Signed-off-by: Till Wegmueller <toasterson@gmail.com>
2023-03-22 13:48:49 +09:00
dependabot[bot]
c0a5b28fd3 Update lru requirement from 0.9.0 to 0.10.0 (#1932)
Updates the requirements on [lru](https://github.com/jeromefroe/lru-rs) to permit the latest version.
- [Release notes](https://github.com/jeromefroe/lru-rs/releases)
- [Changelog](https://github.com/jeromefroe/lru-rs/blob/master/CHANGELOG.md)
- [Commits](https://github.com/jeromefroe/lru-rs/compare/0.9.0...0.10.0)

---
updated-dependencies:
- dependency-name: lru
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-03-07 15:09:02 +09:00
Paul Masurel
ed5a3b3172 Bumped murmurhash version 2023-03-03 21:24:32 +09:00
PSeitz
850a0d7ae2 add agg benchmark for optional and multi value (#1916)
closes #1870
2023-03-01 17:01:52 +09:00
Paul Masurel
d002698008 Re-export of query grammar. (#1908) 2023-02-27 12:26:34 +09:00
PSeitz
03345f0aa2 fmt code, update lz4_flex (#1838)
formatting on nightly changed
2023-02-10 01:42:32 +09:00
Paul Masurel
405e2cf4d9 Merge with main 2023-02-09 14:28:57 +01:00
Paul Masurel
bd5eea9852 Integrated columnar work. 2023-02-09 13:14:31 +01:00
PSeitz
226d0f88bc add columnar to workspace (#1808) 2023-01-20 11:47:10 +01:00
Paul Masurel
25bad784ad Integrated fastfield codecs into columnar. (#1782)
Introduced asymetric OptionalCodec / SerializableOptionalCodec
Removed cardinality from the columnar sstable.
Added DynamicColumn
Reorganized all files
Change DenseCodec serialization logic.
Renamed methods to rank/select
Moved versioning footer to the columnar level
2023-01-16 17:24:49 +09:00
Adam Reichold
82a183bc2d Bump dependency on lru to from version 0.7.5 to version 0.9.0. (#1755) 2023-01-10 13:35:37 +09:00
dependabot[bot]
3090d49615 Update base64 requirement from 0.20.0 to 0.21.0 (#1769)
Updates the requirements on [base64](https://github.com/marshallpierce/rust-base64) to permit the latest version.
- [Release notes](https://github.com/marshallpierce/rust-base64/releases)
- [Changelog](https://github.com/marshallpierce/rust-base64/blob/master/RELEASE-NOTES.md)
- [Commits](https://github.com/marshallpierce/rust-base64/compare/v0.20.0...v0.21.0)

---
updated-dependencies:
- dependency-name: base64
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-01-10 13:35:05 +09:00
PSeitz
514d23a20c move tokenizer API to seperate crate (#1767)
closes #1766

Finding tantivy tokenizers is a frustrating experience currently, since
they need be updated for each tantivy version. That's unnecessary since
the API is rather stable anyway.
2023-01-09 06:37:38 +01:00
Paul Masurel
4f9efe654c Support for columnar (#1734)
* Added support for dynamic fast field.

See README for more information.

* Apply suggestions from code review

Co-authored-by: PSeitz <PSeitz@users.noreply.github.com>
2023-01-07 17:37:00 +09:00
Paul Masurel
f39165e1e7 Moving FileSlice to tantivy-common (#1729) 2022-12-21 16:35:11 +09:00
dependabot[bot]
fbb0f8b55d Update base64 requirement from 0.13.0 to 0.20.0 (#1720)
Updates the requirements on [base64](https://github.com/marshallpierce/rust-base64) to permit the latest version.
- [Release notes](https://github.com/marshallpierce/rust-base64/releases)
- [Changelog](https://github.com/marshallpierce/rust-base64/blob/master/RELEASE-NOTES.md)
- [Commits](https://github.com/marshallpierce/rust-base64/compare/v0.13.0...v0.20.0)

---
updated-dependencies:
- dependency-name: base64
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-12-13 11:46:23 +09:00
Paul Masurel
136a8f4124 Isolating sstable and stacker in independant crates. (#1718)
Both crate will be used in the new (optional + dynamic) fastfield work.
2022-12-13 11:44:17 +09:00
PSeitz
509adab79d Bump version (#1715)
* group workspace deps

* update cargo.toml

* revert tant version

* chore: Release
2022-12-12 04:39:43 +01:00