436 Commits

Author SHA1 Message Date
Paul Masurel
63c66005db Lazy scorers (#2726)
* Refactoring of the score tweaker into `SortKeyComputer`s to unlock two features.

- Allow lazy evaluation of score. As soon as we identified that a doc won't
reach the topK threshold, we can stop the evaluation.
- Allow for a different segment level score, segment level score and their conversion.

This PR breaks public API, but fixing code is straightforward.

* Bumping tantivy version

---------

Co-authored-by: Paul Masurel <paul.masurel@datadoghq.com>
2025-12-01 15:38:57 +01:00
Moe
70e591e230 feat: added filter aggregation (#2711)
* Initial impl

* Added `Filter` impl in `build_single_agg_segment_collector_with_reader` + Added tests

* Added `Filter(FilterBucketResult)` + Made tests work.

* Fixed type issues.

* Fixed a test.

* 8a7a73a: Pass `segment_reader`

* Added more tests.

* Improved parsing + tests

* refactoring

* Added more tests.

* refactoring: moved parsing code under QueryParser

* Use Tantivy syntax instead of ES

* Added a sanity check test.

* Simplified impl + tests

* Added back tests in a more maintable way

* nitz.

* nitz

* implemented very simple fast-path

* improved a comment

* implemented fast field support

* Used `BoundsRange`

* Improved fast field impl + tests

* Simplified execution.

* Fixed exports + nitz

* Improved the tests to check to the expected result.

* Improved test by checking the whole result JSON

* Removed brittle perf checks.

* Added efficiency verification tests.

* Added one more efficiency check test.

* Improved the efficiency tests.

* Removed unnecessary parsing code + added direct Query obj

* Fixed tests.

* Improved tests

* Fixed code structure

* Fixed lint issues

* nitz.

* nitz

* nitz.

* nitz.

* nitz.

* Added an example

* Fixed PR comments.

* Applied PR comments + nitz

* nitz.

* Improved the code.

* Fixed a perf issue.

* Added batch processing.

* Made the example more interesting

* Fixed bucket count

* Renamed Direct to CustomQuery

* Fixed lint issues.

* No need for scorer to be an `Option`

* nitz

* Used BitSet

* Added an optimization for AllQuery

* Fixed merge issues.

* Fixed lint issues.

* Added benchmark for FILTER

* Removed the Option wrapper.

* nitz.

* Applied PR comments.

* Fixed the AllQuery optimization

* Applied PR comments.

* feat: used `erased_serde` to allow filter query to be serialized

* further improved a comment

* Added back tests.

* removed an unused method

* removed an unused method

* Added documentation

* nitz.

* Added query builder.

* Fixed a comment.

* Applied PR comments.

* Fixed doctest issues.

* Added ser/de

* Removed bench in test

* Fixed a lint issue.
2025-11-18 20:54:31 +01:00
PSeitz
e1e131a804 add and/or queries benchmark (#2701) 2025-09-22 16:32:49 +02:00
PSeitz-dd
203751f2fe Optimize ExistsQuery for a high number of dynamic columns (#2694)
* Optimize ExistsQuery for a high number of dynamic columns

The previous algorithm checked _each_ doc in _each_ column for
existence. This causes huge cost on JSON fields with e.g. 100k columns.
Compute a bitset instead if we have more than one column.

add `iter_docs` to the multivalued_index

* add benchmark

subfields=1
exists_json_union    Memory: 89.3 KB (+2.01%)    Avg: 0.4865ms (-26.03%)    Median: 0.4865ms (-26.03%)    [0.4865ms .. 0.4865ms]
subfields=2
exists_json_union    Memory: 68.1 KB     Avg: 1.7048ms (-0.46%)    Median: 1.7048ms (-0.46%)    [1.7048ms .. 1.7048ms]
subfields=3
exists_json_union    Memory: 61.8 KB     Avg: 2.0742ms (-2.22%)    Median: 2.0742ms (-2.22%)    [2.0742ms .. 2.0742ms]
subfields=4
exists_json_union    Memory: 119.8 KB (+103.44%)    Avg: 3.9500ms (+42.62%)    Median: 3.9500ms (+42.62%)    [3.9500ms .. 3.9500ms]
subfields=5
exists_json_union    Memory: 120.4 KB (+107.65%)    Avg: 3.9610ms (+20.65%)    Median: 3.9610ms (+20.65%)    [3.9610ms .. 3.9610ms]
subfields=6
exists_json_union    Memory: 120.6 KB (+107.49%)    Avg: 3.8903ms (+3.11%)    Median: 3.8903ms (+3.11%)    [3.8903ms .. 3.8903ms]
subfields=7
exists_json_union    Memory: 120.9 KB (+106.93%)    Avg: 3.6220ms (-16.22%)    Median: 3.6220ms (-16.22%)    [3.6220ms .. 3.6220ms]
subfields=8
exists_json_union    Memory: 121.3 KB (+106.23%)    Avg: 4.0981ms (-15.97%)    Median: 4.0981ms (-15.97%)    [4.0981ms .. 4.0981ms]
subfields=16
exists_json_union    Memory: 123.1 KB (+103.09%)    Avg: 4.3483ms (-92.26%)    Median: 4.3483ms (-92.26%)    [4.3483ms .. 4.3483ms]
subfields=256
exists_json_union    Memory: 204.6 KB (+19.85%)    Avg: 3.8874ms (-99.01%)    Median: 3.8874ms (-99.01%)    [3.8874ms .. 3.8874ms]
subfields=4096
exists_json_union    Memory: 2.0 MB     Avg: 3.5571ms (-99.90%)    Median: 3.5571ms (-99.90%)    [3.5571ms .. 3.5571ms]
subfields=65536
exists_json_union    Memory: 28.3 MB     Avg: 14.4417ms (-99.97%)    Median: 14.4417ms (-99.97%)    [14.4417ms .. 14.4417ms]
subfields=262144
exists_json_union    Memory: 113.3 MB     Avg: 66.2860ms (-99.95%)    Median: 66.2860ms (-99.95%)    [66.2860ms .. 66.2860ms]

* rename methods
2025-09-16 18:21:03 +02:00
PSeitz
33794a114c chore: Release (#2686)
Co-authored-by: Pascal Seitz <pascal.seitz@datadoghq.com>
2025-08-20 18:29:37 +08:00
PSeitz-dd
8676a1f57b prepare release: update Changelog (#2685) 2025-08-20 16:07:53 +08:00
Paul Masurel
39e027667b per field size details (#2679)
* Added per-field size details.

This also does a bunch of refactoring.

merging field metadata does not silently asserts that arguments should be sorted.
merging does not set `stored`.

We do not rely on a hashmap to group fields, but instead rely on the fact that
the term dictionary is sorted.

The inverted level method that exposes field metadata is not exposed
as public anymore.

* CR comment

---------

Co-authored-by: Paul Masurel <paul.masurel@datadoghq.com>
2025-08-13 13:12:22 +02:00
Paul M.
f2c77f06c5 Update fs4 to latest (0.13.1) (#2654)
- One change was needed to handle the `Result<bool>` that now returns from `try_lock_exclusive`

Co-authored-by: Paul M. <prov223@tutanota.com>
2025-07-14 11:26:19 +08:00
PSeitz
4a6123d3ff release tantivy: bump versions (#2625)
* chore: Release

* chore: Release

---------

Co-authored-by: Pascal Seitz <pascal.seitz@datadoghq.com>
2025-06-10 15:34:39 +02:00
Parth
5a2fe42c24 make zstd optional in sstable (#2633)
* make zstd truly optional

* changelog notes

* make sure we write

* resolve comments

* make this a default feature

* remove changelog notes
2025-05-14 17:16:41 +02:00
PSeitz
5379c99ea2 update edition to 2024 (#2620)
* update common to edition 2024

* update bitpacker to edition 2024

* update stacker to edition 2024

* update query-grammar to edition 2024

* update sstable to edition 2024 + fmt

* fmt

* update columnar to edition 2024

* cargo fmt

* use None instead of _
2025-04-18 04:56:31 +02:00
Pascal Seitz
6ab4102253 fix tantivy-query-grammar version 2025-04-09 14:35:23 +08:00
PSeitz
11c6329ca5 temp unbump version (#2501)
temp unbump to 0.22 for easier release with `cargo release`
2025-04-09 08:09:41 +02:00
Kat Lim Ruiz
18ae3ffe94 uniformize root cargo.toml 2025-03-30 21:55:51 -05:00
Kat Lim Ruiz
feced4762f update root cargo.toml 2025-03-30 11:01:22 -05:00
dependabot[bot]
4aa8cd2470 Update downcast-rs requirement from 1.2.1 to 2.0.1 (#2566)
Updates the requirements on [downcast-rs](https://github.com/marcianx/downcast-rs) to permit the latest version.
- [Changelog](https://github.com/marcianx/downcast-rs/blob/master/CHANGELOG.md)
- [Commits](https://github.com/marcianx/downcast-rs/compare/v1.2.1...v2.0.1)

---
updated-dependencies:
- dependency-name: downcast-rs
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-01-22 10:32:24 +01:00
dependabot[bot]
43c89b4360 Update itertools requirement from 0.13.0 to 0.14.0 (#2563)
Updates the requirements on [itertools](https://github.com/rust-itertools/itertools) to permit the latest version.
- [Changelog](https://github.com/rust-itertools/itertools/blob/master/CHANGELOG.md)
- [Commits](https://github.com/rust-itertools/itertools/compare/v0.13.0...v0.14.0)

---
updated-dependencies:
- dependency-name: itertools
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-01-08 17:11:46 +01:00
trinity-1686a
d281ca3e65 Merge pull request #2559 from quickwit-oss/trinity/sstable-partial-automaton
allow warming partially an sstable for an automaton
2025-01-08 16:35:35 +01:00
trinity Pointard
175a529c41 use executor for cpu-heavy sstable decompression for automaton 2025-01-03 19:14:07 +01:00
Harrison Burt
148594f0f9 Improve IndexWriter customisation via builder (#2562)
* Improve `IndexWriter` customisation via builder

* Remove change noise from PR

* Correct documentation

* Resolve comments and add test
2025-01-02 09:43:22 +01:00
dependabot[bot]
0f99d4f420 Update measure_time requirement from 0.8.2 to 0.9.0 (#2557)
---
updated-dependencies:
- dependency-name: measure_time
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-12-09 21:39:01 +01:00
dependabot[bot]
c71ea7b2ef Update thiserror requirement from 1.0.30 to 2.0.1 (#2542)
Updates the requirements on [thiserror](https://github.com/dtolnay/thiserror) to permit the latest version.
- [Release notes](https://github.com/dtolnay/thiserror/releases)
- [Commits](https://github.com/dtolnay/thiserror/compare/1.0.30...2.0.1)

---
updated-dependencies:
- dependency-name: thiserror
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-11-09 08:08:34 +08:00
Paul Masurel
c35a782747 Updating rustc-hash and clippy fixes (#2532)
* Updating rustc-hash and clippy fixes

* fix terms_aggregation_min_doc_count_special_case

---------

Co-authored-by: Pascal Seitz <pascal.seitz@gmail.com>
2024-11-01 13:46:26 +08:00
dependabot[bot]
c66af2c0a9 Update binggan requirement from 0.12.0 to 0.14.0 (#2530)
* Update binggan requirement from 0.12.0 to 0.14.0

---
updated-dependencies:
- dependency-name: binggan
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>

* fix build

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Pascal Seitz <pascal.seitz@gmail.com>
2024-10-24 09:41:35 +08:00
dependabot[bot]
99be20cedd Update binggan requirement from 0.10.0 to 0.12.0 (#2519)
* Update binggan requirement from 0.10.0 to 0.12.0

---
updated-dependencies:
- dependency-name: binggan
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>

* fix build

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Pascal Seitz <pascal.seitz@gmail.com>
2024-10-16 11:36:04 +08:00
Bruce Mitchener
5f026901b8 Update MSRV to 1.75 (#2515)
This is required by the `fs4` dependency. There are other
things that need something later than 1.66.

Both quickwit and the Python binding already require something
newer.
2024-10-16 10:32:16 +08:00
PSeitz
2f5a269c70 update packages (#2500)
fixes some warnings
2024-09-25 17:46:18 +08:00
dependabot[bot]
56fc56c5b9 Update binggan requirement from 0.8.0 to 0.10.0 (#2493)
* Update binggan requirement from 0.8.0 to 0.10.0

---
updated-dependencies:
- dependency-name: binggan
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>

* update PR

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Pascal Seitz <pascal.seitz@gmail.com>
2024-09-10 14:26:06 +08:00
PSeitz
56d79cb203 fix cardinality aggregation performance (#2446)
* fix cardinality aggregation performance

fix cardinality performance by fetching multiple terms at once. This
avoids decompressing the same block and keeps the buffer state between
terms.

add cardinality aggregation benchmark

bump rust version to 1.66

Performance comparison to before (AllQuery)
```
full
cardinality_agg                   Memory: 3.5 MB (-0.00%)    Avg: 21.2256ms (-97.78%)    Median: 21.0042ms (-97.82%)    [20.4717ms .. 23.6206ms]
terms_few_with_cardinality_agg    Memory: 10.6 MB            Avg: 81.9293ms (-97.37%)    Median: 81.5526ms (-97.38%)    [79.7564ms .. 88.0374ms]
dense
cardinality_agg                   Memory: 3.6 MB (-0.00%)    Avg: 25.9372ms (-97.24%)    Median: 25.7744ms (-97.25%)    [24.7241ms .. 27.8793ms]
terms_few_with_cardinality_agg    Memory: 10.6 MB            Avg: 93.9897ms (-96.91%)    Median: 92.7821ms (-96.94%)    [90.3312ms .. 117.4076ms]
sparse
cardinality_agg                   Memory: 895.4 KB (-0.00%)    Avg: 22.5113ms (-95.01%)    Median: 22.5629ms (-94.99%)    [22.1628ms .. 22.9436ms]
terms_few_with_cardinality_agg    Memory: 680.2 KB             Avg: 26.4250ms (-94.85%)    Median: 26.4135ms (-94.86%)    [26.3210ms .. 26.6774ms]
```

* clippy

* assert for sorted ordinals
2024-07-02 15:29:00 +08:00
Raphael Coeffic
d9db5302d9 feat: cardinality aggregation (#2337)
* WiP: cardinality aggregation

* Collect unique entries first, then insert into HyperLogLog

* Handle `missing`

* Hybrid approach

* Review changes

- insert `missing` value at most once
- `term_id` -> `term_ord`
- iterate directly over entries without collecting first

* Use salted hasher to include column type

* fix: formatting

* More review fixes

* Add cardinality to test_aggregation_flushing

* Formatting
2024-07-01 07:49:42 +08:00
dependabot[bot]
b960e40bc8 Update sketches-ddsketch requirement from 0.2.1 to 0.3.0 (#2423)
Updates the requirements on [sketches-ddsketch](https://github.com/mheffner/rust-sketches-ddsketch) to permit the latest version.
- [Release notes](https://github.com/mheffner/rust-sketches-ddsketch/releases)
- [Commits](https://github.com/mheffner/rust-sketches-ddsketch/compare/v0.2.1...v0.3.0)

---
updated-dependencies:
- dependency-name: sketches-ddsketch
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-06-04 15:50:23 +08:00
PSeitz
c0686515a9 update one_shot (#2420) 2024-05-31 11:07:35 +08:00
Meng Zhang
4143d31865 chore: fix build as the rev is gone (#2417) 2024-05-29 09:49:16 +08:00
dependabot[bot]
5a80420b10 --- (#2406)
updated-dependencies:
- dependency-name: binggan
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-05-21 04:36:32 +02:00
dependabot[bot]
aa26ff5029 Update binggan requirement from 0.6.2 to 0.7.0 (#2401)
---
updated-dependencies:
- dependency-name: binggan
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-05-17 02:53:25 +02:00
dependabot[bot]
e197b59258 Update itertools requirement from 0.12.0 to 0.13.0 (#2400)
Updates the requirements on [itertools](https://github.com/rust-itertools/itertools) to permit the latest version.
- [Changelog](https://github.com/rust-itertools/itertools/blob/master/CHANGELOG.md)
- [Commits](https://github.com/rust-itertools/itertools/compare/v0.12.0...v0.13.0)

---
updated-dependencies:
- dependency-name: itertools
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-05-17 02:53:02 +02:00
dependabot[bot]
a79590477e Update binggan requirement from 0.5.2 to 0.6.2 (#2399)
---
updated-dependencies:
- dependency-name: binggan
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-05-15 05:40:37 +02:00
Paul Masurel
6181c1eb5e Small changes in the Executor API. (#2391)
Warning, this change is mildly not backward compatible
so I bumped tantivy's version.
2024-05-10 17:19:12 +09:00
Paul Masurel
2b76335a95 Removed usage of num_cpus (#2387)
* Removed usage of num_cpus
* handling error
2024-05-08 13:32:52 +09:00
PSeitz
c6b213d8f0 use bingang for agg benchmark (#2378)
* use bingang for agg benchmark

use bingang for agg benchmark, which includes memory consumption

Output:
```
full
histogram                     Memory: 15.8 KB              Avg: 10.9322ms  (+5.44%)    Median: 10.8790ms  (+9.28%)     Min: 10.7470ms    Max: 11.3263ms
histogram_hard_bounds         Memory: 15.5 KB              Avg: 5.1939ms  (+6.61%)     Median: 5.1722ms  (+10.98%)     Min: 5.0432ms     Max: 5.3910ms
histogram_with_avg_sub_agg    Memory: 48.7 KB              Avg: 23.8165ms  (+4.57%)    Median: 23.7264ms  (+10.06%)    Min: 23.4995ms    Max: 24.8107ms
dense
histogram                     Memory: 17.3 KB              Avg: 15.6810ms  (-8.54%)    Median: 15.6174ms  (-8.89%)    Min: 15.4953ms    Max: 16.0702ms
histogram_hard_bounds         Memory: 15.4 KB              Avg: 10.0720ms  (-7.33%)    Median: 10.0572ms  (-7.06%)    Min: 9.8500ms     Max: 10.4819ms
histogram_with_avg_sub_agg    Memory: 50.1 KB              Avg: 33.0993ms  (-7.04%)    Median: 32.9499ms  (-6.86%)    Min: 32.8284ms    Max: 34.0529ms
sparse
histogram                     Memory: 16.3 KB              Avg: 19.2325ms  (-0.44%)    Median: 19.1211ms  (-1.26%)    Min: 19.0348ms    Max: 19.7902ms
histogram_hard_bounds         Memory: 16.1 KB              Avg: 18.5179ms  (-0.61%)    Median: 18.4552ms  (-0.90%)    Min: 18.3799ms    Max: 19.0535ms
histogram_with_avg_sub_agg    Memory: 34.7 KB              Avg: 21.2589ms  (-0.69%)    Median: 21.1867ms  (-1.05%)    Min: 21.0342ms    Max: 21.9900ms
```

* add more bench with term as sub agg
2024-05-07 11:29:49 +02:00
PSeitz
17d5869ad6 update CHANGELOG, use github API in cliff (#2354)
* update CHANGELOG, use github API in cliff

* reset version to 0.21.1, before release

* chore: Release

* remove unreleased from CHANGELOG
2024-04-15 10:07:20 +02:00
PSeitz
74940e9345 clippy (#2349)
* fix clippy

* fix clippy

* fix duplicate imports
2024-04-09 07:54:44 +02:00
PSeitz
92c32979d2 fix postcard compatibility for top_hits, add postcard test (#2346)
* fix postcard compatibility for top_hits, add postcard test

* fix top_hits naming, delay data fetch

closes #2347

* fix import
2024-04-09 06:17:25 +02:00
dependabot[bot]
0cffe5fb09 Update base64 requirement from 0.21.0 to 0.22.0 (#2324)
Updates the requirements on [base64](https://github.com/marshallpierce/rust-base64) to permit the latest version.
- [Changelog](https://github.com/marshallpierce/rust-base64/blob/master/RELEASE-NOTES.md)
- [Commits](https://github.com/marshallpierce/rust-base64/compare/v0.21.0...v0.22.0)

---
updated-dependencies:
- dependency-name: base64
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-03-15 15:50:34 +09:00
dependabot[bot]
2650317622 Update fs4 requirement from 0.7.0 to 0.8.0 (#2321)
Updates the requirements on [fs4](https://github.com/al8n/fs4-rs) to permit the latest version.
- [Release notes](https://github.com/al8n/fs4-rs/releases)
- [Commits](https://github.com/al8n/fs4-rs/commits)

---
updated-dependencies:
- dependency-name: fs4
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-02-27 03:38:04 +01:00
Tushar
0e04ec3136 feat(aggregators/metric): Add a top_hits aggregator (#2198)
* feat(aggregators/metric): Implement a top_hits aggregator

* fix: Expose get_fields

* fix: Serializer for top_hits request

Also removes extraneous the extraneous third-party
serialization helper.

* chore: Avert panick on parsing invalid top_hits query

* refactor: Allow multiple field names from aggregations

* perf: Replace binary heap with TopNComputer

* fix: Avoid comparator inversion by ComparableDoc

* fix: Rank missing field values lower than present values

* refactor: Make KeyOrder a struct

* feat: Rough attempt at docvalue_fields

* feat: Complete stab at docvalue_fields

- Rename "SearchResult*" => "Retrieval*"
- Revert Vec => HashMap for aggregation accessors.
- Split accessors for core aggregation and field retrieval.
- Resolve globbed field names in docvalue_fields retrieval.
- Handle strings/bytes and other column types with DynamicColumn

* test(unit): Add tests for top_hits aggregator

* fix: docfield_value field globbing

* test(unit): Include dynamic fields

* fix: Value -> OwnedValue

* fix: Use OwnedValue's native Null variant

* chore: Improve readability of test asserts

* chore: Remove DocAddress from top_hits result

* docs: Update aggregator doc

* revert: accidental doc test

* chore: enable time macros only for tests

* chore: Apply suggestions from review

* chore: Apply suggestions from review

* fix: Retrieve all values for fields

* test(unit): Update for multi-value retrieval

* chore: Assert term existence

* feat: Include all columns for a column name

Since a (name, type) constitutes a unique column.

* fix: Resolve json fields

Introduces a translation step to bridge the difference between
ColumnarReaders null `\0` separated json field keys to the common
`.` separated used by SegmentReader. Although, this should probably
be the default behavior for ColumnarReader's public API perhaps.

* chore: Address review on mutability

* chore: s/segment_id/segment_ordinal instances of SegmentOrdinal

* chore: Revert erroneous grammar change
2024-01-26 16:46:41 +01:00
Paul Masurel
9b7f3a55cf Bumped census version 2024-01-26 19:32:02 +09:00
PSeitz
0b56c88e69 Revert "Preparing for 0.21.2 release." (#2258)
* Revert "Preparing for 0.21.2 release. (#2256)"

This reverts commit 9caab45136.

* bump version to 0.21.1

* set version to 0.22.0-dev
2023-12-01 13:46:12 +01:00
PSeitz
24841f0b2a update bitpacker dep (#2269) 2023-12-01 13:45:52 +01:00
PSeitz
07573a7f19 update fst (#2267)
update fst to 0.5 (deduplicates regex-syntax in the dep tree)
deps cleanup
2023-11-21 16:06:57 +01:00