Implement GreaterThanOrEqual and LessThanOrEqual to handle boundary cases in Chain.

Simpler implementation of first_vals_in_value_range.
Fix compound filters, and remove redundant implementation in Chain implementation
2026-01-06 17:22:54 +00:00 · 2025-12-29 15:38:28 -07:00 · 2025-12-29 14:51:21 -07:00 · 2025-12-29 14:51:17 -07:00 · 2025-12-27 21:03:02 -07:00 · 2025-12-27 17:56:55 -07:00
354 changed files with 27874 additions and 6669 deletions
--- a/.github/workflows/coverage.yml
+++ b/.github/workflows/coverage.yml
@@ -1,29 +0,0 @@
-name: Coverage
-
-on:
-  push:
-    branches: [main]
-
-# Ensures that we cancel running jobs for the same PR / same workflow.
-concurrency:
-  group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}
-  cancel-in-progress: true
-
-jobs:
-  coverage:
-    runs-on: ubuntu-latest
-    steps:
-      - uses: actions/checkout@v4
-      - name: Install Rust
-        run: rustup toolchain install nightly-2024-07-01 --profile minimal --component llvm-tools-preview
-      - uses: Swatinem/rust-cache@v2
-      - uses: taiki-e/install-action@cargo-llvm-cov
-      - name: Generate code coverage
-        run: cargo +nightly-2024-07-01 llvm-cov --all-features --workspace --doctests --lcov --output-path lcov.info
-      - name: Upload coverage to Codecov
-        uses: codecov/codecov-action@v3
-        continue-on-error: true
-        with:
-          token: ${{ secrets.CODECOV_TOKEN }} # not required for public repos
-          files: lcov.info
-          fail_ci_if_error: true
--- a/.github/workflows/test.yml
+++ b/.github/workflows/test.yml
@@ -76,7 +76,9 @@ jobs:
            profile: minimal
            override: true

-    - uses: taiki-e/install-action@nextest
+    - uses: taiki-e/install-action@v2
+      with:
+        tool: 'nextest'
    - uses: Swatinem/rust-cache@v2

    - name: Run tests
--- a/.gitignore
+++ b/.gitignore
@@ -6,7 +6,6 @@ target
 target/debug
 .vscode
 target/release
-Cargo.lock
 benchmark
 .DS_Store
 *.bk
@@ -15,3 +14,7 @@ trace.dat
 cargo-timing*
 control
 variable
+
+# for `sample record -p`
+profile.json
+profile.json.gz
--- a/ARCHITECTURE.md
+++ b/ARCHITECTURE.md
@@ -46,7 +46,7 @@ The file of a segment has the format

 ```segment-id . ext```

-The extension signals which data structure (or [`SegmentComponent`](src/core/segment_component.rs)) is stored in the file.
+The extension signals which data structure (or [`SegmentComponent`](src/index/segment_component.rs)) is stored in the file.

 A small `meta.json` file is in charge of keeping track of the list of segments, as well as the schema.

@@ -102,7 +102,7 @@ but users can extend tantivy with their own implementation.

 Tantivy's document follows a very strict schema, decided before building any index.

-The schema defines all of the fields that the indexes [`Document`](src/schema/document.rs) may and should contain, their types (`text`, `i64`, `u64`, `Date`, ...) as well as how it should be indexed / represented in tantivy.
+The schema defines all of the fields that the indexes [`Document`](src/schema/document/mod.rs) may and should contain, their types (`text`, `i64`, `u64`, `Date`, ...) as well as how it should be indexed / represented in tantivy.

 Depending on the type of the field, you can decide to

--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -1,11 +1,42 @@
-Tantivy 0.23 - Unreleased
+Tantivy 0.25
 ================================
-Tantivy 0.23 will be backwards compatible with indices created with v0.22 and v0.21.
+
+## Bugfixes
+- fix union performance regression in tantivy 0.24 [#2663](https://github.com/quickwit-oss/tantivy/pull/2663)(@PSeitz)
+- make zstd optional in sstable [#2633](https://github.com/quickwit-oss/tantivy/pull/2633)(@Parth)
+- Fix TopDocs::order_by_string_fast_field for asc order [#2672](https://github.com/quickwit-oss/tantivy/pull/2672)(@stuhood @PSeitz)
+
+## Features/Improvements
+- add docs/example and Vec<u32> values to sstable [#2660](https://github.com/quickwit-oss/tantivy/pull/2660)(@PSeitz)
+- Add string fast field support to `TopDocs`. [#2642](https://github.com/quickwit-oss/tantivy/pull/2642)(@stuhood)
+- update edition to 2024 [#2620](https://github.com/quickwit-oss/tantivy/pull/2620)(@PSeitz)
+- Allow optional spaces between the field name and the value in the query parser [#2678](https://github.com/quickwit-oss/tantivy/pull/2678)(@Darkheir)
+- Support mixed field types in query parser [#2676](https://github.com/quickwit-oss/tantivy/pull/2676)(@trinity-1686a)
+- Add per-field size details [#2679](https://github.com/quickwit-oss/tantivy/pull/2679)(@fulmicoton)
+
+Tantivy 0.24.2
+================================
+- Fix TopNComputer for reverse order. [#2672](https://github.com/quickwit-oss/tantivy/pull/2672)(@stuhood @PSeitz) 
+
+Affected queries are [order_by_fast_field](https://docs.rs/tantivy/latest/tantivy/collector/struct.TopDocs.html#method.order_by_fast_field) and
+[order_by_u64_field](https://docs.rs/tantivy/latest/tantivy/collector/struct.TopDocs.html#method.order_by_u64_field)
+for `Order::Asc`
+
+Tantivy 0.24.1
+================================
+- Fix: bump required rust version to 1.81
+  
+Tantivy 0.24
+================================
+Tantivy 0.24 will be backwards compatible with indices created with v0.22 and v0.21. The new minimum rust version will be 1.75. Tantivy 0.23 will be skipped.

 #### Bugfixes
 - fix potential endless loop in merge [#2457](https://github.com/quickwit-oss/tantivy/pull/2457)(@PSeitz)
 - fix bug that causes out-of-order sstable key. [#2445](https://github.com/quickwit-oss/tantivy/pull/2445)(@fulmicoton)
 - fix ReferenceValue API flaw [#2372](https://github.com/quickwit-oss/tantivy/pull/2372)(@PSeitz)
+- fix `OwnedBytes` debug panic [#2512](https://github.com/quickwit-oss/tantivy/pull/2512)(@b41sh)
+- catch panics during merges [#2582](https://github.com/quickwit-oss/tantivy/pull/2582)(@rdettai)
+- switch from u32 to usize in bitpacker. This enables multivalued columns larger than 4GB, which crashed during merge before. [#2581](https://github.com/quickwit-oss/tantivy/pull/2581) [#2586](https://github.com/quickwit-oss/tantivy/pull/2586)(@fulmicoton-dd @PSeitz)

 #### Breaking API Changes
 - remove index sorting [#2434](https://github.com/quickwit-oss/tantivy/pull/2434)(@PSeitz)
@@ -23,6 +54,7 @@ Tantivy 0.23 will be backwards compatible with indices created with v0.22 and v0
    - reduce top hits memory consumption [#2426](https://github.com/quickwit-oss/tantivy/pull/2426)(@PSeitz)
    - check unsupported parameters top_hits [#2351](https://github.com/quickwit-oss/tantivy/pull/2351)(@PSeitz)
    - Change AggregationLimits to AggregationLimitsGuard [#2495](https://github.com/quickwit-oss/tantivy/pull/2495)(@PSeitz)
+    - add support for counting non integer in aggregation [#2547](https://github.com/quickwit-oss/tantivy/pull/2547)(@trinity-1686a)
 - **Range Queries**
    - Support fast field range queries on json fields [#2456](https://github.com/quickwit-oss/tantivy/pull/2456)(@PSeitz)
    - Add support for str fast field range query [#2460](https://github.com/quickwit-oss/tantivy/pull/2460) [#2452](https://github.com/quickwit-oss/tantivy/pull/2452) [#2453](https://github.com/quickwit-oss/tantivy/pull/2453)(@PSeitz)
@@ -33,11 +65,20 @@ Tantivy 0.23 will be backwards compatible with indices created with v0.22 and v0
 - add columnar format compatibility tests [#2433](https://github.com/quickwit-oss/tantivy/pull/2433)(@PSeitz)
 - Improved snippet ranges algorithm [#2474](https://github.com/quickwit-oss/tantivy/pull/2474)(@gezihuzi)
 - make find_field_with_default return json fields without path [#2476](https://github.com/quickwit-oss/tantivy/pull/2476)(@trinity-1686a)
- feat(query): Make `BooleanQuery` support `minimum_number_should_match` [#2405](https://github.com/quickwit-oss/tantivy/pull/2405)(@LebranceBW)
+- Make `BooleanQuery` support `minimum_number_should_match` [#2405](https://github.com/quickwit-oss/tantivy/pull/2405)(@LebranceBW)
+- Make `NUM_MERGE_THREADS` configurable [#2535](https://github.com/quickwit-oss/tantivy/pull/2535)(@Barre)

- **Optional Index in Multivalue Columnar Index** For mostly empty multivalued indices there was a large overhead during creation when iterating all docids (merge case). This is alleviated by placing an optional index in the multivalued index to mark documents that have values. This will slightly increase space and access time. [#2439](https://github.com/quickwit-oss/tantivy/pull/2439)(@PSeitz)
+- **RegexPhraseQuery** 
+`RegexPhraseQuery` supports phrase queries with regex. E.g. query "b.* b.* wolf" matches "big bad wolf". Slop is supported as well: "b.* wolf"~2 matches "big bad wolf" [#2516](https://github.com/quickwit-oss/tantivy/pull/2516)(@PSeitz)

- **Performace/Memory**
+- **Optional Index in Multivalue Columnar Index** 
+For mostly empty multivalued indices there was a large overhead during creation when iterating all docids (merge case). 
+This is alleviated by placing an optional index in the multivalued index to mark documents that have values. 
+This will slightly increase space and access time. [#2439](https://github.com/quickwit-oss/tantivy/pull/2439)(@PSeitz)
+
+- **Store DateTime as nanoseconds in doc store** DateTime in the doc store was truncated to microseconds previously. This removes this truncation, while still keeping backwards compatibility. [#2486](https://github.com/quickwit-oss/tantivy/pull/2486)(@PSeitz)
+
+- **Performance/Memory**
    - lift clauses in LogicalAst for optimized ast during execution [#2449](https://github.com/quickwit-oss/tantivy/pull/2449)(@PSeitz)
    - Use Vec instead of BTreeMap to back OwnedValue object [#2364](https://github.com/quickwit-oss/tantivy/pull/2364)(@fulmicoton)
    - Replace TantivyDocument with CompactDoc. CompactDoc is much smaller and provides similar performance. [#2402](https://github.com/quickwit-oss/tantivy/pull/2402)(@PSeitz)
@@ -51,18 +92,29 @@ Tantivy 0.23 will be backwards compatible with indices created with v0.22 and v0
    - fix de-escaping too much in query parser [#2427](https://github.com/quickwit-oss/tantivy/pull/2427)(@trinity-1686a)
    - improve query parser [#2416](https://github.com/quickwit-oss/tantivy/pull/2416)(@trinity-1686a)
    - Support field grouping `title:(return AND "pink panther")` [#2333](https://github.com/quickwit-oss/tantivy/pull/2333)(@trinity-1686a)
+    - allow term starting with wildcard [#2568](https://github.com/quickwit-oss/tantivy/pull/2568)(@trinity-1686a)

+- Exist queries match subpath fields [#2558](https://github.com/quickwit-oss/tantivy/pull/2558)(@rdettai)
 - add access benchmark for columnar [#2432](https://github.com/quickwit-oss/tantivy/pull/2432)(@PSeitz)
 - extend indexwriter proptests [#2342](https://github.com/quickwit-oss/tantivy/pull/2342)(@PSeitz)
 - add bench & test for columnar merging [#2428](https://github.com/quickwit-oss/tantivy/pull/2428)(@PSeitz)
 - Change in Executor API [#2391](https://github.com/quickwit-oss/tantivy/pull/2391)(@fulmicoton)
 - Removed usage of num_cpus [#2387](https://github.com/quickwit-oss/tantivy/pull/2387)(@fulmicoton)
- use bingang for agg benchmark [#2378](https://github.com/quickwit-oss/tantivy/pull/2378)(@PSeitz)
+- use bingang for agg and stacker benchmark [#2378](https://github.com/quickwit-oss/tantivy/pull/2378)[#2492](https://github.com/quickwit-oss/tantivy/pull/2492)(@PSeitz) 
 - cleanup top level exports [#2382](https://github.com/quickwit-oss/tantivy/pull/2382)(@PSeitz)
 - make convert_to_fast_value_and_append_to_json_term pub [#2370](https://github.com/quickwit-oss/tantivy/pull/2370)(@PSeitz)
 - remove JsonTermWriter [#2238](https://github.com/quickwit-oss/tantivy/pull/2238)(@PSeitz)
 - validate sort by field type [#2336](https://github.com/quickwit-oss/tantivy/pull/2336)(@PSeitz)
 - Fix trait bound of StoreReader::iter [#2360](https://github.com/quickwit-oss/tantivy/pull/2360)(@adamreichold)
+- remove read_postings_no_deletes [#2526](https://github.com/quickwit-oss/tantivy/pull/2526)(@PSeitz)
+
+Tantivy 0.22.1
+================================
+- Fix TopNComputer for reverse order. [#2672](https://github.com/quickwit-oss/tantivy/pull/2672)(@stuhood @PSeitz) 
+
+Affected queries are [order_by_fast_field](https://docs.rs/tantivy/latest/tantivy/collector/struct.TopDocs.html#method.order_by_fast_field) and
+[order_by_u64_field](https://docs.rs/tantivy/latest/tantivy/collector/struct.TopDocs.html#method.order_by_u64_field)
+for `Order::Asc`

 Tantivy 0.22
 ================================
@@ -717,7 +769,7 @@ Tantivy 0.4.0
 - Raise the limit of number of fields (previously 256 fields) (@fulmicoton)
 - Removed u32 fields. They are replaced by u64 and i64 fields (#65) (@fulmicoton)
 - Optimized skip in SegmentPostings (#130) (@lnicola)
- Replacing rustc_serialize by serde. Kudos to @KodrAus and @lnicola
+- Replacing rustc_serialize by serde. Kudos to  benchmark@KodrAus and @lnicola
 - Using error-chain (@KodrAus)
 - QueryParser: (@fulmicoton)
  - Explicit error returned when searched for a term that is not indexed
--- a/CITATION.cff
+++ b/CITATION.cff
@@ -0,0 +1,10 @@
+cff-version: 1.2.0
+message: "If you use this software, please cite it as below."
+authors:
+  - alias: Quickwit Inc.
+    website: "https://quickwit.io"
+title: "tantivy"
+version: 0.22.0
+doi: 10.5281/zenodo.13942948
+date-released: 2024-10-17
+url: "https://github.com/quickwit-oss/tantivy"
--- a/Cargo.lock
+++ b/Cargo.lock
--- a/Cargo.toml
+++ b/Cargo.toml
@@ -1,6 +1,6 @@
 [package]
 name = "tantivy"
-version = "0.23.0"
+version = "0.26.0"
 authors = ["Paul Masurel <paul.masurel@gmail.com>"]
 license = "MIT"
 categories = ["database-implementations", "data-structures"]
@@ -11,7 +11,7 @@ repository = "https://github.com/quickwit-oss/tantivy"
 readme = "README.md"
 keywords = ["search", "information", "retrieval"]
 edition = "2021"
-rust-version = "1.66"
+rust-version = "1.85"
 exclude = ["benches/*.json", "benches/*.txt"]

 [dependencies]
@@ -21,58 +21,67 @@ byteorder = "1.4.3"
 crc32fast = "1.3.2"
 once_cell = "1.10.0"
 regex = { version = "1.5.5", default-features = false, features = [
-    "std",
-    "unicode",
+  "std",
+  "unicode",
 ] }
 aho-corasick = "1.0"
-tantivy-fst = "0.5"
+tantivy-fst = { git = "https://github.com/paradedb/fst.git" }
 memmap2 = { version = "0.9.0", optional = true }
 lz4_flex = { version = "0.11", default-features = false, optional = true }
 zstd = { version = "0.13", optional = true, default-features = false }
 tempfile = { version = "3.12.0", optional = true }
 log = "0.4.16"
-serde = { version = "1.0.136", features = ["derive"] }
-serde_json = "1.0.79"
-fs4 = { version = "0.8.0", optional = true }
+serde = { version = "1.0.219", features = ["derive"] }
+serde_json = "1.0.140"
+fs4 = { version = "0.13.1", optional = true }
 levenshtein_automata = "0.2.1"
 uuid = { version = "1.0.0", features = ["v4", "serde"] }
 crossbeam-channel = "0.5.4"
 rust-stemmers = "1.2.0"
-downcast-rs = "1.2.1"
+tantivy-stemmers = { version = "0.4.0", default-features = false, features = ["polish_yarovoy"] }
+downcast-rs = "2.0.1"
 bitpacking = { version = "0.9.2", default-features = false, features = [
-    "bitpacker4x",
+  "bitpacker4x",
 ] }
 census = "0.4.2"
-rustc-hash = "1.1.0"
-thiserror = "1.0.30"
+rustc-hash = "2.0.0"
+thiserror = "2.0.1"
 htmlescape = "0.3.1"
 fail = { version = "0.5.0", optional = true }
 time = { version = "0.3.35", features = ["serde-well-known"] }
+# TODO: We have integer wrappers with PartialOrd, and a misfeature of
+# `deranged` causes inference to fail in a bunch of cases. See
+# https://github.com/jhpratt/deranged/issues/18#issuecomment-2746844093
+deranged = "=0.4.0"
 smallvec = "1.8.0"
 rayon = "1.5.2"
 lru = "0.12.0"
 fastdivide = "0.4.0"
-itertools = "0.13.0"
-measure_time = "0.8.2"
+itertools = "0.14.0"
+measure_time = "0.9.0"
 arc-swap = "1.5.0"
+bon = "3.3.1"

-columnar = { version = "0.3", path = "./columnar", package = "tantivy-columnar" }
-sstable = { version = "0.3", path = "./sstable", package = "tantivy-sstable", optional = true }
-stacker = { version = "0.3", path = "./stacker", package = "tantivy-stacker" }
-query-grammar = { version = "0.22.0", path = "./query-grammar", package = "tantivy-query-grammar" }
-tantivy-bitpacker = { version = "0.6", path = "./bitpacker" }
-common = { version = "0.7", path = "./common/", package = "tantivy-common" }
-tokenizer-api = { version = "0.3", path = "./tokenizer-api", package = "tantivy-tokenizer-api" }
+columnar = { version = "0.6", path = "./columnar", package = "tantivy-columnar" }
+sstable = { version = "0.6", path = "./sstable", package = "tantivy-sstable", optional = true }
+stacker = { version = "0.6", path = "./stacker", package = "tantivy-stacker" }
+query-grammar = { version = "0.25.0", path = "./query-grammar", package = "tantivy-query-grammar" }
+tantivy-bitpacker = { version = "0.9", path = "./bitpacker" }
+common = { version = "0.10", path = "./common/", package = "tantivy-common" }
+tokenizer-api = { version = "0.6", path = "./tokenizer-api", package = "tantivy-tokenizer-api" }
 sketches-ddsketch = { version = "0.3.0", features = ["use_serde"] }
 hyperloglogplus = { version = "0.4.1", features = ["const-loop"] }
 futures-util = { version = "0.3.28", optional = true }
+futures-channel = { version = "0.3.28", optional = true }
 fnv = "1.0.7"
+parking_lot = "0.12.4"
+typetag = "0.2.21"

 [target.'cfg(windows)'.dependencies]
 winapi = "0.3.9"

 [dev-dependencies]
-binggan = "0.10.0"
+binggan = "0.14.0"
 rand = "0.8.5"
 maplit = "1.0.2"
 matches = "0.1.9"
@@ -110,17 +119,20 @@ debug-assertions = true
 overflow-checks = true

 [features]
-default = ["mmap", "stopwords", "lz4-compression"]
+default = ["mmap", "stopwords", "lz4-compression", "columnar-zstd-compression"]
 mmap = ["fs4", "tempfile", "memmap2"]
 stopwords = []

 lz4-compression = ["lz4_flex"]
 zstd-compression = ["zstd"]

+# enable zstd-compression in columnar (and sstable)
+columnar-zstd-compression = ["columnar/zstd-compression"]
+
 failpoints = ["fail", "fail/failpoints"]
 unstable = []                            # useful for benches.

-quickwit = ["sstable", "futures-util"]
+quickwit = ["sstable", "futures-util", "futures-channel"]

 # Compares only the hash of a string when indexing data.
 # Increases indexing speed, but may lead to extremely rare missing terms, when there's a hash collision.
@@ -129,14 +141,14 @@ compare_hash_only = ["stacker/compare_hash_only"]

 [workspace]
 members = [
-    "query-grammar",
-    "bitpacker",
-    "common",
-    "ownedbytes",
-    "stacker",
-    "sstable",
-    "tokenizer-api",
-    "columnar",
+  "query-grammar",
+  "bitpacker",
+  "common",
+  "ownedbytes",
+  "stacker",
+  "sstable",
+  "tokenizer-api",
+  "columnar",
 ]

 # Following the "fail" crate best practises, we isolate
@@ -162,3 +174,11 @@ harness = false
 [[bench]]
 name = "agg_bench"
 harness = false
+
+[[bench]]
+name = "exists_json"
+harness = false
+
+[[bench]]
+name = "and_or_queries"
+harness = false
--- a/README.md
+++ b/README.md
@@ -23,8 +23,6 @@ performance for different types of queries/collections.

 Your mileage WILL vary depending on the nature of queries and their load.

-<img src="doc/assets/images/searchbenchmark.png">
-
 Details about the benchmark can be found at this [repository](https://github.com/quickwit-oss/search-benchmark-game).

 ## Features
@@ -125,6 +123,7 @@ You can also find other bindings on [GitHub](https://github.com/search?q=tantivy
 - [seshat](https://github.com/matrix-org/seshat/): A matrix message database/indexer
 - [tantiny](https://github.com/baygeldin/tantiny): Tiny full-text search for Ruby
 - [lnx](https://github.com/lnx-search/lnx): adaptable, typo tolerant search engine with a REST API
+- [Bichon](https://github.com/rustmailer/bichon): A lightweight, high-performance Rust email archiver with WebUI
 - and [more](https://github.com/search?q=tantivy)!

 ### On average, how much faster is Tantivy compared to Lucene?
--- a/RELEASE.md
+++ b/RELEASE.md
@@ -1,4 +1,4 @@
-# Release a new Tantivy Version
+# Releasing a new Tantivy Version

 ## Steps

@@ -10,12 +10,29 @@
 6. Set git tag with new version


-In conjucation with `cargo-release` Steps 1-4 (I'm not sure if the change detection works):
-Set new packages to version 0.0.0
+[`cargo-release`](https://github.com/crate-ci/cargo-release) will help us with steps 1-5:

 Replace prev-tag-name
 ```bash
-cargo release --workspace --no-publish -v --prev-tag-name 0.19 --push-remote origin minor --no-tag --execute
+cargo release --workspace --no-publish -v --prev-tag-name 0.24 --push-remote origin minor --no-tag
 ```

-no-tag or it will create tags for all the subpackages
+`no-tag` or it will create tags for all the subpackages
+
+cargo release will _not_ ignore unchanged packages, but it will print warnings for them.
+e.g. "warning: updating ownedbytes to 0.10.0 despite no changes made since tag 0.24"
+
+We need to manually ignore these unchanged packages
+```bash
+cargo release --workspace --no-publish -v --prev-tag-name 0.24 --push-remote origin minor --no-tag --exclude tokenizer-api
+```
+
+Add `--execute` to actually publish the packages, otherwise it will only print the commands that would be run.
+
+### Tag Version
+```bash
+git tag 0.25.0
+git push upstream tag 0.25.0
+```
+
+
--- a/TODO.txt
+++ b/TODO.txt
@@ -10,7 +10,7 @@ rename FastFieldReaders::open to load
 remove fast field reader

 find a way to unify the two DateTime.
-readd type check in the filter wrapper
+re-add type check in the filter wrapper

 add unit test on columnar list columns.

--- a/benches/agg_bench.rs
+++ b/benches/agg_bench.rs
@@ -1,4 +1,6 @@
+use binggan::plugins::PeakMemAllocPlugin;
 use binggan::{black_box, InputGroup, PeakMemAlloc, INSTRUMENTED_SYSTEM};
+use rand::distributions::WeightedIndex;
 use rand::prelude::SliceRandom;
 use rand::rngs::StdRng;
 use rand::{Rng, SeedableRng};
@@ -19,7 +21,6 @@ macro_rules! register {
    ($runner:expr, $func:ident) => {
        $runner.register(stringify!($func), move |index| {
            $func(index);
-            None
        })
    };
 }
@@ -45,7 +46,8 @@ fn main() {
 }

 fn bench_agg(mut group: InputGroup<Index>) {
-    group.set_alloc(GLOBAL); // Set the peak mem allocator. This will enable peak memory reporting.
+    group.add_plugin(PeakMemAllocPlugin::new(GLOBAL));
+
    register!(group, average_u64);
    register!(group, average_f64);
    register!(group, average_f64_u64);
@@ -53,11 +55,19 @@ fn bench_agg(mut group: InputGroup<Index>) {
    register!(group, extendedstats_f64);
    register!(group, percentiles_f64);
    register!(group, terms_few);
+    register!(group, terms_all_unique);
    register!(group, terms_many);
    register!(group, terms_many_top_1000);
    register!(group, terms_many_order_by_term);
    register!(group, terms_many_with_top_hits);
+    register!(group, terms_all_unique_with_avg_sub_agg);
    register!(group, terms_many_with_avg_sub_agg);
+    register!(group, terms_few_with_avg_sub_agg);
+    register!(group, terms_status_with_avg_sub_agg);
+    register!(group, terms_status);
+    register!(group, terms_few_with_histogram);
+    register!(group, terms_status_with_histogram);
+
    register!(group, terms_many_json_mixed_type_with_avg_sub_agg);

    register!(group, cardinality_agg);
@@ -70,8 +80,15 @@ fn bench_agg(mut group: InputGroup<Index>) {
    register!(group, histogram);
    register!(group, histogram_hard_bounds);
    register!(group, histogram_with_avg_sub_agg);
+    register!(group, histogram_with_term_agg_few);
    register!(group, avg_and_range_with_avg_sub_agg);

+    // Filter aggregation benchmarks
+    register!(group, filter_agg_all_query_count_agg);
+    register!(group, filter_agg_term_query_count_agg);
+    register!(group, filter_agg_all_query_with_sub_aggs);
+    register!(group, filter_agg_term_query_with_sub_aggs);
+
    group.run();
 }

@@ -122,12 +139,12 @@ fn extendedstats_f64(index: &Index) {
 }
 fn percentiles_f64(index: &Index) {
    let agg_req = json!({
-      "mypercentiles": {
-        "percentiles": {
-          "field": "score_f64",
-          "percents": [ 95, 99, 99.9 ]
+        "mypercentiles": {
+            "percentiles": {
+                "field": "score_f64",
+                "percents": [ 95, 99, 99.9 ]
+            }
        }
-      }
    });
    execute_agg(index, agg_req);
 }
@@ -164,6 +181,19 @@ fn terms_few(index: &Index) {
    });
    execute_agg(index, agg_req);
 }
+fn terms_status(index: &Index) {
+    let agg_req = json!({
+        "my_texts": { "terms": { "field": "text_few_terms_status" } },
+    });
+    execute_agg(index, agg_req);
+}
+fn terms_all_unique(index: &Index) {
+    let agg_req = json!({
+        "my_texts": { "terms": { "field": "text_all_unique_terms" } },
+    });
+    execute_agg(index, agg_req);
+}
+
 fn terms_many(index: &Index) {
    let agg_req = json!({
        "my_texts": { "terms": { "field": "text_many_terms" } },
@@ -212,6 +242,63 @@ fn terms_many_with_avg_sub_agg(index: &Index) {
    });
    execute_agg(index, agg_req);
 }
+fn terms_all_unique_with_avg_sub_agg(index: &Index) {
+    let agg_req = json!({
+        "my_texts": {
+            "terms": { "field": "text_all_unique_terms" },
+            "aggs": {
+                "average_f64": { "avg": { "field": "score_f64" } }
+            }
+        },
+    });
+    execute_agg(index, agg_req);
+}
+fn terms_few_with_histogram(index: &Index) {
+    let agg_req = json!({
+        "my_texts": {
+            "terms": { "field": "text_few_terms" },
+            "aggs": {
+                "histo": {"histogram": { "field": "score_f64", "interval": 10 }}
+            }
+        }
+    });
+    execute_agg(index, agg_req);
+}
+fn terms_status_with_histogram(index: &Index) {
+    let agg_req = json!({
+        "my_texts": {
+            "terms": { "field": "text_few_terms_status" },
+            "aggs": {
+                "histo": {"histogram": { "field": "score_f64", "interval": 10 }}
+            }
+        }
+    });
+    execute_agg(index, agg_req);
+}
+
+fn terms_few_with_avg_sub_agg(index: &Index) {
+    let agg_req = json!({
+        "my_texts": {
+            "terms": { "field": "text_few_terms" },
+            "aggs": {
+                "average_f64": { "avg": { "field": "score_f64" } }
+            }
+        },
+    });
+    execute_agg(index, agg_req);
+}
+fn terms_status_with_avg_sub_agg(index: &Index) {
+    let agg_req = json!({
+        "my_texts": {
+            "terms": { "field": "text_few_terms_status" },
+            "aggs": {
+                "average_f64": { "avg": { "field": "score_f64" } }
+            }
+        },
+    });
+    execute_agg(index, agg_req);
+}
+
 fn terms_many_json_mixed_type_with_avg_sub_agg(index: &Index) {
    let agg_req = json!({
        "my_texts": {
@@ -338,6 +425,17 @@ fn histogram_with_avg_sub_agg(index: &Index) {
    });
    execute_agg(index, agg_req);
 }
+fn histogram_with_term_agg_few(index: &Index) {
+    let agg_req = json!({
+        "rangef64": {
+            "histogram": { "field": "score_f64", "interval": 10 },
+            "aggs": {
+                "my_texts": { "terms": { "field": "text_few_terms" } }
+            }
+        }
+    });
+    execute_agg(index, agg_req);
+}
 fn avg_and_range_with_avg_sub_agg(index: &Index) {
    let agg_req = json!({
        "rangef64": {
@@ -385,14 +483,21 @@ fn get_test_index_bench(cardinality: Cardinality) -> tantivy::Result<Index> {
        .set_stored();
    let text_field = schema_builder.add_text_field("text", text_fieldtype);
    let json_field = schema_builder.add_json_field("json", FAST);
+    let text_field_all_unique_terms =
+        schema_builder.add_text_field("text_all_unique_terms", STRING | FAST);
+    let text_field_many_terms = schema_builder.add_text_field("text_many_terms", STRING | FAST);
    let text_field_many_terms = schema_builder.add_text_field("text_many_terms", STRING | FAST);
    let text_field_few_terms = schema_builder.add_text_field("text_few_terms", STRING | FAST);
+    let text_field_few_terms_status =
+        schema_builder.add_text_field("text_few_terms_status", STRING | FAST);
    let score_fieldtype = tantivy::schema::NumericOptions::default().set_fast();
    let score_field = schema_builder.add_u64_field("score", score_fieldtype.clone());
    let score_field_f64 = schema_builder.add_f64_field("score_f64", score_fieldtype.clone());
    let score_field_i64 = schema_builder.add_i64_field("score_i64", score_fieldtype);
    let index = Index::create_from_tempdir(schema_builder.build())?;
    let few_terms_data = ["INFO", "ERROR", "WARN", "DEBUG"];
+    // Approximate production log proportions: INFO dominant, WARN and DEBUG occasional, ERROR rare.
+    let log_level_distribution = WeightedIndex::new([80u32, 3, 12, 5]).unwrap();

    let lg_norm = rand_distr::LogNormal::new(2.996f64, 0.979f64).unwrap();

@@ -408,15 +513,21 @@ fn get_test_index_bench(cardinality: Cardinality) -> tantivy::Result<Index> {
            index_writer.add_document(doc!())?;
        }
        if cardinality == Cardinality::Multivalued {
+            let log_level_sample_a = few_terms_data[log_level_distribution.sample(&mut rng)];
+            let log_level_sample_b = few_terms_data[log_level_distribution.sample(&mut rng)];
            index_writer.add_document(doc!(
                json_field => json!({"mixed_type": 10.0}),
                json_field => json!({"mixed_type": 10.0}),
                text_field => "cool",
                text_field => "cool",
+                text_field_all_unique_terms => "cool",
+                text_field_all_unique_terms => "coolo",
                text_field_many_terms => "cool",
                text_field_many_terms => "cool",
                text_field_few_terms => "cool",
                text_field_few_terms => "cool",
+                text_field_few_terms_status => log_level_sample_a,
+                text_field_few_terms_status => log_level_sample_b,
                score_field => 1u64,
                score_field => 1u64,
                score_field_f64 => lg_norm.sample(&mut rng),
@@ -441,8 +552,10 @@ fn get_test_index_bench(cardinality: Cardinality) -> tantivy::Result<Index> {
            index_writer.add_document(doc!(
                text_field => "cool",
                json_field => json,
+                text_field_all_unique_terms => format!("unique_term_{}", rng.gen::<u64>()),
                text_field_many_terms => many_terms_data.choose(&mut rng).unwrap().to_string(),
                text_field_few_terms => few_terms_data.choose(&mut rng).unwrap().to_string(),
+                text_field_few_terms_status => few_terms_data[log_level_distribution.sample(&mut rng)],
                score_field => val as u64,
                score_field_f64 => lg_norm.sample(&mut rng),
                score_field_i64 => val as i64,
@@ -459,3 +572,61 @@ fn get_test_index_bench(cardinality: Cardinality) -> tantivy::Result<Index> {

    Ok(index)
 }
+
+// Filter aggregation benchmarks
+
+fn filter_agg_all_query_count_agg(index: &Index) {
+    let agg_req = json!({
+        "filtered": {
+            "filter": "*",
+            "aggs": {
+                "count": { "value_count": { "field": "score" } }
+            }
+        }
+    });
+    execute_agg(index, agg_req);
+}
+
+fn filter_agg_term_query_count_agg(index: &Index) {
+    let agg_req = json!({
+        "filtered": {
+            "filter": "text:cool",
+            "aggs": {
+                "count": { "value_count": { "field": "score" } }
+            }
+        }
+    });
+    execute_agg(index, agg_req);
+}
+
+fn filter_agg_all_query_with_sub_aggs(index: &Index) {
+    let agg_req = json!({
+        "filtered": {
+            "filter": "*",
+            "aggs": {
+                "avg_score": { "avg": { "field": "score" } },
+                "stats_score": { "stats": { "field": "score_f64" } },
+                "terms_text": {
+                    "terms": { "field": "text_few_terms" }
+                }
+            }
+        }
+    });
+    execute_agg(index, agg_req);
+}
+
+fn filter_agg_term_query_with_sub_aggs(index: &Index) {
+    let agg_req = json!({
+        "filtered": {
+            "filter": "text:cool",
+            "aggs": {
+                "avg_score": { "avg": { "field": "score" } },
+                "stats_score": { "stats": { "field": "score_f64" } },
+                "terms_text": {
+                    "terms": { "field": "text_few_terms" }
+                }
+            }
+        }
+    });
+    execute_agg(index, agg_req);
+}
--- a/benches/and_or_queries.rs
+++ b/benches/and_or_queries.rs
@@ -0,0 +1,218 @@
+// Benchmarks boolean conjunction queries using binggan.
+//
+// What’s measured:
+// - Or and And queries with varying selectivity (only `Term` queries for now on leafs)
+// - Nested AND/OR combinations (on multiple fields)
+// - No-scoring path using the Count collector (focus on iterator/skip performance)
+// - Top-K retrieval (k=10) using the TopDocs collector
+//
+// Corpus model:
+// - Synthetic docs; each token a/b/c is independently included per doc
+// - If none of a/b/c are included, emit a neutral filler token to keep doc length similar
+//
+// Notes:
+// - After optimization, when scoring is disabled Tantivy reads doc-only postings
+//   (IndexRecordOption::Basic), avoiding frequency decoding overhead.
+// - This bench isolates boolean iteration speed and intersection/union cost.
+// - Use `cargo bench --bench boolean_conjunction` to run.
+
+use binggan::{black_box, BenchGroup, BenchRunner};
+use rand::prelude::*;
+use rand::rngs::StdRng;
+use rand::SeedableRng;
+use tantivy::collector::sort_key::SortByStaticFastValue;
+use tantivy::collector::{Collector, Count, TopDocs};
+use tantivy::query::{Query, QueryParser};
+use tantivy::schema::{Schema, FAST, TEXT};
+use tantivy::{doc, Index, Order, ReloadPolicy, Searcher};
+
+#[derive(Clone)]
+struct BenchIndex {
+    #[allow(dead_code)]
+    index: Index,
+    searcher: Searcher,
+    query_parser: QueryParser,
+}
+
+/// Build a single index containing both fields (title, body) and
+/// return two BenchIndex views:
+/// - single_field: QueryParser defaults to only "body"
+/// - multi_field:  QueryParser defaults to ["title", "body"]
+fn build_shared_indices(num_docs: usize, p_a: f32, p_b: f32, p_c: f32) -> (BenchIndex, BenchIndex) {
+    // Unified schema (two text fields)
+    let mut schema_builder = Schema::builder();
+    let f_title = schema_builder.add_text_field("title", TEXT);
+    let f_body = schema_builder.add_text_field("body", TEXT);
+    let f_score = schema_builder.add_u64_field("score", FAST);
+    let f_score2 = schema_builder.add_u64_field("score2", FAST);
+    let schema = schema_builder.build();
+    let index = Index::create_in_ram(schema.clone());
+
+    // Populate index with stable RNG for reproducibility.
+    let mut rng = StdRng::from_seed([7u8; 32]);
+
+    // Populate: spread each present token 90/10 to body/title
+    {
+        let mut writer = index.writer_with_num_threads(1, 500_000_000).unwrap();
+        for _ in 0..num_docs {
+            let has_a = rng.gen_bool(p_a as f64);
+            let has_b = rng.gen_bool(p_b as f64);
+            let has_c = rng.gen_bool(p_c as f64);
+            let score = rng.gen_range(0u64..100u64);
+            let score2 = rng.gen_range(0u64..100_000u64);
+            let mut title_tokens: Vec<&str> = Vec::new();
+            let mut body_tokens: Vec<&str> = Vec::new();
+            if has_a {
+                if rng.gen_bool(0.1) {
+                    title_tokens.push("a");
+                } else {
+                    body_tokens.push("a");
+                }
+            }
+            if has_b {
+                if rng.gen_bool(0.1) {
+                    title_tokens.push("b");
+                } else {
+                    body_tokens.push("b");
+                }
+            }
+            if has_c {
+                if rng.gen_bool(0.1) {
+                    title_tokens.push("c");
+                } else {
+                    body_tokens.push("c");
+                }
+            }
+            if title_tokens.is_empty() && body_tokens.is_empty() {
+                body_tokens.push("z");
+            }
+            writer
+                .add_document(doc!(
+                    f_title=>title_tokens.join(" "),
+                    f_body=>body_tokens.join(" "),
+                    f_score=>score,
+                    f_score2=>score2,
+                ))
+                .unwrap();
+        }
+        writer.commit().unwrap();
+    }
+
+    // Prepare reader/searcher once.
+    let reader = index
+        .reader_builder()
+        .reload_policy(ReloadPolicy::Manual)
+        .try_into()
+        .unwrap();
+    let searcher = reader.searcher();
+
+    // Build two query parsers with different default fields.
+    let qp_single = QueryParser::for_index(&index, vec![f_body]);
+    let qp_multi = QueryParser::for_index(&index, vec![f_title, f_body]);
+
+    let single_view = BenchIndex {
+        index: index.clone(),
+        searcher: searcher.clone(),
+        query_parser: qp_single,
+    };
+    let multi_view = BenchIndex {
+        index,
+        searcher,
+        query_parser: qp_multi,
+    };
+    (single_view, multi_view)
+}
+
+fn main() {
+    // Prepare corpora with varying selectivity. Build one index per corpus
+    // and derive two views (single-field vs multi-field) from it.
+    let scenarios = vec![
+        (
+            "N=1M, p(a)=5%, p(b)=1%, p(c)=15%".to_string(),
+            1_000_000,
+            0.05,
+            0.01,
+            0.15,
+        ),
+        (
+            "N=1M, p(a)=1%, p(b)=1%, p(c)=15%".to_string(),
+            1_000_000,
+            0.01,
+            0.01,
+            0.15,
+        ),
+    ];
+
+    let queries = &["a", "+a +b", "+a +b +c", "a OR b", "a OR b OR c"];
+
+    let mut runner = BenchRunner::new();
+    for (label, n, pa, pb, pc) in scenarios {
+        let (single_view, multi_view) = build_shared_indices(n, pa, pb, pc);
+
+        for (view_name, bench_index) in [("single_field", single_view), ("multi_field", multi_view)]
+        {
+            // Single-field group: default field is body only
+            let mut group = runner.new_group();
+            group.set_name(format!("{} — {}", view_name, label));
+            for query_str in queries {
+                add_bench_task(&mut group, &bench_index, query_str, Count, "count");
+                add_bench_task(
+                    &mut group,
+                    &bench_index,
+                    query_str,
+                    TopDocs::with_limit(10).order_by_score(),
+                    "top10",
+                );
+                add_bench_task(
+                    &mut group,
+                    &bench_index,
+                    query_str,
+                    TopDocs::with_limit(10).order_by_fast_field::<u64>("score", Order::Asc),
+                    "top10_by_ff",
+                );
+                add_bench_task(
+                    &mut group,
+                    &bench_index,
+                    query_str,
+                    TopDocs::with_limit(10).order_by((
+                        SortByStaticFastValue::<u64>::for_field("score"),
+                        SortByStaticFastValue::<u64>::for_field("score2"),
+                    )),
+                    "top10_by_2ff",
+                );
+            }
+            group.run();
+        }
+    }
+}
+
+fn add_bench_task<C: Collector + 'static>(
+    bench_group: &mut BenchGroup,
+    bench_index: &BenchIndex,
+    query_str: &str,
+    collector: C,
+    collector_name: &str,
+) {
+    let task_name = format!("{}_{}", query_str.replace(" ", "_"), collector_name);
+    let query = bench_index.query_parser.parse_query(query_str).unwrap();
+    let search_task = SearchTask {
+        searcher: bench_index.searcher.clone(),
+        collector,
+        query,
+    };
+    bench_group.register(task_name, move |_| black_box(search_task.run()));
+}
+
+struct SearchTask<C: Collector> {
+    searcher: Searcher,
+    collector: C,
+    query: Box<dyn Query>,
+}
+
+impl<C: Collector> SearchTask<C> {
+    #[inline(never)]
+    pub fn run(&self) -> usize {
+        self.searcher.search(&self.query, &self.collector).unwrap();
+        1
+    }
+}
--- a/benches/exists_json.rs
+++ b/benches/exists_json.rs
@@ -0,0 +1,69 @@
+use binggan::plugins::PeakMemAllocPlugin;
+use binggan::{black_box, InputGroup, PeakMemAlloc, INSTRUMENTED_SYSTEM};
+use serde_json::json;
+use tantivy::collector::Count;
+use tantivy::query::ExistsQuery;
+use tantivy::schema::{Schema, FAST, TEXT};
+use tantivy::{doc, Index};
+
+#[global_allocator]
+pub static GLOBAL: &PeakMemAlloc<std::alloc::System> = &INSTRUMENTED_SYSTEM;
+
+fn main() {
+    let doc_count: usize = 500_000;
+    let subfield_counts: &[usize] = &[1, 2, 3, 4, 5, 6, 7, 8, 16, 256, 4096, 65536, 262144];
+
+    let indices: Vec<(String, Index)> = subfield_counts
+        .iter()
+        .map(|&sub_fields| {
+            (
+                format!("subfields={sub_fields}"),
+                build_index_with_json_subfields(doc_count, sub_fields),
+            )
+        })
+        .collect();
+
+    let mut group = InputGroup::new_with_inputs(indices);
+    group.add_plugin(PeakMemAllocPlugin::new(GLOBAL));
+
+    group.config().num_iter_group = Some(1);
+    group.config().num_iter_bench = Some(1);
+    group.register("exists_json", exists_json_union);
+
+    group.run();
+}
+
+fn exists_json_union(index: &Index) {
+    let reader = index.reader().expect("reader");
+    let searcher = reader.searcher();
+    let query = ExistsQuery::new("json".to_string(), true);
+    let count = searcher.search(&query, &Count).expect("exists search");
+    // Prevents optimizer from eliding the search
+    black_box(count);
+}
+
+fn build_index_with_json_subfields(num_docs: usize, num_subfields: usize) -> Index {
+    // Schema: single JSON field stored as FAST to support ExistsQuery.
+    let mut schema_builder = Schema::builder();
+    let json_field = schema_builder.add_json_field("json", TEXT | FAST);
+    let schema = schema_builder.build();
+
+    let index = Index::create_from_tempdir(schema).expect("create index");
+    {
+        let mut index_writer = index
+            .writer_with_num_threads(1, 200_000_000)
+            .expect("writer");
+        for i in 0..num_docs {
+            let sub = i % num_subfields;
+            // Only one subpath set per document; rotate subpaths so that
+            // no single subpath is full, but the union covers all docs.
+            let v = json!({ format!("field_{sub}"): i as u64 });
+            index_writer
+                .add_document(doc!(json_field => v))
+                .expect("add_document");
+        }
+        index_writer.commit().expect("commit");
+    }
+
+    index
+}
--- a/bitpacker/Cargo.toml
+++ b/bitpacker/Cargo.toml
@@ -1,7 +1,7 @@
 [package]
 name = "tantivy-bitpacker"
-version = "0.6.0"
-edition = "2021"
+version = "0.9.0"
+edition = "2024"
 authors = ["Paul Masurel <paul.masurel@gmail.com>"]
 license = "MIT"
 categories = []
@@ -11,9 +11,6 @@ keywords = []
 documentation = "https://docs.rs/tantivy-bitpacker/latest/tantivy_bitpacker"
 homepage = "https://github.com/quickwit-oss/tantivy"

-
-# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html
-
 [dependencies]
 bitpacking = { version = "0.9.2", default-features = false, features = ["bitpacker1x"] }

--- a/bitpacker/src/bitpacker.rs
+++ b/bitpacker/src/bitpacker.rs
@@ -69,6 +69,12 @@ pub struct BitUnpacker {
    mask: u64,
 }

+pub type BlockNumber = usize;
+
+// 16k
+const BLOCK_SIZE_MIN_POW: u8 = 14;
+const BLOCK_SIZE_MIN: usize = 2 << BLOCK_SIZE_MIN_POW;
+
 impl BitUnpacker {
    /// Creates a bit unpacker, that assumes the same bitwidth for all values.
    ///
@@ -82,6 +88,7 @@ impl BitUnpacker {
        } else {
            (1u64 << num_bits) - 1u64
        };
+
        BitUnpacker {
            num_bits: u32::from(num_bits),
            mask,
@@ -92,10 +99,63 @@ impl BitUnpacker {
        self.num_bits as u8
    }

+    /// Calculates a block number for the given `idx`.
+    #[inline]
+    pub fn block_num(&self, idx: u32) -> BlockNumber {
+        // Find the address in bits of the index.
+        let addr_in_bits = (idx * self.num_bits) as usize;
+
+        // Then round down to the nearest byte.
+        let addr_in_bytes = addr_in_bits >> 3;
+
+        // And compute the containing BlockNumber.
+        addr_in_bytes >> (BLOCK_SIZE_MIN_POW + 1)
+    }
+
+    /// Given a block number and dataset length, calculates a data Range for the block.
+    pub fn block(&self, block: BlockNumber, data_len: usize) -> Range<usize> {
+        let block_addr = block << (BLOCK_SIZE_MIN_POW + 1);
+        // We extend the end of the block by a constant factor, so that it overlaps the next
+        // block. That ensures that we never need to read on a block boundary.
+        block_addr..(std::cmp::min(block_addr + BLOCK_SIZE_MIN + 8, data_len))
+    }
+
+    /// Calculates the number of blocks for the given data_len.
+    ///
+    /// Usually only called at startup to pre-allocate structures.
+    pub fn block_count(&self, data_len: usize) -> usize {
+        let block_count = data_len / (BLOCK_SIZE_MIN as usize);
+        if data_len % (BLOCK_SIZE_MIN as usize) == 0 {
+            block_count
+        } else {
+            block_count + 1
+        }
+    }
+
+    /// Returns a range within the data which covers the given id_range.
+    ///
+    /// NOTE: This method is used for batch reads which bypass blocks to avoid dealing with block
+    /// boundaries.
+    #[inline]
+    pub fn block_oblivious_range(&self, id_range: Range<u32>, data_len: usize) -> Range<usize> {
+        let start_in_bits = id_range.start * self.num_bits;
+        let start = (start_in_bits >> 3) as usize;
+        let end_in_bits = id_range.end * self.num_bits;
+        let end = (end_in_bits >> 3) as usize;
+        // TODO: We fetch more than we need and then truncate.
+        start..(std::cmp::min(end + 8, data_len))
+    }
+
    #[inline]
    pub fn get(&self, idx: u32, data: &[u8]) -> u64 {
+        self.get_from_subset(idx, 0, data)
+    }
+
+    /// Get the value at the given idx, which must exist within the given subset of the data.
+    #[inline]
+    pub fn get_from_subset(&self, idx: u32, data_offset: usize, data: &[u8]) -> u64 {
        let addr_in_bits = idx * self.num_bits;
-        let addr = (addr_in_bits >> 3) as usize;
+        let addr = (addr_in_bits >> 3) as usize - data_offset;
        if addr + 8 > data.len() {
            if self.num_bits == 0 {
                return 0;
@@ -113,6 +173,7 @@ impl BitUnpacker {
    #[inline(never)]
    fn get_slow_path(&self, addr: usize, bit_shift: u32, data: &[u8]) -> u64 {
        let mut bytes: [u8; 8] = [0u8; 8];
+
        let available_bytes = data.len() - addr;
        // This function is meant to only be called if we did not have 8 bytes to load.
        debug_assert!(available_bytes < 8);
@@ -128,7 +189,7 @@ impl BitUnpacker {
    // #Panics
    //
    // This methods panics if `num_bits` is > 32.
-    fn get_batch_u32s(&self, start_idx: u32, data: &[u8], output: &mut [u32]) {
+    fn get_batch_u32s(&self, start_idx: u32, data_offset: usize, data: &[u8], output: &mut [u32]) {
        assert!(
            self.bit_width() <= 32,
            "Bitwidth must be <= 32 to use this method."
@@ -139,14 +200,14 @@ impl BitUnpacker {
        let end_bit_read = end_idx * self.num_bits;
        let end_byte_read = (end_bit_read + 7) / 8;
        assert!(
-            end_byte_read as usize <= data.len(),
+            end_byte_read as usize <= data_offset + data.len(),
            "Requested index is out of bounds."
        );

        // Simple slow implementation of get_batch_u32s, to deal with our ramps.
        let get_batch_ramp = |start_idx: u32, output: &mut [u32]| {
            for (out, idx) in output.iter_mut().zip(start_idx..) {
-                *out = self.get(idx, data) as u32;
+                *out = self.get_from_subset(idx, data_offset, data) as u32;
            }
        };

@@ -176,7 +237,7 @@ impl BitUnpacker {
        get_batch_ramp(start_idx, &mut output[..entrance_ramp_len as usize]);

        // Highway
-        let mut offset = (highway_start * self.num_bits) as usize / 8;
+        let mut offset = ((highway_start * self.num_bits) as usize / 8) - data_offset;
        let mut output_cursor = (highway_start - start_idx) as usize;
        for _ in 0..num_blocks {
            offset += BitPacker1x.decompress(
@@ -198,16 +259,27 @@ impl BitUnpacker {
        id_range: Range<u32>,
        data: &[u8],
        positions: &mut Vec<u32>,
+    ) {
+        self.get_ids_for_value_range_from_subset(range, id_range, 0, data, positions)
+    }
+
+    pub fn get_ids_for_value_range_from_subset(
+        &self,
+        range: RangeInclusive<u64>,
+        id_range: Range<u32>,
+        data_offset: usize,
+        data: &[u8],
+        positions: &mut Vec<u32>,
    ) {
        if self.bit_width() > 32 {
-            self.get_ids_for_value_range_slow(range, id_range, data, positions)
+            self.get_ids_for_value_range_slow(range, id_range, data_offset, data, positions)
        } else {
            if *range.start() > u32::MAX as u64 {
                positions.clear();
                return;
            }
            let range_u32 = (*range.start() as u32)..=(*range.end()).min(u32::MAX as u64) as u32;
-            self.get_ids_for_value_range_fast(range_u32, id_range, data, positions)
+            self.get_ids_for_value_range_fast(range_u32, id_range, data_offset, data, positions)
        }
    }

@@ -215,6 +287,7 @@ impl BitUnpacker {
        &self,
        range: RangeInclusive<u64>,
        id_range: Range<u32>,
+        data_offset: usize,
        data: &[u8],
        positions: &mut Vec<u32>,
    ) {
@@ -222,7 +295,7 @@ impl BitUnpacker {
        for i in id_range {
            // If we cared we could make this branchless, but the slow implementation should rarely
            // kick in.
-            let val = self.get(i, data);
+            let val = self.get_from_subset(i, data_offset, data);
            if range.contains(&val) {
                positions.push(i);
            }
@@ -233,11 +306,12 @@ impl BitUnpacker {
        &self,
        value_range: RangeInclusive<u32>,
        id_range: Range<u32>,
+        data_offset: usize,
        data: &[u8],
        positions: &mut Vec<u32>,
    ) {
        positions.resize(id_range.len(), 0u32);
-        self.get_batch_u32s(id_range.start, data, positions);
+        self.get_batch_u32s(id_range.start, data_offset, data, positions);
        crate::filter_vec::filter_vec_in_place(value_range, id_range.start, positions)
    }
 }
@@ -257,7 +331,7 @@ mod test {
            bitpacker.write(val, num_bits, &mut data).unwrap();
        }
        bitpacker.close(&mut data).unwrap();
-        assert_eq!(data.len(), ((num_bits as usize) * len + 7) / 8);
+        assert_eq!(data.len(), ((num_bits as usize) * len).div_ceil(8));
        let bitunpacker = BitUnpacker::new(num_bits);
        (bitunpacker, vals, data)
    }
@@ -303,7 +377,7 @@ mod test {
            bitpacker.write(val, num_bits, &mut buffer).unwrap();
        }
        bitpacker.flush(&mut buffer).unwrap();
-        assert_eq!(buffer.len(), (vals.len() * num_bits as usize + 7) / 8);
+        assert_eq!(buffer.len(), (vals.len() * num_bits as usize).div_ceil(8));
        let bitunpacker = BitUnpacker::new(num_bits);
        let max_val = if num_bits == 64 {
            u64::MAX
@@ -328,14 +402,14 @@ mod test {
    fn test_get_batch_panics_over_32_bits() {
        let bitunpacker = BitUnpacker::new(33);
        let mut output: [u32; 1] = [0u32];
-        bitunpacker.get_batch_u32s(0, &[0, 0, 0, 0, 0, 0, 0, 0], &mut output[..]);
+        bitunpacker.get_batch_u32s(0, 0, &[0, 0, 0, 0, 0, 0, 0, 0], &mut output[..]);
    }

    #[test]
    fn test_get_batch_limit() {
        let bitunpacker = BitUnpacker::new(1);
        let mut output: [u32; 3] = [0u32, 0u32, 0u32];
-        bitunpacker.get_batch_u32s(8 * 4 - 3, &[0u8, 0u8, 0u8, 0u8], &mut output[..]);
+        bitunpacker.get_batch_u32s(8 * 4 - 3, 0, &[0u8, 0u8, 0u8, 0u8], &mut output[..]);
    }

    #[test]
@@ -344,7 +418,7 @@ mod test {
        let bitunpacker = BitUnpacker::new(1);
        let mut output: [u32; 3] = [0u32, 0u32, 0u32];
        // We are missing exactly one bit.
-        bitunpacker.get_batch_u32s(8 * 4 - 2, &[0u8, 0u8, 0u8, 0u8], &mut output[..]);
+        bitunpacker.get_batch_u32s(8 * 4 - 2, 0, &[0u8, 0u8, 0u8, 0u8], &mut output[..]);
    }

    proptest::proptest! {
@@ -367,7 +441,7 @@ mod test {
            for len in [0, 1, 2, 32, 33, 34, 64] {
                for start_idx in 0u32..32u32 {
                    output.resize(len, 0);
-                    bitunpacker.get_batch_u32s(start_idx, &buffer, &mut output);
+                    bitunpacker.get_batch_u32s(start_idx, 0, &buffer, &mut output);
                    for (i, output_byte) in output.iter().enumerate() {
                        let expected = (start_idx + i as u32) & mask;
                        assert_eq!(*output_byte, expected);
--- a/bitpacker/src/blocked_bitpacker.rs
+++ b/bitpacker/src/blocked_bitpacker.rs
@@ -1,6 +1,6 @@
 use super::bitpacker::BitPacker;
 use super::compute_num_bits;
-use crate::{minmax, BitUnpacker};
+use crate::{BitUnpacker, minmax};

 const BLOCK_SIZE: usize = 128;

@@ -34,7 +34,7 @@ struct BlockedBitpackerEntryMetaData {

 impl BlockedBitpackerEntryMetaData {
    fn new(offset: u64, num_bits: u8, base_value: u64) -> Self {
-        let encoded = offset | (num_bits as u64) << (64 - 8);
+        let encoded = offset | (u64::from(num_bits) << (64 - 8));
        Self {
            encoded,
            base_value,
@@ -140,10 +140,10 @@ impl BlockedBitpacker {
    pub fn iter(&self) -> impl Iterator<Item = u64> + '_ {
        // todo performance: we could decompress a whole block and cache it instead
        let bitpacked_elems = self.offset_and_bits.len() * BLOCK_SIZE;
-        let iter = (0..bitpacked_elems)
+
+        (0..bitpacked_elems)
            .map(move |idx| self.get(idx))
-            .chain(self.buffer.iter().cloned());
-        iter
+            .chain(self.buffer.iter().cloned())
    }
 }

--- a/bitpacker/src/filter_vec/avx2.rs
+++ b/bitpacker/src/filter_vec/avx2.rs
@@ -19,7 +19,7 @@ fn u32_to_i32(val: u32) -> i32 {
 #[inline]
 unsafe fn u32_to_i32_avx2(vals_u32x8s: DataType) -> DataType {
    const HIGHEST_BIT_MASK: DataType = from_u32x8([HIGHEST_BIT; NUM_LANES]);
-    op_xor(vals_u32x8s, HIGHEST_BIT_MASK)
+    unsafe { op_xor(vals_u32x8s, HIGHEST_BIT_MASK) }
 }

 pub fn filter_vec_in_place(range: RangeInclusive<u32>, offset: u32, output: &mut Vec<u32>) {
@@ -66,17 +66,19 @@ unsafe fn filter_vec_avx2_aux(
    ]);
    const SHIFT: __m256i = from_u32x8([NUM_LANES as u32; NUM_LANES]);
    for _ in 0..num_words {
-        let word = load_unaligned(input);
-        let word = u32_to_i32_avx2(word);
-        let keeper_bitset = compute_filter_bitset(word, range_simd.clone());
-        let added_len = keeper_bitset.count_ones();
-        let filtered_doc_ids = compact(ids, keeper_bitset);
-        store_unaligned(output_tail as *mut __m256i, filtered_doc_ids);
-        output_tail = output_tail.offset(added_len as isize);
-        ids = op_add(ids, SHIFT);
-        input = input.offset(1);
+        unsafe {
+            let word = load_unaligned(input);
+            let word = u32_to_i32_avx2(word);
+            let keeper_bitset = compute_filter_bitset(word, range_simd.clone());
+            let added_len = keeper_bitset.count_ones();
+            let filtered_doc_ids = compact(ids, keeper_bitset);
+            store_unaligned(output_tail as *mut __m256i, filtered_doc_ids);
+            output_tail = output_tail.offset(added_len as isize);
+            ids = op_add(ids, SHIFT);
+            input = input.offset(1);
+        }
    }
-    output_tail.offset_from(output) as usize
+    unsafe { output_tail.offset_from(output) as usize }
 }

 #[inline]
@@ -92,8 +94,7 @@ unsafe fn compute_filter_bitset(val: __m256i, range: std::ops::RangeInclusive<__
    let too_low = op_greater(*range.start(), val);
    let too_high = op_greater(val, *range.end());
    let inside = op_or(too_low, too_high);
-    255 - std::arch::x86_64::_mm256_movemask_ps(std::mem::transmute::<DataType, __m256>(inside))
-        as u8
+    255 - std::arch::x86_64::_mm256_movemask_ps(_mm256_castsi256_ps(inside)) as u8
 }

 union U8x32 {
--- a/bitpacker/src/filter_vec/mod.rs
+++ b/bitpacker/src/filter_vec/mod.rs
@@ -35,8 +35,8 @@ const IMPLS: [FilterImplPerInstructionSet; 2] = [
 const IMPLS: [FilterImplPerInstructionSet; 1] = [FilterImplPerInstructionSet::Scalar];

 impl FilterImplPerInstructionSet {
-    #[allow(unused_variables)]
    #[inline]
+    #[allow(unused_variables)] // on non-x86_64, code is unused.
    fn from(code: u8) -> FilterImplPerInstructionSet {
        #[cfg(target_arch = "x86_64")]
        if code == FilterImplPerInstructionSet::AVX2 as u8 {
--- a/bitpacker/src/lib.rs
+++ b/bitpacker/src/lib.rs
@@ -33,11 +33,7 @@ pub use crate::blocked_bitpacker::BlockedBitpacker;
 /// number of bits.
 pub fn compute_num_bits(n: u64) -> u8 {
    let amplitude = (64u32 - n.leading_zeros()) as u8;
-    if amplitude <= 64 - 8 {
-        amplitude
-    } else {
-        64
-    }
+    if amplitude <= 64 - 8 { amplitude } else { 64 }
 }

 /// Computes the (min, max) of an iterator of `PartialOrd` values.
--- a/cliff.toml
+++ b/cliff.toml
@@ -16,14 +16,14 @@ body = """

 {%- if version %} in {{ version }}{%- endif -%}
 {% for commit in commits %}
-  {% if commit.github.pr_title -%}
-    {%- set commit_message = commit.github.pr_title -%}
+  {% if commit.remote.pr_title -%}
+    {%- set commit_message = commit.remote.pr_title -%}
  {%- else -%}
    {%- set commit_message = commit.message -%}
  {%- endif -%}
  - {{ commit_message | split(pat="\n") | first | trim }}\
-    {% if commit.github.pr_number %} \
-      [#{{ commit.github.pr_number }}]({{ self::remote_url() }}/pull/{{ commit.github.pr_number }}){% if commit.github.username %}(@{{ commit.github.username }}){%- endif -%} \
+    {% if commit.remote.pr_number %} \
+      [#{{ commit.remote.pr_number }}]({{ self::remote_url() }}/pull/{{ commit.remote.pr_number }}){% if commit.remote.username %}(@{{ commit.remote.username }}){%- endif -%} \
    {%- endif %}
 {%- endfor -%}

--- a/columnar/Cargo.toml
+++ b/columnar/Cargo.toml
@@ -1,7 +1,7 @@
 [package]
 name = "tantivy-columnar"
-version = "0.3.0"
-edition = "2021"
+version = "0.6.0"
+edition = "2024"
 license = "MIT"
 homepage = "https://github.com/quickwit-oss/tantivy"
 repository = "https://github.com/quickwit-oss/tantivy"
@@ -9,21 +9,21 @@ description = "column oriented storage for tantivy"
 categories = ["database-implementations", "data-structures", "compression"]

 [dependencies]
-itertools = "0.13.0"
+itertools = "0.14.0"
 fastdivide = "0.4.0"

-stacker = { version= "0.3", path = "../stacker", package="tantivy-stacker"}
-sstable = { version= "0.3", path = "../sstable", package = "tantivy-sstable" }
-common = { version= "0.7", path = "../common", package = "tantivy-common" }
-tantivy-bitpacker = { version= "0.6", path = "../bitpacker/" }
-serde = "1.0.152"
-downcast-rs = "1.2.0"
+stacker = { version= "0.6", path = "../stacker", package="tantivy-stacker"}
+sstable = { version= "0.6", path = "../sstable", package = "tantivy-sstable" }
+common = { version= "0.10", path = "../common", package = "tantivy-common" }
+tantivy-bitpacker = { version= "0.9", path = "../bitpacker/" }
+serde = { version = "1.0.152", features = ["derive"] }
+downcast-rs = "2.0.1"

 [dev-dependencies]
 proptest = "1"
 more-asserts = "0.3.1"
 rand = "0.8"
-binggan = "0.10.0"
+binggan = "0.14.0"

 [[bench]]
 name = "bench_merge"
@@ -33,6 +33,29 @@ harness = false
 name = "bench_access"
 harness = false

+[[bench]]
+name = "bench_first_vals"
+harness = false
+
+[[bench]]
+name = "bench_values_u64"
+harness = false
+
+[[bench]]
+name = "bench_values_u128"
+harness = false
+
+[[bench]]
+name = "bench_create_column_values"
+harness = false
+
+[[bench]]
+name = "bench_column_values_get"
+harness = false
+
+[[bench]]
+name = "bench_optional_index"
+harness = false

 [features]
-unstable = []
+zstd-compression = ["sstable/zstd-compression"]
--- a/columnar/README.md
+++ b/columnar/README.md
@@ -73,7 +73,7 @@ The crate introduces the following concepts.
 `Columnar` is an equivalent of a dataframe.
 It maps `column_key` to `Column`.

-A `Column<T>` asssociates a `RowId` (u32) to any
+A `Column<T>` associates a `RowId` (u32) to any
 number of values.

 This is made possible by wrapping a `ColumnIndex` and a `ColumnValue` object.
--- a/columnar/benches/bench_access.rs
+++ b/columnar/benches/bench_access.rs
@@ -1,6 +1,6 @@
-use binggan::{black_box, InputGroup};
+use binggan::{InputGroup, black_box};
 use common::*;
-use tantivy_columnar::Column;
+use tantivy_columnar::{Column, ValueRange};

 pub mod common;

@@ -19,7 +19,7 @@ fn main() {

    let mut add_card = |card1: Card| {
        inputs.push((
-            format!("{card1}"),
+            card1.to_string(),
            generate_columnar_and_open(card1, NUM_DOCS),
        ));
    };
@@ -42,20 +42,20 @@ fn bench_group(mut runner: InputGroup<Column>) {
            }
        }
        black_box(sum);
-        None
    });
    runner.register("access_first_vals", |column| {
        let mut sum = 0;
        const BLOCK_SIZE: usize = 32;
-        let mut docs = vec![0; BLOCK_SIZE];
-        let mut buffer = vec![None; BLOCK_SIZE];
+        let mut docs = Vec::with_capacity(BLOCK_SIZE);
+        let mut buffer = Vec::with_capacity(BLOCK_SIZE);
        for i in (0..NUM_DOCS).step_by(BLOCK_SIZE) {
-            // fill docs
+            docs.clear();
            for idx in 0..BLOCK_SIZE {
-                docs[idx] = idx as u32 + i;
+                docs.push(idx as u32 + i);
            }

-            column.first_vals(&docs, &mut buffer);
+            buffer.clear();
+            column.first_vals_in_value_range(&mut docs, &mut buffer, ValueRange::All);
            for val in buffer.iter() {
                let Some(val) = val else { continue };
                sum += *val;
@@ -63,7 +63,6 @@ fn bench_group(mut runner: InputGroup<Column>) {
        }

        black_box(sum);
-        None
    });
    runner.run();
 }
--- a/columnar/benches/bench_column_values_get.rs
+++ b/columnar/benches/bench_column_values_get.rs
@@ -0,0 +1,61 @@
+use std::sync::Arc;
+
+use binggan::{InputGroup, black_box};
+use rand::rngs::StdRng;
+use rand::{Rng, SeedableRng};
+use tantivy_columnar::ColumnValues;
+use tantivy_columnar::column_values::{CodecType, serialize_and_load_u64_based_column_values};
+
+fn get_data() -> Vec<u64> {
+    let mut rng = StdRng::seed_from_u64(2u64);
+    let mut data: Vec<_> = (100..55_000_u64)
+        .map(|num| num + rng.r#gen::<u8>() as u64)
+        .collect();
+    data.push(99_000);
+    data.insert(1000, 2000);
+    data.insert(2000, 100);
+    data.insert(3000, 4100);
+    data.insert(4000, 100);
+    data.insert(5000, 800);
+    data
+}
+
+#[inline(never)]
+fn value_iter() -> impl Iterator<Item = u64> {
+    0..20_000
+}
+
+type Col = Arc<dyn ColumnValues<u64>>;
+
+fn main() {
+    let data = get_data();
+    let inputs: Vec<(String, Col)> = vec![
+        (
+            "bitpacked".to_string(),
+            serialize_and_load_u64_based_column_values(&data.as_slice(), &[CodecType::Bitpacked]),
+        ),
+        (
+            "linear".to_string(),
+            serialize_and_load_u64_based_column_values(&data.as_slice(), &[CodecType::Linear]),
+        ),
+        (
+            "blockwise_linear".to_string(),
+            serialize_and_load_u64_based_column_values(
+                &data.as_slice(),
+                &[CodecType::BlockwiseLinear],
+            ),
+        ),
+    ];
+
+    let mut group: InputGroup<Col> = InputGroup::new_with_inputs(inputs);
+
+    group.register("fastfield_get", |col: &Col| {
+        let mut sum = 0u64;
+        for pos in value_iter() {
+            sum = sum.wrapping_add(col.get_val(pos as u32));
+        }
+        black_box(sum);
+    });
+
+    group.run();
+}
--- a/columnar/benches/bench_create_column_values.rs
+++ b/columnar/benches/bench_create_column_values.rs
@@ -0,0 +1,44 @@
+use binggan::{InputGroup, black_box};
+use rand::rngs::StdRng;
+use rand::{Rng, SeedableRng};
+use tantivy_columnar::column_values::{CodecType, serialize_u64_based_column_values};
+
+fn get_data() -> Vec<u64> {
+    let mut rng = StdRng::seed_from_u64(2u64);
+    let mut data: Vec<_> = (100..55_000_u64)
+        .map(|num| num + rng.r#gen::<u8>() as u64)
+        .collect();
+    data.push(99_000);
+    data.insert(1000, 2000);
+    data.insert(2000, 100);
+    data.insert(3000, 4100);
+    data.insert(4000, 100);
+    data.insert(5000, 800);
+    data
+}
+
+fn main() {
+    let data = get_data();
+    let mut group: InputGroup<(CodecType, Vec<u64>)> = InputGroup::new_with_inputs(vec![
+        (
+            "bitpacked codec".to_string(),
+            (CodecType::Bitpacked, data.clone()),
+        ),
+        (
+            "linear codec".to_string(),
+            (CodecType::Linear, data.clone()),
+        ),
+        (
+            "blockwise linear codec".to_string(),
+            (CodecType::BlockwiseLinear, data.clone()),
+        ),
+    ]);
+
+    group.register("serialize column_values", |data| {
+        let mut buffer = Vec::new();
+        serialize_u64_based_column_values(&data.1.as_slice(), &[data.0], &mut buffer).unwrap();
+        black_box(buffer.len());
+    });
+
+    group.run();
+}
--- a/columnar/benches/bench_first_vals.rs
+++ b/columnar/benches/bench_first_vals.rs
@@ -1,12 +1,9 @@
-#![feature(test)]
-extern crate test;
-
 use std::sync::Arc;

+use binggan::{InputGroup, black_box};
 use rand::prelude::*;
-use tantivy_columnar::column_values::{serialize_and_load_u64_based_column_values, CodecType};
+use tantivy_columnar::column_values::{CodecType, serialize_and_load_u64_based_column_values};
 use tantivy_columnar::*;
-use test::{black_box, Bencher};

 struct Columns {
    pub optional: Column,
@@ -68,88 +65,38 @@ pub fn serialize_and_load(column: &[u64], codec_type: CodecType) -> Arc<dyn Colu
    serialize_and_load_u64_based_column_values(&column, &[codec_type])
 }

-fn run_bench_on_column_full_scan(b: &mut Bencher, column: Column) {
-    let num_iter = black_box(NUM_VALUES);
-    b.iter(|| {
+fn main() {
+    let Columns {
+        optional,
+        full,
+        multi,
+    } = get_test_columns();
+
+    let inputs = vec![
+        ("full".to_string(), full),
+        ("optional".to_string(), optional),
+        ("multi".to_string(), multi),
+    ];
+
+    let mut group = InputGroup::new_with_inputs(inputs);
+
+    group.register("first_full_scan", |column| {
        let mut sum = 0u64;
-        for i in 0..num_iter as u32 {
+        for i in 0..NUM_VALUES as u32 {
            let val = column.first(i);
            sum += val.unwrap_or(0);
        }
-        sum
+        black_box(sum);
    });
-}
-fn run_bench_on_column_block_fetch(b: &mut Bencher, column: Column) {
-    let mut block: Vec<Option<u64>> = vec![None; 64];
-    let fetch_docids = (0..64).collect::<Vec<_>>();
-    b.iter(move || {
-        column.first_vals(&fetch_docids, &mut block);
-        block[0]
-    });
-}
-fn run_bench_on_column_block_single_calls(b: &mut Bencher, column: Column) {
-    let mut block: Vec<Option<u64>> = vec![None; 64];
-    let fetch_docids = (0..64).collect::<Vec<_>>();
-    b.iter(move || {
+
+    group.register("first_block_single_calls", |column| {
+        let mut block: Vec<Option<u64>> = vec![None; 64];
+        let fetch_docids = (0..64).collect::<Vec<_>>();
        for i in 0..fetch_docids.len() {
            block[i] = column.first(fetch_docids[i]);
        }
-        block[0]
+        black_box(block[0]);
    });
-}

-/// Column first method
-#[bench]
-fn bench_get_first_on_full_column_full_scan(b: &mut Bencher) {
-    let column = get_test_columns().full;
-    run_bench_on_column_full_scan(b, column);
-}
-
-#[bench]
-fn bench_get_first_on_optional_column_full_scan(b: &mut Bencher) {
-    let column = get_test_columns().optional;
-    run_bench_on_column_full_scan(b, column);
-}
-
-#[bench]
-fn bench_get_first_on_multi_column_full_scan(b: &mut Bencher) {
-    let column = get_test_columns().multi;
-    run_bench_on_column_full_scan(b, column);
-}
-
-/// Block fetch column accessor
-#[bench]
-fn bench_get_block_first_on_optional_column(b: &mut Bencher) {
-    let column = get_test_columns().optional;
-    run_bench_on_column_block_fetch(b, column);
-}
-
-#[bench]
-fn bench_get_block_first_on_multi_column(b: &mut Bencher) {
-    let column = get_test_columns().multi;
-    run_bench_on_column_block_fetch(b, column);
-}
-
-#[bench]
-fn bench_get_block_first_on_full_column(b: &mut Bencher) {
-    let column = get_test_columns().full;
-    run_bench_on_column_block_fetch(b, column);
-}
-
-#[bench]
-fn bench_get_block_first_on_optional_column_single_calls(b: &mut Bencher) {
-    let column = get_test_columns().optional;
-    run_bench_on_column_block_single_calls(b, column);
-}
-
-#[bench]
-fn bench_get_block_first_on_multi_column_single_calls(b: &mut Bencher) {
-    let column = get_test_columns().multi;
-    run_bench_on_column_block_single_calls(b, column);
-}
-
-#[bench]
-fn bench_get_block_first_on_full_column_single_calls(b: &mut Bencher) {
-    let column = get_test_columns().full;
-    run_bench_on_column_block_single_calls(b, column);
+    group.run();
 }
--- a/columnar/benches/bench_merge.rs
+++ b/columnar/benches/bench_merge.rs
@@ -1,7 +1,7 @@
 pub mod common;

-use binggan::{black_box, BenchRunner};
-use common::{generate_columnar_with_name, Card};
+use binggan::BenchRunner;
+use common::{Card, generate_columnar_with_name};
 use tantivy_columnar::*;

 const NUM_DOCS: u32 = 100_000;
@@ -29,7 +29,7 @@ fn main() {
    add_combo(Card::Multi, Card::Dense);
    add_combo(Card::Multi, Card::Sparse);

-    let runner: BenchRunner = BenchRunner::new();
+    let mut runner: BenchRunner = BenchRunner::new();
    let mut group = runner.new_group();
    for (input_name, columnar_readers) in inputs.iter() {
        group.register_with_input(
@@ -40,7 +40,14 @@ fn main() {
                let columnar_readers = columnar_readers.iter().collect::<Vec<_>>();
                let merge_row_order = StackMergeOrder::stack(&columnar_readers[..]);

-                merge_columnar(&columnar_readers, &[], merge_row_order.into(), &mut out).unwrap();
+                merge_columnar(
+                    &columnar_readers,
+                    &[],
+                    merge_row_order.into(),
+                    &mut out,
+                    || false,
+                )
+                .unwrap();
                Some(out.len() as u64)
            },
        );
--- a/columnar/benches/bench_optional_index.rs
+++ b/columnar/benches/bench_optional_index.rs
@@ -0,0 +1,106 @@
+use binggan::{InputGroup, black_box};
+use rand::rngs::StdRng;
+use rand::{Rng, SeedableRng};
+use tantivy_columnar::column_index::{OptionalIndex, Set};
+
+const TOTAL_NUM_VALUES: u32 = 1_000_000;
+
+fn gen_optional_index(fill_ratio: f64) -> OptionalIndex {
+    let mut rng: StdRng = StdRng::from_seed([1u8; 32]);
+    let vals: Vec<u32> = (0..TOTAL_NUM_VALUES)
+        .map(|_| rng.gen_bool(fill_ratio))
+        .enumerate()
+        .filter(|(_pos, val)| *val)
+        .map(|(pos, _)| pos as u32)
+        .collect();
+    OptionalIndex::for_test(TOTAL_NUM_VALUES, &vals)
+}
+
+fn random_range_iterator(
+    start: u32,
+    end: u32,
+    avg_step_size: u32,
+    avg_deviation: u32,
+) -> impl Iterator<Item = u32> {
+    let mut rng: StdRng = StdRng::from_seed([1u8; 32]);
+    let mut current = start;
+    std::iter::from_fn(move || {
+        current += rng.gen_range(avg_step_size - avg_deviation..=avg_step_size + avg_deviation);
+        if current >= end { None } else { Some(current) }
+    })
+}
+
+fn n_percent_step_iterator(percent: f32, num_values: u32) -> impl Iterator<Item = u32> {
+    let ratio = percent / 100.0;
+    let step_size = (1f32 / ratio) as u32;
+    let deviation = step_size - 1;
+    random_range_iterator(0, num_values, step_size, deviation)
+}
+
+fn walk_over_data(codec: &OptionalIndex, avg_step_size: u32) -> Option<u32> {
+    walk_over_data_from_positions(
+        codec,
+        random_range_iterator(0, TOTAL_NUM_VALUES, avg_step_size, 0),
+    )
+}
+
+fn walk_over_data_from_positions(
+    codec: &OptionalIndex,
+    positions: impl Iterator<Item = u32>,
+) -> Option<u32> {
+    let mut dense_idx: Option<u32> = None;
+    for idx in positions {
+        dense_idx = dense_idx.or(codec.rank_if_exists(idx));
+    }
+    dense_idx
+}
+
+fn main() {
+    // Build separate inputs for each fill ratio.
+    let inputs: Vec<(String, OptionalIndex)> = vec![
+        ("fill=1%".to_string(), gen_optional_index(0.01)),
+        ("fill=5%".to_string(), gen_optional_index(0.05)),
+        ("fill=10%".to_string(), gen_optional_index(0.10)),
+        ("fill=50%".to_string(), gen_optional_index(0.50)),
+        ("fill=90%".to_string(), gen_optional_index(0.90)),
+    ];
+
+    let mut group: InputGroup<OptionalIndex> = InputGroup::new_with_inputs(inputs);
+
+    // Translate orig->codec (rank_if_exists) with sampling
+    group.register("orig_to_codec_10pct_hit", |codec: &OptionalIndex| {
+        black_box(walk_over_data(codec, 100));
+    });
+    group.register("orig_to_codec_1pct_hit", |codec: &OptionalIndex| {
+        black_box(walk_over_data(codec, 1000));
+    });
+    group.register("orig_to_codec_full_scan", |codec: &OptionalIndex| {
+        black_box(walk_over_data_from_positions(codec, 0..TOTAL_NUM_VALUES));
+    });
+
+    // Translate codec->orig (select/select_batch) on sampled ranks
+    fn bench_translate_codec_to_orig_util(codec: &OptionalIndex, percent_hit: f32) {
+        let num_non_nulls = codec.num_non_nulls();
+        let idxs: Vec<u32> = if percent_hit == 100.0f32 {
+            (0..num_non_nulls).collect()
+        } else {
+            n_percent_step_iterator(percent_hit, num_non_nulls).collect()
+        };
+        let mut output = vec![0u32; idxs.len()];
+        output.copy_from_slice(&idxs[..]);
+        codec.select_batch(&mut output);
+        black_box(output);
+    }
+
+    group.register("codec_to_orig_0.005pct_hit", |codec: &OptionalIndex| {
+        bench_translate_codec_to_orig_util(codec, 0.005);
+    });
+    group.register("codec_to_orig_10pct_hit", |codec: &OptionalIndex| {
+        bench_translate_codec_to_orig_util(codec, 10.0);
+    });
+    group.register("codec_to_orig_full_scan", |codec: &OptionalIndex| {
+        bench_translate_codec_to_orig_util(codec, 100.0);
+    });
+
+    group.run();
+}
--- a/columnar/benches/bench_values_u128.rs
+++ b/columnar/benches/bench_values_u128.rs
@@ -1,15 +1,12 @@
-#![feature(test)]
-
 use std::ops::RangeInclusive;
 use std::sync::Arc;

+use binggan::{InputGroup, black_box};
 use common::OwnedBytes;
 use rand::rngs::StdRng;
 use rand::seq::SliceRandom;
-use rand::{random, Rng, SeedableRng};
+use rand::{Rng, SeedableRng, random};
 use tantivy_columnar::ColumnValues;
-use test::Bencher;
-extern crate test;

 // TODO does this make sense for IPv6 ?
 fn generate_random() -> Vec<u64> {
@@ -47,78 +44,77 @@ fn get_data_50percent_item() -> Vec<u128> {
    }
    data.push(SINGLE_ITEM);
    data.shuffle(&mut rng);
-    let data = data.iter().map(|el| *el as u128).collect::<Vec<_>>();
-    data
+    data.iter().map(|el| *el as u128).collect::<Vec<_>>()
 }

-#[bench]
-fn bench_intfastfield_getrange_u128_50percent_hit(b: &mut Bencher) {
+fn main() {
    let data = get_data_50percent_item();
-    let column = get_u128_column_from_data(&data);
+    let column_range = get_u128_column_from_data(&data);
+    let column_random = get_u128_column_random();

-    b.iter(|| {
+    struct Inputs {
+        data: Vec<u128>,
+        column_range: Arc<dyn ColumnValues<u128>>,
+        column_random: Arc<dyn ColumnValues<u128>>,
+    }
+
+    let inputs = Inputs {
+        data,
+        column_range,
+        column_random,
+    };
+    let mut group: InputGroup<Inputs> =
+        InputGroup::new_with_inputs(vec![("u128 benches".to_string(), inputs)]);
+
+    group.register(
+        "intfastfield_getrange_u128_50percent_hit",
+        |inp: &Inputs| {
+            let mut positions = Vec::new();
+            inp.column_range.get_row_ids_for_value_range(
+                *FIFTY_PERCENT_RANGE.start() as u128..=*FIFTY_PERCENT_RANGE.end() as u128,
+                0..inp.data.len() as u32,
+                &mut positions,
+            );
+            black_box(positions.len());
+        },
+    );
+
+    group.register("intfastfield_getrange_u128_single_hit", |inp: &Inputs| {
        let mut positions = Vec::new();
-        column.get_row_ids_for_value_range(
-            *FIFTY_PERCENT_RANGE.start() as u128..=*FIFTY_PERCENT_RANGE.end() as u128,
-            0..data.len() as u32,
-            &mut positions,
-        );
-        positions
-    });
-}
-
-#[bench]
-fn bench_intfastfield_getrange_u128_single_hit(b: &mut Bencher) {
-    let data = get_data_50percent_item();
-    let column = get_u128_column_from_data(&data);
-
-    b.iter(|| {
-        let mut positions = Vec::new();
-        column.get_row_ids_for_value_range(
+        inp.column_range.get_row_ids_for_value_range(
            *SINGLE_ITEM_RANGE.start() as u128..=*SINGLE_ITEM_RANGE.end() as u128,
-            0..data.len() as u32,
+            0..inp.data.len() as u32,
            &mut positions,
        );
-        positions
+        black_box(positions.len());
    });
-}

-#[bench]
-fn bench_intfastfield_getrange_u128_hit_all(b: &mut Bencher) {
-    let data = get_data_50percent_item();
-    let column = get_u128_column_from_data(&data);
-
-    b.iter(|| {
+    group.register("intfastfield_getrange_u128_hit_all", |inp: &Inputs| {
        let mut positions = Vec::new();
-        column.get_row_ids_for_value_range(0..=u128::MAX, 0..data.len() as u32, &mut positions);
-        positions
+        inp.column_range.get_row_ids_for_value_range(
+            0..=u128::MAX,
+            0..inp.data.len() as u32,
+            &mut positions,
+        );
+        black_box(positions.len());
    });
-}
-// U128 RANGE END

-#[bench]
-fn bench_intfastfield_scan_all_fflookup_u128(b: &mut Bencher) {
-    let column = get_u128_column_random();
-
-    b.iter(|| {
+    group.register("intfastfield_scan_all_fflookup_u128", |inp: &Inputs| {
        let mut a = 0u128;
-        for i in 0u64..column.num_vals() as u64 {
-            a += column.get_val(i as u32);
+        for i in 0u64..inp.column_random.num_vals() as u64 {
+            a += inp.column_random.get_val(i as u32);
        }
-        a
+        black_box(a);
    });
-}

-#[bench]
-fn bench_intfastfield_jumpy_stride5_u128(b: &mut Bencher) {
-    let column = get_u128_column_random();
-
-    b.iter(|| {
-        let n = column.num_vals();
+    group.register("intfastfield_jumpy_stride5_u128", |inp: &Inputs| {
+        let n = inp.column_random.num_vals();
        let mut a = 0u128;
        for i in (0..n / 5).map(|val| val * 5) {
-            a += column.get_val(i);
+            a += inp.column_random.get_val(i);
        }
-        a
+        black_box(a);
    });
+
+    group.run();
 }
--- a/columnar/benches/bench_values_u64.rs
+++ b/columnar/benches/bench_values_u64.rs
@@ -1,13 +1,10 @@
-#![feature(test)]
-extern crate test;
-
 use std::ops::RangeInclusive;
 use std::sync::Arc;

+use binggan::{InputGroup, black_box};
 use rand::prelude::*;
-use tantivy_columnar::column_values::{serialize_and_load_u64_based_column_values, CodecType};
+use tantivy_columnar::column_values::{CodecType, serialize_and_load_u64_based_column_values};
 use tantivy_columnar::*;
-use test::Bencher;

 // Warning: this generates the same permutation at each call
 fn generate_permutation() -> Vec<u64> {
@@ -27,37 +24,11 @@ pub fn serialize_and_load(column: &[u64], codec_type: CodecType) -> Arc<dyn Colu
    serialize_and_load_u64_based_column_values(&column, &[codec_type])
 }

-#[bench]
-fn bench_intfastfield_jumpy_veclookup(b: &mut Bencher) {
-    let permutation = generate_permutation();
-    let n = permutation.len();
-    b.iter(|| {
-        let mut a = 0u64;
-        for _ in 0..n {
-            a = permutation[a as usize];
-        }
-        a
-    });
-}
-
-#[bench]
-fn bench_intfastfield_jumpy_fflookup_bitpacked(b: &mut Bencher) {
-    let permutation = generate_permutation();
-    let n = permutation.len();
-    let column: Arc<dyn ColumnValues<u64>> = serialize_and_load(&permutation, CodecType::Bitpacked);
-    b.iter(|| {
-        let mut a = 0u64;
-        for _ in 0..n {
-            a = column.get_val(a as u32);
-        }
-        a
-    });
-}
-
 const FIFTY_PERCENT_RANGE: RangeInclusive<u64> = 1..=50;
 const SINGLE_ITEM: u64 = 90;
 const SINGLE_ITEM_RANGE: RangeInclusive<u64> = 90..=90;
 const ONE_PERCENT_ITEM_RANGE: RangeInclusive<u64> = 49..=49;
+
 fn get_data_50percent_item() -> Vec<u128> {
    let mut rng = StdRng::from_seed([1u8; 32]);

@@ -69,135 +40,122 @@ fn get_data_50percent_item() -> Vec<u128> {
    data.push(SINGLE_ITEM);

    data.shuffle(&mut rng);
-    let data = data.iter().map(|el| *el as u128).collect::<Vec<_>>();
-    data
+    data.iter().map(|el| *el as u128).collect::<Vec<_>>()
 }

-// U64 RANGE START
-#[bench]
-fn bench_intfastfield_getrange_u64_50percent_hit(b: &mut Bencher) {
-    let data = get_data_50percent_item();
-    let data = data.iter().map(|el| *el as u64).collect::<Vec<_>>();
-    let column: Arc<dyn ColumnValues<u64>> = serialize_and_load(&data, CodecType::Bitpacked);
-    b.iter(|| {
-        let mut positions = Vec::new();
-        column.get_row_ids_for_value_range(
-            FIFTY_PERCENT_RANGE,
-            0..data.len() as u32,
-            &mut positions,
-        );
-        positions
-    });
-}
+type VecCol = (Vec<u64>, Arc<dyn ColumnValues<u64>>);

-#[bench]
-fn bench_intfastfield_getrange_u64_1percent_hit(b: &mut Bencher) {
-    let data = get_data_50percent_item();
-    let data = data.iter().map(|el| *el as u64).collect::<Vec<_>>();
-    let column: Arc<dyn ColumnValues<u64>> = serialize_and_load(&data, CodecType::Bitpacked);
-
-    b.iter(|| {
-        let mut positions = Vec::new();
-        column.get_row_ids_for_value_range(
-            ONE_PERCENT_ITEM_RANGE,
-            0..data.len() as u32,
-            &mut positions,
-        );
-        positions
-    });
-}
-
-#[bench]
-fn bench_intfastfield_getrange_u64_single_hit(b: &mut Bencher) {
-    let data = get_data_50percent_item();
-    let data = data.iter().map(|el| *el as u64).collect::<Vec<_>>();
-    let column: Arc<dyn ColumnValues<u64>> = serialize_and_load(&data, CodecType::Bitpacked);
-
-    b.iter(|| {
-        let mut positions = Vec::new();
-        column.get_row_ids_for_value_range(SINGLE_ITEM_RANGE, 0..data.len() as u32, &mut positions);
-        positions
-    });
-}
-
-#[bench]
-fn bench_intfastfield_getrange_u64_hit_all(b: &mut Bencher) {
-    let data = get_data_50percent_item();
-    let data = data.iter().map(|el| *el as u64).collect::<Vec<_>>();
-    let column: Arc<dyn ColumnValues<u64>> = serialize_and_load(&data, CodecType::Bitpacked);
-
-    b.iter(|| {
-        let mut positions = Vec::new();
-        column.get_row_ids_for_value_range(0..=u64::MAX, 0..data.len() as u32, &mut positions);
-        positions
-    });
-}
-// U64 RANGE END
-
-#[bench]
-fn bench_intfastfield_stride7_vec(b: &mut Bencher) {
+fn bench_access() {
    let permutation = generate_permutation();
-    let n = permutation.len();
-    b.iter(|| {
+    let column_perm: Arc<dyn ColumnValues<u64>> =
+        serialize_and_load(&permutation, CodecType::Bitpacked);
+
+    let permutation_gcd = generate_permutation_gcd();
+    let column_perm_gcd: Arc<dyn ColumnValues<u64>> =
+        serialize_and_load(&permutation_gcd, CodecType::Bitpacked);
+
+    let mut group: InputGroup<VecCol> = InputGroup::new_with_inputs(vec![
+        (
+            "access".to_string(),
+            (permutation.clone(), column_perm.clone()),
+        ),
+        (
+            "access_gcd".to_string(),
+            (permutation_gcd.clone(), column_perm_gcd.clone()),
+        ),
+    ]);
+
+    group.register("stride7_vec", |inp: &VecCol| {
+        let n = inp.0.len();
        let mut a = 0u64;
        for i in (0..n / 7).map(|val| val * 7) {
-            a += permutation[i as usize];
+            a += inp.0[i];
        }
-        a
+        black_box(a);
    });
-}

-#[bench]
-fn bench_intfastfield_stride7_fflookup(b: &mut Bencher) {
-    let permutation = generate_permutation();
-    let n = permutation.len();
-    let column: Arc<dyn ColumnValues<u64>> = serialize_and_load(&permutation, CodecType::Bitpacked);
-    b.iter(|| {
-        let mut a = 0;
+    group.register("fullscan_vec", |inp: &VecCol| {
+        let mut a = 0u64;
+        for i in 0..inp.0.len() {
+            a += inp.0[i];
+        }
+        black_box(a);
+    });
+
+    group.register("stride7_column_values", |inp: &VecCol| {
+        let n = inp.1.num_vals() as usize;
+        let mut a = 0u64;
        for i in (0..n / 7).map(|val| val * 7) {
-            a += column.get_val(i as u32);
+            a += inp.1.get_val(i as u32);
        }
-        a
+        black_box(a);
    });
-}

-#[bench]
-fn bench_intfastfield_scan_all_fflookup(b: &mut Bencher) {
-    let permutation = generate_permutation();
-    let n = permutation.len();
-    let column: Arc<dyn ColumnValues<u64>> = serialize_and_load(&permutation, CodecType::Bitpacked);
-    let column_ref = column.as_ref();
-    b.iter(|| {
-        let mut a = 0u64;
-        for i in 0u32..n as u32 {
-            a += column_ref.get_val(i);
-        }
-        a
-    });
-}
-
-#[bench]
-fn bench_intfastfield_scan_all_fflookup_gcd(b: &mut Bencher) {
-    let permutation = generate_permutation_gcd();
-    let n = permutation.len();
-    let column: Arc<dyn ColumnValues<u64>> = serialize_and_load(&permutation, CodecType::Bitpacked);
-    b.iter(|| {
+    group.register("fullscan_column_values", |inp: &VecCol| {
        let mut a = 0u64;
+        let n = inp.1.num_vals() as usize;
        for i in 0..n {
-            a += column.get_val(i as u32);
+            a += inp.1.get_val(i as u32);
        }
-        a
+        black_box(a);
    });
+
+    group.run();
 }

-#[bench]
-fn bench_intfastfield_scan_all_vec(b: &mut Bencher) {
-    let permutation = generate_permutation();
-    b.iter(|| {
-        let mut a = 0u64;
-        for i in 0..permutation.len() {
-            a += permutation[i as usize] as u64;
-        }
-        a
-    });
+fn bench_range() {
+    let data_50 = get_data_50percent_item();
+    let data_u64 = data_50.iter().map(|el| *el as u64).collect::<Vec<_>>();
+    let column_data: Arc<dyn ColumnValues<u64>> =
+        serialize_and_load(&data_u64, CodecType::Bitpacked);
+
+    let mut group: InputGroup<Arc<dyn ColumnValues<u64>>> =
+        InputGroup::new_with_inputs(vec![("dist_50pct_item".to_string(), column_data.clone())]);
+
+    group.register(
+        "fastfield_getrange_u64_50percent_hit",
+        |col: &Arc<dyn ColumnValues<u64>>| {
+            let mut positions = Vec::new();
+            col.get_row_ids_for_value_range(FIFTY_PERCENT_RANGE, 0..col.num_vals(), &mut positions);
+            black_box(positions.len());
+        },
+    );
+
+    group.register(
+        "fastfield_getrange_u64_1percent_hit",
+        |col: &Arc<dyn ColumnValues<u64>>| {
+            let mut positions = Vec::new();
+            col.get_row_ids_for_value_range(
+                ONE_PERCENT_ITEM_RANGE,
+                0..col.num_vals(),
+                &mut positions,
+            );
+            black_box(positions.len());
+        },
+    );
+
+    group.register(
+        "fastfield_getrange_u64_single_hit",
+        |col: &Arc<dyn ColumnValues<u64>>| {
+            let mut positions = Vec::new();
+            col.get_row_ids_for_value_range(SINGLE_ITEM_RANGE, 0..col.num_vals(), &mut positions);
+            black_box(positions.len());
+        },
+    );
+
+    group.register(
+        "fastfield_getrange_u64_hit_all",
+        |col: &Arc<dyn ColumnValues<u64>>| {
+            let mut positions = Vec::new();
+            col.get_row_ids_for_value_range(0..=u64::MAX, 0..col.num_vals(), &mut positions);
+            black_box(positions.len());
+        },
+    );
+
+    group.run();
+}
+
+fn main() {
+    bench_access();
+    bench_range();
 }
--- a/columnar/columnar-cli-inspect/Cargo.toml
+++ b/columnar/columnar-cli-inspect/Cargo.toml
@@ -0,0 +1,18 @@
+[package]
+name = "tantivy-columnar-inspect"
+version = "0.1.0"
+edition = "2021"
+license = "MIT"
+
+[dependencies]
+tantivy = {path="../..", package="tantivy"}
+columnar = {path="../", package="tantivy-columnar"}
+common = {path="../../common", package="tantivy-common"}
+
+[workspace]
+members = []
+
+[profile.release]
+debug = true
+#debug-assertions = true
+#overflow-checks = true
--- a/columnar/columnar-cli-inspect/src/main.rs
+++ b/columnar/columnar-cli-inspect/src/main.rs
@@ -0,0 +1,54 @@
+use columnar::ColumnarReader;
+use common::file_slice::{FileSlice, WrapFile};
+use std::io;
+use std::path::Path;
+use tantivy::directory::footer::Footer;
+
+fn main() -> io::Result<()> {
+    println!("Opens a columnar file written by tantivy and validates it.");
+    let path = std::env::args().nth(1).unwrap();
+
+    let path = Path::new(&path);
+    println!("Reading {:?}", path);
+    let _reader = open_and_validate_columnar(path.to_str().unwrap())?;
+
+    Ok(())
+}
+
+pub fn validate_columnar_reader(reader: &ColumnarReader) {
+    let num_rows = reader.num_rows();
+    println!("num_rows: {}", num_rows);
+    let columns = reader.list_columns().unwrap();
+    println!("num columns: {:?}", columns.len());
+    for (col_name, dynamic_column_handle) in columns {
+        let col = dynamic_column_handle.open().unwrap();
+        match col {
+            columnar::DynamicColumn::Bool(_)
+            | columnar::DynamicColumn::I64(_)
+            | columnar::DynamicColumn::U64(_)
+            | columnar::DynamicColumn::F64(_)
+            | columnar::DynamicColumn::IpAddr(_)
+            | columnar::DynamicColumn::DateTime(_)
+            | columnar::DynamicColumn::Bytes(_) => {}
+            columnar::DynamicColumn::Str(str_column) => {
+                let num_vals = str_column.ords().values.num_vals();
+                let num_terms_dict = str_column.num_terms() as u64;
+                let max_ord = str_column.ords().values.iter().max().unwrap_or_default();
+                println!("{col_name:35}  num_vals {num_vals:10} \t num_terms_dict {num_terms_dict:8} max_ord: {max_ord:8}",);
+                for ord in str_column.ords().values.iter() {
+                    assert!(ord < num_terms_dict);
+                }
+            }
+        }
+    }
+}
+
+/// Opens a columnar file that was written by tantivy and validates it.
+pub fn open_and_validate_columnar(path: &str) -> io::Result<ColumnarReader> {
+    let wrap_file = WrapFile::new(std::fs::File::open(path)?)?;
+    let slice = FileSlice::new(std::sync::Arc::new(wrap_file));
+    let (_footer, slice) = Footer::extract_footer(slice.clone()).unwrap();
+    let reader = ColumnarReader::open(slice).unwrap();
+    validate_columnar_reader(&reader);
+    Ok(reader)
+}
--- a/columnar/src/block_accessor.rs
+++ b/columnar/src/block_accessor.rs
@@ -66,7 +66,7 @@ impl<T: PartialOrd + Copy + std::fmt::Debug + Send + Sync + 'static + Default>
        &'a self,
        docs: &'a [u32],
        accessor: &Column<T>,
-    ) -> impl Iterator<Item = (DocId, T)> + '_ {
+    ) -> impl Iterator<Item = (DocId, T)> + 'a + use<'a, T> {
        if accessor.index.get_cardinality().is_full() {
            docs.iter().cloned().zip(self.val_cache.iter().cloned())
        } else {
@@ -139,7 +139,7 @@ mod tests {
            missing_docs.push(missing_doc);
        });

-        assert_eq!(missing_docs, vec![]);
+        assert_eq!(missing_docs, Vec::<u32>::new());
    }

    #[test]
--- a/columnar/src/column/dictionary_encoded.rs
+++ b/columnar/src/column/dictionary_encoded.rs
@@ -4,8 +4,8 @@ use std::{fmt, io};

 use sstable::{Dictionary, VoidSSTable};

-use crate::column::Column;
 use crate::RowId;
+use crate::column::Column;

 /// Dictionary encoded column.
 ///
--- a/columnar/src/column/mod.rs
+++ b/columnar/src/column/mod.rs
@@ -1,6 +1,7 @@
 mod dictionary_encoded;
 mod serialize;

+use std::cell::RefCell;
 use std::fmt::{self, Debug};
 use std::io::Write;
 use std::ops::{Range, RangeInclusive};
@@ -9,15 +10,21 @@ use std::sync::Arc;
 use common::BinarySerializable;
 pub use dictionary_encoded::{BytesColumn, StrColumn};
 pub use serialize::{
-    open_column_bytes, open_column_str, open_column_u128, open_column_u128_as_compact_u64,
-    open_column_u64, serialize_column_mappable_to_u128, serialize_column_mappable_to_u64,
+    open_column_bytes, open_column_str, open_column_u64, open_column_u128,
+    open_column_u128_as_compact_u64, serialize_column_mappable_to_u64,
+    serialize_column_mappable_to_u128,
 };

 use crate::column_index::{ColumnIndex, Set};
 use crate::column_values::monotonic_mapping::StrictlyMonotonicMappingToInternal;
-use crate::column_values::{monotonic_map_column, ColumnValues};
+use crate::column_values::{ColumnValues, monotonic_map_column};
 use crate::{Cardinality, DocId, EmptyColumnValues, MonotonicallyMappableToU64, RowId};

+thread_local! {
+    static ROWS: RefCell<Vec<RowId>> = const { RefCell::new(Vec::new()) };
+    static DOCS: RefCell<Vec<DocId>> = const { RefCell::new(Vec::new()) };
+}
+
 #[derive(Clone)]
 pub struct Column<T = u64> {
    pub index: ColumnIndex,
@@ -88,32 +95,7 @@ impl<T: PartialOrd + Copy + Debug + Send + Sync + 'static> Column<T> {
        self.values_for_doc(row_id).next()
    }

-    /// Load the first value for each docid in the provided slice.
-    #[inline]
-    pub fn first_vals(&self, docids: &[DocId], output: &mut [Option<T>]) {
-        match &self.index {
-            ColumnIndex::Empty { .. } => {}
-            ColumnIndex::Full => self.values.get_vals_opt(docids, output),
-            ColumnIndex::Optional(optional_index) => {
-                for (i, docid) in docids.iter().enumerate() {
-                    output[i] = optional_index
-                        .rank_if_exists(*docid)
-                        .map(|rowid| self.values.get_val(rowid));
-                }
-            }
-            ColumnIndex::Multivalued(multivalued_index) => {
-                for (i, docid) in docids.iter().enumerate() {
-                    let range = multivalued_index.range(*docid);
-                    let is_empty = range.start == range.end;
-                    if !is_empty {
-                        output[i] = Some(self.values.get_val(range.start));
-                    }
-                }
-            }
-        }
-    }
-
-    /// Translates a block of docis to row_ids.
+    /// Translates a block of docids to row_ids.
    ///
    /// returns the row_ids and the matching docids on the same index
    /// e.g.
@@ -130,6 +112,8 @@ impl<T: PartialOrd + Copy + Debug + Send + Sync + 'static> Column<T> {
        self.index.docids_to_rowids(doc_ids, doc_ids_out, row_ids)
    }

+    /// Get an iterator over the values for the provided docid.
+    #[inline]
    pub fn values_for_doc(&self, doc_id: DocId) -> impl Iterator<Item = T> + '_ {
        self.index
            .value_row_ids(doc_id)
@@ -140,7 +124,7 @@ impl<T: PartialOrd + Copy + Debug + Send + Sync + 'static> Column<T> {
    #[inline]
    pub fn get_docids_for_value_range(
        &self,
-        value_range: RangeInclusive<T>,
+        value_range: ValueRange<T>,
        selected_docid_range: Range<u32>,
        doc_ids: &mut Vec<u32>,
    ) {
@@ -157,15 +141,6 @@ impl<T: PartialOrd + Copy + Debug + Send + Sync + 'static> Column<T> {
            .select_batch_in_place(selected_docid_range.start, doc_ids);
    }

-    /// Fills the output vector with the (possibly multiple values that are associated_with
-    /// `row_id`.
-    ///
-    /// This method clears the `output` vector.
-    pub fn fill_vals(&self, row_id: RowId, output: &mut Vec<T>) {
-        output.clear();
-        output.extend(self.values_for_doc(row_id));
-    }
-
    pub fn first_or_default_col(self, default_value: T) -> Arc<dyn ColumnValues<T>> {
        Arc::new(FirstValueWithDefault {
            column: self,
@@ -174,6 +149,181 @@ impl<T: PartialOrd + Copy + Debug + Send + Sync + 'static> Column<T> {
    }
 }

+// Separate impl block for methods requiring `Default` for `T`.
+impl<T: PartialOrd + Copy + Debug + Send + Sync + 'static + Default> Column<T> {
+    /// Load the first value for each docid in the provided slice.
+    ///
+    /// The `docids` vector is mutated: documents that do not match the `value_range` are removed.
+    /// The `values` vector is populated with the values of the remaining documents.
+    #[inline]
+    pub fn first_vals_in_value_range(
+        &self,
+        input_docs: &[DocId],
+        output: &mut Vec<crate::ComparableDoc<Option<T>, DocId>>,
+        value_range: ValueRange<T>,
+    ) {
+        match (&self.index, value_range) {
+            (ColumnIndex::Empty { .. }, value_range) => {
+                let nulls_match = match &value_range {
+                    ValueRange::All => true,
+                    ValueRange::Inclusive(_) => false,
+                    ValueRange::GreaterThan(_, nulls_match) => *nulls_match,
+                    ValueRange::GreaterThanOrEqual(_, nulls_match) => *nulls_match,
+                    ValueRange::LessThan(_, nulls_match) => *nulls_match,
+                    ValueRange::LessThanOrEqual(_, nulls_match) => *nulls_match,
+                };
+                if nulls_match {
+                    for &doc in input_docs {
+                        output.push(crate::ComparableDoc {
+                            doc,
+                            sort_key: None,
+                        });
+                    }
+                }
+            }
+            (ColumnIndex::Full, value_range) => {
+                self.values
+                    .get_vals_in_value_range(input_docs, input_docs, output, value_range);
+            }
+            (ColumnIndex::Optional(optional_index), value_range) => {
+                let nulls_match = match &value_range {
+                    ValueRange::All => true,
+                    ValueRange::Inclusive(_) => false,
+                    ValueRange::GreaterThan(_, nulls_match) => *nulls_match,
+                    ValueRange::GreaterThanOrEqual(_, nulls_match) => *nulls_match,
+                    ValueRange::LessThan(_, nulls_match) => *nulls_match,
+                    ValueRange::LessThanOrEqual(_, nulls_match) => *nulls_match,
+                };
+
+                let fallback_needed = ROWS.with(|rows_cell| {
+                    DOCS.with(|docs_cell| {
+                        let mut rows = rows_cell.borrow_mut();
+                        let mut docs = docs_cell.borrow_mut();
+                        rows.clear();
+                        docs.clear();
+
+                        let mut has_nulls = false;
+
+                        for &doc_id in input_docs {
+                            if let Some(row_id) = optional_index.rank_if_exists(doc_id) {
+                                rows.push(row_id);
+                                docs.push(doc_id);
+                            } else {
+                                has_nulls = true;
+                                if nulls_match {
+                                    break;
+                                }
+                            }
+                        }
+
+                        if !has_nulls || !nulls_match {
+                            self.values.get_vals_in_value_range(
+                                &rows,
+                                &docs,
+                                output,
+                                value_range.clone(),
+                            );
+                            return false;
+                        }
+                        true
+                    })
+                });
+
+                if fallback_needed {
+                    for &doc_id in input_docs {
+                        if let Some(row_id) = optional_index.rank_if_exists(doc_id) {
+                            let val = self.values.get_val(row_id);
+                            let value_matches = match &value_range {
+                                ValueRange::All => true,
+                                ValueRange::Inclusive(r) => r.contains(&val),
+                                ValueRange::GreaterThan(t, _) => val > *t,
+                                ValueRange::GreaterThanOrEqual(t, _) => val >= *t,
+                                ValueRange::LessThan(t, _) => val < *t,
+                                ValueRange::LessThanOrEqual(t, _) => val <= *t,
+                            };
+
+                            if value_matches {
+                                output.push(crate::ComparableDoc {
+                                    doc: doc_id,
+                                    sort_key: Some(val),
+                                });
+                            }
+                        } else if nulls_match {
+                            output.push(crate::ComparableDoc {
+                                doc: doc_id,
+                                sort_key: None,
+                            });
+                        }
+                    }
+                }
+            }
+            (ColumnIndex::Multivalued(multivalued_index), value_range) => {
+                let nulls_match = match &value_range {
+                    ValueRange::All => true,
+                    ValueRange::Inclusive(_) => false,
+                    ValueRange::GreaterThan(_, nulls_match) => *nulls_match,
+                    ValueRange::GreaterThanOrEqual(_, nulls_match) => *nulls_match,
+                    ValueRange::LessThan(_, nulls_match) => *nulls_match,
+                    ValueRange::LessThanOrEqual(_, nulls_match) => *nulls_match,
+                };
+                for i in 0..input_docs.len() {
+                    let docid = input_docs[i];
+                    let row_range = multivalued_index.range(docid);
+                    let is_empty = row_range.start == row_range.end;
+                    if !is_empty {
+                        let val = self.values.get_val(row_range.start);
+                        let matches = match &value_range {
+                            ValueRange::All => true,
+                            ValueRange::Inclusive(r) => r.contains(&val),
+                            ValueRange::GreaterThan(t, _) => val > *t,
+                            ValueRange::GreaterThanOrEqual(t, _) => val >= *t,
+                            ValueRange::LessThan(t, _) => val < *t,
+                            ValueRange::LessThanOrEqual(t, _) => val <= *t,
+                        };
+                        if matches {
+                            output.push(crate::ComparableDoc {
+                                doc: docid,
+                                sort_key: Some(val),
+                            });
+                        }
+                    } else if nulls_match {
+                        output.push(crate::ComparableDoc {
+                            doc: docid,
+                            sort_key: None,
+                        });
+                    }
+                }
+            }
+        }
+    }
+}
+
+/// A range of values.
+///
+/// This type is intended to be used in batch APIs, where the cost of unpacking the enum
+/// is outweighed by the time spent processing a batch.
+///
+/// Implementers should pattern match on the variants to use optimized loops for each case.
+#[derive(Clone, Debug)]
+pub enum ValueRange<T> {
+    /// A range that includes both start and end.
+    Inclusive(RangeInclusive<T>),
+    /// A range that matches all values.
+    All,
+    /// A range that matches all values greater than the threshold.
+    /// The boolean flag indicates if null values should be included.
+    GreaterThan(T, bool),
+    /// A range that matches all values greater than or equal to the threshold.
+    /// The boolean flag indicates if null values should be included.
+    GreaterThanOrEqual(T, bool),
+    /// A range that matches all values less than the threshold.
+    /// The boolean flag indicates if null values should be included.
+    LessThan(T, bool),
+    /// A range that matches all values less than or equal to the threshold.
+    /// The boolean flag indicates if null values should be included.
+    LessThanOrEqual(T, bool),
+}
+
 impl BinarySerializable for Cardinality {
    fn serialize<W: Write + ?Sized>(&self, writer: &mut W) -> std::io::Result<()> {
        self.to_code().serialize(writer)
--- a/columnar/src/column/serialize.rs
+++ b/columnar/src/column/serialize.rs
@@ -2,14 +2,14 @@ use std::io;
 use std::io::Write;
 use std::sync::Arc;

-use common::OwnedBytes;
+use common::file_slice::FileSlice;
 use sstable::Dictionary;

 use crate::column::{BytesColumn, Column};
-use crate::column_index::{serialize_column_index, SerializableColumnIndex};
+use crate::column_index::{SerializableColumnIndex, serialize_column_index};
 use crate::column_values::{
+    CodecType, MonotonicallyMappableToU64, MonotonicallyMappableToU128,
    load_u64_based_column_values, serialize_column_values_u128, serialize_u64_based_column_values,
-    CodecType, MonotonicallyMappableToU128, MonotonicallyMappableToU64,
 };
 use crate::iterable::Iterable;
 use crate::{StrColumn, Version};
@@ -41,12 +41,13 @@ pub fn serialize_column_mappable_to_u64<T: MonotonicallyMappableToU64>(
 }

 pub fn open_column_u64<T: MonotonicallyMappableToU64>(
-    bytes: OwnedBytes,
+    file_slice: FileSlice,
    format_version: Version,
 ) -> io::Result<Column<T>> {
-    let (body, column_index_num_bytes_payload) = bytes.rsplit(4);
+    let (body, column_index_num_bytes_payload) = file_slice.split_from_end(4);
    let column_index_num_bytes = u32::from_le_bytes(
        column_index_num_bytes_payload
+            .read_bytes()?
            .as_slice()
            .try_into()
            .unwrap(),
@@ -61,12 +62,13 @@ pub fn open_column_u64<T: MonotonicallyMappableToU64>(
 }

 pub fn open_column_u128<T: MonotonicallyMappableToU128>(
-    bytes: OwnedBytes,
+    file_slice: FileSlice,
    format_version: Version,
 ) -> io::Result<Column<T>> {
-    let (body, column_index_num_bytes_payload) = bytes.rsplit(4);
+    let (body, column_index_num_bytes_payload) = file_slice.split_from_end(4);
    let column_index_num_bytes = u32::from_le_bytes(
        column_index_num_bytes_payload
+            .read_bytes()?
            .as_slice()
            .try_into()
            .unwrap(),
@@ -84,12 +86,13 @@ pub fn open_column_u128<T: MonotonicallyMappableToU128>(
 ///
 /// See [`open_u128_as_compact_u64`] for more details.
 pub fn open_column_u128_as_compact_u64(
-    bytes: OwnedBytes,
+    file_slice: FileSlice,
    format_version: Version,
 ) -> io::Result<Column<u64>> {
-    let (body, column_index_num_bytes_payload) = bytes.rsplit(4);
+    let (body, column_index_num_bytes_payload) = file_slice.split_from_end(4);
    let column_index_num_bytes = u32::from_le_bytes(
        column_index_num_bytes_payload
+            .read_bytes()?
            .as_slice()
            .try_into()
            .unwrap(),
@@ -103,11 +106,21 @@ pub fn open_column_u128_as_compact_u64(
    })
 }

-pub fn open_column_bytes(data: OwnedBytes, format_version: Version) -> io::Result<BytesColumn> {
-    let (body, dictionary_len_bytes) = data.rsplit(4);
-    let dictionary_len = u32::from_le_bytes(dictionary_len_bytes.as_slice().try_into().unwrap());
+pub fn open_column_bytes(
+    file_slice: FileSlice,
+    format_version: Version,
+) -> io::Result<BytesColumn> {
+    let (body, dictionary_len_bytes) = file_slice.split_from_end(4);
+    let dictionary_len = u32::from_le_bytes(
+        dictionary_len_bytes
+            .read_bytes()?
+            .as_slice()
+            .try_into()
+            .unwrap(),
+    );
    let (dictionary_bytes, column_bytes) = body.split(dictionary_len as usize);
-    let dictionary = Arc::new(Dictionary::from_bytes(dictionary_bytes)?);
+
+    let dictionary = Arc::new(Dictionary::open(dictionary_bytes)?);
    let term_ord_column = crate::column::open_column_u64::<u64>(column_bytes, format_version)?;
    Ok(BytesColumn {
        dictionary,
@@ -115,7 +128,7 @@ pub fn open_column_bytes(data: OwnedBytes, format_version: Version) -> io::Resul
    })
 }

-pub fn open_column_str(data: OwnedBytes, format_version: Version) -> io::Result<StrColumn> {
-    let bytes_column = open_column_bytes(data, format_version)?;
+pub fn open_column_str(file_slice: FileSlice, format_version: Version) -> io::Result<StrColumn> {
+    let bytes_column = open_column_bytes(file_slice, format_version)?;
    Ok(StrColumn::wrap(bytes_column))
 }
--- a/columnar/src/column_index/merge/mod.rs
+++ b/columnar/src/column_index/merge/mod.rs
@@ -95,13 +95,13 @@ pub fn merge_column_index<'a>(

 #[cfg(test)]
 mod tests {
-    use common::OwnedBytes;
+    use common::file_slice::FileSlice;

    use crate::column_index::merge::detect_cardinality;
    use crate::column_index::multivalued_index::{
-        open_multivalued_index, serialize_multivalued_index, MultiValueIndex,
+        MultiValueIndex, open_multivalued_index, serialize_multivalued_index,
    };
-    use crate::column_index::{merge_column_index, OptionalIndex, SerializableColumnIndex};
+    use crate::column_index::{OptionalIndex, SerializableColumnIndex, merge_column_index};
    use crate::{
        Cardinality, ColumnIndex, MergeRowOrder, RowAddr, RowId, ShuffleMergeOrder, StackMergeOrder,
    };
@@ -178,7 +178,7 @@ mod tests {
        let mut output = Vec::new();
        serialize_multivalued_index(&start_index_iterable, &mut output).unwrap();
        let multivalue =
-            open_multivalued_index(OwnedBytes::new(output), crate::Version::V2).unwrap();
+            open_multivalued_index(FileSlice::from(output), crate::Version::V2).unwrap();
        let start_indexes: Vec<RowId> = multivalue.get_start_index_column().iter().collect();
        assert_eq!(&start_indexes, &[0, 3, 5]);
    }
@@ -216,7 +216,7 @@ mod tests {
        let mut output = Vec::new();
        serialize_multivalued_index(&start_index_iterable, &mut output).unwrap();
        let multivalue =
-            open_multivalued_index(OwnedBytes::new(output), crate::Version::V2).unwrap();
+            open_multivalued_index(FileSlice::from(output), crate::Version::V2).unwrap();
        let start_indexes: Vec<RowId> = multivalue.get_start_index_column().iter().collect();
        assert_eq!(&start_indexes, &[0, 3, 5, 6]);
    }
--- a/columnar/src/column_index/merge/shuffled.rs
+++ b/columnar/src/column_index/merge/shuffled.rs
@@ -58,7 +58,7 @@ struct ShuffledIndex<'a> {
    merge_order: &'a ShuffleMergeOrder,
 }

-impl<'a> Iterable<u32> for ShuffledIndex<'a> {
+impl Iterable<u32> for ShuffledIndex<'_> {
    fn boxed_iter(&self) -> Box<dyn Iterator<Item = u32> + '_> {
        Box::new(
            self.merge_order
@@ -127,7 +127,7 @@ fn integrate_num_vals(num_vals: impl Iterator<Item = u32>) -> impl Iterator<Item
    )
 }

-impl<'a> Iterable<u32> for ShuffledMultivaluedIndex<'a> {
+impl Iterable<u32> for ShuffledMultivaluedIndex<'_> {
    fn boxed_iter(&self) -> Box<dyn Iterator<Item = u32> + '_> {
        let num_vals_per_row = iter_num_values(self.column_indexes, self.merge_order);
        Box::new(integrate_num_vals(num_vals_per_row))
@@ -137,8 +137,8 @@ impl<'a> Iterable<u32> for ShuffledMultivaluedIndex<'a> {
 #[cfg(test)]
 mod tests {
    use super::*;
-    use crate::column_index::OptionalIndex;
    use crate::RowAddr;
+    use crate::column_index::OptionalIndex;

    #[test]
    fn test_integrate_num_vals_empty() {
--- a/columnar/src/column_index/merge/stacked.rs
+++ b/columnar/src/column_index/merge/stacked.rs
@@ -1,8 +1,8 @@
 use std::ops::Range;

+use crate::column_index::SerializableColumnIndex;
 use crate::column_index::multivalued_index::{MultiValueIndex, SerializableMultivalueIndex};
 use crate::column_index::serialize::SerializableOptionalIndex;
-use crate::column_index::SerializableColumnIndex;
 use crate::iterable::Iterable;
 use crate::{Cardinality, ColumnIndex, RowId, StackMergeOrder};

@@ -56,7 +56,7 @@ fn get_doc_ids_with_values<'a>(
        ColumnIndex::Full => Box::new(doc_range),
        ColumnIndex::Optional(optional_index) => Box::new(
            optional_index
-                .iter_rows()
+                .iter_non_null_docs()
                .map(move |row| row + doc_range.start),
        ),
        ColumnIndex::Multivalued(multivalued_index) => match multivalued_index {
@@ -73,7 +73,7 @@ fn get_doc_ids_with_values<'a>(
            MultiValueIndex::MultiValueIndexV2(multivalued_index) => Box::new(
                multivalued_index
                    .optional_index
-                    .iter_rows()
+                    .iter_non_null_docs()
                    .map(move |row| row + doc_range.start),
            ),
        },
@@ -105,10 +105,11 @@ fn get_num_values_iterator<'a>(
 ) -> Box<dyn Iterator<Item = u32> + 'a> {
    match column_index {
        ColumnIndex::Empty { .. } => Box::new(std::iter::empty()),
-        ColumnIndex::Full => Box::new(std::iter::repeat(1u32).take(num_docs as usize)),
-        ColumnIndex::Optional(optional_index) => {
-            Box::new(std::iter::repeat(1u32).take(optional_index.num_non_nulls() as usize))
-        }
+        ColumnIndex::Full => Box::new(std::iter::repeat_n(1u32, num_docs as usize)),
+        ColumnIndex::Optional(optional_index) => Box::new(std::iter::repeat_n(
+            1u32,
+            optional_index.num_non_nulls() as usize,
+        )),
        ColumnIndex::Multivalued(multivalued_index) => Box::new(
            multivalued_index
                .get_start_index_column()
@@ -123,7 +124,7 @@ fn get_num_values_iterator<'a>(
    }
 }

-impl<'a> Iterable<u32> for StackedStartOffsets<'a> {
+impl Iterable<u32> for StackedStartOffsets<'_> {
    fn boxed_iter(&self) -> Box<dyn Iterator<Item = u32> + '_> {
        let num_values_it = (0..self.column_indexes.len()).flat_map(|columnar_id| {
            let num_docs = self.stack_merge_order.columnar_range(columnar_id).len() as u32;
@@ -177,7 +178,7 @@ impl<'a> Iterable<RowId> for StackedOptionalIndex<'a> {
                        ColumnIndex::Full => Box::new(columnar_row_range),
                        ColumnIndex::Optional(optional_index) => Box::new(
                            optional_index
-                                .iter_rows()
+                                .iter_non_null_docs()
                                .map(move |row_id: RowId| columnar_row_range.start + row_id),
                        ),
                        ColumnIndex::Multivalued(_) => {
--- a/columnar/src/column_index/mod.rs
+++ b/columnar/src/column_index/mod.rs
@@ -14,7 +14,7 @@ pub use merge::merge_column_index;
 pub(crate) use multivalued_index::SerializableMultivalueIndex;
 pub use optional_index::{OptionalIndex, Set};
 pub use serialize::{
-    open_column_index, serialize_column_index, SerializableColumnIndex, SerializableOptionalIndex,
+    SerializableColumnIndex, SerializableOptionalIndex, open_column_index, serialize_column_index,
 };

 use crate::column_index::multivalued_index::MultiValueIndex;
--- a/columnar/src/column_index/multivalued_index.rs
+++ b/columnar/src/column_index/multivalued_index.rs
@@ -3,12 +3,13 @@ use std::io::Write;
 use std::ops::Range;
 use std::sync::Arc;

-use common::{CountingWriter, OwnedBytes};
+use common::CountingWriter;
+use common::file_slice::FileSlice;

 use super::optional_index::{open_optional_index, serialize_optional_index};
 use super::{OptionalIndex, SerializableOptionalIndex, Set};
 use crate::column_values::{
-    load_u64_based_column_values, serialize_u64_based_column_values, CodecType, ColumnValues,
+    CodecType, ColumnValues, load_u64_based_column_values, serialize_u64_based_column_values,
 };
 use crate::iterable::Iterable;
 use crate::{DocId, RowId, Version};
@@ -44,21 +45,26 @@ pub fn serialize_multivalued_index(
 }

 pub fn open_multivalued_index(
-    bytes: OwnedBytes,
+    file_slice: FileSlice,
    format_version: Version,
 ) -> io::Result<MultiValueIndex> {
    match format_version {
        Version::V1 => {
            let start_index_column: Arc<dyn ColumnValues<RowId>> =
-                load_u64_based_column_values(bytes)?;
+                load_u64_based_column_values(file_slice)?;
            Ok(MultiValueIndex::MultiValueIndexV1(MultiValueIndexV1 {
                start_index_column,
            }))
        }
        Version::V2 => {
-            let (body_bytes, optional_index_len) = bytes.rsplit(4);
-            let optional_index_len =
-                u32::from_le_bytes(optional_index_len.as_slice().try_into().unwrap());
+            let (body_bytes, optional_index_len) = file_slice.split_from_end(4);
+            let optional_index_len = u32::from_le_bytes(
+                optional_index_len
+                    .read_bytes()?
+                    .as_slice()
+                    .try_into()
+                    .unwrap(),
+            );
            let (optional_index_bytes, start_index_bytes) =
                body_bytes.split(optional_index_len as usize);
            let optional_index = open_optional_index(optional_index_bytes)?;
@@ -185,8 +191,8 @@ impl MultiValueIndex {
        };
        let mut buffer = Vec::new();
        serialize_multivalued_index(&serializable_multivalued_index, &mut buffer).unwrap();
-        let bytes = OwnedBytes::new(buffer);
-        open_multivalued_index(bytes, Version::V2).unwrap()
+        let file_slice = FileSlice::from(buffer);
+        open_multivalued_index(file_slice, Version::V2).unwrap()
    }

    pub fn get_start_index_column(&self) -> &Arc<dyn crate::ColumnValues<RowId>> {
@@ -215,6 +221,32 @@ impl MultiValueIndex {
        }
    }

+    /// Returns an iterator over document ids that have at least one value.
+    pub fn iter_non_null_docs(&self) -> Box<dyn Iterator<Item = DocId> + '_> {
+        match self {
+            MultiValueIndex::MultiValueIndexV1(idx) => {
+                let mut doc: DocId = 0u32;
+                let num_docs = idx.num_docs();
+                Box::new(std::iter::from_fn(move || {
+                    // This is not the most efficient way to do this, but it's legacy code.
+                    while doc < num_docs {
+                        let cur = doc;
+                        doc += 1;
+                        let start = idx.start_index_column.get_val(cur);
+                        let end = idx.start_index_column.get_val(cur + 1);
+                        if end > start {
+                            return Some(cur);
+                        }
+                    }
+                    None
+                }))
+            }
+            MultiValueIndex::MultiValueIndexV2(idx) => {
+                Box::new(idx.optional_index.iter_non_null_docs())
+            }
+        }
+    }
+
    /// Converts a list of ranks (row ids of values) in a 1:n index to the corresponding list of
    /// docids. Positions are converted inplace to docids.
    ///
@@ -307,7 +339,7 @@ mod tests {
    use std::ops::Range;

    use super::MultiValueIndex;
-    use crate::{ColumnarReader, DynamicColumn};
+    use crate::{ColumnarReader, DynamicColumn, ValueRange};

    fn index_to_pos_helper(
        index: &MultiValueIndex,
@@ -387,7 +419,7 @@ mod tests {
        assert_eq!(row_id_range, 0..4);

        let check = |range, expected| {
-            let full_range = 0..=u64::MAX;
+            let full_range = ValueRange::All;
            let mut docids = Vec::new();
            column.get_docids_for_value_range(full_range, range, &mut docids);
            assert_eq!(docids, expected);
--- a/columnar/src/column_index/optional_index/mod.rs
+++ b/columnar/src/column_index/optional_index/mod.rs
@@ -1,17 +1,18 @@
-use std::io::{self, Write};
+use std::io;
 use std::sync::Arc;

 mod set;
 mod set_block;

+use common::file_slice::FileSlice;
 use common::{BinarySerializable, OwnedBytes, VInt};
 pub use set::{SelectCursor, Set, SetCodec};
 use set_block::{
-    DenseBlock, DenseBlockCodec, SparseBlock, SparseBlockCodec, DENSE_BLOCK_NUM_BYTES,
+    DENSE_BLOCK_NUM_BYTES, DenseBlock, DenseBlockCodec, SparseBlock, SparseBlockCodec,
 };

 use crate::iterable::Iterable;
-use crate::{DocId, InvalidData, RowId};
+use crate::{DocId, RowId};

 /// The threshold for for number of elements after which we switch to dense block encoding.
 ///
@@ -80,23 +81,23 @@ impl BlockVariant {
 /// index is the block index. For each block `byte_start` and `offset` is computed.
 #[derive(Clone)]
 pub struct OptionalIndex {
-    num_rows: RowId,
-    num_non_null_rows: RowId,
+    num_docs: RowId,
+    num_non_null_docs: RowId,
    block_data: OwnedBytes,
    block_metas: Arc<[BlockMeta]>,
 }

-impl<'a> Iterable<u32> for &'a OptionalIndex {
+impl Iterable<u32> for &OptionalIndex {
    fn boxed_iter(&self) -> Box<dyn Iterator<Item = u32> + '_> {
-        Box::new(self.iter_rows())
+        Box::new(self.iter_non_null_docs())
    }
 }

 impl std::fmt::Debug for OptionalIndex {
    fn fmt(&self, f: &mut std::fmt::Formatter) -> std::fmt::Result {
        f.debug_struct("OptionalIndex")
-            .field("num_rows", &self.num_rows)
-            .field("num_non_null_rows", &self.num_non_null_rows)
+            .field("num_docs", &self.num_docs)
+            .field("num_non_null_docs", &self.num_non_null_docs)
            .finish_non_exhaustive()
    }
 }
@@ -123,7 +124,7 @@ enum BlockSelectCursor<'a> {
    Sparse(<SparseBlock<'a> as Set<u16>>::SelectCursor<'a>),
 }

-impl<'a> BlockSelectCursor<'a> {
+impl BlockSelectCursor<'_> {
    fn select(&mut self, rank: u16) -> u16 {
        match self {
            BlockSelectCursor::Dense(dense_select_cursor) => dense_select_cursor.select(rank),
@@ -141,7 +142,7 @@ pub struct OptionalIndexSelectCursor<'a> {
    num_null_rows_before_block: RowId,
 }

-impl<'a> OptionalIndexSelectCursor<'a> {
+impl OptionalIndexSelectCursor<'_> {
    fn search_and_load_block(&mut self, rank: RowId) {
        if rank < self.current_block_end_rank {
            // we are already in the right block
@@ -165,7 +166,7 @@ impl<'a> OptionalIndexSelectCursor<'a> {
    }
 }

-impl<'a> SelectCursor<RowId> for OptionalIndexSelectCursor<'a> {
+impl SelectCursor<RowId> for OptionalIndexSelectCursor<'_> {
    fn select(&mut self, rank: RowId) -> RowId {
        self.search_and_load_block(rank);
        let index_in_block = (rank - self.num_null_rows_before_block) as u16;
@@ -259,29 +260,32 @@ impl Set<RowId> for OptionalIndex {

 impl OptionalIndex {
    pub fn for_test(num_rows: RowId, row_ids: &[RowId]) -> OptionalIndex {
-        assert!(row_ids
-            .last()
-            .copied()
-            .map(|last_row_id| last_row_id < num_rows)
-            .unwrap_or(true));
+        assert!(
+            row_ids
+                .last()
+                .copied()
+                .map(|last_row_id| last_row_id < num_rows)
+                .unwrap_or(true)
+        );
        let mut buffer = Vec::new();
        serialize_optional_index(&row_ids, num_rows, &mut buffer).unwrap();
-        let bytes = OwnedBytes::new(buffer);
-        open_optional_index(bytes).unwrap()
+        let file_slice = FileSlice::from(buffer);
+        open_optional_index(file_slice).unwrap()
    }

    pub fn num_docs(&self) -> RowId {
-        self.num_rows
+        self.num_docs
    }

    pub fn num_non_nulls(&self) -> RowId {
-        self.num_non_null_rows
+        self.num_non_null_docs
    }

-    pub fn iter_rows(&self) -> impl Iterator<Item = RowId> + '_ {
-        // TODO optimize
+    pub fn iter_non_null_docs(&self) -> impl Iterator<Item = RowId> + '_ {
+        // TODO optimize. We could iterate over the blocks directly.
+        // We use the dense value ids and retrieve the doc ids via select.
        let mut select_batch = self.select_cursor();
-        (0..self.num_non_null_rows).map(move |rank| select_batch.select(rank))
+        (0..self.num_non_null_docs).map(move |rank| select_batch.select(rank))
    }
    pub fn select_batch(&self, ranks: &mut [RowId]) {
        let mut select_cursor = self.select_cursor();
@@ -332,38 +336,6 @@ enum Block<'a> {
    Sparse(SparseBlock<'a>),
 }

-#[derive(Debug, Copy, Clone)]
-enum OptionalIndexCodec {
-    Dense = 0,
-    Sparse = 1,
-}
-
-impl OptionalIndexCodec {
-    fn to_code(self) -> u8 {
-        self as u8
-    }
-
-    fn try_from_code(code: u8) -> Result<Self, InvalidData> {
-        match code {
-            0 => Ok(Self::Dense),
-            1 => Ok(Self::Sparse),
-            _ => Err(InvalidData),
-        }
-    }
-}
-
-impl BinarySerializable for OptionalIndexCodec {
-    fn serialize<W: Write + ?Sized>(&self, writer: &mut W) -> io::Result<()> {
-        writer.write_all(&[self.to_code()])
-    }
-
-    fn deserialize<R: io::Read>(reader: &mut R) -> io::Result<Self> {
-        let optional_codec_code = u8::deserialize(reader)?;
-        let optional_codec = Self::try_from_code(optional_codec_code)?;
-        Ok(optional_codec)
-    }
-}
-
 fn serialize_optional_index_block(block_els: &[u16], out: &mut impl io::Write) -> io::Result<()> {
    let is_sparse = is_sparse(block_els.len() as u32);
    if is_sparse {
@@ -505,7 +477,7 @@ fn deserialize_optional_index_block_metadatas(
        non_null_rows_before_block += num_non_null_rows;
    }
    block_metas.resize(
-        ((num_rows + ELEMENTS_PER_BLOCK - 1) / ELEMENTS_PER_BLOCK) as usize,
+        num_rows.div_ceil(ELEMENTS_PER_BLOCK) as usize,
        BlockMeta {
            non_null_rows_before_block,
            start_byte_offset,
@@ -515,19 +487,26 @@ fn deserialize_optional_index_block_metadatas(
    (block_metas.into_boxed_slice(), non_null_rows_before_block)
 }

-pub fn open_optional_index(bytes: OwnedBytes) -> io::Result<OptionalIndex> {
-    let (mut bytes, num_non_empty_blocks_bytes) = bytes.rsplit(2);
-    let num_non_empty_block_bytes =
-        u16::from_le_bytes(num_non_empty_blocks_bytes.as_slice().try_into().unwrap());
-    let num_rows = VInt::deserialize_u64(&mut bytes)? as u32;
+pub fn open_optional_index(file_slice: FileSlice) -> io::Result<OptionalIndex> {
+    let (bytes, num_non_empty_blocks_bytes) = file_slice.split_from_end(2);
+    let num_non_empty_block_bytes = u16::from_le_bytes(
+        num_non_empty_blocks_bytes
+            .read_bytes()?
+            .as_slice()
+            .try_into()
+            .unwrap(),
+    );
+
+    let mut bytes = bytes.read_bytes()?;
+    let num_docs = VInt::deserialize_u64(&mut bytes)? as u32;
    let block_metas_num_bytes =
        num_non_empty_block_bytes as usize * SERIALIZED_BLOCK_META_NUM_BYTES;
    let (block_data, block_metas) = bytes.rsplit(block_metas_num_bytes);
-    let (block_metas, num_non_null_rows) =
-        deserialize_optional_index_block_metadatas(block_metas.as_slice(), num_rows);
+    let (block_metas, num_non_null_docs) =
+        deserialize_optional_index_block_metadatas(block_metas.as_slice(), num_docs);
    let optional_index = OptionalIndex {
-        num_rows,
-        num_non_null_rows,
+        num_docs,
+        num_non_null_docs,
        block_data,
        block_metas: block_metas.into(),
    };
--- a/columnar/src/column_index/optional_index/set_block/dense.rs
+++ b/columnar/src/column_index/optional_index/set_block/dense.rs
@@ -2,7 +2,7 @@ use std::io::{self, Write};

 use common::BinarySerializable;

-use crate::column_index::optional_index::{SelectCursor, Set, SetCodec, ELEMENTS_PER_BLOCK};
+use crate::column_index::optional_index::{ELEMENTS_PER_BLOCK, SelectCursor, Set, SetCodec};

 #[inline(always)]
 fn get_bit_at(input: u64, n: u16) -> bool {
@@ -23,7 +23,6 @@ fn set_bit_at(input: &mut u64, n: u16) {
 ///
 /// When translating a dense index to the original index, we can use the offset to find the correct
 /// block. Direct computation is not possible, but we can employ a linear or binary search.
-
 const ELEMENTS_PER_MINI_BLOCK: u16 = 64;
 const MINI_BLOCK_BITVEC_NUM_BYTES: usize = 8;
 const MINI_BLOCK_OFFSET_NUM_BYTES: usize = 2;
@@ -109,7 +108,7 @@ pub struct DenseBlockSelectCursor<'a> {
    dense_block: DenseBlock<'a>,
 }

-impl<'a> SelectCursor<u16> for DenseBlockSelectCursor<'a> {
+impl SelectCursor<u16> for DenseBlockSelectCursor<'_> {
    #[inline]
    fn select(&mut self, rank: u16) -> u16 {
        self.block_id = self
@@ -175,7 +174,7 @@ impl<'a> Set<u16> for DenseBlock<'a> {
    }
 }

-impl<'a> DenseBlock<'a> {
+impl DenseBlock<'_> {
    #[inline]
    fn mini_block(&self, mini_block_id: u16) -> DenseMiniBlock {
        let data_start_pos = mini_block_id as usize * MINI_BLOCK_NUM_BYTES;
--- a/columnar/src/column_index/optional_index/set_block/mod.rs
+++ b/columnar/src/column_index/optional_index/set_block/mod.rs
@@ -1,7 +1,7 @@
 mod dense;
 mod sparse;

-pub use dense::{DenseBlock, DenseBlockCodec, DENSE_BLOCK_NUM_BYTES};
+pub use dense::{DENSE_BLOCK_NUM_BYTES, DenseBlock, DenseBlockCodec};
 pub use sparse::{SparseBlock, SparseBlockCodec};

 #[cfg(test)]
--- a/columnar/src/column_index/optional_index/set_block/sparse.rs
+++ b/columnar/src/column_index/optional_index/set_block/sparse.rs
@@ -31,7 +31,7 @@ impl<'a> SelectCursor<u16> for SparseBlock<'a> {
    }
 }

-impl<'a> Set<u16> for SparseBlock<'a> {
+impl Set<u16> for SparseBlock<'_> {
    type SelectCursor<'b>
        = Self
    where Self: 'b;
@@ -69,7 +69,7 @@ fn get_u16(data: &[u8], byte_position: usize) -> u16 {
    u16::from_le_bytes(bytes)
 }

-impl<'a> SparseBlock<'a> {
+impl SparseBlock<'_> {
    #[inline(always)]
    fn value_at_idx(&self, data: &[u8], idx: u16) -> u16 {
        let start_offset: usize = idx as usize * 2;
@@ -82,7 +82,7 @@ impl<'a> SparseBlock<'a> {
    }

    #[inline]
-    #[allow(clippy::comparison_chain)]
+    #[expect(clippy::comparison_chain)]
    // Looks for the element in the block. Returns the positions if found.
    fn binary_search(&self, target: u16) -> Result<u16, u16> {
        let data = &self.0;
--- a/columnar/src/column_index/optional_index/tests.rs
+++ b/columnar/src/column_index/optional_index/tests.rs
@@ -59,7 +59,7 @@ fn test_with_random_sets_simple() {
    let vals = 10..ELEMENTS_PER_BLOCK * 2;
    let mut out: Vec<u8> = Vec::new();
    serialize_optional_index(&vals, 100, &mut out).unwrap();
-    let null_index = open_optional_index(OwnedBytes::new(out)).unwrap();
+    let null_index = open_optional_index(FileSlice::from(out)).unwrap();
    let ranks: Vec<u32> = (65_472u32..65_473u32).collect();
    let els: Vec<u32> = ranks.iter().copied().map(|rank| rank + 10).collect();
    let mut select_cursor = null_index.select_cursor();
@@ -102,7 +102,7 @@ impl<'a> Iterable<RowId> for &'a [bool] {
 fn test_null_index(data: &[bool]) {
    let mut out: Vec<u8> = Vec::new();
    serialize_optional_index(&data, data.len() as RowId, &mut out).unwrap();
-    let null_index = open_optional_index(OwnedBytes::new(out)).unwrap();
+    let null_index = open_optional_index(FileSlice::from(out)).unwrap();
    let orig_idx_with_value: Vec<u32> = data
        .iter()
        .enumerate()
@@ -164,7 +164,11 @@ fn test_optional_index_large() {
 fn test_optional_index_iter_aux(row_ids: &[RowId], num_rows: RowId) {
    let optional_index = OptionalIndex::for_test(num_rows, row_ids);
    assert_eq!(optional_index.num_docs(), num_rows);
-    assert!(optional_index.iter_rows().eq(row_ids.iter().copied()));
+    assert!(
+        optional_index
+            .iter_non_null_docs()
+            .eq(row_ids.iter().copied())
+    );
 }

 #[test]
@@ -241,7 +245,7 @@ mod bench {
            .collect();
        serialize_optional_index(&&vals[..], TOTAL_NUM_VALUES, &mut out).unwrap();

-        open_optional_index(OwnedBytes::new(out)).unwrap()
+        open_optional_index(FileSlice::from(out)).unwrap()
    }

    fn random_range_iterator(
@@ -254,11 +258,7 @@ mod bench {
        let mut current = start;
        std::iter::from_fn(move || {
            current += rng.gen_range(avg_step_size - avg_deviation..=avg_step_size + avg_deviation);
-            if current >= end {
-                None
-            } else {
-                Some(current)
-            }
+            if current >= end { None } else { Some(current) }
        })
    }

--- a/columnar/src/column_index/serialize.rs
+++ b/columnar/src/column_index/serialize.rs
@@ -1,13 +1,14 @@
 use std::io;
 use std::io::Write;

-use common::{CountingWriter, OwnedBytes};
+use common::file_slice::FileSlice;
+use common::{CountingWriter, HasLen};

-use super::multivalued_index::SerializableMultivalueIndex;
 use super::OptionalIndex;
+use super::multivalued_index::SerializableMultivalueIndex;
+use crate::column_index::ColumnIndex;
 use crate::column_index::multivalued_index::serialize_multivalued_index;
 use crate::column_index::optional_index::serialize_optional_index;
-use crate::column_index::ColumnIndex;
 use crate::iterable::Iterable;
 use crate::{Cardinality, RowId, Version};

@@ -31,7 +32,7 @@ pub enum SerializableColumnIndex<'a> {
    Multivalued(SerializableMultivalueIndex<'a>),
 }

-impl<'a> SerializableColumnIndex<'a> {
+impl SerializableColumnIndex<'_> {
    pub fn get_cardinality(&self) -> Cardinality {
        match self {
            SerializableColumnIndex::Full => Cardinality::Full,
@@ -65,27 +66,28 @@ pub fn serialize_column_index(

 /// Open a serialized column index.
 pub fn open_column_index(
-    mut bytes: OwnedBytes,
+    file_slice: FileSlice,
    format_version: Version,
 ) -> io::Result<ColumnIndex> {
-    if bytes.is_empty() {
+    if file_slice.len() == 0 {
        return Err(io::Error::new(
            io::ErrorKind::UnexpectedEof,
            "Failed to deserialize column index. Empty buffer.",
        ));
    }
-    let cardinality_code = bytes[0];
+    let (header, body) = file_slice.split(1);
+    let cardinality_code = header.read_bytes()?.as_slice()[0];
    let cardinality = Cardinality::try_from_code(cardinality_code)?;
-    bytes.advance(1);
+
    match cardinality {
        Cardinality::Full => Ok(ColumnIndex::Full),
        Cardinality::Optional => {
-            let optional_index = super::optional_index::open_optional_index(bytes)?;
+            let optional_index = super::optional_index::open_optional_index(body)?;
            Ok(ColumnIndex::Optional(optional_index))
        }
        Cardinality::Multivalued => {
            let multivalue_index =
-                super::multivalued_index::open_multivalued_index(bytes, format_version)?;
+                super::multivalued_index::open_multivalued_index(body, format_version)?;
            Ok(ColumnIndex::Multivalued(multivalue_index))
        }
    }
--- a/columnar/src/column_values/bench.rs
+++ b/columnar/src/column_values/bench.rs
@@ -1,139 +0,0 @@
-use std::sync::Arc;
-
-use common::OwnedBytes;
-use rand::rngs::StdRng;
-use rand::{Rng, SeedableRng};
-use test::{self, Bencher};
-
-use super::*;
-use crate::column_values::u64_based::*;
-
-fn get_data() -> Vec<u64> {
-    let mut rng = StdRng::seed_from_u64(2u64);
-    let mut data: Vec<_> = (100..55000_u64)
-        .map(|num| num + rng.gen::<u8>() as u64)
-        .collect();
-    data.push(99_000);
-    data.insert(1000, 2000);
-    data.insert(2000, 100);
-    data.insert(3000, 4100);
-    data.insert(4000, 100);
-    data.insert(5000, 800);
-    data
-}
-
-fn compute_stats(vals: impl Iterator<Item = u64>) -> ColumnStats {
-    let mut stats_collector = StatsCollector::default();
-    for val in vals {
-        stats_collector.collect(val);
-    }
-    stats_collector.stats()
-}
-
-#[inline(never)]
-fn value_iter() -> impl Iterator<Item = u64> {
-    0..20_000
-}
-
-fn get_reader_for_bench<Codec: ColumnCodec>(data: &[u64]) -> Codec::ColumnValues {
-    let mut bytes = Vec::new();
-    let stats = compute_stats(data.iter().cloned());
-    let mut codec_serializer = Codec::estimator();
-    for val in data {
-        codec_serializer.collect(*val);
-    }
-    codec_serializer
-        .serialize(&stats, Box::new(data.iter().copied()).as_mut(), &mut bytes)
-        .unwrap();
-
-    Codec::load(OwnedBytes::new(bytes)).unwrap()
-}
-
-fn bench_get<Codec: ColumnCodec>(b: &mut Bencher, data: &[u64]) {
-    let col = get_reader_for_bench::<Codec>(data);
-    b.iter(|| {
-        let mut sum = 0u64;
-        for pos in value_iter() {
-            let val = col.get_val(pos as u32);
-            sum = sum.wrapping_add(val);
-        }
-        sum
-    });
-}
-
-#[inline(never)]
-fn bench_get_dynamic_helper(b: &mut Bencher, col: Arc<dyn ColumnValues>) {
-    b.iter(|| {
-        let mut sum = 0u64;
-        for pos in value_iter() {
-            let val = col.get_val(pos as u32);
-            sum = sum.wrapping_add(val);
-        }
-        sum
-    });
-}
-
-fn bench_get_dynamic<Codec: ColumnCodec>(b: &mut Bencher, data: &[u64]) {
-    let col = Arc::new(get_reader_for_bench::<Codec>(data));
-    bench_get_dynamic_helper(b, col);
-}
-fn bench_create<Codec: ColumnCodec>(b: &mut Bencher, data: &[u64]) {
-    let stats = compute_stats(data.iter().cloned());
-
-    let mut bytes = Vec::new();
-    b.iter(|| {
-        bytes.clear();
-        let mut codec_serializer = Codec::estimator();
-        for val in data.iter().take(1024) {
-            codec_serializer.collect(*val);
-        }
-
-        codec_serializer.serialize(&stats, Box::new(data.iter().copied()).as_mut(), &mut bytes)
-    });
-}
-
-#[bench]
-fn bench_fastfield_bitpack_create(b: &mut Bencher) {
-    let data: Vec<_> = get_data();
-    bench_create::<BitpackedCodec>(b, &data);
-}
-#[bench]
-fn bench_fastfield_linearinterpol_create(b: &mut Bencher) {
-    let data: Vec<_> = get_data();
-    bench_create::<LinearCodec>(b, &data);
-}
-#[bench]
-fn bench_fastfield_multilinearinterpol_create(b: &mut Bencher) {
-    let data: Vec<_> = get_data();
-    bench_create::<BlockwiseLinearCodec>(b, &data);
-}
-#[bench]
-fn bench_fastfield_bitpack_get(b: &mut Bencher) {
-    let data: Vec<_> = get_data();
-    bench_get::<BitpackedCodec>(b, &data);
-}
-#[bench]
-fn bench_fastfield_bitpack_get_dynamic(b: &mut Bencher) {
-    let data: Vec<_> = get_data();
-    bench_get_dynamic::<BitpackedCodec>(b, &data);
-}
-#[bench]
-fn bench_fastfield_linearinterpol_get(b: &mut Bencher) {
-    let data: Vec<_> = get_data();
-    bench_get::<LinearCodec>(b, &data);
-}
-#[bench]
-fn bench_fastfield_linearinterpol_get_dynamic(b: &mut Bencher) {
-    let data: Vec<_> = get_data();
-    bench_get_dynamic::<LinearCodec>(b, &data);
-}
-#[bench]
-fn bench_fastfield_multilinearinterpol_get(b: &mut Bencher) {
-    let data: Vec<_> = get_data();
-    bench_get::<BlockwiseLinearCodec>(b, &data);
-}
-#[bench]
-fn bench_fastfield_multilinearinterpol_get_dynamic(b: &mut Bencher) {
-    let data: Vec<_> = get_data();
-    bench_get_dynamic::<BlockwiseLinearCodec>(b, &data);
-}
--- a/columnar/src/column_values/merge.rs
+++ b/columnar/src/column_values/merge.rs
@@ -10,7 +10,7 @@ pub(crate) struct MergedColumnValues<'a, T> {
    pub(crate) merge_row_order: &'a MergeRowOrder,
 }

-impl<'a, T: Copy + PartialOrd + Debug + 'static> Iterable<T> for MergedColumnValues<'a, T> {
+impl<T: Copy + PartialOrd + Debug + 'static> Iterable<T> for MergedColumnValues<'_, T> {
    fn boxed_iter(&self) -> Box<dyn Iterator<Item = T> + '_> {
        match self.merge_row_order {
            MergeRowOrder::Stack(_) => Box::new(
--- a/columnar/src/column_values/mod.rs
+++ b/columnar/src/column_values/mod.rs
@@ -7,13 +7,15 @@
 //! - Monotonically map values to u64/u128

 use std::fmt::Debug;
-use std::ops::{Range, RangeInclusive};
+use std::ops::Range;
 use std::sync::Arc;

 use downcast_rs::DowncastSync;
 pub use monotonic_mapping::{MonotonicallyMappableToU64, StrictlyMonotonicFn};
 pub use monotonic_mapping_u128::MonotonicallyMappableToU128;

+use crate::column::ValueRange;
+
 mod merge;
 pub(crate) mod monotonic_mapping;
 pub(crate) mod monotonic_mapping_u128;
@@ -26,13 +28,12 @@ mod monotonic_column;

 pub(crate) use merge::MergedColumnValues;
 pub use stats::ColumnStats;
-pub use u128_based::{
-    open_u128_as_compact_u64, open_u128_mapped, serialize_column_values_u128,
-    CompactSpaceU64Accessor,
-};
 pub use u64_based::{
-    load_u64_based_column_values, serialize_and_load_u64_based_column_values,
-    serialize_u64_based_column_values, CodecType, ALL_U64_CODEC_TYPES,
+    ALL_U64_CODEC_TYPES, CodecType, load_u64_based_column_values, serialize_u64_based_column_values,
+};
+pub use u128_based::{
+    CompactSpaceU64Accessor, open_u128_as_compact_u64, open_u128_mapped,
+    serialize_column_values_u128,
 };
 pub use vec_column::VecColumn;

@@ -109,6 +110,307 @@ pub trait ColumnValues<T: PartialOrd = u64>: Send + Sync + DowncastSync {
        }
    }

+    /// Load the values for the provided docids.
+    ///
+    /// The values are filtered by the provided value range.
+    fn get_vals_in_value_range(
+        &self,
+        input_indexes: &[u32],
+        input_doc_ids: &[u32],
+        output: &mut Vec<crate::ComparableDoc<Option<T>, crate::DocId>>,
+        value_range: ValueRange<T>,
+    ) {
+        let len = input_indexes.len();
+        let mut read_head = 0;
+
+        match value_range {
+            ValueRange::All => {
+                while read_head + 3 < len {
+                    let idx0 = input_indexes[read_head];
+                    let idx1 = input_indexes[read_head + 1];
+                    let idx2 = input_indexes[read_head + 2];
+                    let idx3 = input_indexes[read_head + 3];
+
+                    let doc0 = input_doc_ids[read_head];
+                    let doc1 = input_doc_ids[read_head + 1];
+                    let doc2 = input_doc_ids[read_head + 2];
+                    let doc3 = input_doc_ids[read_head + 3];
+
+                    let val0 = self.get_val(idx0);
+                    let val1 = self.get_val(idx1);
+                    let val2 = self.get_val(idx2);
+                    let val3 = self.get_val(idx3);
+
+                    output.push(crate::ComparableDoc {
+                        doc: doc0,
+                        sort_key: Some(val0),
+                    });
+                    output.push(crate::ComparableDoc {
+                        doc: doc1,
+                        sort_key: Some(val1),
+                    });
+                    output.push(crate::ComparableDoc {
+                        doc: doc2,
+                        sort_key: Some(val2),
+                    });
+                    output.push(crate::ComparableDoc {
+                        doc: doc3,
+                        sort_key: Some(val3),
+                    });
+
+                    read_head += 4;
+                }
+            }
+            ValueRange::Inclusive(ref range) => {
+                while read_head + 3 < len {
+                    let idx0 = input_indexes[read_head];
+                    let idx1 = input_indexes[read_head + 1];
+                    let idx2 = input_indexes[read_head + 2];
+                    let idx3 = input_indexes[read_head + 3];
+
+                    let doc0 = input_doc_ids[read_head];
+                    let doc1 = input_doc_ids[read_head + 1];
+                    let doc2 = input_doc_ids[read_head + 2];
+                    let doc3 = input_doc_ids[read_head + 3];
+
+                    let val0 = self.get_val(idx0);
+                    let val1 = self.get_val(idx1);
+                    let val2 = self.get_val(idx2);
+                    let val3 = self.get_val(idx3);
+
+                    if range.contains(&val0) {
+                        output.push(crate::ComparableDoc {
+                            doc: doc0,
+                            sort_key: Some(val0),
+                        });
+                    }
+                    if range.contains(&val1) {
+                        output.push(crate::ComparableDoc {
+                            doc: doc1,
+                            sort_key: Some(val1),
+                        });
+                    }
+                    if range.contains(&val2) {
+                        output.push(crate::ComparableDoc {
+                            doc: doc2,
+                            sort_key: Some(val2),
+                        });
+                    }
+                    if range.contains(&val3) {
+                        output.push(crate::ComparableDoc {
+                            doc: doc3,
+                            sort_key: Some(val3),
+                        });
+                    }
+
+                    read_head += 4;
+                }
+            }
+            ValueRange::GreaterThan(ref threshold, _) => {
+                while read_head + 3 < len {
+                    let idx0 = input_indexes[read_head];
+                    let idx1 = input_indexes[read_head + 1];
+                    let idx2 = input_indexes[read_head + 2];
+                    let idx3 = input_indexes[read_head + 3];
+
+                    let doc0 = input_doc_ids[read_head];
+                    let doc1 = input_doc_ids[read_head + 1];
+                    let doc2 = input_doc_ids[read_head + 2];
+                    let doc3 = input_doc_ids[read_head + 3];
+
+                    let val0 = self.get_val(idx0);
+                    let val1 = self.get_val(idx1);
+                    let val2 = self.get_val(idx2);
+                    let val3 = self.get_val(idx3);
+
+                    if val0 > *threshold {
+                        output.push(crate::ComparableDoc {
+                            doc: doc0,
+                            sort_key: Some(val0),
+                        });
+                    }
+                    if val1 > *threshold {
+                        output.push(crate::ComparableDoc {
+                            doc: doc1,
+                            sort_key: Some(val1),
+                        });
+                    }
+                    if val2 > *threshold {
+                        output.push(crate::ComparableDoc {
+                            doc: doc2,
+                            sort_key: Some(val2),
+                        });
+                    }
+                    if val3 > *threshold {
+                        output.push(crate::ComparableDoc {
+                            doc: doc3,
+                            sort_key: Some(val3),
+                        });
+                    }
+
+                    read_head += 4;
+                }
+            }
+            ValueRange::GreaterThanOrEqual(ref threshold, _) => {
+                while read_head + 3 < len {
+                    let idx0 = input_indexes[read_head];
+                    let idx1 = input_indexes[read_head + 1];
+                    let idx2 = input_indexes[read_head + 2];
+                    let idx3 = input_indexes[read_head + 3];
+
+                    let doc0 = input_doc_ids[read_head];
+                    let doc1 = input_doc_ids[read_head + 1];
+                    let doc2 = input_doc_ids[read_head + 2];
+                    let doc3 = input_doc_ids[read_head + 3];
+
+                    let val0 = self.get_val(idx0);
+                    let val1 = self.get_val(idx1);
+                    let val2 = self.get_val(idx2);
+                    let val3 = self.get_val(idx3);
+
+                    if val0 >= *threshold {
+                        output.push(crate::ComparableDoc {
+                            doc: doc0,
+                            sort_key: Some(val0),
+                        });
+                    }
+                    if val1 >= *threshold {
+                        output.push(crate::ComparableDoc {
+                            doc: doc1,
+                            sort_key: Some(val1),
+                        });
+                    }
+                    if val2 >= *threshold {
+                        output.push(crate::ComparableDoc {
+                            doc: doc2,
+                            sort_key: Some(val2),
+                        });
+                    }
+                    if val3 >= *threshold {
+                        output.push(crate::ComparableDoc {
+                            doc: doc3,
+                            sort_key: Some(val3),
+                        });
+                    }
+
+                    read_head += 4;
+                }
+            }
+            ValueRange::LessThan(ref threshold, _) => {
+                while read_head + 3 < len {
+                    let idx0 = input_indexes[read_head];
+                    let idx1 = input_indexes[read_head + 1];
+                    let idx2 = input_indexes[read_head + 2];
+                    let idx3 = input_indexes[read_head + 3];
+
+                    let doc0 = input_doc_ids[read_head];
+                    let doc1 = input_doc_ids[read_head + 1];
+                    let doc2 = input_doc_ids[read_head + 2];
+                    let doc3 = input_doc_ids[read_head + 3];
+
+                    let val0 = self.get_val(idx0);
+                    let val1 = self.get_val(idx1);
+                    let val2 = self.get_val(idx2);
+                    let val3 = self.get_val(idx3);
+
+                    if val0 < *threshold {
+                        output.push(crate::ComparableDoc {
+                            doc: doc0,
+                            sort_key: Some(val0),
+                        });
+                    }
+                    if val1 < *threshold {
+                        output.push(crate::ComparableDoc {
+                            doc: doc1,
+                            sort_key: Some(val1),
+                        });
+                    }
+                    if val2 < *threshold {
+                        output.push(crate::ComparableDoc {
+                            doc: doc2,
+                            sort_key: Some(val2),
+                        });
+                    }
+                    if val3 < *threshold {
+                        output.push(crate::ComparableDoc {
+                            doc: doc3,
+                            sort_key: Some(val3),
+                        });
+                    }
+
+                    read_head += 4;
+                }
+            }
+            ValueRange::LessThanOrEqual(ref threshold, _) => {
+                while read_head + 3 < len {
+                    let idx0 = input_indexes[read_head];
+                    let idx1 = input_indexes[read_head + 1];
+                    let idx2 = input_indexes[read_head + 2];
+                    let idx3 = input_indexes[read_head + 3];
+
+                    let doc0 = input_doc_ids[read_head];
+                    let doc1 = input_doc_ids[read_head + 1];
+                    let doc2 = input_doc_ids[read_head + 2];
+                    let doc3 = input_doc_ids[read_head + 3];
+
+                    let val0 = self.get_val(idx0);
+                    let val1 = self.get_val(idx1);
+                    let val2 = self.get_val(idx2);
+                    let val3 = self.get_val(idx3);
+
+                    if val0 <= *threshold {
+                        output.push(crate::ComparableDoc {
+                            doc: doc0,
+                            sort_key: Some(val0),
+                        });
+                    }
+                    if val1 <= *threshold {
+                        output.push(crate::ComparableDoc {
+                            doc: doc1,
+                            sort_key: Some(val1),
+                        });
+                    }
+                    if val2 <= *threshold {
+                        output.push(crate::ComparableDoc {
+                            doc: doc2,
+                            sort_key: Some(val2),
+                        });
+                    }
+                    if val3 <= *threshold {
+                        output.push(crate::ComparableDoc {
+                            doc: doc3,
+                            sort_key: Some(val3),
+                        });
+                    }
+
+                    read_head += 4;
+                }
+            }
+        }
+        // Process remaining elements (0 to 3)
+        while read_head < len {
+            let idx = input_indexes[read_head];
+            let doc = input_doc_ids[read_head];
+            let val = self.get_val(idx);
+            let matches = match value_range {
+                // 'value_range' is still moved here. This is the outer `value_range`
+                ValueRange::All => true,
+                ValueRange::Inclusive(ref r) => r.contains(&val),
+                ValueRange::GreaterThan(ref t, _) => val > *t,
+                ValueRange::GreaterThanOrEqual(ref t, _) => val >= *t,
+                ValueRange::LessThan(ref t, _) => val < *t,
+                ValueRange::LessThanOrEqual(ref t, _) => val <= *t,
+            };
+            if matches {
+                output.push(crate::ComparableDoc {
+                    doc,
+                    sort_key: Some(val),
+                });
+            }
+            read_head += 1;
+        }
+    }
+
    /// Fills an output buffer with the fast field values
    /// associated with the `DocId` going from
    /// `start` to `start + output.len()`.
@@ -129,15 +431,54 @@ pub trait ColumnValues<T: PartialOrd = u64>: Send + Sync + DowncastSync {
    /// Note that position == docid for single value fast fields
    fn get_row_ids_for_value_range(
        &self,
-        value_range: RangeInclusive<T>,
+        value_range: ValueRange<T>,
        row_id_range: Range<RowId>,
        row_id_hits: &mut Vec<RowId>,
    ) {
        let row_id_range = row_id_range.start..row_id_range.end.min(self.num_vals());
-        for idx in row_id_range {
-            let val = self.get_val(idx);
-            if value_range.contains(&val) {
-                row_id_hits.push(idx);
+        match value_range {
+            ValueRange::Inclusive(range) => {
+                for idx in row_id_range {
+                    let val = self.get_val(idx);
+                    if range.contains(&val) {
+                        row_id_hits.push(idx);
+                    }
+                }
+            }
+            ValueRange::GreaterThan(threshold, _) => {
+                for idx in row_id_range {
+                    let val = self.get_val(idx);
+                    if val > threshold {
+                        row_id_hits.push(idx);
+                    }
+                }
+            }
+            ValueRange::GreaterThanOrEqual(threshold, _) => {
+                for idx in row_id_range {
+                    let val = self.get_val(idx);
+                    if val >= threshold {
+                        row_id_hits.push(idx);
+                    }
+                }
+            }
+            ValueRange::LessThan(threshold, _) => {
+                for idx in row_id_range {
+                    let val = self.get_val(idx);
+                    if val < threshold {
+                        row_id_hits.push(idx);
+                    }
+                }
+            }
+            ValueRange::LessThanOrEqual(threshold, _) => {
+                for idx in row_id_range {
+                    let val = self.get_val(idx);
+                    if val <= threshold {
+                        row_id_hits.push(idx);
+                    }
+                }
+            }
+            ValueRange::All => {
+                row_id_hits.extend(row_id_range);
            }
        }
    }
@@ -193,6 +534,17 @@ impl<T: PartialOrd + Default> ColumnValues<T> for EmptyColumnValues {
    fn num_vals(&self) -> u32 {
        0
    }
+
+    fn get_vals_in_value_range(
+        &self,
+        input_indexes: &[u32],
+        input_doc_ids: &[u32],
+        output: &mut Vec<crate::ComparableDoc<Option<T>, crate::DocId>>,
+        value_range: ValueRange<T>,
+    ) {
+        let _ = (input_indexes, input_doc_ids, output, value_range);
+        panic!("Internal Error: Called get_vals_in_value_range of empty column.")
+    }
 }

 impl<T: Copy + PartialOrd + Debug + 'static> ColumnValues<T> for Arc<dyn ColumnValues<T>> {
@@ -206,6 +558,18 @@ impl<T: Copy + PartialOrd + Debug + 'static> ColumnValues<T> for Arc<dyn ColumnV
        self.as_ref().get_vals_opt(indexes, output)
    }

+    #[inline(always)]
+    fn get_vals_in_value_range(
+        &self,
+        input_indexes: &[u32],
+        input_doc_ids: &[u32],
+        output: &mut Vec<crate::ComparableDoc<Option<T>, crate::DocId>>,
+        value_range: ValueRange<T>,
+    ) {
+        self.as_ref()
+            .get_vals_in_value_range(input_indexes, input_doc_ids, output, value_range)
+    }
+
    #[inline(always)]
    fn min_value(&self) -> T {
        self.as_ref().min_value()
@@ -234,7 +598,7 @@ impl<T: Copy + PartialOrd + Debug + 'static> ColumnValues<T> for Arc<dyn ColumnV
    #[inline(always)]
    fn get_row_ids_for_value_range(
        &self,
-        range: RangeInclusive<T>,
+        range: ValueRange<T>,
        doc_id_range: Range<u32>,
        positions: &mut Vec<u32>,
    ) {
@@ -242,6 +606,3 @@ impl<T: Copy + PartialOrd + Debug + 'static> ColumnValues<T> for Arc<dyn ColumnV
            .get_row_ids_for_value_range(range, doc_id_range, positions)
    }
 }
-
-#[cfg(all(test, feature = "unstable"))]
-mod bench;
--- a/columnar/src/column_values/monotonic_column.rs
+++ b/columnar/src/column_values/monotonic_column.rs
@@ -1,9 +1,10 @@
 use std::fmt::Debug;
 use std::marker::PhantomData;
-use std::ops::{Range, RangeInclusive};
+use std::ops::Range;

-use crate::column_values::monotonic_mapping::StrictlyMonotonicFn;
 use crate::ColumnValues;
+use crate::column::ValueRange;
+use crate::column_values::monotonic_mapping::StrictlyMonotonicFn;

 struct MonotonicMappingColumn<C, T, Input> {
    from_column: C,
@@ -80,16 +81,52 @@ where

    fn get_row_ids_for_value_range(
        &self,
-        range: RangeInclusive<Output>,
+        range: ValueRange<Output>,
        doc_id_range: Range<u32>,
        positions: &mut Vec<u32>,
    ) {
-        self.from_column.get_row_ids_for_value_range(
-            self.monotonic_mapping.inverse(range.start().clone())
-                ..=self.monotonic_mapping.inverse(range.end().clone()),
-            doc_id_range,
-            positions,
-        )
+        match range {
+            ValueRange::Inclusive(range) => self.from_column.get_row_ids_for_value_range(
+                ValueRange::Inclusive(
+                    self.monotonic_mapping.inverse(range.start().clone())
+                        ..=self.monotonic_mapping.inverse(range.end().clone()),
+                ),
+                doc_id_range,
+                positions,
+            ),
+            ValueRange::All => self.from_column.get_row_ids_for_value_range(
+                ValueRange::All,
+                doc_id_range,
+                positions,
+            ),
+            ValueRange::GreaterThan(threshold, _) => self.from_column.get_row_ids_for_value_range(
+                ValueRange::GreaterThan(self.monotonic_mapping.inverse(threshold), false),
+                doc_id_range,
+                positions,
+            ),
+            ValueRange::GreaterThanOrEqual(threshold, _) => {
+                self.from_column.get_row_ids_for_value_range(
+                    ValueRange::GreaterThanOrEqual(
+                        self.monotonic_mapping.inverse(threshold),
+                        false,
+                    ),
+                    doc_id_range,
+                    positions,
+                )
+            }
+            ValueRange::LessThan(threshold, _) => self.from_column.get_row_ids_for_value_range(
+                ValueRange::LessThan(self.monotonic_mapping.inverse(threshold), false),
+                doc_id_range,
+                positions,
+            ),
+            ValueRange::LessThanOrEqual(threshold, _) => {
+                self.from_column.get_row_ids_for_value_range(
+                    ValueRange::LessThanOrEqual(self.monotonic_mapping.inverse(threshold), false),
+                    doc_id_range,
+                    positions,
+                )
+            }
+        }
    }

    // We voluntarily do not implement get_range as it yields a regression,
@@ -99,10 +136,10 @@ where
 #[cfg(test)]
 mod tests {
    use super::*;
+    use crate::column_values::VecColumn;
    use crate::column_values::monotonic_mapping::{
        StrictlyMonotonicMappingInverter, StrictlyMonotonicMappingToInternal,
    };
-    use crate::column_values::VecColumn;

    #[test]
    fn test_monotonic_mapping_iter() {
--- a/columnar/src/column_values/monotonic_mapping_u128.rs
+++ b/columnar/src/column_values/monotonic_mapping_u128.rs
@@ -1,7 +1,7 @@
 use std::fmt::Debug;
 use std::net::Ipv6Addr;

-/// Montonic maps a value to u128 value space
+/// Monotonic maps a value to u128 value space
 /// Monotonic mapping enables `PartialOrd` on u128 space without conversion to original space.
 pub trait MonotonicallyMappableToU128: 'static + PartialOrd + Copy + Debug + Send + Sync {
    /// Converts a value to u128.
--- a/columnar/src/column_values/stats.rs
+++ b/columnar/src/column_values/stats.rs
@@ -2,7 +2,8 @@ use std::io;
 use std::io::Write;
 use std::num::NonZeroU64;

-use common::{BinarySerializable, VInt};
+use common::file_slice::FileSlice;
+use common::{BinarySerializable, HasLen, VInt};

 use crate::RowId;

@@ -27,6 +28,55 @@ impl ColumnStats {
    }
 }

+impl ColumnStats {
+    /// Deserialize from the tail of the given FileSlice, and return the stats and remaining prefix
+    /// FileSlice.
+    pub fn deserialize_from_tail(file_slice: FileSlice) -> io::Result<(Self, FileSlice)> {
+        // [`deserialize_with_size`] deserializes 4 variable-width encoded u64s, which
+        // could end up being, in the worst case, 9 bytes each. this is where the 36 comes from
+        let (stats, _) = file_slice.clone().split(36.min(file_slice.len())); // hope that's enough bytes
+        let mut stats = stats.read_bytes()?;
+        let (stats, stats_nbytes) = ColumnStats::deserialize_with_size(&mut stats)?;
+        let (_, remainder) = file_slice.split(stats_nbytes);
+        Ok((stats, remainder))
+    }
+
+    /// Same as [`BinarySeerializable::deserialize`] but also returns the number of bytes
+    /// consumed from the reader `R`
+    fn deserialize_with_size<R: io::Read>(reader: &mut R) -> io::Result<(Self, usize)> {
+        let mut nbytes = 0;
+
+        let (min_value, len) = VInt::deserialize_with_size(reader)?;
+        let min_value = min_value.0;
+        nbytes += len;
+
+        let (gcd, len) = VInt::deserialize_with_size(reader)?;
+        let gcd = gcd.0;
+        let gcd = NonZeroU64::new(gcd)
+            .ok_or_else(|| io::Error::new(io::ErrorKind::InvalidData, "GCD of 0 is forbidden"))?;
+        nbytes += len;
+
+        let (amplitude, len) = VInt::deserialize_with_size(reader)?;
+        let amplitude = amplitude.0 * gcd.get();
+        let max_value = min_value + amplitude;
+        nbytes += len;
+
+        let (num_rows, len) = VInt::deserialize_with_size(reader)?;
+        let num_rows = num_rows.0 as RowId;
+        nbytes += len;
+
+        Ok((
+            ColumnStats {
+                min_value,
+                max_value,
+                num_rows,
+                gcd,
+            },
+            nbytes,
+        ))
+    }
+}
+
 impl BinarySerializable for ColumnStats {
    fn serialize<W: Write + ?Sized>(&self, writer: &mut W) -> io::Result<()> {
        VInt(self.min_value).serialize(writer)?;
--- a/columnar/src/column_values/u128_based/compact_space/build_compact_space.rs
+++ b/columnar/src/column_values/u128_based/compact_space/build_compact_space.rs
@@ -185,10 +185,10 @@ impl CompactSpaceBuilder {
        let mut covered_space = Vec::with_capacity(self.blanks.len());

        // beginning of the blanks
-        if let Some(first_blank_start) = self.blanks.first().map(RangeInclusive::start) {
-            if *first_blank_start != 0 {
-                covered_space.push(0..=first_blank_start - 1);
-            }
+        if let Some(first_blank_start) = self.blanks.first().map(RangeInclusive::start)
+            && *first_blank_start != 0
+        {
+            covered_space.push(0..=first_blank_start - 1);
        }

        // Between the blanks
@@ -202,10 +202,10 @@ impl CompactSpaceBuilder {
        covered_space.extend(between_blanks);

        // end of the blanks
-        if let Some(last_blank_end) = self.blanks.last().map(RangeInclusive::end) {
-            if *last_blank_end != u128::MAX {
-                covered_space.push(last_blank_end + 1..=u128::MAX);
-            }
+        if let Some(last_blank_end) = self.blanks.last().map(RangeInclusive::end)
+            && *last_blank_end != u128::MAX
+        {
+            covered_space.push(last_blank_end + 1..=u128::MAX);
        }

        if covered_space.is_empty() {
--- a/columnar/src/column_values/u128_based/compact_space/mod.rs
+++ b/columnar/src/column_values/u128_based/compact_space/mod.rs
@@ -24,8 +24,9 @@ use build_compact_space::get_compact_space;
 use common::{BinarySerializable, CountingWriter, OwnedBytes, VInt, VIntU128};
 use tantivy_bitpacker::{BitPacker, BitUnpacker};

-use crate::column_values::ColumnValues;
 use crate::RowId;
+use crate::column::ValueRange;
+use crate::column_values::ColumnValues;

 /// The cost per blank is quite hard actually, since blanks are delta encoded, the actual cost of
 /// blanks depends on the number of blanks.
@@ -338,14 +339,48 @@ impl ColumnValues<u64> for CompactSpaceU64Accessor {
    #[inline]
    fn get_row_ids_for_value_range(
        &self,
-        value_range: RangeInclusive<u64>,
+        value_range: ValueRange<u64>,
        position_range: Range<u32>,
        positions: &mut Vec<u32>,
    ) {
-        let value_range = self.0.compact_to_u128(*value_range.start() as u32)
-            ..=self.0.compact_to_u128(*value_range.end() as u32);
-        self.0
-            .get_row_ids_for_value_range(value_range, position_range, positions)
+        match value_range {
+            ValueRange::Inclusive(value_range) => {
+                let value_range = ValueRange::Inclusive(
+                    self.0.compact_to_u128(*value_range.start() as u32)
+                        ..=self.0.compact_to_u128(*value_range.end() as u32),
+                );
+                self.0
+                    .get_row_ids_for_value_range(value_range, position_range, positions)
+            }
+            ValueRange::All => {
+                let position_range = position_range.start..position_range.end.min(self.num_vals());
+                positions.extend(position_range);
+            }
+            ValueRange::GreaterThan(threshold, _) => {
+                let value_range =
+                    ValueRange::GreaterThan(self.0.compact_to_u128(threshold as u32), false);
+                self.0
+                    .get_row_ids_for_value_range(value_range, position_range, positions)
+            }
+            ValueRange::GreaterThanOrEqual(threshold, _) => {
+                let value_range =
+                    ValueRange::GreaterThanOrEqual(self.0.compact_to_u128(threshold as u32), false);
+                self.0
+                    .get_row_ids_for_value_range(value_range, position_range, positions)
+            }
+            ValueRange::LessThan(threshold, _) => {
+                let value_range =
+                    ValueRange::LessThan(self.0.compact_to_u128(threshold as u32), false);
+                self.0
+                    .get_row_ids_for_value_range(value_range, position_range, positions)
+            }
+            ValueRange::LessThanOrEqual(threshold, _) => {
+                let value_range =
+                    ValueRange::LessThanOrEqual(self.0.compact_to_u128(threshold as u32), false);
+                self.0
+                    .get_row_ids_for_value_range(value_range, position_range, positions)
+            }
+        }
    }
 }

@@ -375,10 +410,47 @@ impl ColumnValues<u128> for CompactSpaceDecompressor {
    #[inline]
    fn get_row_ids_for_value_range(
        &self,
-        value_range: RangeInclusive<u128>,
+        value_range: ValueRange<u128>,
        position_range: Range<u32>,
        positions: &mut Vec<u32>,
    ) {
+        let value_range = match value_range {
+            ValueRange::Inclusive(value_range) => value_range,
+            ValueRange::All => {
+                let position_range = position_range.start..position_range.end.min(self.num_vals());
+                positions.extend(position_range);
+                return;
+            }
+            ValueRange::GreaterThan(threshold, _) => {
+                let max = self.max_value();
+                if threshold >= max {
+                    return;
+                }
+                (threshold + 1)..=max
+            }
+            ValueRange::GreaterThanOrEqual(threshold, _) => {
+                let max = self.max_value();
+                if threshold > max {
+                    return;
+                }
+                threshold..=max
+            }
+            ValueRange::LessThan(threshold, _) => {
+                let min = self.min_value();
+                if threshold <= min {
+                    return;
+                }
+                min..=(threshold - 1)
+            }
+            ValueRange::LessThanOrEqual(threshold, _) => {
+                let min = self.min_value();
+                if threshold < min {
+                    return;
+                }
+                min..=threshold
+            }
+        };
+
        if value_range.start() > value_range.end() {
            return;
        }
@@ -560,7 +632,7 @@ mod tests {
                    .collect::<Vec<_>>();
                let mut positions = Vec::new();
                decompressor.get_row_ids_for_value_range(
-                    range,
+                    ValueRange::Inclusive(range),
                    0..decompressor.num_vals(),
                    &mut positions,
                );
@@ -604,7 +676,11 @@ mod tests {
            let val = *val;
            let pos = pos as u32;
            let mut positions = Vec::new();
-            decomp.get_row_ids_for_value_range(val..=val, pos..pos + 1, &mut positions);
+            decomp.get_row_ids_for_value_range(
+                ValueRange::Inclusive(val..=val),
+                pos..pos + 1,
+                &mut positions,
+            );
            assert_eq!(positions, vec![pos]);
        }

@@ -653,12 +729,14 @@ mod tests {
            ),
            &[3]
        );
-        assert!(get_positions_for_value_range_helper(
-            &decomp,
-            99998u128..=99998u128,
-            complete_range.clone()
-        )
-        .is_empty());
+        assert!(
+            get_positions_for_value_range_helper(
+                &decomp,
+                99998u128..=99998u128,
+                complete_range.clone()
+            )
+            .is_empty()
+        );
        assert_eq!(
            &get_positions_for_value_range_helper(
                &decomp,
@@ -744,7 +822,11 @@ mod tests {
        doc_id_range: Range<u32>,
    ) -> Vec<u32> {
        let mut positions = Vec::new();
-        column.get_row_ids_for_value_range(value_range, doc_id_range, &mut positions);
+        column.get_row_ids_for_value_range(
+            ValueRange::Inclusive(value_range),
+            doc_id_range,
+            &mut positions,
+        );
        positions
    }

@@ -767,7 +849,7 @@ mod tests {
        ];
        let mut out = Vec::new();
        serialize_column_values_u128(&&vals[..], &mut out).unwrap();
-        let decomp = open_u128_mapped(OwnedBytes::new(out)).unwrap();
+        let decomp = open_u128_mapped(FileSlice::from(out)).unwrap();
        let complete_range = 0..vals.len() as u32;

        assert_eq!(
@@ -821,6 +903,7 @@ mod tests {
        let _data = test_aux_vals(vals);
    }

+    use common::file_slice::FileSlice;
    use proptest::prelude::*;

    fn num_strategy() -> impl Strategy<Value = u128> {
--- a/columnar/src/column_values/u128_based/mod.rs
+++ b/columnar/src/column_values/u128_based/mod.rs
@@ -5,7 +5,8 @@ use std::sync::Arc;

 mod compact_space;

-use common::{BinarySerializable, OwnedBytes, VInt};
+use common::file_slice::FileSlice;
+use common::{BinarySerializable, VInt};
 pub use compact_space::{
    CompactSpaceCompressor, CompactSpaceDecompressor, CompactSpaceU64Accessor,
 };
@@ -101,8 +102,9 @@ impl U128FastFieldCodecType {

 /// Returns the correct codec reader wrapped in the `Arc` for the data.
 pub fn open_u128_mapped<T: MonotonicallyMappableToU128 + Debug>(
-    mut bytes: OwnedBytes,
+    file_slice: FileSlice,
 ) -> io::Result<Arc<dyn ColumnValues<T>>> {
+    let mut bytes = file_slice.read_bytes()?;
    let header = U128Header::deserialize(&mut bytes)?;
    assert_eq!(header.codec_type, U128FastFieldCodecType::CompactSpace);
    let reader = CompactSpaceDecompressor::open(bytes)?;
@@ -120,7 +122,8 @@ pub fn open_u128_mapped<T: MonotonicallyMappableToU128 + Debug>(
 /// # Notice
 /// In case there are new codecs added, check for usages of `CompactSpaceDecompressorU64` and
 /// also handle the new codecs.
-pub fn open_u128_as_compact_u64(mut bytes: OwnedBytes) -> io::Result<Arc<dyn ColumnValues<u64>>> {
+pub fn open_u128_as_compact_u64(file_slice: FileSlice) -> io::Result<Arc<dyn ColumnValues<u64>>> {
+    let mut bytes = file_slice.read_bytes()?;
    let header = U128Header::deserialize(&mut bytes)?;
    assert_eq!(header.codec_type, U128FastFieldCodecType::CompactSpace);
    let reader = CompactSpaceU64Accessor::open(bytes)?;
@@ -128,13 +131,13 @@ pub fn open_u128_as_compact_u64(mut bytes: OwnedBytes) -> io::Result<Arc<dyn Col
 }

 #[cfg(test)]
-pub mod tests {
+pub(crate) mod tests {
    use super::*;
-    use crate::column_values::u64_based::{
-        serialize_and_load_u64_based_column_values, serialize_u64_based_column_values,
-        ALL_U64_CODEC_TYPES,
-    };
    use crate::column_values::CodecType;
+    use crate::column_values::u64_based::{
+        ALL_U64_CODEC_TYPES, serialize_and_load_u64_based_column_values,
+        serialize_u64_based_column_values,
+    };

    #[test]
    fn test_serialize_deserialize_u128_header() {
--- a/columnar/src/column_values/u64_based/bitpacked.rs
+++ b/columnar/src/column_values/u64_based/bitpacked.rs
@@ -1,11 +1,14 @@
 use std::io::{self, Write};
 use std::num::NonZeroU64;
 use std::ops::{Range, RangeInclusive};
+use std::sync::{Arc, OnceLock};

-use common::{BinarySerializable, OwnedBytes};
+use common::file_slice::FileSlice;
+use common::{BinarySerializable, HasLen, OwnedBytes};
 use fastdivide::DividerU64;
-use tantivy_bitpacker::{compute_num_bits, BitPacker, BitUnpacker};
+use tantivy_bitpacker::{BitPacker, BitUnpacker, compute_num_bits};

+use crate::column::ValueRange;
 use crate::column_values::u64_based::{ColumnCodec, ColumnCodecEstimator, ColumnStats};
 use crate::{ColumnValues, RowId};

@@ -13,9 +16,40 @@ use crate::{ColumnValues, RowId};
 /// fast field is required.
 #[derive(Clone)]
 pub struct BitpackedReader {
-    data: OwnedBytes,
+    data: FileSlice,
    bit_unpacker: BitUnpacker,
    stats: ColumnStats,
+    blocks: Arc<[OnceLock<Block>]>,
+}
+
+impl BitpackedReader {
+    #[inline(always)]
+    fn unpack_val(&self, doc: u32) -> u64 {
+        let block_num = self.bit_unpacker.block_num(doc);
+
+        if block_num == 0 && self.blocks.len() == 0 {
+            return 0;
+        }
+
+        let block = self.blocks[block_num].get_or_init(|| {
+            let block_range = self.bit_unpacker.block(block_num, self.data.len());
+            let offset = block_range.start;
+            let data = self
+                .data
+                .slice(block_range)
+                .read_bytes()
+                .expect("Failed to read column values.");
+            Block { offset, data }
+        });
+
+        self.bit_unpacker
+            .get_from_subset(doc, block.offset, &block.data)
+    }
+}
+
+struct Block {
+    offset: usize,
+    data: OwnedBytes,
 }

 #[inline(always)]
@@ -23,11 +57,7 @@ const fn div_ceil(n: u64, q: NonZeroU64) -> u64 {
    // copied from unstable rust standard library.
    let d = n / q.get();
    let r = n % q.get();
-    if r > 0 {
-        d + 1
-    } else {
-        d
-    }
+    if r > 0 { d + 1 } else { d }
 }

 // The bitpacked codec applies a linear transformation `f` over data that are bitpacked.
@@ -61,8 +91,9 @@ fn transform_range_before_linear_transformation(
 impl ColumnValues for BitpackedReader {
    #[inline(always)]
    fn get_val(&self, doc: u32) -> u64 {
-        self.stats.min_value + self.stats.gcd.get() * self.bit_unpacker.get(doc, &self.data)
+        self.stats.min_value + self.stats.gcd.get() * self.unpack_val(doc)
    }
+
    #[inline]
    fn min_value(&self) -> u64 {
        self.stats.min_value
@@ -76,24 +107,329 @@ impl ColumnValues for BitpackedReader {
        self.stats.num_rows
    }

+    fn get_vals_in_value_range(
+        &self,
+        input_indexes: &[u32],
+        input_doc_ids: &[u32],
+        output: &mut Vec<crate::ComparableDoc<Option<u64>, crate::DocId>>,
+        value_range: ValueRange<u64>,
+    ) {
+        match value_range {
+            ValueRange::All => {
+                for (&idx, &doc) in input_indexes.iter().zip(input_doc_ids.iter()) {
+                    output.push(crate::ComparableDoc {
+                        doc,
+                        sort_key: Some(self.get_val(idx)),
+                    });
+                }
+            }
+            ValueRange::Inclusive(range) => {
+                if let Some(transformed_range) =
+                    transform_range_before_linear_transformation(&self.stats, range)
+                {
+                    for (&idx, &doc) in input_indexes.iter().zip(input_doc_ids.iter()) {
+                        let raw_val = self.unpack_val(idx);
+                        if transformed_range.contains(&raw_val) {
+                            output.push(crate::ComparableDoc {
+                                doc,
+                                sort_key: Some(
+                                    self.stats.min_value + self.stats.gcd.get() * raw_val,
+                                ),
+                            });
+                        }
+                    }
+                }
+            }
+            ValueRange::GreaterThan(threshold, _) => {
+                if threshold < self.stats.min_value {
+                    for (&idx, &doc) in input_indexes.iter().zip(input_doc_ids.iter()) {
+                        output.push(crate::ComparableDoc {
+                            doc,
+                            sort_key: Some(self.get_val(idx)),
+                        });
+                    }
+                } else if threshold >= self.stats.max_value {
+                    // All filtered out
+                } else {
+                    let raw_threshold = (threshold - self.stats.min_value) / self.stats.gcd.get();
+                    for (&idx, &doc) in input_indexes.iter().zip(input_doc_ids.iter()) {
+                        let raw_val = self.unpack_val(idx);
+                        if raw_val > raw_threshold {
+                            output.push(crate::ComparableDoc {
+                                doc,
+                                sort_key: Some(
+                                    self.stats.min_value + self.stats.gcd.get() * raw_val,
+                                ),
+                            });
+                        }
+                    }
+                }
+            }
+            ValueRange::GreaterThanOrEqual(threshold, _) => {
+                if threshold <= self.stats.min_value {
+                    for (&idx, &doc) in input_indexes.iter().zip(input_doc_ids.iter()) {
+                        output.push(crate::ComparableDoc {
+                            doc,
+                            sort_key: Some(self.get_val(idx)),
+                        });
+                    }
+                } else if threshold > self.stats.max_value {
+                    // All filtered out
+                } else {
+                    let diff = threshold - self.stats.min_value;
+                    let gcd = self.stats.gcd.get();
+                    let raw_threshold = (diff + gcd - 1) / gcd;
+                    for (&idx, &doc) in input_indexes.iter().zip(input_doc_ids.iter()) {
+                        let raw_val = self.unpack_val(idx);
+                        if raw_val >= raw_threshold {
+                            output.push(crate::ComparableDoc {
+                                doc,
+                                sort_key: Some(
+                                    self.stats.min_value + self.stats.gcd.get() * raw_val,
+                                ),
+                            });
+                        }
+                    }
+                }
+            }
+            ValueRange::LessThan(threshold, _) => {
+                if threshold > self.stats.max_value {
+                    for (&idx, &doc) in input_indexes.iter().zip(input_doc_ids.iter()) {
+                        output.push(crate::ComparableDoc {
+                            doc,
+                            sort_key: Some(self.get_val(idx)),
+                        });
+                    }
+                } else if threshold <= self.stats.min_value {
+                    // All filtered out
+                } else {
+                    let diff = threshold - self.stats.min_value;
+                    let gcd = self.stats.gcd.get();
+                    let raw_threshold = if diff % gcd == 0 {
+                        diff / gcd
+                    } else {
+                        diff / gcd + 1
+                    };
+
+                    for (&idx, &doc) in input_indexes.iter().zip(input_doc_ids.iter()) {
+                        let raw_val = self.unpack_val(idx);
+                        if raw_val < raw_threshold {
+                            output.push(crate::ComparableDoc {
+                                doc,
+                                sort_key: Some(
+                                    self.stats.min_value + self.stats.gcd.get() * raw_val,
+                                ),
+                            });
+                        }
+                    }
+                }
+            }
+            ValueRange::LessThanOrEqual(threshold, _) => {
+                if threshold >= self.stats.max_value {
+                    for (&idx, &doc) in input_indexes.iter().zip(input_doc_ids.iter()) {
+                        output.push(crate::ComparableDoc {
+                            doc,
+                            sort_key: Some(self.get_val(idx)),
+                        });
+                    }
+                } else if threshold < self.stats.min_value {
+                    // All filtered out
+                } else {
+                    let diff = threshold - self.stats.min_value;
+                    let gcd = self.stats.gcd.get();
+                    let raw_threshold = diff / gcd;
+
+                    for (&idx, &doc) in input_indexes.iter().zip(input_doc_ids.iter()) {
+                        let raw_val = self.unpack_val(idx);
+                        if raw_val <= raw_threshold {
+                            output.push(crate::ComparableDoc {
+                                doc,
+                                sort_key: Some(
+                                    self.stats.min_value + self.stats.gcd.get() * raw_val,
+                                ),
+                            });
+                        }
+                    }
+                }
+            }
+        }
+    }
    fn get_row_ids_for_value_range(
        &self,
-        range: RangeInclusive<u64>,
+        range: ValueRange<u64>,
        doc_id_range: Range<u32>,
        positions: &mut Vec<u32>,
    ) {
-        let Some(transformed_range) =
-            transform_range_before_linear_transformation(&self.stats, range)
-        else {
-            positions.clear();
-            return;
-        };
-        self.bit_unpacker.get_ids_for_value_range(
-            transformed_range,
-            doc_id_range,
-            &self.data,
-            positions,
-        );
+        match range {
+            ValueRange::All => {
+                positions.extend(doc_id_range);
+                return;
+            }
+            ValueRange::Inclusive(range) => {
+                let Some(transformed_range) =
+                    transform_range_before_linear_transformation(&self.stats, range)
+                else {
+                    positions.clear();
+                    return;
+                };
+                // TODO: This does not use the `self.blocks` cache, because callers are usually
+                // already doing sequential, and fairly dense reads. Fix it to
+                // iterate over blocks if that assumption turns out to be incorrect!
+                let data_range = self
+                    .bit_unpacker
+                    .block_oblivious_range(doc_id_range.clone(), self.data.len());
+                let data_offset = data_range.start;
+                let data_subset = self
+                    .data
+                    .slice(data_range)
+                    .read_bytes()
+                    .expect("Failed to read column values.");
+                self.bit_unpacker.get_ids_for_value_range_from_subset(
+                    transformed_range,
+                    doc_id_range,
+                    data_offset,
+                    &data_subset,
+                    positions,
+                );
+            }
+            ValueRange::GreaterThan(threshold, _) => {
+                if threshold < self.stats.min_value {
+                    positions.extend(doc_id_range);
+                    return;
+                }
+                if threshold >= self.stats.max_value {
+                    return;
+                }
+                let raw_threshold = (threshold - self.stats.min_value) / self.stats.gcd.get();
+                // We want raw > raw_threshold.
+                // bit_unpacker.get_ids_for_value_range_from_subset takes a RangeInclusive.
+                // We can construct a RangeInclusive: (raw_threshold + 1) ..= u64::MAX
+                // But max raw value is known? (max_value - min_value) / gcd.
+                let max_raw = (self.stats.max_value - self.stats.min_value) / self.stats.gcd.get();
+                let transformed_range = (raw_threshold + 1)..=max_raw;
+
+                let data_range = self
+                    .bit_unpacker
+                    .block_oblivious_range(doc_id_range.clone(), self.data.len());
+                let data_offset = data_range.start;
+                let data_subset = self
+                    .data
+                    .slice(data_range)
+                    .read_bytes()
+                    .expect("Failed to read column values.");
+                self.bit_unpacker.get_ids_for_value_range_from_subset(
+                    transformed_range,
+                    doc_id_range,
+                    data_offset,
+                    &data_subset,
+                    positions,
+                );
+            }
+            ValueRange::GreaterThanOrEqual(threshold, _) => {
+                if threshold <= self.stats.min_value {
+                    positions.extend(doc_id_range);
+                    return;
+                }
+                if threshold > self.stats.max_value {
+                    return;
+                }
+                let diff = threshold - self.stats.min_value;
+                let gcd = self.stats.gcd.get();
+                let raw_threshold = (diff + gcd - 1) / gcd;
+                // We want raw >= raw_threshold.
+                let max_raw = (self.stats.max_value - self.stats.min_value) / self.stats.gcd.get();
+                let transformed_range = raw_threshold..=max_raw;
+
+                let data_range = self
+                    .bit_unpacker
+                    .block_oblivious_range(doc_id_range.clone(), self.data.len());
+                let data_offset = data_range.start;
+                let data_subset = self
+                    .data
+                    .slice(data_range)
+                    .read_bytes()
+                    .expect("Failed to read column values.");
+                self.bit_unpacker.get_ids_for_value_range_from_subset(
+                    transformed_range,
+                    doc_id_range,
+                    data_offset,
+                    &data_subset,
+                    positions,
+                );
+            }
+            ValueRange::LessThan(threshold, _) => {
+                if threshold > self.stats.max_value {
+                    positions.extend(doc_id_range);
+                    return;
+                }
+                if threshold <= self.stats.min_value {
+                    return;
+                }
+
+                let diff = threshold - self.stats.min_value;
+                let gcd = self.stats.gcd.get();
+                // We want raw < raw_threshold_limit
+                // raw <= raw_threshold_limit - 1
+                let raw_threshold_limit = if diff % gcd == 0 {
+                    diff / gcd
+                } else {
+                    diff / gcd + 1
+                };
+
+                if raw_threshold_limit == 0 {
+                    return;
+                }
+                let transformed_range = 0..=(raw_threshold_limit - 1);
+
+                let data_range = self
+                    .bit_unpacker
+                    .block_oblivious_range(doc_id_range.clone(), self.data.len());
+                let data_offset = data_range.start;
+                let data_subset = self
+                    .data
+                    .slice(data_range)
+                    .read_bytes()
+                    .expect("Failed to read column values.");
+                self.bit_unpacker.get_ids_for_value_range_from_subset(
+                    transformed_range,
+                    doc_id_range,
+                    data_offset,
+                    &data_subset,
+                    positions,
+                );
+            }
+            ValueRange::LessThanOrEqual(threshold, _) => {
+                if threshold >= self.stats.max_value {
+                    positions.extend(doc_id_range);
+                    return;
+                }
+                if threshold < self.stats.min_value {
+                    return;
+                }
+                let diff = threshold - self.stats.min_value;
+                let gcd = self.stats.gcd.get();
+                // We want raw <= raw_threshold.
+                let raw_threshold = diff / gcd;
+                let transformed_range = 0..=raw_threshold;
+
+                let data_range = self
+                    .bit_unpacker
+                    .block_oblivious_range(doc_id_range.clone(), self.data.len());
+                let data_offset = data_range.start;
+                let data_subset = self
+                    .data
+                    .slice(data_range)
+                    .read_bytes()
+                    .expect("Failed to read column values.");
+                self.bit_unpacker.get_ids_for_value_range_from_subset(
+                    transformed_range,
+                    doc_id_range,
+                    data_offset,
+                    &data_subset,
+                    positions,
+                );
+            }
+        }
    }
 }

@@ -109,7 +445,7 @@ impl ColumnCodecEstimator for BitpackedCodecEstimator {

    fn estimate(&self, stats: &ColumnStats) -> Option<u64> {
        let num_bits_per_value = num_bits(stats);
-        Some(stats.num_bytes() + (stats.num_rows as u64 * (num_bits_per_value as u64) + 7) / 8)
+        Some(stats.num_bytes() + (stats.num_rows as u64 * (num_bits_per_value as u64)).div_ceil(8))
    }

    fn serialize(
@@ -137,14 +473,20 @@ impl ColumnCodec for BitpackedCodec {
    type Estimator = BitpackedCodecEstimator;

    /// Opens a fast field given a file.
-    fn load(mut data: OwnedBytes) -> io::Result<Self::ColumnValues> {
-        let stats = ColumnStats::deserialize(&mut data)?;
+    fn load(file_slice: FileSlice) -> io::Result<Self::ColumnValues> {
+        let (stats, data) = ColumnStats::deserialize_from_tail(file_slice)?;
+
        let num_bits = num_bits(&stats);
        let bit_unpacker = BitUnpacker::new(num_bits);
+        let block_count = bit_unpacker.block_count(data.len());
        Ok(BitpackedReader {
            data,
            bit_unpacker,
            stats,
+            blocks: (0..block_count)
+                .into_iter()
+                .map(|_| OnceLock::new())
+                .collect(),
        })
    }
 }
--- a/columnar/src/column_values/u64_based/blockwise_linear.rs
+++ b/columnar/src/column_values/u64_based/blockwise_linear.rs
@@ -1,15 +1,17 @@
+use std::io;
 use std::io::Write;
-use std::sync::Arc;
-use std::{io, iter};
+use std::ops::{Deref, DerefMut};
+use std::sync::{Arc, OnceLock};

-use common::{BinarySerializable, CountingWriter, DeserializeFrom, OwnedBytes};
+use common::file_slice::FileSlice;
+use common::{BinarySerializable, CountingWriter, DeserializeFrom, HasLen, OwnedBytes};
 use fastdivide::DividerU64;
-use tantivy_bitpacker::{compute_num_bits, BitPacker, BitUnpacker};
+use tantivy_bitpacker::{BitPacker, BitUnpacker, compute_num_bits};

+use crate::MonotonicallyMappableToU64;
 use crate::column_values::u64_based::line::Line;
 use crate::column_values::u64_based::{ColumnCodec, ColumnCodecEstimator, ColumnStats};
 use crate::column_values::{ColumnValues, VecColumn};
-use crate::MonotonicallyMappableToU64;

 const BLOCK_SIZE: u32 = 512u32;

@@ -39,7 +41,7 @@ impl BinarySerializable for Block {
 }

 fn compute_num_blocks(num_vals: u32) -> u32 {
-    (num_vals + BLOCK_SIZE - 1) / BLOCK_SIZE
+    num_vals.div_ceil(BLOCK_SIZE)
 }

 pub struct BlockwiseLinearEstimator {
@@ -172,32 +174,63 @@ impl ColumnCodec<u64> for BlockwiseLinearCodec {

    type Estimator = BlockwiseLinearEstimator;

-    fn load(mut bytes: OwnedBytes) -> io::Result<Self::ColumnValues> {
-        let stats = ColumnStats::deserialize(&mut bytes)?;
-        let footer_len: u32 = (&bytes[bytes.len() - 4..]).deserialize()?;
-        let footer_offset = bytes.len() - 4 - footer_len as usize;
-        let (data, mut footer) = bytes.split(footer_offset);
+    fn load(file_slice: FileSlice) -> io::Result<Self::ColumnValues> {
+        let (stats, body) = ColumnStats::deserialize_from_tail(file_slice)?;
+
+        let (_, footer) = body.clone().split_from_end(4);
+
+        let footer_len: u32 = footer.read_bytes()?.as_slice().deserialize()?;
+        let (data, footer) = body.split_from_end(footer_len as usize + 4);
+
+        let mut footer = footer.read_bytes()?;
        let num_blocks = compute_num_blocks(stats.num_rows);
-        let mut blocks: Vec<Block> = iter::repeat_with(|| Block::deserialize(&mut footer))
-            .take(num_blocks as usize)
-            .collect::<io::Result<_>>()?;
+
        let mut start_offset = 0;
-        for block in &mut blocks {
+        let mut blocks = Vec::with_capacity(num_blocks as usize);
+
+        for _ in 0..num_blocks {
+            let mut block = Block::deserialize(&mut footer)?;
+            let len = (block.bit_unpacker.bit_width() as usize) * BLOCK_SIZE as usize / 8;
+
            block.data_start_offset = start_offset;
-            start_offset += (block.bit_unpacker.bit_width() as usize) * BLOCK_SIZE as usize / 8;
+            blocks.push(BlockWithData {
+                block,
+                file_slice: data.slice(start_offset..(start_offset + len).min(data.len())),
+                data: Default::default(),
+            });
+
+            start_offset += len;
        }
        Ok(BlockwiseLinearReader {
            blocks: blocks.into_boxed_slice().into(),
-            data,
            stats,
        })
    }
 }

+struct BlockWithData {
+    block: Block,
+    file_slice: FileSlice,
+    data: OnceLock<OwnedBytes>,
+}
+
+impl Deref for BlockWithData {
+    type Target = Block;
+
+    fn deref(&self) -> &Self::Target {
+        &self.block
+    }
+}
+
+impl DerefMut for BlockWithData {
+    fn deref_mut(&mut self) -> &mut Self::Target {
+        &mut self.block
+    }
+}
+
 #[derive(Clone)]
 pub struct BlockwiseLinearReader {
-    blocks: Arc<[Block]>,
-    data: OwnedBytes,
+    blocks: Arc<[BlockWithData]>,
    stats: ColumnStats,
 }

@@ -208,7 +241,9 @@ impl ColumnValues for BlockwiseLinearReader {
        let idx_within_block = idx % BLOCK_SIZE;
        let block = &self.blocks[block_id];
        let interpoled_val: u64 = block.line.eval(idx_within_block);
-        let block_bytes = &self.data[block.data_start_offset..];
+        let block_bytes = block
+            .data
+            .get_or_init(|| block.file_slice.read_bytes().unwrap());
        let bitpacked_diff = block.bit_unpacker.get(idx_within_block, block_bytes);
        // TODO optimize me! the line parameters could be tweaked to include the multiplication and
        // remove the dependency.
--- a/columnar/src/column_values/u64_based/line.rs
+++ b/columnar/src/column_values/u64_based/line.rs
@@ -8,7 +8,7 @@ use crate::column_values::ColumnValues;
 const MID_POINT: u64 = (1u64 << 32) - 1u64;

 /// `Line` describes a line function `y: ax + b` using integer
-/// arithmetics.
+/// arithmetic.
 ///
 /// The slope is in fact a decimal split into a 32 bit integer value,
 /// and a 32-bit decimal value.
@@ -94,7 +94,7 @@ impl Line {
        // `(i, ys[])`.
        //
        // The best intercept therefore has the form
-        // `y[i] - line.eval(i)` (using wrapping arithmetics).
+        // `y[i] - line.eval(i)` (using wrapping arithmetic).
        // In other words, the best intercept is one of the `y - Line::eval(ys[i])`
        // and our task is just to pick the one that minimizes our error.
        //
--- a/columnar/src/column_values/u64_based/linear.rs
+++ b/columnar/src/column_values/u64_based/linear.rs
@@ -1,13 +1,14 @@
 use std::io;

+use common::file_slice::FileSlice;
 use common::{BinarySerializable, OwnedBytes};
-use tantivy_bitpacker::{compute_num_bits, BitPacker, BitUnpacker};
+use tantivy_bitpacker::{BitPacker, BitUnpacker, compute_num_bits};

-use super::line::Line;
 use super::ColumnValues;
-use crate::column_values::u64_based::{ColumnCodec, ColumnCodecEstimator, ColumnStats};
-use crate::column_values::VecColumn;
+use super::line::Line;
 use crate::RowId;
+use crate::column_values::VecColumn;
+use crate::column_values::u64_based::{ColumnCodec, ColumnCodecEstimator, ColumnStats};

 const HALF_SPACE: u64 = u64::MAX / 2;
 const LINE_ESTIMATION_BLOCK_LEN: usize = 512;
@@ -117,7 +118,7 @@ impl ColumnCodecEstimator for LinearCodecEstimator {
        Some(
            stats.num_bytes()
                + linear_params.num_bytes()
-                + (num_bits as u64 * stats.num_rows as u64 + 7) / 8,
+                + (num_bits as u64 * stats.num_rows as u64).div_ceil(8),
        )
    }

@@ -190,7 +191,8 @@ impl ColumnCodec for LinearCodec {

    type Estimator = LinearCodecEstimator;

-    fn load(mut data: OwnedBytes) -> io::Result<Self::ColumnValues> {
+    fn load(file_slice: FileSlice) -> io::Result<Self::ColumnValues> {
+        let mut data = file_slice.read_bytes()?;
        let stats = ColumnStats::deserialize(&mut data)?;
        let linear_params = LinearParams::deserialize(&mut data)?;
        Ok(LinearReader {
--- a/columnar/src/column_values/u64_based/mod.rs
+++ b/columnar/src/column_values/u64_based/mod.rs
@@ -8,7 +8,8 @@ use std::io;
 use std::io::Write;
 use std::sync::Arc;

-use common::{BinarySerializable, OwnedBytes};
+use common::BinarySerializable;
+use common::file_slice::FileSlice;

 use crate::column_values::monotonic_mapping::{
    StrictlyMonotonicMappingInverter, StrictlyMonotonicMappingToInternal,
@@ -17,7 +18,7 @@ pub use crate::column_values::u64_based::bitpacked::BitpackedCodec;
 pub use crate::column_values::u64_based::blockwise_linear::BlockwiseLinearCodec;
 pub use crate::column_values::u64_based::linear::LinearCodec;
 pub use crate::column_values::u64_based::stats_collector::StatsCollector;
-use crate::column_values::{monotonic_map_column, ColumnStats};
+use crate::column_values::{ColumnStats, monotonic_map_column};
 use crate::iterable::Iterable;
 use crate::{ColumnValues, MonotonicallyMappableToU64};

@@ -52,7 +53,7 @@ pub trait ColumnCodecEstimator<T = u64>: 'static {
    ) -> io::Result<()>;
 }

-/// A column codec describes a colunm serialization format.
+/// A column codec describes a column serialization format.
 pub trait ColumnCodec<T: PartialOrd = u64> {
    /// Specialized `ColumnValues` type.
    type ColumnValues: ColumnValues<T> + 'static;
@@ -60,7 +61,7 @@ pub trait ColumnCodec<T: PartialOrd = u64> {
    type Estimator: ColumnCodecEstimator + Default;

    /// Loads a column that has been serialized using this codec.
-    fn load(bytes: OwnedBytes) -> io::Result<Self::ColumnValues>;
+    fn load(file_slice: FileSlice) -> io::Result<Self::ColumnValues>;

    /// Returns an estimator.
    fn estimator() -> Self::Estimator {
@@ -111,20 +112,22 @@ impl CodecType {

    fn load<T: MonotonicallyMappableToU64>(
        &self,
-        bytes: OwnedBytes,
+        file_slice: FileSlice,
    ) -> io::Result<Arc<dyn ColumnValues<T>>> {
        match self {
-            CodecType::Bitpacked => load_specific_codec::<BitpackedCodec, T>(bytes),
-            CodecType::Linear => load_specific_codec::<LinearCodec, T>(bytes),
-            CodecType::BlockwiseLinear => load_specific_codec::<BlockwiseLinearCodec, T>(bytes),
+            CodecType::Bitpacked => load_specific_codec::<BitpackedCodec, T>(file_slice),
+            CodecType::Linear => load_specific_codec::<LinearCodec, T>(file_slice),
+            CodecType::BlockwiseLinear => {
+                load_specific_codec::<BlockwiseLinearCodec, T>(file_slice)
+            }
        }
    }
 }

 fn load_specific_codec<C: ColumnCodec, T: MonotonicallyMappableToU64>(
-    bytes: OwnedBytes,
+    file_slice: FileSlice,
 ) -> io::Result<Arc<dyn ColumnValues<T>>> {
-    let reader = C::load(bytes)?;
+    let reader = C::load(file_slice)?;
    let reader_typed = monotonic_map_column(
        reader,
        StrictlyMonotonicMappingInverter::from(StrictlyMonotonicMappingToInternal::<T>::new()),
@@ -189,25 +192,28 @@ pub fn serialize_u64_based_column_values<T: MonotonicallyMappableToU64>(
 ///
 /// This method first identifies the codec off the first byte.
 pub fn load_u64_based_column_values<T: MonotonicallyMappableToU64>(
-    mut bytes: OwnedBytes,
+    file_slice: FileSlice,
 ) -> io::Result<Arc<dyn ColumnValues<T>>> {
-    let codec_type: CodecType = bytes
-        .first()
-        .copied()
+    let (header, body) = file_slice.split(1);
+    let codec_type: CodecType = header
+        .read_bytes()?
+        .as_slice()
+        .get(0)
+        .cloned()
        .and_then(CodecType::try_from_code)
        .ok_or_else(|| io::Error::new(io::ErrorKind::InvalidData, "Failed to read codec type"))?;
-    bytes.advance(1);
-    codec_type.load(bytes)
+    codec_type.load(body)
 }

 /// Helper function to serialize a column (autodetect from all codecs) and then open it
+#[cfg(test)]
 pub fn serialize_and_load_u64_based_column_values<T: MonotonicallyMappableToU64>(
    vals: &dyn Iterable,
    codec_types: &[CodecType],
 ) -> Arc<dyn ColumnValues<T>> {
    let mut buffer = Vec::new();
    serialize_u64_based_column_values(vals, codec_types, &mut buffer).unwrap();
-    load_u64_based_column_values::<T>(OwnedBytes::new(buffer)).unwrap()
+    load_u64_based_column_values::<T>(FileSlice::from(buffer)).unwrap()
 }

 #[cfg(test)]
--- a/columnar/src/column_values/u64_based/stats_collector.rs
+++ b/columnar/src/column_values/u64_based/stats_collector.rs
@@ -2,8 +2,8 @@ use std::num::NonZeroU64;

 use fastdivide::DividerU64;

-use crate::column_values::ColumnStats;
 use crate::RowId;
+use crate::column_values::ColumnStats;

 /// Compute the gcd of two non null numbers.
 ///
@@ -96,8 +96,8 @@ impl StatsCollector {
 mod tests {
    use std::num::NonZeroU64;

-    use crate::column_values::u64_based::stats_collector::{compute_gcd, StatsCollector};
    use crate::column_values::u64_based::ColumnStats;
+    use crate::column_values::u64_based::stats_collector::{StatsCollector, compute_gcd};

    fn compute_stats(vals: impl Iterator<Item = u64>) -> ColumnStats {
        let mut stats_collector = StatsCollector::default();
--- a/columnar/src/column_values/u64_based/tests.rs
+++ b/columnar/src/column_values/u64_based/tests.rs
@@ -1,5 +1,7 @@
+use common::HasLen;
 use proptest::prelude::*;
 use proptest::{prop_oneof, proptest};
+use rand::Rng;

 #[test]
 fn test_serialize_and_load_simple() {
@@ -12,7 +14,7 @@ fn test_serialize_and_load_simple() {
    )
    .unwrap();
    assert_eq!(buffer.len(), 7);
-    let col = load_u64_based_column_values::<u64>(OwnedBytes::new(buffer)).unwrap();
+    let col = load_u64_based_column_values::<u64>(FileSlice::from(buffer)).unwrap();
    assert_eq!(col.num_vals(), 3);
    assert_eq!(col.get_val(0), 1);
    assert_eq!(col.get_val(1), 2);
@@ -29,7 +31,7 @@ fn test_empty_column_i64() {
            continue;
        }
        num_acceptable_codecs += 1;
-        let col = load_u64_based_column_values::<i64>(OwnedBytes::new(buffer)).unwrap();
+        let col = load_u64_based_column_values::<i64>(FileSlice::from(buffer)).unwrap();
        assert_eq!(col.num_vals(), 0);
        assert_eq!(col.min_value(), i64::MIN);
        assert_eq!(col.max_value(), i64::MIN);
@@ -47,7 +49,7 @@ fn test_empty_column_u64() {
            continue;
        }
        num_acceptable_codecs += 1;
-        let col = load_u64_based_column_values::<u64>(OwnedBytes::new(buffer)).unwrap();
+        let col = load_u64_based_column_values::<u64>(FileSlice::from(buffer)).unwrap();
        assert_eq!(col.num_vals(), 0);
        assert_eq!(col.min_value(), u64::MIN);
        assert_eq!(col.max_value(), u64::MIN);
@@ -65,7 +67,7 @@ fn test_empty_column_f64() {
            continue;
        }
        num_acceptable_codecs += 1;
-        let col = load_u64_based_column_values::<f64>(OwnedBytes::new(buffer)).unwrap();
+        let col = load_u64_based_column_values::<f64>(FileSlice::from(buffer)).unwrap();
        assert_eq!(col.num_vals(), 0);
        // FIXME. f64::MIN would be better!
        assert!(col.min_value().is_nan());
@@ -96,7 +98,7 @@ pub(crate) fn create_and_validate<TColumnCodec: ColumnCodec>(

    let actual_compression = buffer.len() as u64;

-    let reader = TColumnCodec::load(OwnedBytes::new(buffer)).unwrap();
+    let reader = TColumnCodec::load(FileSlice::from(buffer)).unwrap();
    assert_eq!(reader.num_vals(), vals.len() as u32);
    let mut buffer = Vec::new();
    for (doc, orig_val) in vals.iter().copied().enumerate() {
@@ -130,7 +132,7 @@ pub(crate) fn create_and_validate<TColumnCodec: ColumnCodec>(
            .collect();
        let mut positions = Vec::new();
        reader.get_row_ids_for_value_range(
-            vals[test_rand_idx]..=vals[test_rand_idx],
+            crate::column::ValueRange::Inclusive(vals[test_rand_idx]..=vals[test_rand_idx]),
            0..vals.len() as u32,
            &mut positions,
        );
@@ -325,7 +327,7 @@ fn test_fastfield_gcd_i64_with_codec(codec_type: CodecType, num_vals: usize) ->
        &[codec_type],
        &mut buffer,
    )?;
-    let buffer = OwnedBytes::new(buffer);
+    let buffer = FileSlice::from(buffer);
    let column = crate::column_values::load_u64_based_column_values::<i64>(buffer.clone())?;
    assert_eq!(column.get_val(0), -4000i64);
    assert_eq!(column.get_val(1), -3000i64);
@@ -342,7 +344,7 @@ fn test_fastfield_gcd_i64_with_codec(codec_type: CodecType, num_vals: usize) ->
        &[codec_type],
        &mut buffer_without_gcd,
    )?;
-    let buffer_without_gcd = OwnedBytes::new(buffer_without_gcd);
+    let buffer_without_gcd = FileSlice::from(buffer_without_gcd);
    assert!(buffer_without_gcd.len() > buffer.len());

    Ok(())
@@ -368,7 +370,7 @@ fn test_fastfield_gcd_u64_with_codec(codec_type: CodecType, num_vals: usize) ->
        &[codec_type],
        &mut buffer,
    )?;
-    let buffer = OwnedBytes::new(buffer);
+    let buffer = FileSlice::from(buffer);
    let column = crate::column_values::load_u64_based_column_values::<u64>(buffer.clone())?;
    assert_eq!(column.get_val(0), 1000u64);
    assert_eq!(column.get_val(1), 2000u64);
@@ -385,7 +387,7 @@ fn test_fastfield_gcd_u64_with_codec(codec_type: CodecType, num_vals: usize) ->
        &[codec_type],
        &mut buffer_without_gcd,
    )?;
-    let buffer_without_gcd = OwnedBytes::new(buffer_without_gcd);
+    let buffer_without_gcd = FileSlice::from(buffer_without_gcd);
    assert!(buffer_without_gcd.len() > buffer.len());
    Ok(())
 }
@@ -404,7 +406,7 @@ fn test_fastfield_gcd_u64() -> io::Result<()> {

 #[test]
 pub fn test_fastfield2() {
-    let test_fastfield = crate::column_values::serialize_and_load_u64_based_column_values::<u64>(
+    let test_fastfield = serialize_and_load_u64_based_column_values::<u64>(
        &&[100u64, 200u64, 300u64][..],
        &ALL_U64_CODEC_TYPES,
    );
--- a/columnar/src/columnar/column_type.rs
+++ b/columnar/src/columnar/column_type.rs
@@ -4,8 +4,8 @@ use std::net::Ipv6Addr;

 use serde::{Deserialize, Serialize};

-use crate::value::NumericalType;
 use crate::InvalidData;
+use crate::value::NumericalType;

 /// The column type represents the column type.
 /// Any changes need to be propagated to `COLUMN_TYPES`.
--- a/columnar/src/columnar/merge/merge_dict_column.rs
+++ b/columnar/src/columnar/merge/merge_dict_column.rs
@@ -3,7 +3,7 @@ use std::io::{self, Write};
 use common::{BitSet, CountingWriter, ReadOnlyBitSet};
 use sstable::{SSTable, Streamer, TermOrdinal, VoidSSTable};

-use super::term_merger::TermMerger;
+use super::term_merger::{TermMerger, TermsWithSegmentOrd};
 use crate::column::serialize_column_mappable_to_u64;
 use crate::column_index::SerializableColumnIndex;
 use crate::iterable::Iterable;
@@ -39,7 +39,7 @@ struct RemappedTermOrdinalsValues<'a> {
    merge_row_order: &'a MergeRowOrder,
 }

-impl<'a> Iterable for RemappedTermOrdinalsValues<'a> {
+impl Iterable for RemappedTermOrdinalsValues<'_> {
    fn boxed_iter(&self) -> Box<dyn Iterator<Item = u64> + '_> {
        match self.merge_row_order {
            MergeRowOrder::Stack(_) => self.boxed_iter_stacked(),
@@ -50,7 +50,7 @@ impl<'a> Iterable for RemappedTermOrdinalsValues<'a> {
    }
 }

-impl<'a> RemappedTermOrdinalsValues<'a> {
+impl RemappedTermOrdinalsValues<'_> {
    fn boxed_iter_stacked(&self) -> Box<dyn Iterator<Item = u64> + '_> {
        let iter = self
            .bytes_columns
@@ -126,14 +126,17 @@ fn serialize_merged_dict(
    let mut term_ord_mapping = TermOrdinalMapping::default();

    let mut field_term_streams = Vec::new();
-    for column_opt in bytes_columns.iter() {
+    for (segment_ord, column_opt) in bytes_columns.iter().enumerate() {
        if let Some(column) = column_opt {
            term_ord_mapping.add_segment(column.dictionary.num_terms());
            let terms: Streamer<VoidSSTable> = column.dictionary.stream()?;
-            field_term_streams.push(terms);
+            field_term_streams.push(TermsWithSegmentOrd { terms, segment_ord });
        } else {
            term_ord_mapping.add_segment(0);
-            field_term_streams.push(Streamer::empty());
+            field_term_streams.push(TermsWithSegmentOrd {
+                terms: Streamer::empty(),
+                segment_ord,
+            });
        }
    }

@@ -191,6 +194,7 @@ fn serialize_merged_dict(

 #[derive(Default, Debug)]
 struct TermOrdinalMapping {
+    /// Contains the new term ordinals for each segment.
    per_segment_new_term_ordinals: Vec<Vec<TermOrdinal>>,
 }

@@ -205,6 +209,6 @@ impl TermOrdinalMapping {
    }

    fn get_segment(&self, segment_ord: u32) -> &[TermOrdinal] {
-        &(self.per_segment_new_term_ordinals[segment_ord as usize])[..]
+        &self.per_segment_new_term_ordinals[segment_ord as usize]
    }
 }
--- a/columnar/src/columnar/merge/merge_mapping.rs
+++ b/columnar/src/columnar/merge/merge_mapping.rs
@@ -26,7 +26,7 @@ impl StackMergeOrder {
        let mut cumulated_row_ids: Vec<RowId> = Vec::with_capacity(columnars.len());
        let mut cumulated_row_id = 0;
        for columnar in columnars {
-            cumulated_row_id += columnar.num_rows();
+            cumulated_row_id += columnar.num_docs();
            cumulated_row_ids.push(cumulated_row_id);
        }
        StackMergeOrder { cumulated_row_ids }
--- a/columnar/src/columnar/merge/mod.rs
+++ b/columnar/src/columnar/merge/mod.rs
@@ -4,17 +4,18 @@ mod term_merger;

 use std::collections::{BTreeMap, HashSet};
 use std::io;
+use std::io::ErrorKind;
 use std::net::Ipv6Addr;
 use std::sync::Arc;

 pub use merge_mapping::{MergeRowOrder, ShuffleMergeOrder, StackMergeOrder};

 use super::writer::ColumnarSerializer;
-use crate::column::{serialize_column_mappable_to_u128, serialize_column_mappable_to_u64};
+use crate::column::{serialize_column_mappable_to_u64, serialize_column_mappable_to_u128};
 use crate::column_values::MergedColumnValues;
+use crate::columnar::ColumnarReader;
 use crate::columnar::merge::merge_dict_column::merge_bytes_or_str_column;
 use crate::columnar::writer::CompatibleNumericalTypes;
-use crate::columnar::ColumnarReader;
 use crate::dynamic_column::DynamicColumn;
 use crate::{
    BytesColumn, Column, ColumnIndex, ColumnType, ColumnValues, DynamicColumnHandle, NumericalType,
@@ -78,31 +79,37 @@ pub fn merge_columnar(
    required_columns: &[(String, ColumnType)],
    merge_row_order: MergeRowOrder,
    output: &mut impl io::Write,
+    cancel: impl Fn() -> bool,
 ) -> io::Result<()> {
    let mut serializer = ColumnarSerializer::new(output);
-    let num_rows_per_columnar = columnar_readers
+    let num_docs_per_columnar = columnar_readers
        .iter()
-        .map(|reader| reader.num_rows())
+        .map(|reader| reader.num_docs())
        .collect::<Vec<u32>>();

-    let columns_to_merge =
-        group_columns_for_merge(columnar_readers, required_columns, &merge_row_order)?;
+    let columns_to_merge = group_columns_for_merge(columnar_readers, required_columns)?;
    for res in columns_to_merge {
+        if cancel() {
+            return Err(io::Error::new(ErrorKind::Interrupted, "Merge cancelled"));
+        }
        let ((column_name, _column_type_category), grouped_columns) = res;
        let grouped_columns = grouped_columns.open(&merge_row_order)?;
        if grouped_columns.is_empty() {
            continue;
        }

-        let column_type = grouped_columns.column_type_after_merge();
+        let column_type_after_merge = grouped_columns.column_type_after_merge();
        let mut columns = grouped_columns.columns;
-        coerce_columns(column_type, &mut columns)?;
+        // Make sure the number of columns is the same as the number of columnar readers.
+        // Or num_docs_per_columnar would be incorrect.
+        assert_eq!(columns.len(), columnar_readers.len());
+        coerce_columns(column_type_after_merge, &mut columns)?;

        let mut column_serializer =
-            serializer.start_serialize_column(column_name.as_bytes(), column_type);
+            serializer.start_serialize_column(column_name.as_bytes(), column_type_after_merge);
        merge_column(
-            column_type,
-            &num_rows_per_columnar,
+            column_type_after_merge,
+            &num_docs_per_columnar,
            columns,
            &merge_row_order,
            &mut column_serializer,
@@ -128,7 +135,7 @@ fn dynamic_column_to_u64_monotonic(dynamic_column: DynamicColumn) -> Option<Colu
 fn merge_column(
    column_type: ColumnType,
    num_docs_per_column: &[u32],
-    columns: Vec<Option<DynamicColumn>>,
+    columns_to_merge: Vec<Option<DynamicColumn>>,
    merge_row_order: &MergeRowOrder,
    wrt: &mut impl io::Write,
 ) -> io::Result<()> {
@@ -138,20 +145,21 @@ fn merge_column(
        | ColumnType::F64
        | ColumnType::DateTime
        | ColumnType::Bool => {
-            let mut column_indexes: Vec<ColumnIndex> = Vec::with_capacity(columns.len());
+            let mut column_indexes: Vec<ColumnIndex> = Vec::with_capacity(columns_to_merge.len());
            let mut column_values: Vec<Option<Arc<dyn ColumnValues>>> =
-                Vec::with_capacity(columns.len());
-            for (i, dynamic_column_opt) in columns.into_iter().enumerate() {
-                if let Some(Column { index: idx, values }) =
-                    dynamic_column_opt.and_then(dynamic_column_to_u64_monotonic)
-                {
-                    column_indexes.push(idx);
-                    column_values.push(Some(values));
-                } else {
-                    column_indexes.push(ColumnIndex::Empty {
-                        num_docs: num_docs_per_column[i],
-                    });
-                    column_values.push(None);
+                Vec::with_capacity(columns_to_merge.len());
+            for (i, dynamic_column_opt) in columns_to_merge.into_iter().enumerate() {
+                match dynamic_column_opt.and_then(dynamic_column_to_u64_monotonic) {
+                    Some(Column { index: idx, values }) => {
+                        column_indexes.push(idx);
+                        column_values.push(Some(values));
+                    }
+                    None => {
+                        column_indexes.push(ColumnIndex::Empty {
+                            num_docs: num_docs_per_column[i],
+                        });
+                        column_values.push(None);
+                    }
                }
            }
            let merged_column_index =
@@ -164,10 +172,10 @@ fn merge_column(
            serialize_column_mappable_to_u64(merged_column_index, &merge_column_values, wrt)?;
        }
        ColumnType::IpAddr => {
-            let mut column_indexes: Vec<ColumnIndex> = Vec::with_capacity(columns.len());
+            let mut column_indexes: Vec<ColumnIndex> = Vec::with_capacity(columns_to_merge.len());
            let mut column_values: Vec<Option<Arc<dyn ColumnValues<Ipv6Addr>>>> =
-                Vec::with_capacity(columns.len());
-            for (i, dynamic_column_opt) in columns.into_iter().enumerate() {
+                Vec::with_capacity(columns_to_merge.len());
+            for (i, dynamic_column_opt) in columns_to_merge.into_iter().enumerate() {
                if let Some(DynamicColumn::IpAddr(Column { index: idx, values })) =
                    dynamic_column_opt
                {
@@ -192,9 +200,10 @@ fn merge_column(
            serialize_column_mappable_to_u128(merged_column_index, &merge_column_values, wrt)?;
        }
        ColumnType::Bytes | ColumnType::Str => {
-            let mut column_indexes: Vec<ColumnIndex> = Vec::with_capacity(columns.len());
-            let mut bytes_columns: Vec<Option<BytesColumn>> = Vec::with_capacity(columns.len());
-            for (i, dynamic_column_opt) in columns.into_iter().enumerate() {
+            let mut column_indexes: Vec<ColumnIndex> = Vec::with_capacity(columns_to_merge.len());
+            let mut bytes_columns: Vec<Option<BytesColumn>> =
+                Vec::with_capacity(columns_to_merge.len());
+            for (i, dynamic_column_opt) in columns_to_merge.into_iter().enumerate() {
                match dynamic_column_opt {
                    Some(DynamicColumn::Str(str_column)) => {
                        column_indexes.push(str_column.term_ord_column.index.clone());
@@ -248,13 +257,15 @@ impl GroupedColumns {
        if column_type.len() == 1 {
            return column_type.into_iter().next().unwrap();
        }
-        // At the moment, only the numerical categorical column type has more than one possible
+        // At the moment, only the numerical column type category has more than one possible
        // column type.
-        assert!(self
-            .columns
-            .iter()
-            .flatten()
-            .all(|el| ColumnTypeCategory::from(el.column_type()) == ColumnTypeCategory::Numerical));
+        assert!(
+            self.columns
+                .iter()
+                .flatten()
+                .all(|el| ColumnTypeCategory::from(el.column_type())
+                    == ColumnTypeCategory::Numerical)
+        );
        merged_numerical_columns_type(self.columns.iter().flatten()).into()
    }
 }
@@ -361,7 +372,7 @@ fn is_empty_after_merge(
                    ColumnIndex::Empty { .. } => true,
                    ColumnIndex::Full => alive_bitset.len() == 0,
                    ColumnIndex::Optional(optional_index) => {
-                        for doc in optional_index.iter_rows() {
+                        for doc in optional_index.iter_non_null_docs() {
                            if alive_bitset.contains(doc) {
                                return false;
                            }
@@ -391,7 +402,6 @@ fn is_empty_after_merge(
 fn group_columns_for_merge<'a>(
    columnar_readers: &'a [&'a ColumnarReader],
    required_columns: &'a [(String, ColumnType)],
-    _merge_row_order: &'a MergeRowOrder,
 ) -> io::Result<BTreeMap<(String, ColumnTypeCategory), GroupedColumnsHandle>> {
    let mut columns: BTreeMap<(String, ColumnTypeCategory), GroupedColumnsHandle> = BTreeMap::new();

--- a/columnar/src/columnar/merge/term_merger.rs
+++ b/columnar/src/columnar/merge/term_merger.rs
@@ -5,28 +5,29 @@ use sstable::TermOrdinal;

 use crate::Streamer;

-pub struct HeapItem<'a> {
-    pub streamer: Streamer<'a>,
+/// The terms of a column with the ordinal of the segment.
+pub struct TermsWithSegmentOrd<'a> {
+    pub terms: Streamer<'a>,
    pub segment_ord: usize,
 }

-impl<'a> PartialEq for HeapItem<'a> {
+impl PartialEq for TermsWithSegmentOrd<'_> {
    fn eq(&self, other: &Self) -> bool {
        self.segment_ord == other.segment_ord
    }
 }

-impl<'a> Eq for HeapItem<'a> {}
+impl Eq for TermsWithSegmentOrd<'_> {}

-impl<'a> PartialOrd for HeapItem<'a> {
-    fn partial_cmp(&self, other: &HeapItem<'a>) -> Option<Ordering> {
+impl<'a> PartialOrd for TermsWithSegmentOrd<'a> {
+    fn partial_cmp(&self, other: &TermsWithSegmentOrd<'a>) -> Option<Ordering> {
        Some(self.cmp(other))
    }
 }

-impl<'a> Ord for HeapItem<'a> {
-    fn cmp(&self, other: &HeapItem<'a>) -> Ordering {
-        (&other.streamer.key(), &other.segment_ord).cmp(&(&self.streamer.key(), &self.segment_ord))
+impl<'a> Ord for TermsWithSegmentOrd<'a> {
+    fn cmp(&self, other: &TermsWithSegmentOrd<'a>) -> Ordering {
+        (&other.terms.key(), &other.segment_ord).cmp(&(&self.terms.key(), &self.segment_ord))
    }
 }

@@ -37,39 +38,32 @@ impl<'a> Ord for HeapItem<'a> {
 /// - the term
 /// - a slice with the ordinal of the segments containing the terms.
 pub struct TermMerger<'a> {
-    heap: BinaryHeap<HeapItem<'a>>,
-    current_streamers: Vec<HeapItem<'a>>,
+    heap: BinaryHeap<TermsWithSegmentOrd<'a>>,
+    term_streams_with_segment: Vec<TermsWithSegmentOrd<'a>>,
 }

 impl<'a> TermMerger<'a> {
    /// Stream of merged term dictionary
-    pub fn new(streams: Vec<Streamer<'a>>) -> TermMerger<'a> {
+    pub fn new(term_streams_with_segment: Vec<TermsWithSegmentOrd<'a>>) -> TermMerger<'a> {
        TermMerger {
            heap: BinaryHeap::new(),
-            current_streamers: streams
-                .into_iter()
-                .enumerate()
-                .map(|(ord, streamer)| HeapItem {
-                    streamer,
-                    segment_ord: ord,
-                })
-                .collect(),
+            term_streams_with_segment,
        }
    }

    pub(crate) fn matching_segments<'b: 'a>(
        &'b self,
    ) -> impl 'b + Iterator<Item = (usize, TermOrdinal)> {
-        self.current_streamers
+        self.term_streams_with_segment
            .iter()
-            .map(|heap_item| (heap_item.segment_ord, heap_item.streamer.term_ord()))
+            .map(|heap_item| (heap_item.segment_ord, heap_item.terms.term_ord()))
    }

    fn advance_segments(&mut self) {
-        let streamers = &mut self.current_streamers;
+        let streamers = &mut self.term_streams_with_segment;
        let heap = &mut self.heap;
        for mut heap_item in streamers.drain(..) {
-            if heap_item.streamer.advance() {
+            if heap_item.terms.advance() {
                heap.push(heap_item);
            }
        }
@@ -80,18 +74,19 @@ impl<'a> TermMerger<'a> {
    /// False if there is none.
    pub fn advance(&mut self) -> bool {
        self.advance_segments();
-        if let Some(head) = self.heap.pop() {
-            self.current_streamers.push(head);
-            while let Some(next_streamer) = self.heap.peek() {
-                if self.current_streamers[0].streamer.key() != next_streamer.streamer.key() {
-                    break;
+        match self.heap.pop() {
+            Some(head) => {
+                self.term_streams_with_segment.push(head);
+                while let Some(next_streamer) = self.heap.peek() {
+                    if self.term_streams_with_segment[0].terms.key() != next_streamer.terms.key() {
+                        break;
+                    }
+                    let next_heap_it = self.heap.pop().unwrap(); // safe : we peeked beforehand
+                    self.term_streams_with_segment.push(next_heap_it);
                }
-                let next_heap_it = self.heap.pop().unwrap(); // safe : we peeked beforehand
-                self.current_streamers.push(next_heap_it);
+                true
            }
-            true
-        } else {
-            false
+            _ => false,
        }
    }

@@ -101,6 +96,6 @@ impl<'a> TermMerger<'a> {
    /// if and only if advance() has been called before
    /// and "true" was returned.
    pub fn key(&self) -> &[u8] {
-        self.current_streamers[0].streamer.key()
+        self.term_streams_with_segment[0].terms.key()
    }
 }
--- a/columnar/src/columnar/merge/tests.rs
+++ b/columnar/src/columnar/merge/tests.rs
@@ -1,7 +1,10 @@
 use itertools::Itertools;
+use proptest::collection::vec;
+use proptest::prelude::*;

 use super::*;
-use crate::{Cardinality, ColumnarWriter, HasAssociatedColumnType, RowId};
+use crate::columnar::{ColumnarReader, MergeRowOrder, StackMergeOrder, merge_columnar};
+use crate::{Cardinality, ColumnarWriter, DynamicColumn, HasAssociatedColumnType, RowId};

 fn make_columnar<T: Into<NumericalValue> + HasAssociatedColumnType + Copy>(
    column_name: &str,
@@ -26,9 +29,8 @@ fn test_column_coercion_to_u64() {
    // u64 type
    let columnar2 = make_columnar("numbers", &[u64::MAX]);
    let columnars = &[&columnar1, &columnar2];
-    let merge_order = StackMergeOrder::stack(columnars).into();
    let column_map: BTreeMap<(String, ColumnTypeCategory), GroupedColumnsHandle> =
-        group_columns_for_merge(columnars, &[], &merge_order).unwrap();
+        group_columns_for_merge(columnars, &[]).unwrap();
    assert_eq!(column_map.len(), 1);
    assert!(column_map.contains_key(&("numbers".to_string(), ColumnTypeCategory::Numerical)));
 }
@@ -38,9 +40,8 @@ fn test_column_coercion_to_i64() {
    let columnar1 = make_columnar("numbers", &[-1i64]);
    let columnar2 = make_columnar("numbers", &[2u64]);
    let columnars = &[&columnar1, &columnar2];
-    let merge_order = StackMergeOrder::stack(columnars).into();
    let column_map: BTreeMap<(String, ColumnTypeCategory), GroupedColumnsHandle> =
-        group_columns_for_merge(columnars, &[], &merge_order).unwrap();
+        group_columns_for_merge(columnars, &[]).unwrap();
    assert_eq!(column_map.len(), 1);
    assert!(column_map.contains_key(&("numbers".to_string(), ColumnTypeCategory::Numerical)));
 }
@@ -63,14 +64,8 @@ fn test_group_columns_with_required_column() {
    let columnar1 = make_columnar("numbers", &[1i64]);
    let columnar2 = make_columnar("numbers", &[2u64]);
    let columnars = &[&columnar1, &columnar2];
-    let merge_order = StackMergeOrder::stack(columnars).into();
    let column_map: BTreeMap<(String, ColumnTypeCategory), GroupedColumnsHandle> =
-        group_columns_for_merge(
-            &[&columnar1, &columnar2],
-            &[("numbers".to_string(), ColumnType::U64)],
-            &merge_order,
-        )
-        .unwrap();
+        group_columns_for_merge(columnars, &[("numbers".to_string(), ColumnType::U64)]).unwrap();
    assert_eq!(column_map.len(), 1);
    assert!(column_map.contains_key(&("numbers".to_string(), ColumnTypeCategory::Numerical)));
 }
@@ -80,13 +75,9 @@ fn test_group_columns_required_column_with_no_existing_columns() {
    let columnar1 = make_columnar("numbers", &[2u64]);
    let columnar2 = make_columnar("numbers", &[2u64]);
    let columnars = &[&columnar1, &columnar2];
-    let merge_order = StackMergeOrder::stack(columnars).into();
-    let column_map: BTreeMap<_, _> = group_columns_for_merge(
-        columnars,
-        &[("required_col".to_string(), ColumnType::Str)],
-        &merge_order,
-    )
-    .unwrap();
+    let column_map: BTreeMap<_, _> =
+        group_columns_for_merge(columnars, &[("required_col".to_string(), ColumnType::Str)])
+            .unwrap();
    assert_eq!(column_map.len(), 2);
    let columns = &column_map
        .get(&("required_col".to_string(), ColumnTypeCategory::Str))
@@ -102,14 +93,8 @@ fn test_group_columns_required_column_is_above_all_columns_have_the_same_type_ru
    let columnar1 = make_columnar("numbers", &[2i64]);
    let columnar2 = make_columnar("numbers", &[2i64]);
    let columnars = &[&columnar1, &columnar2];
-    let merge_order = StackMergeOrder::stack(columnars).into();
    let column_map: BTreeMap<(String, ColumnTypeCategory), GroupedColumnsHandle> =
-        group_columns_for_merge(
-            columnars,
-            &[("numbers".to_string(), ColumnType::U64)],
-            &merge_order,
-        )
-        .unwrap();
+        group_columns_for_merge(columnars, &[("numbers".to_string(), ColumnType::U64)]).unwrap();
    assert_eq!(column_map.len(), 1);
    assert!(column_map.contains_key(&("numbers".to_string(), ColumnTypeCategory::Numerical)));
 }
@@ -119,9 +104,8 @@ fn test_missing_column() {
    let columnar1 = make_columnar("numbers", &[-1i64]);
    let columnar2 = make_columnar("numbers2", &[2u64]);
    let columnars = &[&columnar1, &columnar2];
-    let merge_order = StackMergeOrder::stack(columnars).into();
    let column_map: BTreeMap<(String, ColumnTypeCategory), GroupedColumnsHandle> =
-        group_columns_for_merge(columnars, &[], &merge_order).unwrap();
+        group_columns_for_merge(columnars, &[]).unwrap();
    assert_eq!(column_map.len(), 2);
    assert!(column_map.contains_key(&("numbers".to_string(), ColumnTypeCategory::Numerical)));
    {
@@ -221,10 +205,11 @@ fn test_merge_columnar_numbers() {
        &[],
        MergeRowOrder::Stack(stack_merge_order),
        &mut buffer,
+        || false,
    )
    .unwrap();
    let columnar_reader = ColumnarReader::open(buffer).unwrap();
-    assert_eq!(columnar_reader.num_rows(), 3);
+    assert_eq!(columnar_reader.num_docs(), 3);
    assert_eq!(columnar_reader.num_columns(), 1);
    let cols = columnar_reader.read_columns("numbers").unwrap();
    let dynamic_column = cols[0].open().unwrap();
@@ -249,10 +234,11 @@ fn test_merge_columnar_texts() {
        &[],
        MergeRowOrder::Stack(stack_merge_order),
        &mut buffer,
+        || false,
    )
    .unwrap();
    let columnar_reader = ColumnarReader::open(buffer).unwrap();
-    assert_eq!(columnar_reader.num_rows(), 3);
+    assert_eq!(columnar_reader.num_docs(), 3);
    assert_eq!(columnar_reader.num_columns(), 1);
    let cols = columnar_reader.read_columns("texts").unwrap();
    let dynamic_column = cols[0].open().unwrap();
@@ -298,10 +284,11 @@ fn test_merge_columnar_byte() {
        &[],
        MergeRowOrder::Stack(stack_merge_order),
        &mut buffer,
+        || false,
    )
    .unwrap();
    let columnar_reader = ColumnarReader::open(buffer).unwrap();
-    assert_eq!(columnar_reader.num_rows(), 4);
+    assert_eq!(columnar_reader.num_docs(), 4);
    assert_eq!(columnar_reader.num_columns(), 1);
    let cols = columnar_reader.read_columns("bytes").unwrap();
    let dynamic_column = cols[0].open().unwrap();
@@ -354,10 +341,11 @@ fn test_merge_columnar_byte_with_missing() {
        &[],
        MergeRowOrder::Stack(stack_merge_order),
        &mut buffer,
+        || false,
    )
    .unwrap();
    let columnar_reader = ColumnarReader::open(buffer).unwrap();
-    assert_eq!(columnar_reader.num_rows(), 3 + 2 + 3);
+    assert_eq!(columnar_reader.num_docs(), 3 + 2 + 3);
    assert_eq!(columnar_reader.num_columns(), 2);
    let cols = columnar_reader.read_columns("col").unwrap();
    let dynamic_column = cols[0].open().unwrap();
@@ -406,10 +394,11 @@ fn test_merge_columnar_different_types() {
        &[],
        MergeRowOrder::Stack(stack_merge_order),
        &mut buffer,
+        || false,
    )
    .unwrap();
    let columnar_reader = ColumnarReader::open(buffer).unwrap();
-    assert_eq!(columnar_reader.num_rows(), 4);
+    assert_eq!(columnar_reader.num_docs(), 4);
    assert_eq!(columnar_reader.num_columns(), 2);
    let cols = columnar_reader.read_columns("mixed").unwrap();

@@ -419,11 +408,11 @@ fn test_merge_columnar_different_types() {
        panic!()
    };
    assert_eq!(vals.get_cardinality(), Cardinality::Optional);
-    assert_eq!(vals.values_for_doc(0).collect_vec(), vec![]);
-    assert_eq!(vals.values_for_doc(1).collect_vec(), vec![]);
-    assert_eq!(vals.values_for_doc(2).collect_vec(), vec![]);
+    assert_eq!(vals.values_for_doc(0).collect_vec(), Vec::<i64>::new());
+    assert_eq!(vals.values_for_doc(1).collect_vec(), Vec::<i64>::new());
+    assert_eq!(vals.values_for_doc(2).collect_vec(), Vec::<i64>::new());
    assert_eq!(vals.values_for_doc(3).collect_vec(), vec![1]);
-    assert_eq!(vals.values_for_doc(4).collect_vec(), vec![]);
+    assert_eq!(vals.values_for_doc(4).collect_vec(), Vec::<i64>::new());

    // text column
    let dynamic_column = cols[1].open().unwrap();
@@ -471,10 +460,11 @@ fn test_merge_columnar_different_empty_cardinality() {
        &[],
        MergeRowOrder::Stack(stack_merge_order),
        &mut buffer,
+        || false,
    )
    .unwrap();
    let columnar_reader = ColumnarReader::open(buffer).unwrap();
-    assert_eq!(columnar_reader.num_rows(), 2);
+    assert_eq!(columnar_reader.num_docs(), 2);
    assert_eq!(columnar_reader.num_columns(), 2);
    let cols = columnar_reader.read_columns("mixed").unwrap();

@@ -486,3 +476,121 @@ fn test_merge_columnar_different_empty_cardinality() {
    let dynamic_column = cols[1].open().unwrap();
    assert_eq!(dynamic_column.get_cardinality(), Cardinality::Optional);
 }
+
+#[derive(Debug, Clone)]
+struct ColumnSpec {
+    column_name: String,
+    /// (row_id, term)
+    terms: Vec<(RowId, Vec<u8>)>,
+}
+
+#[derive(Clone, Debug)]
+struct ColumnarSpec {
+    columns: Vec<ColumnSpec>,
+}
+
+/// Generate a random (row_id, term) pair:
+///  - row_id in [0..10]
+///  - term is either from POSSIBLE_TERMS or random bytes
+fn rowid_and_term_strategy() -> impl Strategy<Value = (RowId, Vec<u8>)> {
+    const POSSIBLE_TERMS: &[&[u8]] = &[b"a", b"b", b"allo"];
+
+    let term_strat = prop_oneof![
+        // pick from the fixed list
+        (0..POSSIBLE_TERMS.len()).prop_map(|i| POSSIBLE_TERMS[i].to_vec()),
+        // or random bytes (length 0..10)
+        prop::collection::vec(any::<u8>(), 0..10),
+    ];
+
+    (0u32..11, term_strat)
+}
+
+/// Generate one ColumnSpec, with a random name and a random list of (row_id, term).
+/// We sort it by row_id so that data is in ascending order.
+fn column_spec_strategy() -> impl Strategy<Value = ColumnSpec> {
+    let column_name = prop_oneof![
+        Just("col".to_string()),
+        Just("col2".to_string()),
+        "col.*".prop_map(|s| s),
+    ];
+
+    // We'll produce 0..8 (rowid,term) entries for this column
+    let data_strat = vec(rowid_and_term_strategy(), 0..8).prop_map(|mut pairs| {
+        // Sort by row_id
+        pairs.sort_by_key(|(row_id, _)| *row_id);
+        pairs
+    });
+
+    (column_name, data_strat).prop_map(|(name, data)| ColumnSpec {
+        column_name: name,
+        terms: data,
+    })
+}
+
+/// Strategy to generate an ColumnarSpec
+fn columnar_strategy() -> impl Strategy<Value = ColumnarSpec> {
+    vec(column_spec_strategy(), 0..3).prop_map(|columns| ColumnarSpec { columns })
+}
+
+/// Strategy to generate multiple ColumnarSpecs, each of which we will treat
+/// as one "columnar" to be merged together.
+fn columnars_strategy() -> impl Strategy<Value = Vec<ColumnarSpec>> {
+    vec(columnar_strategy(), 1..4)
+}
+
+/// Build a `ColumnarReader` from a `ColumnarSpec`
+fn build_columnar(spec: &ColumnarSpec) -> ColumnarReader {
+    let mut writer = ColumnarWriter::default();
+    let mut max_row_id = 0;
+    for col in &spec.columns {
+        for &(row_id, ref term) in &col.terms {
+            writer.record_bytes(row_id, &col.column_name, term);
+            max_row_id = max_row_id.max(row_id);
+        }
+    }
+
+    let mut buffer = Vec::new();
+    writer.serialize(max_row_id + 1, &mut buffer).unwrap();
+    ColumnarReader::open(buffer).unwrap()
+}
+
+proptest! {
+    // We just test that the merge_columnar function doesn't crash.
+    #![proptest_config(ProptestConfig::with_cases(256))]
+    #[test]
+    fn test_merge_columnar_bytes_no_crash(columnars in columnars_strategy(), second_merge_columnars in columnars_strategy()) {
+        let columnars: Vec<ColumnarReader> = columnars.iter()
+            .map(build_columnar)
+            .collect();
+
+        let mut out = Vec::new();
+        let columnar_refs: Vec<&ColumnarReader> = columnars.iter().collect();
+        let stack_merge_order = StackMergeOrder::stack(&columnar_refs);
+        merge_columnar(
+            &columnar_refs,
+            &[],
+            MergeRowOrder::Stack(stack_merge_order),
+            &mut out,
+            || false,
+        ).unwrap();
+
+        let merged_reader = ColumnarReader::open(out).unwrap();
+
+        // Merge the second set of columnars with the result of the first merge
+        let mut columnars: Vec<ColumnarReader> = second_merge_columnars.iter()
+            .map(build_columnar)
+            .collect();
+        columnars.push(merged_reader);
+        let mut out = Vec::new();
+        let columnar_refs: Vec<&ColumnarReader> = columnars.iter().collect();
+        let stack_merge_order = StackMergeOrder::stack(&columnar_refs);
+        merge_columnar(
+            &columnar_refs,
+            &[],
+            MergeRowOrder::Stack(stack_merge_order),
+            &mut out,
+            || false,
+        ).unwrap();
+
+    }
+}
--- a/columnar/src/columnar/mod.rs
+++ b/columnar/src/columnar/mod.rs
@@ -5,9 +5,9 @@ mod reader;
 mod writer;

 pub use column_type::{ColumnType, HasAssociatedColumnType};
-pub use format_version::{Version, CURRENT_VERSION};
+pub use format_version::{CURRENT_VERSION, Version};
 #[cfg(test)]
 pub(crate) use merge::ColumnTypeCategory;
-pub use merge::{merge_columnar, MergeRowOrder, ShuffleMergeOrder, StackMergeOrder};
+pub use merge::{MergeRowOrder, ShuffleMergeOrder, StackMergeOrder, merge_columnar};
 pub use reader::ColumnarReader;
 pub use writer::ColumnarWriter;
--- a/columnar/src/columnar/reader/mod.rs
+++ b/columnar/src/columnar/reader/mod.rs
@@ -1,10 +1,11 @@
 use std::{fmt, io, mem};

-use common::file_slice::FileSlice;
 use common::BinarySerializable;
+use common::file_slice::FileSlice;
+use common::json_path_writer::JSON_PATH_SEGMENT_SEP;
 use sstable::{Dictionary, RangeSSTable};

-use crate::columnar::{format_version, ColumnType};
+use crate::columnar::{ColumnType, format_version};
 use crate::dynamic_column::DynamicColumnHandle;
 use crate::{RowId, Version};

@@ -18,13 +19,13 @@ fn io_invalid_data(msg: String) -> io::Error {
 pub struct ColumnarReader {
    column_dictionary: Dictionary<RangeSSTable>,
    column_data: FileSlice,
-    num_rows: RowId,
+    num_docs: RowId,
    format_version: Version,
 }

 impl fmt::Debug for ColumnarReader {
    fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
-        let num_rows = self.num_rows();
+        let num_rows = self.num_docs();
        let columns = self.list_columns().unwrap();
        let num_cols = columns.len();
        let mut debug_struct = f.debug_struct("Columnar");
@@ -76,6 +77,19 @@ fn read_all_columns_in_stream(
    Ok(results)
 }

+fn column_dictionary_prefix_for_column_name(column_name: &str) -> String {
+    // Each column is a associated to a given `column_key`,
+    // that starts by `column_name\0column_header`.
+    //
+    // Listing the columns associated to the given column name is therefore equivalent to
+    // listing `column_key` with the prefix `column_name\0`.
+    format!("{}{}", column_name, '\0')
+}
+
+fn column_dictionary_prefix_for_subpath(root_path: &str) -> String {
+    format!("{}{}", root_path, JSON_PATH_SEGMENT_SEP as char)
+}
+
 impl ColumnarReader {
    /// Opens a new Columnar file.
    pub fn open<F>(file_slice: F) -> io::Result<ColumnarReader>
@@ -98,13 +112,13 @@ impl ColumnarReader {
        Ok(ColumnarReader {
            column_dictionary,
            column_data,
-            num_rows,
+            num_docs: num_rows,
            format_version,
        })
    }

-    pub fn num_rows(&self) -> RowId {
-        self.num_rows
+    pub fn num_docs(&self) -> RowId {
+        self.num_docs
    }
    // Iterate over the columns in a sorted way
    pub fn iter_columns(
@@ -144,32 +158,14 @@ impl ColumnarReader {
        Ok(self.iter_columns()?.collect())
    }

-    fn stream_for_column_range(&self, column_name: &str) -> sstable::StreamerBuilder<RangeSSTable> {
-        // Each column is a associated to a given `column_key`,
-        // that starts by `column_name\0column_header`.
-        //
-        // Listing the columns associated to the given column name is therefore equivalent to
-        // listing `column_key` with the prefix `column_name\0`.
-        //
-        // This is in turn equivalent to searching for the range
-        // `[column_name,\0`..column_name\1)`.
-        // TODO can we get some more generic `prefix(..)` logic in the dictionary.
-        let mut start_key = column_name.to_string();
-        start_key.push('\0');
-        let mut end_key = column_name.to_string();
-        end_key.push(1u8 as char);
-        self.column_dictionary
-            .range()
-            .ge(start_key.as_bytes())
-            .lt(end_key.as_bytes())
-    }
-
    pub async fn read_columns_async(
        &self,
        column_name: &str,
    ) -> io::Result<Vec<DynamicColumnHandle>> {
+        let prefix = column_dictionary_prefix_for_column_name(column_name);
        let stream = self
-            .stream_for_column_range(column_name)
+            .column_dictionary
+            .prefix_range(prefix)
            .into_stream_async()
            .await?;
        read_all_columns_in_stream(stream, &self.column_data, self.format_version)
@@ -180,7 +176,35 @@ impl ColumnarReader {
    /// There can be more than one column associated to a given column name, provided they have
    /// different types.
    pub fn read_columns(&self, column_name: &str) -> io::Result<Vec<DynamicColumnHandle>> {
-        let stream = self.stream_for_column_range(column_name).into_stream()?;
+        let prefix = column_dictionary_prefix_for_column_name(column_name);
+        let stream = self.column_dictionary.prefix_range(prefix).into_stream()?;
+        read_all_columns_in_stream(stream, &self.column_data, self.format_version)
+    }
+
+    pub async fn read_subpath_columns_async(
+        &self,
+        root_path: &str,
+    ) -> io::Result<Vec<DynamicColumnHandle>> {
+        let prefix = column_dictionary_prefix_for_subpath(root_path);
+        let stream = self
+            .column_dictionary
+            .prefix_range(prefix)
+            .into_stream_async()
+            .await?;
+        read_all_columns_in_stream(stream, &self.column_data, self.format_version)
+    }
+
+    /// Get all inner columns for a given JSON prefix, i.e columns for which the name starts
+    /// with the prefix then contain the [`JSON_PATH_SEGMENT_SEP`].
+    ///
+    /// There can be more than one column associated to each path within the JSON structure,
+    /// provided they have different types.
+    pub fn read_subpath_columns(&self, root_path: &str) -> io::Result<Vec<DynamicColumnHandle>> {
+        let prefix = column_dictionary_prefix_for_subpath(root_path);
+        let stream = self
+            .column_dictionary
+            .prefix_range(prefix.as_bytes())
+            .into_stream()?;
        read_all_columns_in_stream(stream, &self.column_data, self.format_version)
    }

@@ -192,6 +216,8 @@ impl ColumnarReader {

 #[cfg(test)]
 mod tests {
+    use common::json_path_writer::JSON_PATH_SEGMENT_SEP;
+
    use crate::{ColumnType, ColumnarReader, ColumnarWriter};

    #[test]
@@ -224,6 +250,64 @@ mod tests {
        assert_eq!(columns[0].1.column_type(), ColumnType::U64);
    }

+    #[test]
+    fn test_read_columns() {
+        let mut columnar_writer = ColumnarWriter::default();
+        columnar_writer.record_column_type("col", ColumnType::U64, false);
+        columnar_writer.record_numerical(1, "col", 1u64);
+        let mut buffer = Vec::new();
+        columnar_writer.serialize(2, &mut buffer).unwrap();
+        let columnar = ColumnarReader::open(buffer).unwrap();
+        {
+            let columns = columnar.read_columns("col").unwrap();
+            assert_eq!(columns.len(), 1);
+            assert_eq!(columns[0].column_type(), ColumnType::U64);
+        }
+        {
+            let columns = columnar.read_columns("other").unwrap();
+            assert_eq!(columns.len(), 0);
+        }
+    }
+
+    #[test]
+    fn test_read_subpath_columns() {
+        let mut columnar_writer = ColumnarWriter::default();
+        columnar_writer.record_str(
+            0,
+            &format!("col1{}subcol1", JSON_PATH_SEGMENT_SEP as char),
+            "hello",
+        );
+        columnar_writer.record_numerical(
+            0,
+            &format!("col1{}subcol2", JSON_PATH_SEGMENT_SEP as char),
+            1i64,
+        );
+        columnar_writer.record_str(1, "col1", "hello");
+        columnar_writer.record_str(0, "col2", "hello");
+        let mut buffer = Vec::new();
+        columnar_writer.serialize(2, &mut buffer).unwrap();
+
+        let columnar = ColumnarReader::open(buffer).unwrap();
+        {
+            let columns = columnar.read_subpath_columns("col1").unwrap();
+            assert_eq!(columns.len(), 2);
+            assert_eq!(columns[0].column_type(), ColumnType::Str);
+            assert_eq!(columns[1].column_type(), ColumnType::I64);
+        }
+        {
+            let columns = columnar.read_subpath_columns("col1.subcol1").unwrap();
+            assert_eq!(columns.len(), 0);
+        }
+        {
+            let columns = columnar.read_subpath_columns("col2").unwrap();
+            assert_eq!(columns.len(), 0);
+        }
+        {
+            let columns = columnar.read_subpath_columns("other").unwrap();
+            assert_eq!(columns.len(), 0);
+        }
+    }
+
    #[test]
    #[should_panic(expected = "Input type forbidden")]
    fn test_list_columns_strict_typing_panics_on_wrong_types() {
--- a/columnar/src/columnar/writer/column_operation.rs
+++ b/columnar/src/columnar/writer/column_operation.rs
@@ -122,7 +122,6 @@ impl<T> From<T> for ColumnOperation<T> {
 // In order to limit memory usage, and in order
 // to benefit from the stacker, we do this by serialization our data
 // as "Symbols".
-#[allow(clippy::from_over_into)]
 pub(super) trait SymbolValue: Clone + Copy {
    // Serializes the symbol into the given buffer.
    // Returns the number of bytes written into the buffer.
@@ -245,7 +244,7 @@ impl SymbolValue for UnorderedId {

 fn compute_num_bytes_for_u64(val: u64) -> usize {
    let msb = (64u32 - val.leading_zeros()) as usize;
-    (msb + 7) / 8
+    msb.div_ceil(8)
 }

 fn encode_zig_zag(n: i64) -> u64 {
--- a/columnar/src/columnar/writer/column_writers.rs
+++ b/columnar/src/columnar/writer/column_writers.rs
@@ -42,7 +42,7 @@ impl ColumnWriter {
        &self,
        arena: &MemoryArena,
        buffer: &'a mut Vec<u8>,
-    ) -> impl Iterator<Item = ColumnOperation<V>> + 'a {
+    ) -> impl Iterator<Item = ColumnOperation<V>> + 'a + use<'a, V> {
        buffer.clear();
        self.values.read_to_end(arena, buffer);
        let mut cursor: &[u8] = &buffer[..];
@@ -104,9 +104,10 @@ pub(crate) struct NumericalColumnWriter {

 impl NumericalColumnWriter {
    pub fn force_numerical_type(&mut self, numerical_type: NumericalType) {
-        assert!(self
-            .compatible_numerical_types
-            .is_type_accepted(numerical_type));
+        assert!(
+            self.compatible_numerical_types
+                .is_type_accepted(numerical_type)
+        );
        self.compatible_numerical_types = CompatibleNumericalTypes::StaticType(numerical_type);
    }
 }
@@ -211,7 +212,7 @@ impl NumericalColumnWriter {
        self,
        arena: &MemoryArena,
        buffer: &'a mut Vec<u8>,
-    ) -> impl Iterator<Item = ColumnOperation<NumericalValue>> + 'a {
+    ) -> impl Iterator<Item = ColumnOperation<NumericalValue>> + 'a + use<'a> {
        self.column_writer.operation_iterator(arena, buffer)
    }
 }
@@ -255,7 +256,7 @@ impl StrOrBytesColumnWriter {
        &self,
        arena: &MemoryArena,
        byte_buffer: &'a mut Vec<u8>,
-    ) -> impl Iterator<Item = ColumnOperation<UnorderedId>> + 'a {
+    ) -> impl Iterator<Item = ColumnOperation<UnorderedId>> + 'a + use<'a> {
        self.column_writer.operation_iterator(arena, byte_buffer)
    }
 }
--- a/columnar/src/columnar/writer/mod.rs
+++ b/columnar/src/columnar/writer/mod.rs
@@ -8,13 +8,13 @@ use std::net::Ipv6Addr;

 use column_operation::ColumnOperation;
 pub(crate) use column_writers::CompatibleNumericalTypes;
-use common::json_path_writer::JSON_END_OF_PATH;
 use common::CountingWriter;
+use common::json_path_writer::JSON_END_OF_PATH;
 pub(crate) use serializer::ColumnarSerializer;
 use stacker::{Addr, ArenaHashMap, MemoryArena};

 use crate::column_index::{SerializableColumnIndex, SerializableOptionalIndex};
-use crate::column_values::{MonotonicallyMappableToU128, MonotonicallyMappableToU64};
+use crate::column_values::{MonotonicallyMappableToU64, MonotonicallyMappableToU128};
 use crate::columnar::column_type::ColumnType;
 use crate::columnar::writer::column_writers::{
    ColumnWriter, NumericalColumnWriter, StrOrBytesColumnWriter,
@@ -285,7 +285,6 @@ impl ColumnarWriter {
                .map(|(column_name, addr)| (column_name, ColumnType::DateTime, addr)),
        );
        columns.sort_unstable_by_key(|(column_name, col_type, _)| (*column_name, *col_type));
-
        let (arena, buffers, dictionaries) = (&self.arena, &mut self.buffers, &self.dictionaries);
        let mut symbol_byte_buffer: Vec<u8> = Vec::new();
        for (column_name, column_type, addr) in columns {
@@ -392,7 +391,7 @@ impl ColumnarWriter {

 // Serialize [Dictionary, Column, dictionary num bytes U32::LE]
 // Column: [Column Index, Column Values, column index num bytes U32::LE]
-#[allow(clippy::too_many_arguments)]
+#[expect(clippy::too_many_arguments)]
 fn serialize_bytes_or_str_column(
    cardinality: Cardinality,
    num_docs: RowId,
--- a/columnar/src/columnar/writer/serializer.rs
+++ b/columnar/src/columnar/writer/serializer.rs
@@ -3,11 +3,11 @@ use std::io::Write;

 use common::json_path_writer::JSON_END_OF_PATH;
 use common::{BinarySerializable, CountingWriter};
-use sstable::value::RangeValueWriter;
 use sstable::RangeSSTable;
+use sstable::value::RangeValueWriter;

-use crate::columnar::ColumnType;
 use crate::RowId;
+use crate::columnar::ColumnType;

 pub struct ColumnarSerializer<W: io::Write> {
    wrt: CountingWriter<W>,
@@ -67,7 +67,7 @@ pub struct ColumnSerializer<'a, W: io::Write> {
    start_offset: u64,
 }

-impl<'a, W: io::Write> ColumnSerializer<'a, W> {
+impl<W: io::Write> ColumnSerializer<'_, W> {
    pub fn finalize(self) -> io::Result<()> {
        let end_offset: u64 = self.columnar_serializer.wrt.written_bytes();
        let byte_range = self.start_offset..end_offset;
@@ -80,7 +80,7 @@ impl<'a, W: io::Write> ColumnSerializer<'a, W> {
    }
 }

-impl<'a, W: io::Write> io::Write for ColumnSerializer<'a, W> {
+impl<W: io::Write> io::Write for ColumnSerializer<'_, W> {
    fn write(&mut self, buf: &[u8]) -> io::Result<usize> {
        self.columnar_serializer.wrt.write(buf)
    }
--- a/columnar/src/columnar/writer/value_index.rs
+++ b/columnar/src/columnar/writer/value_index.rs
@@ -1,6 +1,6 @@
+use crate::RowId;
 use crate::column_index::{SerializableMultivalueIndex, SerializableOptionalIndex};
 use crate::iterable::Iterable;
-use crate::RowId;

 /// The `IndexBuilder` interprets a sequence of
 /// calls of the form:
@@ -31,12 +31,13 @@ pub struct OptionalIndexBuilder {

 impl OptionalIndexBuilder {
    pub fn finish(&mut self, num_rows: RowId) -> impl Iterable<RowId> + '_ {
-        debug_assert!(self
-            .docs
-            .last()
-            .copied()
-            .map(|last_doc| last_doc < num_rows)
-            .unwrap_or(true));
+        debug_assert!(
+            self.docs
+                .last()
+                .copied()
+                .map(|last_doc| last_doc < num_rows)
+                .unwrap_or(true)
+        );
        &self.docs[..]
    }

@@ -48,12 +49,13 @@ impl OptionalIndexBuilder {
 impl IndexBuilder for OptionalIndexBuilder {
    #[inline(always)]
    fn record_row(&mut self, doc: RowId) {
-        debug_assert!(self
-            .docs
-            .last()
-            .copied()
-            .map(|prev_doc| doc > prev_doc)
-            .unwrap_or(true));
+        debug_assert!(
+            self.docs
+                .last()
+                .copied()
+                .map(|prev_doc| doc > prev_doc)
+                .unwrap_or(true)
+        );
        self.docs.push(doc);
    }
 }
--- a/columnar/src/comparable_doc.rs
+++ b/columnar/src/comparable_doc.rs
@@ -0,0 +1,22 @@
+use serde::{Deserialize, Serialize};
+
+/// Contains a feature (field, score, etc.) of a document along with the document address.
+///
+/// Used only by TopNComputer, which implements the actual comparison via a `Comparator`.
+#[derive(Clone, Default, Eq, PartialEq, Serialize, Deserialize)]
+pub struct ComparableDoc<T, D> {
+    /// The feature of the document. In practice, this is
+    /// is a type which can be compared with a `Comparator<T>`.
+    pub sort_key: T,
+    /// The document address. In practice, this is either a `DocId` or `DocAddress`.
+    pub doc: D,
+}
+
+impl<T: std::fmt::Debug, D: std::fmt::Debug> std::fmt::Debug for ComparableDoc<T, D> {
+    fn fmt(&self, f: &mut std::fmt::Formatter) -> std::fmt::Result {
+        f.debug_struct("ComparableDoc")
+            .field("feature", &self.sort_key)
+            .field("doc", &self.doc)
+            .finish()
+    }
+}
--- a/columnar/src/compat_tests.rs
+++ b/columnar/src/compat_tests.rs
@@ -3,8 +3,8 @@ use std::path::PathBuf;
 use itertools::Itertools;

 use crate::{
-    merge_columnar, Cardinality, Column, ColumnarReader, DynamicColumn, StackMergeOrder,
-    CURRENT_VERSION,
+    CURRENT_VERSION, Cardinality, Column, ColumnarReader, DynamicColumn, StackMergeOrder,
+    merge_columnar,
 };

 const NUM_DOCS: u32 = u16::MAX as u32;
@@ -71,7 +71,14 @@ fn test_format(path: &str) {
    let columnar_readers = vec![&reader, &reader2];
    let merge_row_order = StackMergeOrder::stack(&columnar_readers[..]);
    let mut out = Vec::new();
-    merge_columnar(&columnar_readers, &[], merge_row_order.into(), &mut out).unwrap();
+    merge_columnar(
+        &columnar_readers,
+        &[],
+        merge_row_order.into(),
+        &mut out,
+        || false,
+    )
+    .unwrap();
    let reader = ColumnarReader::open(out).unwrap();
    check_columns(&reader);
 }
--- a/columnar/src/dynamic_column.rs
+++ b/columnar/src/dynamic_column.rs
@@ -3,10 +3,11 @@ use std::sync::Arc;
 use std::{fmt, io};

 use common::file_slice::FileSlice;
-use common::{ByteCount, DateTime, HasLen, OwnedBytes};
+use common::{ByteCount, DateTime};
+use serde::{Deserialize, Serialize};

 use crate::column::{BytesColumn, Column, StrColumn};
-use crate::column_values::{monotonic_map_column, StrictlyMonotonicFn};
+use crate::column_values::{StrictlyMonotonicFn, monotonic_map_column};
 use crate::columnar::ColumnType;
 use crate::{Cardinality, ColumnIndex, ColumnValues, NumericalType, Version};

@@ -238,8 +239,7 @@ pub struct DynamicColumnHandle {
 impl DynamicColumnHandle {
    // TODO rename load
    pub fn open(&self) -> io::Result<DynamicColumn> {
-        let column_bytes: OwnedBytes = self.file_slice.read_bytes()?;
-        self.open_internal(column_bytes)
+        self.open_internal(self.file_slice.clone())
    }

    #[doc(hidden)]
@@ -258,16 +258,15 @@ impl DynamicColumnHandle {
    /// If not, the fastfield reader will returns the u64-value associated with the original
    /// FastValue.
    pub fn open_u64_lenient(&self) -> io::Result<Option<Column<u64>>> {
-        let column_bytes = self.file_slice.read_bytes()?;
        match self.column_type {
            ColumnType::Str | ColumnType::Bytes => {
                let column: BytesColumn =
-                    crate::column::open_column_bytes(column_bytes, self.format_version)?;
+                    crate::column::open_column_bytes(self.file_slice.clone(), self.format_version)?;
                Ok(Some(column.term_ord_column))
            }
            ColumnType::IpAddr => {
                let column = crate::column::open_column_u128_as_compact_u64(
-                    column_bytes,
+                    self.file_slice.clone(),
                    self.format_version,
                )?;
                Ok(Some(column))
@@ -277,50 +276,129 @@ impl DynamicColumnHandle {
            | ColumnType::U64
            | ColumnType::F64
            | ColumnType::DateTime => {
-                let column =
-                    crate::column::open_column_u64::<u64>(column_bytes, self.format_version)?;
+                let column = crate::column::open_column_u64::<u64>(
+                    self.file_slice.clone(),
+                    self.format_version,
+                )?;
                Ok(Some(column))
            }
        }
    }

-    fn open_internal(&self, column_bytes: OwnedBytes) -> io::Result<DynamicColumn> {
+    fn open_internal(&self, file_slice: FileSlice) -> io::Result<DynamicColumn> {
        let dynamic_column: DynamicColumn = match self.column_type {
            ColumnType::Bytes => {
-                crate::column::open_column_bytes(column_bytes, self.format_version)?.into()
+                crate::column::open_column_bytes(file_slice, self.format_version)?.into()
            }
            ColumnType::Str => {
-                crate::column::open_column_str(column_bytes, self.format_version)?.into()
+                crate::column::open_column_str(file_slice, self.format_version)?.into()
            }
            ColumnType::I64 => {
-                crate::column::open_column_u64::<i64>(column_bytes, self.format_version)?.into()
+                crate::column::open_column_u64::<i64>(file_slice, self.format_version)?.into()
            }
            ColumnType::U64 => {
-                crate::column::open_column_u64::<u64>(column_bytes, self.format_version)?.into()
+                crate::column::open_column_u64::<u64>(file_slice, self.format_version)?.into()
            }
            ColumnType::F64 => {
-                crate::column::open_column_u64::<f64>(column_bytes, self.format_version)?.into()
+                crate::column::open_column_u64::<f64>(file_slice, self.format_version)?.into()
            }
            ColumnType::Bool => {
-                crate::column::open_column_u64::<bool>(column_bytes, self.format_version)?.into()
+                crate::column::open_column_u64::<bool>(file_slice, self.format_version)?.into()
            }
            ColumnType::IpAddr => {
-                crate::column::open_column_u128::<Ipv6Addr>(column_bytes, self.format_version)?
-                    .into()
+                crate::column::open_column_u128::<Ipv6Addr>(file_slice, self.format_version)?.into()
            }
            ColumnType::DateTime => {
-                crate::column::open_column_u64::<DateTime>(column_bytes, self.format_version)?
-                    .into()
+                crate::column::open_column_u64::<DateTime>(file_slice, self.format_version)?.into()
            }
        };
        Ok(dynamic_column)
    }

    pub fn num_bytes(&self) -> ByteCount {
-        self.file_slice.len().into()
+        self.file_slice.num_bytes()
+    }
+
+    /// Legacy helper returning the column space usage.
+    pub fn column_and_dictionary_num_bytes(&self) -> io::Result<ColumnSpaceUsage> {
+        self.space_usage()
+    }
+
+    /// Return the space usage of the column, optionally broken down by dictionary and column
+    /// values.
+    ///
+    /// For dictionary encoded columns (strings and bytes), this splits the total footprint into
+    /// the dictionary and the remaining column data (including index and values).
+    /// For all other column types, the dictionary size is `None` and the column size
+    /// equals the total bytes.
+    pub fn space_usage(&self) -> io::Result<ColumnSpaceUsage> {
+        let total_num_bytes = self.num_bytes();
+        let dynamic_column = self.open()?;
+        let dictionary_num_bytes = match &dynamic_column {
+            DynamicColumn::Bytes(bytes_column) => bytes_column.dictionary().num_bytes(),
+            DynamicColumn::Str(str_column) => str_column.dictionary().num_bytes(),
+            _ => {
+                return Ok(ColumnSpaceUsage::new(self.num_bytes(), None));
+            }
+        };
+        assert!(dictionary_num_bytes <= total_num_bytes);
+        let column_num_bytes =
+            ByteCount::from(total_num_bytes.get_bytes() - dictionary_num_bytes.get_bytes());
+        Ok(ColumnSpaceUsage::new(
+            column_num_bytes,
+            Some(dictionary_num_bytes),
+        ))
    }

    pub fn column_type(&self) -> ColumnType {
        self.column_type
    }
 }
+
+/// Represents space usage of a column.
+///
+/// `column_num_bytes` tracks the column payload (index, values and footer).
+/// For dictionary encoded columns, `dictionary_num_bytes` captures the dictionary footprint.
+/// [`ColumnSpaceUsage::total_num_bytes`] returns the sum of both parts.
+#[derive(Clone, Debug, Serialize, Deserialize)]
+pub struct ColumnSpaceUsage {
+    column_num_bytes: ByteCount,
+    dictionary_num_bytes: Option<ByteCount>,
+}
+
+impl ColumnSpaceUsage {
+    pub(crate) fn new(
+        column_num_bytes: ByteCount,
+        dictionary_num_bytes: Option<ByteCount>,
+    ) -> Self {
+        ColumnSpaceUsage {
+            column_num_bytes,
+            dictionary_num_bytes,
+        }
+    }
+
+    pub fn column_num_bytes(&self) -> ByteCount {
+        self.column_num_bytes
+    }
+
+    pub fn dictionary_num_bytes(&self) -> Option<ByteCount> {
+        self.dictionary_num_bytes
+    }
+
+    pub fn total_num_bytes(&self) -> ByteCount {
+        self.column_num_bytes + self.dictionary_num_bytes.unwrap_or_default()
+    }
+
+    /// Merge two space usage values by summing their components.
+    pub fn merge(&self, other: &ColumnSpaceUsage) -> ColumnSpaceUsage {
+        let dictionary_num_bytes = match (self.dictionary_num_bytes, other.dictionary_num_bytes) {
+            (Some(lhs), Some(rhs)) => Some(lhs + rhs),
+            (Some(val), None) | (None, Some(val)) => Some(val),
+            (None, None) => None,
+        };
+        ColumnSpaceUsage {
+            column_num_bytes: self.column_num_bytes + other.column_num_bytes,
+            dictionary_num_bytes,
+        }
+    }
+}
--- a/columnar/src/iterable.rs
+++ b/columnar/src/iterable.rs
@@ -7,7 +7,7 @@ pub trait Iterable<T = u64> {
    fn boxed_iter(&self) -> Box<dyn Iterator<Item = T> + '_>;
 }

-impl<'a, T: Copy> Iterable<T> for &'a [T] {
+impl<T: Copy> Iterable<T> for &[T] {
    fn boxed_iter(&self) -> Box<dyn Iterator<Item = T> + '_> {
        Box::new(self.iter().copied())
    }
--- a/columnar/src/lib.rs
+++ b/columnar/src/lib.rs
@@ -17,15 +17,10 @@
 //!       column.
 //!     - [column_values]: Stores the values of a column in a dense format.

-#![cfg_attr(all(feature = "unstable", test), feature(test))]
-
 #[cfg(test)]
 #[macro_use]
 extern crate more_asserts;

-#[cfg(all(test, feature = "unstable"))]
-extern crate test;
-
 use std::fmt::Display;
 use std::io;

@@ -34,6 +29,7 @@ mod column;
 pub mod column_index;
 pub mod column_values;
 mod columnar;
+mod comparable_doc;
 mod dictionary;
 mod dynamic_column;
 mod iterable;
@@ -41,19 +37,20 @@ pub(crate) mod utils;
 mod value;

 pub use block_accessor::ColumnBlockAccessor;
-pub use column::{BytesColumn, Column, StrColumn};
+pub use column::{BytesColumn, Column, StrColumn, ValueRange};
 pub use column_index::ColumnIndex;
 pub use column_values::{
-    ColumnValues, EmptyColumnValues, MonotonicallyMappableToU128, MonotonicallyMappableToU64,
+    ColumnValues, EmptyColumnValues, MonotonicallyMappableToU64, MonotonicallyMappableToU128,
 };
 pub use columnar::{
-    merge_columnar, ColumnType, ColumnarReader, ColumnarWriter, HasAssociatedColumnType,
-    MergeRowOrder, ShuffleMergeOrder, StackMergeOrder, Version, CURRENT_VERSION,
+    CURRENT_VERSION, ColumnType, ColumnarReader, ColumnarWriter, HasAssociatedColumnType,
+    MergeRowOrder, ShuffleMergeOrder, StackMergeOrder, Version, merge_columnar,
 };
+pub use comparable_doc::ComparableDoc;
 use sstable::VoidSSTable;
 pub use value::{NumericalType, NumericalValue};

-pub use self::dynamic_column::{DynamicColumn, DynamicColumnHandle};
+pub use self::dynamic_column::{ColumnSpaceUsage, DynamicColumn, DynamicColumnHandle};

 pub type RowId = u32;
 pub type DocId = u32;
--- a/columnar/src/tests.rs
+++ b/columnar/src/tests.rs
@@ -380,7 +380,7 @@ fn assert_columnar_eq(
    right: &ColumnarReader,
    lenient_on_numerical_value: bool,
 ) {
-    assert_eq!(left.num_rows(), right.num_rows());
+    assert_eq!(left.num_docs(), right.num_docs());
    let left_columns = left.list_columns().unwrap();
    let right_columns = right.list_columns().unwrap();
    assert_eq!(left_columns.len(), right_columns.len());
@@ -588,7 +588,7 @@ proptest! {
    #[test]
    fn test_single_columnar_builder_proptest(docs in columnar_docs_strategy()) {
        let columnar = build_columnar(&docs[..]);
-        assert_eq!(columnar.num_rows() as usize, docs.len());
+        assert_eq!(columnar.num_docs() as usize, docs.len());
        let mut expected_columns: HashMap<(&str, ColumnTypeCategory), HashMap<u32, Vec<&ColumnValue>> > = Default::default();
        for (doc_id, doc_vals) in docs.iter().enumerate() {
            for (col_name, col_val) in doc_vals {
@@ -641,7 +641,7 @@ proptest! {
        let columnar_readers_arr: Vec<&ColumnarReader> = columnar_readers.iter().collect();
        let mut output: Vec<u8> = Vec::new();
        let stack_merge_order = StackMergeOrder::stack(&columnar_readers_arr[..]).into();
-        crate::merge_columnar(&columnar_readers_arr[..], &[], stack_merge_order, &mut output).unwrap();
+        crate::merge_columnar(&columnar_readers_arr[..], &[], stack_merge_order, &mut output, || false,).unwrap();
        let merged_columnar = ColumnarReader::open(output).unwrap();
        let concat_rows: Vec<Vec<(&'static str, ColumnValue)>> = columnar_docs.iter().flatten().cloned().collect();
        let expected_merged_columnar = build_columnar(&concat_rows[..]);
@@ -665,6 +665,7 @@ fn test_columnar_merging_empty_columnar() {
        &[],
        crate::MergeRowOrder::Stack(stack_merge_order),
        &mut output,
+        || false,
    )
    .unwrap();
    let merged_columnar = ColumnarReader::open(output).unwrap();
@@ -702,6 +703,7 @@ fn test_columnar_merging_number_columns() {
        &[],
        crate::MergeRowOrder::Stack(stack_merge_order),
        &mut output,
+        || false,
    )
    .unwrap();
    let merged_columnar = ColumnarReader::open(output).unwrap();
@@ -715,8 +717,9 @@ fn test_columnar_merging_number_columns() {
 // TODO test required_columns
 // TODO document edge case: required_columns incompatible with values.

-fn columnar_docs_and_remap(
-) -> impl Strategy<Value = (Vec<Vec<Vec<(&'static str, ColumnValue)>>>, Vec<RowAddr>)> {
+#[allow(clippy::type_complexity)]
+fn columnar_docs_and_remap()
+-> impl Strategy<Value = (Vec<Vec<Vec<(&'static str, ColumnValue)>>>, Vec<RowAddr>)> {
    proptest::collection::vec(columnar_docs_strategy(), 2..=3).prop_flat_map(
        |columnars_docs: Vec<Vec<Vec<(&str, ColumnValue)>>>| {
            let row_addrs: Vec<RowAddr> = columnars_docs
@@ -774,6 +777,7 @@ fn test_columnar_merge_and_remap(
        &[],
        shuffle_merge_order.into(),
        &mut output,
+        || false,
    )
    .unwrap();
    let merged_columnar = ColumnarReader::open(output).unwrap();
@@ -816,10 +820,11 @@ fn test_columnar_merge_empty() {
        &[],
        shuffle_merge_order.into(),
        &mut output,
+        || false,
    )
    .unwrap();
    let merged_columnar = ColumnarReader::open(output).unwrap();
-    assert_eq!(merged_columnar.num_rows(), 0);
+    assert_eq!(merged_columnar.num_docs(), 0);
    assert_eq!(merged_columnar.num_columns(), 0);
 }

@@ -842,10 +847,11 @@ fn test_columnar_merge_single_str_column() {
        &[],
        shuffle_merge_order.into(),
        &mut output,
+        || false,
    )
    .unwrap();
    let merged_columnar = ColumnarReader::open(output).unwrap();
-    assert_eq!(merged_columnar.num_rows(), 1);
+    assert_eq!(merged_columnar.num_docs(), 1);
    assert_eq!(merged_columnar.num_columns(), 1);
 }

@@ -874,10 +880,11 @@ fn test_delete_decrease_cardinality() {
        &[],
        shuffle_merge_order.into(),
        &mut output,
+        || false,
    )
    .unwrap();
    let merged_columnar = ColumnarReader::open(output).unwrap();
-    assert_eq!(merged_columnar.num_rows(), 1);
+    assert_eq!(merged_columnar.num_docs(), 1);
    assert_eq!(merged_columnar.num_columns(), 1);
    let cols = merged_columnar.read_columns("c").unwrap();
    assert_eq!(cols.len(), 1);
--- a/columnar/src/value.rs
+++ b/columnar/src/value.rs
@@ -1,3 +1,5 @@
+use std::str::FromStr;
+
 use common::DateTime;

 use crate::InvalidData;
@@ -9,6 +11,23 @@ pub enum NumericalValue {
    F64(f64),
 }

+impl FromStr for NumericalValue {
+    type Err = ();
+
+    fn from_str(s: &str) -> Result<Self, ()> {
+        if let Ok(val_i64) = s.parse::<i64>() {
+            return Ok(val_i64.into());
+        }
+        if let Ok(val_u64) = s.parse::<u64>() {
+            return Ok(val_u64.into());
+        }
+        if let Ok(val_f64) = s.parse::<f64>() {
+            return Ok(NumericalValue::from(val_f64).normalize());
+        }
+        Err(())
+    }
+}
+
 impl NumericalValue {
    pub fn numerical_type(&self) -> NumericalType {
        match self {
@@ -26,7 +45,7 @@ impl NumericalValue {
                if val <= i64::MAX as u64 {
                    NumericalValue::I64(val as i64)
                } else {
-                    NumericalValue::F64(val as f64)
+                    NumericalValue::U64(val)
                }
            }
            NumericalValue::I64(val) => NumericalValue::I64(val),
@@ -141,6 +160,7 @@ impl Coerce for DateTime {
 #[cfg(test)]
 mod tests {
    use super::NumericalType;
+    use crate::NumericalValue;

    #[test]
    fn test_numerical_type_code() {
@@ -153,4 +173,58 @@ mod tests {
        }
        assert_eq!(num_numerical_type, 3);
    }
+
+    #[test]
+    fn test_parse_numerical() {
+        assert_eq!(
+            "123".parse::<NumericalValue>().unwrap(),
+            NumericalValue::I64(123)
+        );
+        assert_eq!(
+            "18446744073709551615".parse::<NumericalValue>().unwrap(),
+            NumericalValue::U64(18446744073709551615u64)
+        );
+        assert_eq!(
+            "1.0".parse::<NumericalValue>().unwrap(),
+            NumericalValue::I64(1i64)
+        );
+        assert_eq!(
+            "1.1".parse::<NumericalValue>().unwrap(),
+            NumericalValue::F64(1.1f64)
+        );
+        assert_eq!(
+            "-1.0".parse::<NumericalValue>().unwrap(),
+            NumericalValue::I64(-1i64)
+        );
+    }
+
+    #[test]
+    fn test_normalize_numerical() {
+        assert_eq!(
+            NumericalValue::from(1u64).normalize(),
+            NumericalValue::I64(1i64),
+        );
+        let limit_val = i64::MAX as u64 + 1u64;
+        assert_eq!(
+            NumericalValue::from(limit_val).normalize(),
+            NumericalValue::U64(limit_val),
+        );
+        assert_eq!(
+            NumericalValue::from(-1i64).normalize(),
+            NumericalValue::I64(-1i64),
+        );
+        assert_eq!(
+            NumericalValue::from(-2.0f64).normalize(),
+            NumericalValue::I64(-2i64),
+        );
+        assert_eq!(
+            NumericalValue::from(-2.1f64).normalize(),
+            NumericalValue::F64(-2.1f64),
+        );
+        let large_float = 2.0f64.powf(70.0f64);
+        assert_eq!(
+            NumericalValue::from(large_float).normalize(),
+            NumericalValue::F64(large_float),
+        );
+    }
 }
--- a/common/Cargo.toml
+++ b/common/Cargo.toml
@@ -1,9 +1,9 @@
 [package]
 name = "tantivy-common"
-version = "0.7.0"
+version = "0.10.0"
 authors = ["Paul Masurel <paul@quickwit.io>", "Pascal Seitz <pascal@quickwit.io>"]
 license = "MIT"
-edition = "2021"
+edition = "2024"
 description = "common traits and utility functions used by multiple tantivy subcrates"
 documentation = "https://docs.rs/tantivy_common/"
 homepage = "https://github.com/quickwit-oss/tantivy"
@@ -13,13 +13,13 @@ repository = "https://github.com/quickwit-oss/tantivy"

 [dependencies]
 byteorder = "1.4.3"
-ownedbytes = { version= "0.7", path="../ownedbytes" }
+ownedbytes = { version= "0.9", path="../ownedbytes" }
 async-trait = "0.1"
 time = { version = "0.3.10", features = ["serde-well-known"] }
 serde = { version = "1.0.136", features = ["derive"] }

 [dev-dependencies]
-binggan = "0.10.0"
+binggan = "0.14.0"
 proptest = "1.0.0"
 rand = "0.8.4"

--- a/common/benches/bench.rs
+++ b/common/benches/bench.rs
@@ -1,7 +1,7 @@
-use binggan::{black_box, BenchRunner};
+use binggan::{BenchRunner, black_box};
 use rand::seq::IteratorRandom;
 use rand::thread_rng;
-use tantivy_common::{serialize_vint_u32, BitSet, TinySet};
+use tantivy_common::{BitSet, TinySet, serialize_vint_u32};

 fn bench_vint() {
    let mut runner = BenchRunner::new();
@@ -15,7 +15,6 @@ fn bench_vint() {
            out += u64::from(buf[0]);
        }
        black_box(out);
-        None
    });

    let vals: Vec<u32> = (0..20_000).choose_multiple(&mut thread_rng(), 100_000);
@@ -27,7 +26,6 @@ fn bench_vint() {
            out += u64::from(buf[0]);
        }
        black_box(out);
-        None
    });
 }

@@ -43,24 +41,20 @@ fn bench_bitset() {
        tinyset.pop_lowest();
        tinyset.pop_lowest();
        black_box(tinyset);
-        None
    });

    let tiny_set = TinySet::empty().insert(10u32).insert(14u32).insert(21u32);
    runner.bench_function("bench_tinyset_sum", move |_| {
        assert_eq!(black_box(tiny_set).into_iter().sum::<u32>(), 45u32);
-        None
    });

    let v = [10u32, 14u32, 21u32];
    runner.bench_function("bench_tinyarr_sum", move |_| {
        black_box(v.iter().cloned().sum::<u32>());
-        None
    });

    runner.bench_function("bench_bitset_initialize", move |_| {
        black_box(BitSet::with_max_value(1_000_000));
-        None
    });
 }

--- a/common/src/bitset.rs
+++ b/common/src/bitset.rs
@@ -183,7 +183,7 @@ pub struct BitSet {
 }

 fn num_buckets(max_val: u32) -> u32 {
-    (max_val + 63u32) / 64u32
+    max_val.div_ceil(64u32)
 }

 impl BitSet {
--- a/common/src/bounds.rs
+++ b/common/src/bounds.rs
@@ -65,11 +65,11 @@ pub fn transform_bound_inner_res<TFrom, TTo>(
 ) -> io::Result<Bound<TTo>> {
    use self::Bound::*;
    Ok(match bound {
-        Excluded(ref from_val) => match transform(from_val)? {
+        Excluded(from_val) => match transform(from_val)? {
            TransformBound::NewBound(new_val) => new_val,
            TransformBound::Existing(new_val) => Excluded(new_val),
        },
-        Included(ref from_val) => match transform(from_val)? {
+        Included(from_val) => match transform(from_val)? {
            TransformBound::NewBound(new_val) => new_val,
            TransformBound::Existing(new_val) => Included(new_val),
        },
@@ -85,11 +85,11 @@ pub fn transform_bound_inner<TFrom, TTo>(
 ) -> Bound<TTo> {
    use self::Bound::*;
    match bound {
-        Excluded(ref from_val) => match transform(from_val) {
+        Excluded(from_val) => match transform(from_val) {
            TransformBound::NewBound(new_val) => new_val,
            TransformBound::Existing(new_val) => Excluded(new_val),
        },
-        Included(ref from_val) => match transform(from_val) {
+        Included(from_val) => match transform(from_val) {
            TransformBound::NewBound(new_val) => new_val,
            TransformBound::Existing(new_val) => Included(new_val),
        },
@@ -111,8 +111,8 @@ pub fn map_bound<TFrom, TTo>(
 ) -> Bound<TTo> {
    use self::Bound::*;
    match bound {
-        Excluded(ref from_val) => Bound::Excluded(transform(from_val)),
-        Included(ref from_val) => Bound::Included(transform(from_val)),
+        Excluded(from_val) => Bound::Excluded(transform(from_val)),
+        Included(from_val) => Bound::Included(transform(from_val)),
        Unbounded => Unbounded,
    }
 }
@@ -123,8 +123,8 @@ pub fn map_bound_res<TFrom, TTo, Err>(
 ) -> Result<Bound<TTo>, Err> {
    use self::Bound::*;
    Ok(match bound {
-        Excluded(ref from_val) => Excluded(transform(from_val)?),
-        Included(ref from_val) => Included(transform(from_val)?),
+        Excluded(from_val) => Excluded(transform(from_val)?),
+        Included(from_val) => Included(transform(from_val)?),
        Unbounded => Unbounded,
    })
 }
--- a/common/src/buffered_file_slice.rs
+++ b/common/src/buffered_file_slice.rs
@@ -0,0 +1,106 @@
+use std::cell::RefCell;
+use std::cmp::min;
+use std::io;
+use std::ops::Range;
+
+use super::file_slice::FileSlice;
+use super::{HasLen, OwnedBytes};
+
+const DEFAULT_BUFFER_MAX_SIZE: usize = 512 * 1024; // 512K
+
+/// A buffered reader for a FileSlice.
+///
+/// Reads the underlying `FileSlice` in large, sequential chunks to amortize
+/// the cost of `read_bytes` calls, while keeping peak memory usage under control.
+///
+/// TODO: Rather than wrapping a `FileSlice` in buffering, it will usually be better to adjust a
+/// `FileHandle` to directly handle buffering itself.
+/// TODO: See: https://github.com/paradedb/paradedb/issues/3374
+pub struct BufferedFileSlice {
+    file_slice: FileSlice,
+    buffer: RefCell<OwnedBytes>,
+    buffer_range: RefCell<Range<u64>>,
+    buffer_max_size: usize,
+}
+
+impl BufferedFileSlice {
+    /// Creates a new `BufferedFileSlice`.
+    ///
+    /// The `buffer_max_size` is the amount of data that will be read from the
+    /// `FileSlice` on a buffer miss.
+    pub fn new(file_slice: FileSlice, buffer_max_size: usize) -> Self {
+        Self {
+            file_slice,
+            buffer: RefCell::new(OwnedBytes::empty()),
+            buffer_range: RefCell::new(0..0),
+            buffer_max_size,
+        }
+    }
+
+    /// Creates a new `BufferedFileSlice` with a default buffer max size.
+    pub fn new_with_default_buffer_size(file_slice: FileSlice) -> Self {
+        Self::new(file_slice, DEFAULT_BUFFER_MAX_SIZE)
+    }
+
+    /// Creates an empty `BufferedFileSlice`.
+    pub fn empty() -> Self {
+        Self::new(FileSlice::empty(), 0)
+    }
+
+    /// Returns an `OwnedBytes` corresponding to the given `required_range`.
+    ///
+    /// If the requested range is not in the buffer, this will trigger a read
+    /// from the underlying `FileSlice`.
+    ///
+    /// If the requested range is larger than the buffer_max_size, it will be read directly from the
+    /// source without buffering.
+    ///
+    /// # Errors
+    ///
+    /// Returns an `io::Error` if the underlying read fails or the range is
+    /// out of bounds.
+    pub fn get_bytes(&self, required_range: Range<u64>) -> io::Result<OwnedBytes> {
+        let buffer_range = self.buffer_range.borrow();
+
+        // Cache miss condition: the required range is not fully contained in the current buffer.
+        if required_range.start < buffer_range.start || required_range.end > buffer_range.end {
+            drop(buffer_range); // release borrow before mutating
+
+            if required_range.end > self.file_slice.len() as u64 {
+                return Err(io::Error::new(
+                    io::ErrorKind::UnexpectedEof,
+                    "Requested range extends beyond the end of the file slice.",
+                ));
+            }
+
+            if (required_range.end - required_range.start) as usize > self.buffer_max_size {
+                // This read is larger than our buffer max size.
+                // Read it directly and bypass the buffer to avoid churning.
+                return self
+                    .file_slice
+                    .read_bytes_slice(required_range.start as usize..required_range.end as usize);
+            }
+
+            let new_buffer_start = required_range.start;
+            let new_buffer_end = min(
+                new_buffer_start + self.buffer_max_size as u64,
+                self.file_slice.len() as u64,
+            );
+            let read_range = new_buffer_start..new_buffer_end;
+
+            let new_buffer = self
+                .file_slice
+                .read_bytes_slice(read_range.start as usize..read_range.end as usize)?;
+
+            self.buffer.replace(new_buffer);
+            self.buffer_range.replace(read_range);
+        }
+
+        // Now the data is guaranteed to be in the buffer.
+        let buffer = self.buffer.borrow();
+        let buffer_range = self.buffer_range.borrow();
+        let local_start = (required_range.start - buffer_range.start) as usize;
+        let local_end = (required_range.end - buffer_range.start) as usize;
+        Ok(buffer.slice(local_start..local_end))
+    }
+}
--- a/common/src/file_slice.rs
+++ b/common/src/file_slice.rs
@@ -1,6 +1,7 @@
 use std::fs::File;
 use std::ops::{Deref, Range, RangeBounds};
-use std::sync::Arc;
+use std::path::Path;
+use std::sync::{Arc, OnceLock};
 use std::{fmt, io};

 use async_trait::async_trait;
@@ -73,7 +74,7 @@ impl FileHandle for WrapFile {
        {
            use std::io::{Read, Seek};
            let mut file = self.file.try_clone()?; // Clone the file to read from it separately
-                                                   // Seek to the start position in the file
+            // Seek to the start position in the file
            file.seek(io::SeekFrom::Start(start as u64))?;
            // Read the data into the buffer
            file.read_exact(&mut buffer)?;
@@ -177,6 +178,12 @@ fn combine_ranges<R: RangeBounds<usize>>(orig_range: Range<usize>, rel_range: R)
 }

 impl FileSlice {
+    /// Creates a FileSlice from a path.
+    pub fn open(path: &Path) -> io::Result<FileSlice> {
+        let wrap_file = WrapFile::new(File::open(path)?)?;
+        Ok(FileSlice::new(Arc::new(wrap_file)))
+    }
+
    /// Wraps a FileHandle.
    pub fn new(file_handle: Arc<dyn FileHandle>) -> Self {
        let num_bytes = file_handle.len();
@@ -332,6 +339,27 @@ impl FileHandle for OwnedBytes {
    }
 }

+pub struct DeferredFileSlice {
+    opener: Arc<dyn Fn() -> io::Result<FileSlice> + Send + Sync + 'static>,
+    file_slice: OnceLock<std::io::Result<FileSlice>>,
+}
+
+impl DeferredFileSlice {
+    pub fn new(opener: impl Fn() -> io::Result<FileSlice> + Send + Sync + 'static) -> Self {
+        DeferredFileSlice {
+            opener: Arc::new(opener),
+            file_slice: OnceLock::default(),
+        }
+    }
+
+    pub fn open(&self) -> io::Result<&FileSlice> {
+        match self.file_slice.get_or_init(|| (self.opener)()) {
+            Ok(file_slice) => Ok(file_slice),
+            Err(e) => Err(io::Error::new(io::ErrorKind::Other, e.to_string())),
+        }
+    }
+}
+
 #[cfg(test)]
 mod tests {
    use std::io;
@@ -339,8 +367,8 @@ mod tests {
    use std::sync::Arc;

    use super::{FileHandle, FileSlice};
-    use crate::file_slice::combine_ranges;
    use crate::HasLen;
+    use crate::file_slice::combine_ranges;

    #[test]
    fn test_file_slice() -> io::Result<()> {
--- a/common/src/lib.rs
+++ b/common/src/lib.rs
@@ -6,6 +6,7 @@ pub use byteorder::LittleEndian as Endianness;

 mod bitset;
 pub mod bounds;
+pub mod buffered_file_slice;
 mod byte_count;
 mod datetime;
 pub mod file_slice;
@@ -22,7 +23,7 @@ pub use json_path_writer::JsonPathWriter;
 pub use ownedbytes::{OwnedBytes, StableDeref};
 pub use serialize::{BinarySerializable, DeserializeFrom, FixedSize};
 pub use vint::{
-    read_u32_vint, read_u32_vint_no_advance, serialize_vint_u32, write_u32_vint, VInt, VIntU128,
+    VInt, VIntU128, read_u32_vint, read_u32_vint_no_advance, serialize_vint_u32, write_u32_vint,
 };
 pub use writer::{AntiCallToken, CountingWriter, TerminatingWrite};

@@ -130,11 +131,11 @@ pub fn replace_in_place(needle: u8, replacement: u8, bytes: &mut [u8]) {
 }

 #[cfg(test)]
-pub mod test {
+pub(crate) mod test {

    use proptest::prelude::*;

-    use super::{f64_to_u64, i64_to_u64, u64_to_f64, u64_to_i64, BinarySerializable, FixedSize};
+    use super::{f64_to_u64, i64_to_u64, u64_to_f64, u64_to_i64};

    fn test_i64_converter_helper(val: i64) {
        assert_eq!(u64_to_i64(i64_to_u64(val)), val);
@@ -144,12 +145,6 @@ pub mod test {
        assert_eq!(u64_to_f64(f64_to_u64(val)), val);
    }

-    pub fn fixed_size_test<O: BinarySerializable + FixedSize + Default>() {
-        let mut buffer = Vec::new();
-        O::default().serialize(&mut buffer).unwrap();
-        assert_eq!(buffer.len(), O::SIZE_IN_BYTES);
-    }
-
    proptest! {
        #[test]
        fn test_f64_converter_monotonicity_proptest((left, right) in (proptest::num::f64::NORMAL, proptest::num::f64::NORMAL)) {
@@ -183,8 +178,10 @@ pub mod test {

    #[test]
    fn test_f64_order() {
-        assert!(!(f64_to_u64(f64::NEG_INFINITY)..f64_to_u64(f64::INFINITY))
-            .contains(&f64_to_u64(f64::NAN))); // nan is not a number
+        assert!(
+            !(f64_to_u64(f64::NEG_INFINITY)..f64_to_u64(f64::INFINITY))
+                .contains(&f64_to_u64(f64::NAN))
+        ); // nan is not a number
        assert!(f64_to_u64(1.5) > f64_to_u64(1.0)); // same exponent, different mantissa
        assert!(f64_to_u64(2.0) > f64_to_u64(1.0)); // same mantissa, different exponent
        assert!(f64_to_u64(2.0) > f64_to_u64(1.5)); // different exponent and mantissa
--- a/common/src/serialize.rs
+++ b/common/src/serialize.rs
@@ -74,14 +74,14 @@ impl FixedSize for () {

 impl<T: BinarySerializable> BinarySerializable for Vec<T> {
    fn serialize<W: Write + ?Sized>(&self, writer: &mut W) -> io::Result<()> {
-        VInt(self.len() as u64).serialize(writer)?;
+        BinarySerializable::serialize(&VInt(self.len() as u64), writer)?;
        for it in self {
            it.serialize(writer)?;
        }
        Ok(())
    }
    fn deserialize<R: Read>(reader: &mut R) -> io::Result<Vec<T>> {
-        let num_items = VInt::deserialize(reader)?.val();
+        let num_items = <VInt as BinarySerializable>::deserialize(reader)?.val();
        let mut items: Vec<T> = Vec::with_capacity(num_items as usize);
        for _ in 0..num_items {
            let item = T::deserialize(reader)?;
@@ -236,12 +236,12 @@ impl FixedSize for bool {
 impl BinarySerializable for String {
    fn serialize<W: Write + ?Sized>(&self, writer: &mut W) -> io::Result<()> {
        let data: &[u8] = self.as_bytes();
-        VInt(data.len() as u64).serialize(writer)?;
+        BinarySerializable::serialize(&VInt(data.len() as u64), writer)?;
        writer.write_all(data)
    }

    fn deserialize<R: Read>(reader: &mut R) -> io::Result<String> {
-        let string_length = VInt::deserialize(reader)?.val() as usize;
+        let string_length = <VInt as BinarySerializable>::deserialize(reader)?.val() as usize;
        let mut result = String::with_capacity(string_length);
        reader
            .take(string_length as u64)
@@ -253,12 +253,12 @@ impl BinarySerializable for String {
 impl<'a> BinarySerializable for Cow<'a, str> {
    fn serialize<W: Write + ?Sized>(&self, writer: &mut W) -> io::Result<()> {
        let data: &[u8] = self.as_bytes();
-        VInt(data.len() as u64).serialize(writer)?;
+        BinarySerializable::serialize(&VInt(data.len() as u64), writer)?;
        writer.write_all(data)
    }

    fn deserialize<R: Read>(reader: &mut R) -> io::Result<Cow<'a, str>> {
-        let string_length = VInt::deserialize(reader)?.val() as usize;
+        let string_length = <VInt as BinarySerializable>::deserialize(reader)?.val() as usize;
        let mut result = String::with_capacity(string_length);
        reader
            .take(string_length as u64)
@@ -269,18 +269,18 @@ impl<'a> BinarySerializable for Cow<'a, str> {

 impl<'a> BinarySerializable for Cow<'a, [u8]> {
    fn serialize<W: Write + ?Sized>(&self, writer: &mut W) -> io::Result<()> {
-        VInt(self.len() as u64).serialize(writer)?;
+        BinarySerializable::serialize(&VInt(self.len() as u64), writer)?;
        for it in self.iter() {
-            it.serialize(writer)?;
+            BinarySerializable::serialize(it, writer)?;
        }
        Ok(())
    }

    fn deserialize<R: Read>(reader: &mut R) -> io::Result<Cow<'a, [u8]>> {
-        let num_items = VInt::deserialize(reader)?.val();
+        let num_items = <VInt as BinarySerializable>::deserialize(reader)?.val();
        let mut items: Vec<u8> = Vec::with_capacity(num_items as usize);
        for _ in 0..num_items {
-            let item = u8::deserialize(reader)?;
+            let item = <u8 as BinarySerializable>::deserialize(reader)?;
            items.push(item);
        }
        Ok(Cow::Owned(items))
--- a/common/src/vint.rs
+++ b/common/src/vint.rs
@@ -28,7 +28,9 @@ impl BinarySerializable for VIntU128 {
        writer.write_all(&buffer)
    }

+    #[allow(clippy::unbuffered_bytes)]
    fn deserialize<R: Read>(reader: &mut R) -> io::Result<Self> {
+        #[allow(clippy::unbuffered_bytes)]
        let mut bytes = reader.bytes();
        let mut result = 0u128;
        let mut shift = 0u64;
@@ -56,6 +58,33 @@ impl BinarySerializable for VIntU128 {
 #[derive(Clone, Copy, Debug, Eq, PartialEq)]
 pub struct VInt(pub u64);

+impl VInt {
+    pub fn deserialize_with_size<R: Read>(reader: &mut R) -> io::Result<(Self, usize)> {
+        let mut nbytes = 0;
+        let mut bytes = reader.bytes();
+        let mut result = 0u64;
+        let mut shift = 0u64;
+        loop {
+            match bytes.next() {
+                Some(Ok(b)) => {
+                    nbytes += 1;
+                    result |= u64::from(b % 128u8) << shift;
+                    if b >= STOP_BIT {
+                        return Ok((VInt(result), nbytes));
+                    }
+                    shift += 7;
+                }
+                _ => {
+                    return Err(io::Error::new(
+                        io::ErrorKind::InvalidData,
+                        "Reach end of buffer while reading VInt",
+                    ));
+                }
+            }
+        }
+    }
+}
+
 const STOP_BIT: u8 = 128;

 #[inline]
@@ -195,7 +224,9 @@ impl BinarySerializable for VInt {
        writer.write_all(&buffer[0..num_bytes])
    }

+    #[allow(clippy::unbuffered_bytes)]
    fn deserialize<R: Read>(reader: &mut R) -> io::Result<Self> {
+        #[allow(clippy::unbuffered_bytes)]
        let mut bytes = reader.bytes();
        let mut result = 0u64;
        let mut shift = 0u64;
@@ -221,8 +252,7 @@ impl BinarySerializable for VInt {

 #[cfg(test)]
 mod tests {
-
-    use super::{serialize_vint_u32, BinarySerializable, VInt};
+    use super::{BinarySerializable, VInt, serialize_vint_u32};

    fn aux_test_vint(val: u64) {
        let mut v = [14u8; 10];
--- a/common/src/writer.rs
+++ b/common/src/writer.rs
@@ -87,7 +87,7 @@ impl<W: TerminatingWrite> TerminatingWrite for BufWriter<W> {
    }
 }

-impl<'a> TerminatingWrite for &'a mut Vec<u8> {
+impl TerminatingWrite for &mut Vec<u8> {
    fn terminate_ref(&mut self, _a: AntiCallToken) -> io::Result<()> {
        self.flush()
    }
--- a/doc/assets/images/paradedb.png
+++ b/doc/assets/images/paradedb.png
--- a/doc/assets/images/searchbenchmark.png
+++ b/doc/assets/images/searchbenchmark.png
--- a/doc/src/avant-propos.md
+++ b/doc/src/avant-propos.md
@@ -2,7 +2,7 @@

 > Tantivy is a **search** engine **library** for Rust.

-If you are familiar with Lucene, it's an excellent approximation to consider tantivy as Lucene for rust. tantivy is heavily inspired by Lucene's design and
+If you are familiar with Lucene, it's an excellent approximation to consider tantivy as Lucene for Rust. Tantivy is heavily inspired by Lucene's design and
 they both have the same scope and targeted use cases.

 If you are not familiar with Lucene, let's break down our little tagline.
@@ -17,7 +17,7 @@ relevancy, collapsing, highlighting, spatial search.
  experience. But keep in mind this is just a toolbox.
  Which bring us to the second keyword...

- **Library** means that you will have to write code. tantivy is not an *all-in-one* server solution like elastic search for instance.
+- **Library** means that you will have to write code. Tantivy is not an *all-in-one* server solution like Elasticsearch for instance.

  Sometimes a functionality will not be available in tantivy because it is too
  specific to your use case. By design, tantivy should make it possible to extend
@@ -31,4 +31,4 @@ relevancy, collapsing, highlighting, spatial search.
  index from a different format.

  Tantivy exposes a lot of low level API to do all of these things.
-  
+  
--- a/doc/src/basis.md
+++ b/doc/src/basis.md
@@ -11,7 +11,7 @@ directory shipped with tantivy is the `MmapDirectory`.
 While this design has some downsides, this greatly simplifies the source code of
 tantivy. Caching is also entirely delegated to the OS.

-`tantivy` works entirely (or almost) by directly reading the datastructures as they are laid on disk. As a result, the act of opening an indexing does not involve loading different datastructures from the disk into random access memory : starting a process, opening an index, and performing your first query can typically be done in a matter of milliseconds.
+Tantivy works entirely (or almost) by directly reading the datastructures as they are laid on disk. As a result, the act of opening an indexing does not involve loading different datastructures from the disk into random access memory : starting a process, opening an index, and performing your first query can typically be done in a matter of milliseconds.

 This is an interesting property for a command line search engine, or for some multi-tenant log search engine : spawning a new process for each new query can be a perfectly sensible solution in some use case.

--- a/doc/src/index_sorting.md
+++ b/doc/src/index_sorting.md
@@ -31,13 +31,13 @@ Compression ratio is mainly affected on the fast field of the sorted property, e
 When data is presorted by a field and search queries request sorting by the same field, we can leverage the natural order of the documents.
 E.g. if the data is sorted by timestamp and want the top n newest docs containing a term, we can simply leveraging the order of the docids.

-Note: Tantivy 0.16 does not do this optimization yet.
+Note: tantivy 0.16 does not do this optimization yet.

 ### Pruning

 Let's say we want all documents and want to apply the filter `>= 2010-08-11`. When the data is sorted, we could make a lookup in the fast field to find the docid range and use this as the filter.

-Note: Tantivy 0.16 does not do this optimization yet.
+Note: tantivy 0.16 does not do this optimization yet.

 ### Other?

@@ -45,7 +45,7 @@ In principle there are many algorithms possible that exploit the monotonically i

 ## Usage

-The index sorting can be configured setting [`sort_by_field`](https://github.com/quickwit-oss/tantivy/blob/000d76b11a139a84b16b9b95060a1c93e8b9851c/src/core/index_meta.rs#L238) on `IndexSettings` and passing it to a `IndexBuilder`. As of Tantivy 0.16 only fast fields are allowed to be used.
+The index sorting can be configured setting [`sort_by_field`](https://github.com/quickwit-oss/tantivy/blob/000d76b11a139a84b16b9b95060a1c93e8b9851c/src/core/index_meta.rs#L238) on `IndexSettings` and passing it to a `IndexBuilder`. As of tantivy 0.16 only fast fields are allowed to be used.

 ```rust
 let settings = IndexSettings {
--- a/doc/src/json.md
+++ b/doc/src/json.md
@@ -39,7 +39,7 @@ Its representation is done by separating segments by a unicode char `\x01`, and
 - `value`: The value representation is just the regular Value representation.

 This representation is designed to align the natural sort of Terms with the lexicographical sort
-of their binary representation (Tantivy's dictionary (whether fst or sstable) is sorted and does prefix encoding).
+of their binary representation (tantivy's dictionary (whether fst or sstable) is sorted and does prefix encoding).

 In the example above, the terms will be sorted as

--- a/Show More
+++ b/Show More