update CHANGELOG for tantivy 0.26 release (#2857 )

* update CHANGELOG for tantivy 0.26 release * add CHANGELOG skill Signed-off-by: Pascal Seitz <pascal.seitz@gmail.com> * update CHANGELOG, add CHANGELOG skill Signed-off-by: Pascal Seitz <pascal.seitz@gmail.com> * use sketches from crates.io * update lz4_flex * update CHANGELOG.md --------- Signed-off-by: Pascal Seitz <pascal.seitz@gmail.com>
fix: deduplicate doc counts in term aggregation for multi-valued fields (#2854 )
2026-03-25 22:50:42 +00:00 · 2026-03-24 08:02:12 +01:00 · 2026-03-24 02:02:30 +01:00
11 changed files with 279 additions and 264 deletions
--- a/.claude/skills/update-changelog/SKILL.md
+++ b/.claude/skills/update-changelog/SKILL.md
@@ -0,0 +1,87 @@
+---
+name: update-changelog
+description: Update CHANGELOG.md with merged PRs since the last changelog update, categorized by type
+---
+
+# Update Changelog
+
+This skill updates CHANGELOG.md with merged PRs that aren't already listed.
+
+## Step 1: Determine the changelog scope
+
+Read `CHANGELOG.md` to identify the current unreleased version section at the top (e.g., `Tantivy 0.26 (Unreleased)`).
+
+Collect all PR numbers already mentioned in the unreleased section by extracting `#NNNN` references.
+
+## Step 2: Find merged PRs not yet in the changelog
+
+Use `gh` to list recently merged PRs from the upstream repo:
+
+```bash
+gh pr list --repo quickwit-oss/tantivy --state merged --limit 100 --json number,title,author,labels,mergedAt
+```
+
+Filter out any PRs whose number already appears in the unreleased section of the changelog.
+
+## Step 3: Consolidate related PRs
+
+Before categorizing, group PRs that belong to the same logical change. This is critical for producing a clean changelog. Use PR descriptions, titles, cross-references, and the files touched to identify relationships.
+
+**Merge follow-up PRs into the original:**
+- If a PR is a bugfix, refinement, or follow-up to another PR in the same unreleased cycle, combine them into a single changelog entry with multiple `[#N](url)` links.
+- Also consolidate PRs that touch the same feature area even if not explicitly linked — e.g., a PR fixing an edge case in a new API should be folded into the entry for the PR that introduced that API.
+
+**Filter out bugfixes on unreleased features:**
+- If a bugfix PR fixes something introduced by another PR in the **same unreleased version**, it must NOT appear as a separate Bugfixes entry. Instead, silently fold it into the original feature/improvement entry. The changelog should describe the final shipped state, not the development history.
+- To detect this: check if the bugfix PR references or reverts changes from another PR in the same release cycle, or if it touches code that was newly added (not present in the previous release).
+
+## Step 4: Review the actual code diff
+
+**Do not rely on PR titles or descriptions alone.** For every candidate PR, run `gh pr diff <number> --repo quickwit-oss/tantivy` and read the actual changes. PR titles are often misleading — the diff is the source of truth.
+
+**What to look for in the diff:**
+- Does it change observable behavior, public API surface, or performance characteristics?
+- Is the change something a user of the library would notice or need to know about?
+- Could the change break existing code (API changes, removed features)?
+
+**Skip PRs where the diff reveals the change is not meaningful enough for the changelog** — e.g., cosmetic renames, trivial visibility tweaks, test-only changes, etc.
+
+## Step 5: Categorize each PR group
+
+For each PR (or consolidated group) that survived the diff review, determine its category:
+
+- **Bugfixes** — fixes to behavior that existed in the **previous release**. NOT fixes to features introduced in this release cycle.
+- **Features/Improvements** — new features, API additions, new options, improvements that change user-facing behavior or add new capabilities.
+- **Performance** — optimizations, speed improvements, memory reductions. **If a PR adds new API whose primary purpose is enabling a performance optimization, categorize it as Performance, not Features.** The deciding question is: does a user benefit from this because of new functionality, or because things got faster/leaner? For example, a new trait method that exists solely to enable cheaper intersection ordering is Performance, not a Feature.
+
+If a PR doesn't clearly fit any category (e.g., CI-only changes, internal refactors with no user-facing impact, dependency bumps with no behavior change), skip it — not everything belongs in the changelog.
+
+When unclear, use your best judgment or ask the user.
+
+## Step 6: Format entries
+
+Each entry must follow this exact format:
+
+```
+- Description [#NUMBER](https://github.com/quickwit-oss/tantivy/pull/NUMBER)(@author)
+```
+
+Rules:
+- The description should be concise and describe the user-facing change (not the implementation). Describe the final shipped state, not the incremental development steps.
+- Use sub-categories with bold headers when multiple entries relate to the same area (e.g., `- **Aggregation**` with indented entries beneath). Follow the existing grouping style in the changelog.
+- Author is the GitHub username from the PR, prefixed with `@`. For consolidated entries, include all contributing authors.
+- For consolidated PRs, list all PR links in a single entry: `[#100](url) [#110](url)` (see existing entries for examples).
+
+## Step 7: Present changes to the user
+
+Show the user the proposed changelog entries grouped by category **before** editing the file. Ask for confirmation or adjustments.
+
+## Step 8: Update CHANGELOG.md
+
+Insert the new entries into the appropriate sections of the unreleased version block. If a section doesn't exist yet, create it following the order: Bugfixes, Features/Improvements, Performance.
+
+Append new entries at the end of each section (before the next section header or version header).
+
+## Step 9: Verify
+
+Read back the updated unreleased section and display it to the user for final review.
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -1,3 +1,51 @@
+Tantivy 0.26 (Unreleased)
+================================
+
+## Bugfixes
+- Align float query coercion during search with the columnar coercion rules [#2692](https://github.com/quickwit-oss/tantivy/pull/2692)(@fulmicoton)
+- Fix lenient elastic range queries with trailing closing parentheses [#2816](https://github.com/quickwit-oss/tantivy/pull/2816)(@evance-br)
+- Fix intersection `seek()` advancing below current doc id [#2812](https://github.com/quickwit-oss/tantivy/pull/2812)(@fulmicoton)
+- Fix phrase query prefixed with `*` [#2751](https://github.com/quickwit-oss/tantivy/pull/2751)(@Darkheir)
+- Fix `vint` buffer overflow during index creation [#2778](https://github.com/quickwit-oss/tantivy/pull/2778)(@rebasedming)
+- Fix integer overflow in `ExpUnrolledLinkedList` for large datasets [#2735](https://github.com/quickwit-oss/tantivy/pull/2735)(@mdashti)
+- Fix integer overflow in segment sorting and merge policy truncation [#2846](https://github.com/quickwit-oss/tantivy/pull/2846)(@anaslimem)
+- Fix merging of intermediate aggregation results [#2719](https://github.com/quickwit-oss/tantivy/pull/2719)(@PSeitz)
+- Fix deduplicate doc counts in term aggregation for multi-valued fields [#2854](https://github.com/quickwit-oss/tantivy/pull/2854)(@nuri-yoo)
+
+## Features/Improvements
+- **Aggregation**
+    - Add filter aggregation [#2711](https://github.com/quickwit-oss/tantivy/pull/2711)(@mdashti)
+    - Add include/exclude filtering for term aggregations [#2717](https://github.com/quickwit-oss/tantivy/pull/2717)(@PSeitz)
+    - Add public accessors for intermediate aggregation results [#2829](https://github.com/quickwit-oss/tantivy/pull/2829)(@congx4)
+    - Replace HyperLogLog++ with Apache DataSketches HLL for cardinality aggregation [#2837](https://github.com/quickwit-oss/tantivy/pull/2837) [#2842](https://github.com/quickwit-oss/tantivy/pull/2842)(@congx4)
+    - Add composite aggregation [#2856](https://github.com/quickwit-oss/tantivy/pull/2856)(@fulmicoton)
+- **Fast Fields**
+    - Add fast field fallback for `TermQuery` when the field is not indexed [#2693](https://github.com/quickwit-oss/tantivy/pull/2693)(@PSeitz-dd)
+    - Add fast field support for `Bytes` values [#2830](https://github.com/quickwit-oss/tantivy/pull/2830)(@mdashti)
+- **Query Parser**
+    - Add support for regexes in the query grammar [#2677](https://github.com/quickwit-oss/tantivy/pull/2677) [#2818](https://github.com/quickwit-oss/tantivy/pull/2818)(@Darkheir)
+    - Deduplicate queries in query parser [#2698](https://github.com/quickwit-oss/tantivy/pull/2698)(@PSeitz-dd)
+- Add erased `SortKeyComputer` for sorting on column types unknown until runtime [#2770](https://github.com/quickwit-oss/tantivy/pull/2770) [#2790](https://github.com/quickwit-oss/tantivy/pull/2790)(@stuhood @PSeitz)
+- Add natural-order-with-none-highest support in `TopDocs::order_by` [#2780](https://github.com/quickwit-oss/tantivy/pull/2780)(@stuhood)
+- Move stemming behing `stemmer` feature flag [#2791](https://github.com/quickwit-oss/tantivy/pull/2791)(@fulmicoton)
+- Make `DeleteMeta`, `AddOperation`, `advance_deletes`, `with_max_doc`, `serializer` module, and `delete_queue` public [#2762](https://github.com/quickwit-oss/tantivy/pull/2762) [#2765](https://github.com/quickwit-oss/tantivy/pull/2765) [#2766](https://github.com/quickwit-oss/tantivy/pull/2766) [#2835](https://github.com/quickwit-oss/tantivy/pull/2835)(@philippemnoel @PSeitz)
+- Make `Language` hashable [#2763](https://github.com/quickwit-oss/tantivy/pull/2763)(@philippemnoel)
+- Improve `space_usage` reporting for JSON fields and columnar data [#2761](https://github.com/quickwit-oss/tantivy/pull/2761)(@PSeitz-dd)
+- Split `Term` into `Term` and `IndexingTerm` [#2744](https://github.com/quickwit-oss/tantivy/pull/2744) [#2750](https://github.com/quickwit-oss/tantivy/pull/2750)(@PSeitz-dd @PSeitz)
+
+## Performance
+- **Aggregation**
+    - Large speed up and memory reduction for nested high cardinality aggregations by using one collector per request instead of one per bucket, and adding `PagedTermMap` for faster medium cardinality term aggregations [#2715](https://github.com/quickwit-oss/tantivy/pull/2715) [#2759](https://github.com/quickwit-oss/tantivy/pull/2759)(@PSeitz @PSeitz-dd)
+    - Optimize low-cardinality term aggregations by using a `Vec` instead of a `HashMap` [#2740](https://github.com/quickwit-oss/tantivy/pull/2740)(@fulmicoton-dd)
+- Optimize `ExistsQuery` for a high number of dynamic columns [#2694](https://github.com/quickwit-oss/tantivy/pull/2694)(@PSeitz-dd)
+- Add lazy scorers to stop score evaluation early when a doc won't reach the top-K threshold [#2726](https://github.com/quickwit-oss/tantivy/pull/2726) [#2777](https://github.com/quickwit-oss/tantivy/pull/2777)(@fulmicoton @stuhood)
+- Add `DocSet::cost()` and use it to order scorers in intersections [#2707](https://github.com/quickwit-oss/tantivy/pull/2707)(@PSeitz)
+- Add `collect_block` support for collector wrappers [#2727](https://github.com/quickwit-oss/tantivy/pull/2727)(@stuhood)
+- Optimize saturated posting lists by replacing them with `AllScorer` in boolean queries [#2745](https://github.com/quickwit-oss/tantivy/pull/2745) [#2760](https://github.com/quickwit-oss/tantivy/pull/2760) [#2774](https://github.com/quickwit-oss/tantivy/pull/2774)(@fulmicoton @mdashti @trinity-1686a)
+- Add `seek_danger` on `DocSet` for more efficient intersections [#2538](https://github.com/quickwit-oss/tantivy/pull/2538) [#2810](https://github.com/quickwit-oss/tantivy/pull/2810)(@PSeitz @stuhood @fulmicoton)
+- Skip column traversal in `RangeDocSet` when query range does not overlap with column bounds [#2783](https://github.com/quickwit-oss/tantivy/pull/2783)(@ChangRui-Ryan)
+- Speed up exclude queries by supporting multiple excluded `DocSet`s without intermediate union [#2825](https://github.com/quickwit-oss/tantivy/pull/2825)(@PSeitz)
+
 Tantivy 0.25
 ================================

--- a/Cargo.toml
+++ b/Cargo.toml
@@ -27,7 +27,7 @@ regex = { version = "1.5.5", default-features = false, features = [
 aho-corasick = "1.0"
 tantivy-fst = "0.5"
 memmap2 = { version = "0.9.0", optional = true }
-lz4_flex = { version = "0.12", default-features = false, optional = true }
+lz4_flex = { version = "0.13", default-features = false, optional = true }
 zstd = { version = "0.13", optional = true, default-features = false }
 tempfile = { version = "3.12.0", optional = true }
 log = "0.4.16"
@@ -64,7 +64,7 @@ query-grammar = { version = "0.25.0", path = "./query-grammar", package = "tanti
 tantivy-bitpacker = { version = "0.9", path = "./bitpacker" }
 common = { version = "0.10", path = "./common/", package = "tantivy-common" }
 tokenizer-api = { version = "0.6", path = "./tokenizer-api", package = "tantivy-tokenizer-api" }
-sketches-ddsketch = { git = "https://github.com/quickwit-oss/rust-sketches-ddsketch.git", rev = "555caf1", features = ["use_serde"] }
+sketches-ddsketch = { version = "0.4", features = ["use_serde"] }
 datasketches = "0.2.0"
 futures-util = { version = "0.3.28", optional = true }
 futures-channel = { version = "0.3.28", optional = true }
--- a/benches/and_or_queries.rs
+++ b/benches/and_or_queries.rs
@@ -141,13 +141,6 @@ fn main() {
            0.01,
            0.15,
        ),
-        (
-            "N=1M, p(a)=1%, p(b)=30%, p(c)=30%".to_string(),
-            1_000_000,
-            0.01,
-            0.30,
-            0.30,
-        ),
    ];

    let queries = &["a", "+a +b", "+a +b +c", "a OR b", "a OR b OR c"];
--- a/columnar/src/block_accessor.rs
+++ b/columnar/src/block_accessor.rs
@@ -58,6 +58,78 @@ impl<T: PartialOrd + Copy + std::fmt::Debug + Send + Sync + 'static + Default>
        }
    }

+    /// Like `fetch_block_with_missing`, but deduplicates (doc_id, value) pairs
+    /// so that each unique value per document is returned only once.
+    ///
+    /// This is necessary for correct document counting in aggregations,
+    /// where multi-valued fields can produce duplicate entries that inflate counts.
+    #[inline]
+    pub fn fetch_block_with_missing_unique_per_doc(
+        &mut self,
+        docs: &[u32],
+        accessor: &Column<T>,
+        missing: Option<T>,
+    ) where
+        T: Ord,
+    {
+        self.fetch_block_with_missing(docs, accessor, missing);
+        if accessor.index.get_cardinality().is_multivalue() {
+            self.dedup_docid_val_pairs();
+        }
+    }
+
+    /// Removes duplicate (doc_id, value) pairs from the caches.
+    ///
+    /// After `fetch_block`, entries are sorted by doc_id, but values within
+    /// the same doc may not be sorted (e.g. `(0,1), (0,2), (0,1)`).
+    /// We group consecutive entries by doc_id, sort values within each group
+    /// if it has more than 2 elements, then deduplicate adjacent pairs.
+    ///
+    /// Skips entirely if no doc_id appears more than once in the block.
+    fn dedup_docid_val_pairs(&mut self)
+    where T: Ord {
+        if self.docid_cache.len() <= 1 {
+            return;
+        }
+
+        // Quick check: if no consecutive doc_ids are equal, no dedup needed.
+        let has_multivalue = self.docid_cache.windows(2).any(|w| w[0] == w[1]);
+        if !has_multivalue {
+            return;
+        }
+
+        // Sort values within each doc_id group so duplicates become adjacent.
+        let mut start = 0;
+        while start < self.docid_cache.len() {
+            let doc = self.docid_cache[start];
+            let mut end = start + 1;
+            while end < self.docid_cache.len() && self.docid_cache[end] == doc {
+                end += 1;
+            }
+            if end - start > 2 {
+                self.val_cache[start..end].sort();
+            }
+            start = end;
+        }
+
+        // Now duplicates are adjacent — deduplicate in place.
+        let mut write = 0;
+        for read in 1..self.docid_cache.len() {
+            if self.docid_cache[read] != self.docid_cache[write]
+                || self.val_cache[read] != self.val_cache[write]
+            {
+                write += 1;
+                if write != read {
+                    self.docid_cache[write] = self.docid_cache[read];
+                    self.val_cache[write] = self.val_cache[read];
+                }
+            }
+        }
+        let new_len = write + 1;
+        self.docid_cache.truncate(new_len);
+        self.val_cache.truncate(new_len);
+    }
+
    #[inline]
    pub fn iter_vals(&self) -> impl Iterator<Item = T> + '_ {
        self.val_cache.iter().cloned()
@@ -163,4 +235,56 @@ mod tests {

        assert_eq!(missing_docs, vec![1, 2, 3, 4, 5]);
    }
+
+    #[test]
+    fn test_dedup_docid_val_pairs_consecutive() {
+        let mut accessor = ColumnBlockAccessor::<u64>::default();
+        accessor.docid_cache = vec![0, 0, 2, 3];
+        accessor.val_cache = vec![10, 10, 10, 10];
+        accessor.dedup_docid_val_pairs();
+        assert_eq!(accessor.docid_cache, vec![0, 2, 3]);
+        assert_eq!(accessor.val_cache, vec![10, 10, 10]);
+    }
+
+    #[test]
+    fn test_dedup_docid_val_pairs_non_consecutive() {
+        // (0,1), (0,2), (0,1) — duplicate value not adjacent
+        let mut accessor = ColumnBlockAccessor::<u64>::default();
+        accessor.docid_cache = vec![0, 0, 0];
+        accessor.val_cache = vec![1, 2, 1];
+        accessor.dedup_docid_val_pairs();
+        assert_eq!(accessor.docid_cache, vec![0, 0]);
+        assert_eq!(accessor.val_cache, vec![1, 2]);
+    }
+
+    #[test]
+    fn test_dedup_docid_val_pairs_multi_doc() {
+        // doc 0: values [3, 1, 3], doc 1: values [5, 5]
+        let mut accessor = ColumnBlockAccessor::<u64>::default();
+        accessor.docid_cache = vec![0, 0, 0, 1, 1];
+        accessor.val_cache = vec![3, 1, 3, 5, 5];
+        accessor.dedup_docid_val_pairs();
+        assert_eq!(accessor.docid_cache, vec![0, 0, 1]);
+        assert_eq!(accessor.val_cache, vec![1, 3, 5]);
+    }
+
+    #[test]
+    fn test_dedup_docid_val_pairs_no_duplicates() {
+        let mut accessor = ColumnBlockAccessor::<u64>::default();
+        accessor.docid_cache = vec![0, 0, 1];
+        accessor.val_cache = vec![1, 2, 3];
+        accessor.dedup_docid_val_pairs();
+        assert_eq!(accessor.docid_cache, vec![0, 0, 1]);
+        assert_eq!(accessor.val_cache, vec![1, 2, 3]);
+    }
+
+    #[test]
+    fn test_dedup_docid_val_pairs_single_element() {
+        let mut accessor = ColumnBlockAccessor::<u64>::default();
+        accessor.docid_cache = vec![0];
+        accessor.val_cache = vec![1];
+        accessor.dedup_docid_val_pairs();
+        assert_eq!(accessor.docid_cache, vec![0]);
+        assert_eq!(accessor.val_cache, vec![1]);
+    }
 }
--- a/common/src/bitset.rs
+++ b/common/src/bitset.rs
@@ -47,9 +47,6 @@ impl TinySet {
        TinySet(val)
    }

-    /// An empty `TinySet` constant.
-    pub const EMPTY: TinySet = TinySet(0u64);
-
    /// Returns an empty `TinySet`.
    #[inline]
    pub fn empty() -> TinySet {
--- a/src/aggregation/bucket/term_agg.rs
+++ b/src/aggregation/bucket/term_agg.rs
@@ -807,11 +807,13 @@ impl<TermMap: TermAggregationMap, C: SubAggCache> SegmentAggregationCollector

        let req_data = &mut self.terms_req_data;

-        agg_data.column_block_accessor.fetch_block_with_missing(
-            docs,
-            &req_data.accessor,
-            req_data.missing_value_for_accessor,
-        );
+        agg_data
+            .column_block_accessor
+            .fetch_block_with_missing_unique_per_doc(
+                docs,
+                &req_data.accessor,
+                req_data.missing_value_for_accessor,
+            );

        if let Some(sub_agg) = &mut self.sub_agg {
            let term_buckets = &mut self.parent_buckets[parent_bucket_id as usize];
@@ -2347,7 +2349,7 @@ mod tests {

        // text field
        assert_eq!(res["my_texts"]["buckets"][0]["key"], "Hello Hello");
-        assert_eq!(res["my_texts"]["buckets"][0]["doc_count"], 5);
+        assert_eq!(res["my_texts"]["buckets"][0]["doc_count"], 4);
        assert_eq!(res["my_texts"]["buckets"][1]["key"], "Empty");
        assert_eq!(res["my_texts"]["buckets"][1]["doc_count"], 2);
        assert_eq!(
@@ -2356,7 +2358,7 @@ mod tests {
        );
        // text field with number as missing fallback
        assert_eq!(res["my_texts2"]["buckets"][0]["key"], "Hello Hello");
-        assert_eq!(res["my_texts2"]["buckets"][0]["doc_count"], 5);
+        assert_eq!(res["my_texts2"]["buckets"][0]["doc_count"], 4);
        assert_eq!(res["my_texts2"]["buckets"][1]["key"], 1337.0);
        assert_eq!(res["my_texts2"]["buckets"][1]["doc_count"], 2);
        assert_eq!(
@@ -2370,7 +2372,7 @@ mod tests {
        assert_eq!(res["my_ids"]["buckets"][0]["key"], 1337.0);
        assert_eq!(res["my_ids"]["buckets"][0]["doc_count"], 4);
        assert_eq!(res["my_ids"]["buckets"][1]["key"], 1.0);
-        assert_eq!(res["my_ids"]["buckets"][1]["doc_count"], 3);
+        assert_eq!(res["my_ids"]["buckets"][1]["doc_count"], 2);
        assert_eq!(res["my_ids"]["buckets"][2]["key"], serde_json::Value::Null);

        Ok(())
--- a/src/collector/count_collector.rs
+++ b/src/collector/count_collector.rs
@@ -1,6 +1,5 @@
 use super::Collector;
 use crate::collector::SegmentCollector;
-use crate::query::Weight;
 use crate::{DocId, Score, SegmentOrdinal, SegmentReader};

 /// `CountCollector` collector only counts how many
@@ -56,15 +55,6 @@ impl Collector for Count {
    fn merge_fruits(&self, segment_counts: Vec<usize>) -> crate::Result<usize> {
        Ok(segment_counts.into_iter().sum())
    }
-
-    fn collect_segment(
-        &self,
-        weight: &dyn Weight,
-        _segment_ord: u32,
-        reader: &SegmentReader,
-    ) -> crate::Result<usize> {
-        Ok(weight.count(reader)? as usize)
-    }
 }

 #[derive(Default)]
--- a/src/docset.rs
+++ b/src/docset.rs
@@ -1,7 +1,5 @@
 use std::borrow::{Borrow, BorrowMut};

-use common::TinySet;
-
 use crate::fastfield::AliveBitSet;
 use crate::DocId;

@@ -16,12 +14,6 @@ pub const TERMINATED: DocId = i32::MAX as u32;
 /// exactly this size as long as we can fill the buffer.
 pub const COLLECT_BLOCK_BUFFER_LEN: usize = 64;

-/// Number of `TinySet` (64-bit) buckets in a block used by [`DocSet::fill_bitset_block`].
-pub const BLOCK_NUM_TINYBITSETS: usize = 16;
-
-/// Number of doc IDs covered by one block: `BLOCK_NUM_TINYBITSETS * 64 = 1024`.
-pub const BLOCK_WINDOW: u32 = BLOCK_NUM_TINYBITSETS as u32 * 64;
-
 /// Represents an iterable set of sorted doc ids.
 pub trait DocSet: Send {
    /// Goes to the next element.
@@ -168,31 +160,6 @@ pub trait DocSet: Send {
        self.size_hint() as u64
    }

-    /// Fills a bitmask representing which documents in `[min_doc, min_doc + BLOCK_WINDOW)` are
-    /// present in this docset.
-    ///
-    /// The window is divided into `BLOCK_NUM_TINYBITSETS` buckets of 64 docs each.
-    /// Returns the next doc `>= min_doc + BLOCK_WINDOW`, or `TERMINATED` if exhausted.
-    fn fill_bitset_block(
-        &mut self,
-        min_doc: DocId,
-        mask: &mut [TinySet; BLOCK_NUM_TINYBITSETS],
-    ) -> DocId {
-        self.seek(min_doc);
-        let horizon = min_doc + BLOCK_WINDOW;
-        loop {
-            let doc = self.doc();
-            if doc >= horizon {
-                return doc;
-            }
-            let delta = doc - min_doc;
-            mask[(delta / 64) as usize].insert_mut(delta % 64);
-            if self.advance() == TERMINATED {
-                return TERMINATED;
-            }
-        }
-    }
-
    /// Returns the number documents matching.
    /// Calling this method consumes the `DocSet`.
    fn count(&mut self, alive_bitset: &AliveBitSet) -> u32 {
@@ -247,18 +214,6 @@ impl DocSet for &mut dyn DocSet {
        (**self).seek_danger(target)
    }

-    fn fill_buffer(&mut self, buffer: &mut [DocId; COLLECT_BLOCK_BUFFER_LEN]) -> usize {
-        (**self).fill_buffer(buffer)
-    }
-
-    fn fill_bitset_block(
-        &mut self,
-        min_doc: DocId,
-        mask: &mut [TinySet; BLOCK_NUM_TINYBITSETS],
-    ) -> DocId {
-        (**self).fill_bitset_block(min_doc, mask)
-    }
-
    fn doc(&self) -> u32 {
        (**self).doc()
    }
@@ -301,15 +256,6 @@ impl<TDocSet: DocSet + ?Sized> DocSet for Box<TDocSet> {
        unboxed.fill_buffer(buffer)
    }

-    fn fill_bitset_block(
-        &mut self,
-        min_doc: DocId,
-        mask: &mut [TinySet; BLOCK_NUM_TINYBITSETS],
-    ) -> DocId {
-        let unboxed: &mut TDocSet = self.borrow_mut();
-        unboxed.fill_bitset_block(min_doc, mask)
-    }
-
    fn doc(&self) -> DocId {
        let unboxed: &TDocSet = self.borrow();
        unboxed.doc()
--- a/src/query/intersection.rs
+++ b/src/query/intersection.rs
@@ -1,7 +1,5 @@
-use common::TinySet;
-
 use super::size_hint::estimate_intersection;
-use crate::docset::{DocSet, SeekDangerResult, BLOCK_NUM_TINYBITSETS, TERMINATED};
+use crate::docset::{DocSet, SeekDangerResult, TERMINATED};
 use crate::query::term_query::TermScorer;
 use crate::query::{EmptyScorer, Scorer};
 use crate::{DocId, Score};
@@ -19,7 +17,7 @@ use crate::{DocId, Score};
 /// `size_hint` of the intersection.
 pub fn intersect_scorers(
    mut scorers: Vec<Box<dyn Scorer>>,
-    segment_num_docs: u32,
+    num_docs_segment: u32,
 ) -> Box<dyn Scorer> {
    if scorers.is_empty() {
        return Box::new(EmptyScorer);
@@ -44,14 +42,14 @@ pub fn intersect_scorers(
            left: *(left.downcast::<TermScorer>().map_err(|_| ()).unwrap()),
            right: *(right.downcast::<TermScorer>().map_err(|_| ()).unwrap()),
            others: scorers,
-            segment_num_docs,
+            num_docs: num_docs_segment,
        });
    }
    Box::new(Intersection {
        left,
        right,
        others: scorers,
-        segment_num_docs,
+        num_docs: num_docs_segment,
    })
 }

@@ -60,7 +58,7 @@ pub struct Intersection<TDocSet: DocSet, TOtherDocSet: DocSet = Box<dyn Scorer>>
    left: TDocSet,
    right: TDocSet,
    others: Vec<TOtherDocSet>,
-    segment_num_docs: u32,
+    num_docs: u32,
 }

 fn go_to_first_doc<TDocSet: DocSet>(docsets: &mut [TDocSet]) -> DocId {
@@ -80,10 +78,7 @@ fn go_to_first_doc<TDocSet: DocSet>(docsets: &mut [TDocSet]) -> DocId {

 impl<TDocSet: DocSet> Intersection<TDocSet, TDocSet> {
    /// num_docs is the number of documents in the segment.
-    pub(crate) fn new(
-        mut docsets: Vec<TDocSet>,
-        segment_num_docs: u32,
-    ) -> Intersection<TDocSet, TDocSet> {
+    pub(crate) fn new(mut docsets: Vec<TDocSet>, num_docs: u32) -> Intersection<TDocSet, TDocSet> {
        let num_docsets = docsets.len();
        assert!(num_docsets >= 2);
        docsets.sort_by_key(|docset| docset.cost());
@@ -102,7 +97,7 @@ impl<TDocSet: DocSet> Intersection<TDocSet, TDocSet> {
            left,
            right,
            others: docsets,
-            segment_num_docs,
+            num_docs,
        }
    }
 }
@@ -219,7 +214,7 @@ impl<TDocSet: DocSet, TOtherDocSet: DocSet> DocSet for Intersection<TDocSet, TOt
            [self.left.size_hint(), self.right.size_hint()]
                .into_iter()
                .chain(self.others.iter().map(DocSet::size_hint)),
-            self.segment_num_docs,
+            self.num_docs,
        )
    }

@@ -229,91 +224,6 @@ impl<TDocSet: DocSet, TOtherDocSet: DocSet> DocSet for Intersection<TDocSet, TOt
        // If there are docsets that are bad at skipping, they should also influence the cost.
        self.left.cost()
    }
-
-    fn count_including_deleted(&mut self) -> u32 {
-        const DENSITY_THRESHOLD_INVERSE: u32 = 32;
-        if self
-            .left
-            .size_hint()
-            .saturating_mul(DENSITY_THRESHOLD_INVERSE)
-            < self.segment_num_docs
-        {
-            // Sparse path: if the lead iterator covers less than ~3% of docs,
-            // the block approach wastes time on mostly-empty blocks.
-            self.count_including_deleted_sparse()
-        } else {
-            // Dense approach. We push documents into a block bitset to then
-            // perform count using popcount.
-            self.count_including_deleted_dense()
-        }
-    }
-}
-
-const EMPTY_BLOCK: [TinySet; BLOCK_NUM_TINYBITSETS] = [TinySet::EMPTY; BLOCK_NUM_TINYBITSETS];
-
-/// ANDs `other` into `mask` in-place. Returns `true` if the result is all zeros.
-#[inline]
-fn and_is_empty(
-    mask: &mut [TinySet; BLOCK_NUM_TINYBITSETS],
-    other: &[TinySet; BLOCK_NUM_TINYBITSETS],
-) -> bool {
-    let mut all_empty = true;
-    for (m, t) in mask.iter_mut().zip(other.iter()) {
-        *m = m.intersect(*t);
-        all_empty &= m.is_empty();
-    }
-    all_empty
-}
-
-impl<TDocSet: DocSet, TOtherDocSet: DocSet> Intersection<TDocSet, TOtherDocSet> {
-    fn count_including_deleted_sparse(&mut self) -> u32 {
-        let mut count = 0u32;
-        let mut doc = self.doc();
-        while doc != TERMINATED {
-            count += 1;
-            doc = self.advance();
-        }
-        count
-    }
-
-    /// Dense block-wise bitmask intersection count.
-    ///
-    /// Fills a 1024-doc window from each iterator, ANDs the bitmasks together,
-    /// and popcounts the result. `fill_bitset_block` handles seeking tails forward
-    /// when they lag behind the current block.
-    fn count_including_deleted_dense(&mut self) -> u32 {
-        let mut count = 0u32;
-        let mut next_base = self.left.doc();
-
-        while next_base < TERMINATED {
-            let base = next_base;
-
-            // Fill lead bitmask.
-            let mut mask = EMPTY_BLOCK;
-            next_base = next_base.max(self.left.fill_bitset_block(base, &mut mask));
-
-            let mut tail_mask = EMPTY_BLOCK;
-            next_base = next_base.max(self.right.fill_bitset_block(base, &mut tail_mask));
-
-            if and_is_empty(&mut mask, &tail_mask) {
-                continue;
-            }
-            // AND with each additional tail.
-            for other in &mut self.others {
-                let mut other_mask = EMPTY_BLOCK;
-                next_base = next_base.max(other.fill_bitset_block(base, &mut other_mask));
-                if and_is_empty(&mut mask, &other_mask) {
-                    continue;
-                }
-            }
-
-            for tinyset in &mask {
-                count += tinyset.len();
-            }
-        }
-
-        count
-    }
 }

 impl<TScorer, TOtherScorer> Scorer for Intersection<TScorer, TOtherScorer>
@@ -511,82 +421,6 @@ mod tests {
        }
    }

-    proptest! {
-        #[test]
-        fn prop_test_count_including_deleted_matches_default(
-            a in sorted_deduped_vec(1200, 400),
-            b in sorted_deduped_vec(1200, 400),
-            c in sorted_deduped_vec(1200, 400),
-            num_docs in 1200u32..2000u32,
-        ) {
-            // Compute expected count via set intersection.
-            let expected: u32 = a.iter()
-                .filter(|doc| b.contains(doc) && c.contains(doc))
-                .count() as u32;
-
-            // Test count_including_deleted (dense path).
-            let make_intersection = || {
-                Intersection::new(
-                    vec![
-                        VecDocSet::from(a.clone()),
-                        VecDocSet::from(b.clone()),
-                        VecDocSet::from(c.clone()),
-                    ],
-                    num_docs,
-                )
-            };
-
-            let mut intersection = make_intersection();
-            let count = intersection.count_including_deleted();
-            prop_assert_eq!(count, expected,
-                "count_including_deleted mismatch: a={:?}, b={:?}, c={:?}", a, b, c);
-        }
-    }
-
-    #[test]
-    fn test_count_including_deleted_two_way() {
-        let left = VecDocSet::from(vec![1, 3, 9]);
-        let right = VecDocSet::from(vec![3, 4, 9, 18]);
-        let mut intersection = Intersection::new(vec![left, right], 100);
-        assert_eq!(intersection.count_including_deleted(), 2);
-    }
-
-    #[test]
-    fn test_count_including_deleted_empty() {
-        let a = VecDocSet::from(vec![1, 3]);
-        let b = VecDocSet::from(vec![1, 4]);
-        let c = VecDocSet::from(vec![3, 9]);
-        let mut intersection = Intersection::new(vec![a, b, c], 100);
-        assert_eq!(intersection.count_including_deleted(), 0);
-    }
-
-    /// Test with enough documents to exercise the dense path (>= num_docs/32).
-    #[test]
-    fn test_count_including_deleted_dense_path() {
-        // Create dense docsets: many docs relative to segment size.
-        let docs_a: Vec<u32> = (0..2000).step_by(2).collect(); // even numbers 0..2000
-        let docs_b: Vec<u32> = (0..2000).step_by(3).collect(); // multiples of 3
-        let expected = docs_a.iter().filter(|d| *d % 3 == 0).count() as u32;
-
-        let a = VecDocSet::from(docs_a);
-        let b = VecDocSet::from(docs_b);
-        let mut intersection = Intersection::new(vec![a, b], 2000);
-        assert_eq!(intersection.count_including_deleted(), expected);
-    }
-
-    /// Test that spans multiple blocks (>1024 docs).
-    #[test]
-    fn test_count_including_deleted_multi_block() {
-        let docs_a: Vec<u32> = (0..5000).collect();
-        let docs_b: Vec<u32> = (0..5000).step_by(7).collect();
-        let expected = docs_b.len() as u32; // all of b is in a
-
-        let a = VecDocSet::from(docs_a);
-        let b = VecDocSet::from(docs_b);
-        let mut intersection = Intersection::new(vec![a, b], 5000);
-        assert_eq!(intersection.count_including_deleted(), expected);
-    }
-
    #[test]
    fn test_bug_2811_intersection_candidate_should_increase() {
        let mut schema_builder = Schema::builder();
--- a/src/query/term_query/term_scorer.rs
+++ b/src/query/term_query/term_scorer.rs
@@ -117,12 +117,6 @@ impl DocSet for TermScorer {
    fn size_hint(&self) -> u32 {
        self.postings.size_hint()
    }
-
-    // TODO
-    // It is probably possible to optimize fill_bitset_block for TermScorer,
-    // working directly with the blocks, enabling vectorization.
-    // I did not manage to get a performance improvement on Mac ARM,
-    // and do not have access to x86 to investigate.
 }

 impl Scorer for TermScorer {