try with custom Cow<str>

implement add_borrowed_values on Document
make Document support Yoked inner values
2026-02-16 04:40:36 +00:00 · 2023-01-11 16:02:52 +01:00 · 2022-12-23 16:16:22 +01:00 · 2022-12-22 17:52:53 +01:00 · 2022-12-22 15:43:13 +01:00 · 2022-10-30 14:12:07 +01:00
90 changed files with 104421 additions and 1148 deletions
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -1,10 +1,32 @@
 Tantivy 0.19
 ================================

+- Limit fast fields to u32 (`get_val(u32)`) [#1644](https://github.com/quickwit-oss/tantivy/pull/1644) (@PSeitz)
+- Major bugfix: Fix missing fieldnorms for u64, i64, f64, bool, bytes and date [#1620](https://github.com/quickwit-oss/tantivy/pull/1620) (@PSeitz)
 - Updated [Date Field Type](https://github.com/quickwit-oss/tantivy/pull/1396)
  The `DateTime` type has been updated to hold timestamps with microseconds precision.
-  `DateOptions` and `DatePrecision` have been added to configure Date fields. The precision is used to hint on fast values compression. Otherwise, seconds precision is used everywhere else (i.e terms, indexing).
- Remove Searcher pool and make `Searcher` cloneable.
+  `DateOptions` and `DatePrecision` have been added to configure Date fields. The precision is used to hint on fast values compression. Otherwise, seconds precision is used everywhere else (i.e terms, indexing). (@evanxg852000)
+- Add IP address field type [#1553](https://github.com/quickwit-oss/tantivy/pull/1553) (@PSeitz)
+- Add boolean field type [#1382](https://github.com/quickwit-oss/tantivy/pull/1382) (@boraarslan)
+- Remove Searcher pool and make `Searcher` cloneable. (@PSeitz)
+- Validate settings on create [#1570](https://github.com/quickwit-oss/tantivy/pull/1570 (@PSeitz)
+- Fix interpolation overflow in linear interpolation fastfield codec [#1480](https://github.com/quickwit-oss/tantivy/pull/1480 (@PSeitz @fulmicoton)
+- Detect and apply gcd on fastfield codecs [#1418](https://github.com/quickwit-oss/tantivy/pull/1418) (@PSeitz)
+- Doc store
+  - use separate thread to compress block store [#1389](https://github.com/quickwit-oss/tantivy/pull/1389) [#1510](https://github.com/quickwit-oss/tantivy/pull/1510 (@PSeitz @fulmicoton)
+  - Expose doc store cache size [#1403](https://github.com/quickwit-oss/tantivy/pull/1403) (@PSeitz)
+  - Enable compression levels for doc store [#1378](https://github.com/quickwit-oss/tantivy/pull/1378) (@PSeitz)
+  - Make block size configurable [#1374](https://github.com/quickwit-oss/tantivy/pull/1374) (@kryesh)
+- Make `tantivy::TantivyError` cloneable [#1402](https://github.com/quickwit-oss/tantivy/pull/1402) (@PSeitz)
+- Add support for phrase slop in query language [#1393](https://github.com/quickwit-oss/tantivy/pull/1393) (@saroh)
+- Aggregation
+  - Add support for keyed parameter in range and histgram aggregations [#1424](https://github.com/quickwit-oss/tantivy/pull/1424) (@k-yomo)
+  - Add aggregation bucket limit [#1363](https://github.com/quickwit-oss/tantivy/pull/1363) (@PSeitz)
+- Faster indexing
+  - [#1610](https://github.com/quickwit-oss/tantivy/pull/1610 (@PSeitz)
+  - [#1594](https://github.com/quickwit-oss/tantivy/pull/1594 (@PSeitz)
+  - [#1582](https://github.com/quickwit-oss/tantivy/pull/1582 (@PSeitz)
+  - [#1611](https://github.com/quickwit-oss/tantivy/pull/1611 (@PSeitz)

 Tantivy 0.18
 ================================
@@ -22,6 +44,10 @@ Tantivy 0.18
 - Add terms aggregation (@PSeitz)
 - Add support for zstd compression (@kryesh)

+Tantivy 0.18.1
+================================
+- Hotfix: positions computation.  #1629 (@fmassot, @fulmicoton, @PSeitz)
+
 Tantivy 0.17
 ================================

--- a/Cargo.toml
+++ b/Cargo.toml
@@ -1,6 +1,6 @@
 [package]
 name = "tantivy"
-version = "0.18.0"
+version = "0.19.0-dev"
 authors = ["Paul Masurel <paul.masurel@gmail.com>"]
 license = "MIT"
 categories = ["database-implementations", "data-structures"]
@@ -14,12 +14,13 @@ edition = "2021"
 rust-version = "1.62"

 [dependencies]
-oneshot = "0.1.3"
+oneshot = "0.1.5"
 base64 = "0.13.0"
 byteorder = "1.4.3"
 crc32fast = "1.3.2"
 once_cell = "1.10.0"
 regex = { version = "1.5.5", default-features = false, features = ["std", "unicode"] }
+aho-corasick = "0.7"
 tantivy-fst = "0.4.0"
 memmap2 = { version = "0.5.3", optional = true }
 lz4_flex = { version = "0.9.2", default-features = false, features = ["checked-decode"], optional = true }
@@ -45,7 +46,7 @@ rust-stemmers = "1.2.0"
 downcast-rs = "1.2.0"
 bitpacking = { version = "0.8.4", default-features = false, features = ["bitpacker4x"] }
 census = "0.4.0"
-fnv = "1.0.7"
+rustc-hash = "1.1.0"
 thiserror = "1.0.30"
 htmlescape = "0.3.1"
 fail = "0.5.0"
@@ -57,9 +58,10 @@ lru = "0.7.5"
 fastdivide = "0.4.0"
 itertools = "0.10.3"
 measure_time = "0.8.2"
-serde_cbor = { version = "0.11.2", optional = true }
+ciborium = { version = "0.2", optional = true}
 async-trait = "0.1.53"
 arc-swap = "1.5.0"
+yoke = { version = "0.6.2", features = ["derive"] }

 [target.'cfg(windows)'.dependencies]
 winapi = "0.3.9"
@@ -101,7 +103,7 @@ zstd-compression = ["zstd"]
 failpoints = ["fail/failpoints"]
 unstable = [] # useful for benches.

-quickwit = ["serde_cbor"]
+quickwit = ["ciborium"]

 [workspace]
 members = ["query-grammar", "bitpacker", "common", "fastfield_codecs", "ownedbytes"]
--- a/benches/hdfs_with_array.json
+++ b/benches/hdfs_with_array.json
--- a/benches/index-bench.rs
+++ b/benches/index-bench.rs
@@ -1,116 +1,159 @@
 use criterion::{criterion_group, criterion_main, Criterion};
+use itertools::Itertools;
 use pprof::criterion::{Output, PProfProfiler};
-use tantivy::schema::{INDEXED, STORED, STRING, TEXT};
-use tantivy::Index;
+use serde_json::{self, Value as JsonValue};
+use tantivy::directory::RamDirectory;
+use tantivy::schema::{
+    FieldValue, TextFieldIndexing, TextOptions, Value, INDEXED, STORED, STRING, TEXT,
+};
+use tantivy::{Document, Index, IndexBuilder};

 const HDFS_LOGS: &str = include_str!("hdfs.json");
-const NUM_REPEATS: usize = 2;
+const NUM_REPEATS: usize = 20;

 pub fn hdfs_index_benchmark(c: &mut Criterion) {
-    let schema = {
-        let mut schema_builder = tantivy::schema::SchemaBuilder::new();
-        schema_builder.add_u64_field("timestamp", INDEXED);
-        schema_builder.add_text_field("body", TEXT);
-        schema_builder.add_text_field("severity", STRING);
-        schema_builder.build()
-    };
-    let schema_with_store = {
-        let mut schema_builder = tantivy::schema::SchemaBuilder::new();
-        schema_builder.add_u64_field("timestamp", INDEXED | STORED);
-        schema_builder.add_text_field("body", TEXT | STORED);
-        schema_builder.add_text_field("severity", STRING | STORED);
-        schema_builder.build()
-    };
-    let dynamic_schema = {
-        let mut schema_builder = tantivy::schema::SchemaBuilder::new();
-        schema_builder.add_json_field("json", TEXT);
-        schema_builder.build()
-    };
+    let mut schema_builder = tantivy::schema::SchemaBuilder::new();
+    let text_indexing_options = TextFieldIndexing::default()
+        .set_tokenizer("default")
+        .set_fieldnorms(false)
+        .set_index_option(tantivy::schema::IndexRecordOption::WithFreqsAndPositions);
+    let mut text_options = TextOptions::default().set_indexing_options(text_indexing_options);
+    let text_field = schema_builder.add_text_field("body", text_options);
+    let schema = schema_builder.build();
+
+    // prepare doc
+    let mut documents_no_array = Vec::new();
+    let mut documents_with_array = Vec::new();
+    for doc_json in HDFS_LOGS.trim().split("\n") {
+        let json_obj: serde_json::Map<String, JsonValue> = serde_json::from_str(doc_json).unwrap();
+        let text = json_obj.get("body").unwrap().as_str().unwrap();
+        let mut doc_no_array = Document::new();
+        doc_no_array.add_text(text_field, text);
+        documents_no_array.push(doc_no_array);
+        let mut doc_with_array = Document::new();
+        doc_with_array.add_borrowed_values(text.to_owned(), |text| {
+            text.split(' ')
+                .map(|text| FieldValue::new(text_field, text.into()))
+                .collect()
+        });
+        documents_with_array.push(doc_with_array);
+    }

    let mut group = c.benchmark_group("index-hdfs");
    group.sample_size(20);
    group.bench_function("index-hdfs-no-commit", |b| {
        b.iter(|| {
-            let index = Index::create_in_ram(schema.clone());
-            let index_writer = index.writer_with_num_threads(1, 100_000_000).unwrap();
+            let ram_directory = RamDirectory::create();
+            let mut index_writer = IndexBuilder::new()
+                .schema(schema.clone())
+                .single_segment_index_writer(ram_directory, 100_000_000)
+                .unwrap();
            for _ in 0..NUM_REPEATS {
-                for doc_json in HDFS_LOGS.trim().split("\n") {
-                    let doc = schema.parse_document(doc_json).unwrap();
+                let documents_cloned = documents_no_array.clone();
+                for doc in documents_cloned {
                    index_writer.add_document(doc).unwrap();
                }
            }
        })
    });
-    group.bench_function("index-hdfs-with-commit", |b| {
+    group.bench_function("index-hdfs-with-array-no-commit", |b| {
        b.iter(|| {
-            let index = Index::create_in_ram(schema.clone());
-            let mut index_writer = index.writer_with_num_threads(1, 100_000_000).unwrap();
+            let ram_directory = RamDirectory::create();
+            let mut index_writer = IndexBuilder::new()
+                .schema(schema.clone())
+                .single_segment_index_writer(ram_directory, 100_000_000)
+                .unwrap();
            for _ in 0..NUM_REPEATS {
-                for doc_json in HDFS_LOGS.trim().split("\n") {
-                    let doc = schema.parse_document(doc_json).unwrap();
-                    index_writer.add_document(doc).unwrap();
-                }
-            }
-            index_writer.commit().unwrap();
-        })
-    });
-    group.bench_function("index-hdfs-no-commit-with-docstore", |b| {
-        b.iter(|| {
-            let index = Index::create_in_ram(schema_with_store.clone());
-            let index_writer = index.writer_with_num_threads(1, 100_000_000).unwrap();
-            for _ in 0..NUM_REPEATS {
-                for doc_json in HDFS_LOGS.trim().split("\n") {
-                    let doc = schema.parse_document(doc_json).unwrap();
+                let documents_with_array_cloned = documents_with_array.clone();
+                for doc in documents_with_array_cloned {
                    index_writer.add_document(doc).unwrap();
                }
            }
        })
    });
-    group.bench_function("index-hdfs-with-commit-with-docstore", |b| {
-        b.iter(|| {
-            let index = Index::create_in_ram(schema_with_store.clone());
-            let mut index_writer = index.writer_with_num_threads(1, 100_000_000).unwrap();
-            for _ in 0..NUM_REPEATS {
-                for doc_json in HDFS_LOGS.trim().split("\n") {
-                    let doc = schema.parse_document(doc_json).unwrap();
-                    index_writer.add_document(doc).unwrap();
-                }
-            }
-            index_writer.commit().unwrap();
-        })
-    });
-    group.bench_function("index-hdfs-no-commit-json-without-docstore", |b| {
-        b.iter(|| {
-            let index = Index::create_in_ram(dynamic_schema.clone());
-            let json_field = dynamic_schema.get_field("json").unwrap();
-            let mut index_writer = index.writer_with_num_threads(1, 100_000_000).unwrap();
-            for _ in 0..NUM_REPEATS {
-                for doc_json in HDFS_LOGS.trim().split("\n") {
-                    let json_val: serde_json::Map<String, serde_json::Value> =
-                        serde_json::from_str(doc_json).unwrap();
-                    let doc = tantivy::doc!(json_field=>json_val);
-                    index_writer.add_document(doc).unwrap();
-                }
-            }
-            index_writer.commit().unwrap();
-        })
-    });
-    group.bench_function("index-hdfs-with-commit-json-without-docstore", |b| {
-        b.iter(|| {
-            let index = Index::create_in_ram(dynamic_schema.clone());
-            let json_field = dynamic_schema.get_field("json").unwrap();
-            let mut index_writer = index.writer_with_num_threads(1, 100_000_000).unwrap();
-            for _ in 0..NUM_REPEATS {
-                for doc_json in HDFS_LOGS.trim().split("\n") {
-                    let json_val: serde_json::Map<String, serde_json::Value> =
-                        serde_json::from_str(doc_json).unwrap();
-                    let doc = tantivy::doc!(json_field=>json_val);
-                    index_writer.add_document(doc).unwrap();
-                }
-            }
-            index_writer.commit().unwrap();
-        })
-    });
+    // group.bench_function("index-hdfs-with-commit", |b| {
+    //     b.iter(|| {
+    //         let ram_directory = RamDirectory::create();
+    //         let mut index_writer = IndexBuilder::new()
+    //             .schema(schema.clone())
+    //             .single_segment_index_writer(ram_directory, 100_000_000)
+    //             .unwrap();
+    //         for _ in 0..NUM_REPEATS {
+    //             for doc_json in HDFS_LOGS.trim().split("\n") {
+    //                 let doc = schema.parse_document(doc_json).unwrap();
+    //                 index_writer.add_document(doc).unwrap();
+    //             }
+    //         }
+    //         index_writer.commit().unwrap();
+    //     })
+    // });
+    // group.bench_function("index-hdfs-no-commit-with-docstore", |b| {
+    //     b.iter(|| {
+    //         let ram_directory = RamDirectory::create();
+    //         let mut index_writer = IndexBuilder::new()
+    //             .schema(schema.clone())
+    //             .single_segment_index_writer(ram_directory, 100_000_000)
+    //             .unwrap();
+    //         for _ in 0..NUM_REPEATS {
+    //             for doc_json in HDFS_LOGS.trim().split("\n") {
+    //                 let doc = schema.parse_document(doc_json).unwrap();
+    //                 index_writer.add_document(doc).unwrap();
+    //             }
+    //         }
+    //     })
+    // });
+    // group.bench_function("index-hdfs-with-commit-with-docstore", |b| {
+    //     b.iter(|| {
+    //         let ram_directory = RamDirectory::create();
+    //         let mut index_writer = IndexBuilder::new()
+    //             .schema(schema.clone())
+    //             .single_segment_index_writer(ram_directory, 100_000_000)
+    //             .unwrap();
+    //         for _ in 0..NUM_REPEATS {
+    //             for doc_json in HDFS_LOGS.trim().split("\n") {
+    //                 let doc = schema.parse_document(doc_json).unwrap();
+    //                 index_writer.add_document(doc).unwrap();
+    //             }
+    //         }
+    //         index_writer.commit().unwrap();
+    //     })
+    // });
+    // group.bench_function("index-hdfs-no-commit-json-without-docstore", |b| {
+    //     b.iter(|| {
+    //         let ram_directory = RamDirectory::create();
+    //         let mut index_writer = IndexBuilder::new()
+    //             .schema(schema.clone())
+    //             .single_segment_index_writer(ram_directory, 100_000_000)
+    //             .unwrap();
+    //         for _ in 0..NUM_REPEATS {
+    //             for doc_json in HDFS_LOGS.trim().split("\n") {
+    //                 let json_val: serde_json::Map<String, serde_json::Value> =
+    //                     serde_json::from_str(doc_json).unwrap();
+    //                 let doc = tantivy::doc!(json_field=>json_val);
+    //                 index_writer.add_document(doc).unwrap();
+    //             }
+    //         }
+    //         index_writer.commit().unwrap();
+    //     })
+    // });
+    // group.bench_function("index-hdfs-with-commit-json-without-docstore", |b| {
+    //     b.iter(|| {
+    //         let ram_directory = RamDirectory::create();
+    //         let mut index_writer = IndexBuilder::new()
+    //             .schema(schema.clone())
+    //             .single_segment_index_writer(ram_directory, 100_000_000)
+    //             .unwrap();
+    //         for _ in 0..NUM_REPEATS {
+    //             for doc_json in HDFS_LOGS.trim().split("\n") {
+    //                 let json_val: serde_json::Map<String, serde_json::Value> =
+    //                     serde_json::from_str(doc_json).unwrap();
+    //                 let doc = tantivy::doc!(json_field=>json_val);
+    //                 index_writer.add_document(doc).unwrap();
+    //             }
+    //         }
+    //         index_writer.commit().unwrap();
+    //     })
+    //});
 }

 criterion_group! {
--- a/bitpacker/src/bitpacker.rs
+++ b/bitpacker/src/bitpacker.rs
@@ -87,15 +87,15 @@ impl BitUnpacker {
    }

    #[inline]
-    pub fn get(&self, idx: u64, data: &[u8]) -> u64 {
+    pub fn get(&self, idx: u32, data: &[u8]) -> u64 {
        if self.num_bits == 0 {
            return 0u64;
        }
-        let addr_in_bits = idx * self.num_bits;
+        let addr_in_bits = idx * self.num_bits as u32;
        let addr = addr_in_bits >> 3;
        let bit_shift = addr_in_bits & 7;
        debug_assert!(
-            addr + 8 <= data.len() as u64,
+            addr + 8 <= data.len() as u32,
            "The fast field field should have been padded with 7 bytes."
        );
        let bytes: [u8; 8] = (&data[(addr as usize)..(addr as usize) + 8])
@@ -130,7 +130,7 @@ mod test {
    fn test_bitpacker_util(len: usize, num_bits: u8) {
        let (bitunpacker, vals, data) = create_fastfield_bitpacker(len, num_bits);
        for (i, val) in vals.iter().enumerate() {
-            assert_eq!(bitunpacker.get(i as u64, &data), *val);
+            assert_eq!(bitunpacker.get(i as u32, &data), *val);
        }
    }

--- a/bitpacker/src/blocked_bitpacker.rs
+++ b/bitpacker/src/blocked_bitpacker.rs
@@ -130,7 +130,7 @@ impl BlockedBitpacker {
        let pos_in_block = idx % BLOCK_SIZE as usize;
        if let Some(metadata) = self.offset_and_bits.get(metadata_pos) {
            let unpacked = BitUnpacker::new(metadata.num_bits()).get(
-                pos_in_block as u64,
+                pos_in_block as u32,
                &self.compressed_blocks[metadata.offset() as usize..],
            );
            unpacked + metadata.base_value()
--- a/common/src/lib.rs
+++ b/common/src/lib.rs
@@ -34,8 +34,7 @@ impl<T: Deref<Target = [u8]>> HasLen for T {
    }
 }

-const HIGHEST_BIT_64: u64 = 1 << 63;
-const HIGHEST_BIT_32: u32 = 1 << 31;
+const HIGHEST_BIT: u64 = 1 << 63;

 /// Maps a `i64` to `u64`
 ///
@@ -59,13 +58,13 @@ const HIGHEST_BIT_32: u32 = 1 << 31;
 /// The reverse mapping is [`u64_to_i64()`].
 #[inline]
 pub fn i64_to_u64(val: i64) -> u64 {
-    (val as u64) ^ HIGHEST_BIT_64
+    (val as u64) ^ HIGHEST_BIT
 }

 /// Reverse the mapping given by [`i64_to_u64()`].
 #[inline]
 pub fn u64_to_i64(val: u64) -> i64 {
-    (val ^ HIGHEST_BIT_64) as i64
+    (val ^ HIGHEST_BIT) as i64
 }

 /// Maps a `f64` to `u64`
@@ -89,7 +88,7 @@ pub fn u64_to_i64(val: u64) -> i64 {
 pub fn f64_to_u64(val: f64) -> u64 {
    let bits = val.to_bits();
    if val.is_sign_positive() {
-        bits ^ HIGHEST_BIT_64
+        bits ^ HIGHEST_BIT
    } else {
        !bits
    }
@@ -98,148 +97,26 @@ pub fn f64_to_u64(val: f64) -> u64 {
 /// Reverse the mapping given by [`f64_to_u64()`].
 #[inline]
 pub fn u64_to_f64(val: u64) -> f64 {
-    f64::from_bits(if val & HIGHEST_BIT_64 != 0 {
-        val ^ HIGHEST_BIT_64
+    f64::from_bits(if val & HIGHEST_BIT != 0 {
+        val ^ HIGHEST_BIT
    } else {
        !val
    })
 }

-/// Maps a `f32` to `u64`
-///
-/// # See also
-/// Similar mapping for f64 [`u64_to_f64()`].
-#[inline]
-pub fn f32_to_u64(val: f32) -> u64 {
-    let bits = val.to_bits();
-    let res32 = if val.is_sign_positive() {
-        bits ^ HIGHEST_BIT_32
-    } else {
-        !bits
-    };
-    res32 as u64
-}
-
-/// Reverse the mapping given by [`f32_to_u64()`].
-#[inline]
-pub fn u64_to_f32(val: u64) -> f32 {
-    debug_assert!(val <= 1 << 32);
-    let val = val as u32;
-    f32::from_bits(if val & HIGHEST_BIT_32 != 0 {
-        val ^ HIGHEST_BIT_32
-    } else {
-        !val
-    })
-}
-
-/// Maps a `f64` to a fixed point representation.
-/// Lower bound is inclusive, upper bound is exclusive.
-/// `precision` is the number of bits used to represent the number.
-///
-/// This is a lossy, affine transformation. All provided values must be finite and non-NaN.
-/// Care should be taken to not provide values which would cause loss of precision such as values
-/// low enough to get sub-normal numbers, value high enough rounding would cause ±Inf to appear, or
-/// a precision larger than 50b.
-///
-/// # See also
-/// The reverse mapping is [`fixed_point_to_f64()`].
-#[inline]
-pub fn f64_to_fixed_point(val: f64, min: f64, max: f64, precision: u8) -> u64 {
-    debug_assert!((1..53).contains(&precision));
-    debug_assert!(min < max);
-
-    let delta = max - min;
-    let mult = (1u64 << precision) as f64;
-    let bucket_size = delta / mult;
-    let upper_bound = f64_next_down(max).min(max - bucket_size);
-
-    // due to different cases of rounding error, we need to enforce upper_bound to be
-    // max-bucket_size, but also that upper_bound < max, which is not given for small enough
-    // bucket_size.
-    let val = val.clamp(min, upper_bound);
-
-    let res = (val - min) / bucket_size;
-    if res.fract() == 0.5 {
-        res as u64
-    } else {
-        // round down when getting x.5
-        res.round() as u64
-    }
-}
-
-/// Reverse the mapping given by [`f64_to_fixed_point()`].
-#[inline]
-pub fn fixed_point_to_f64(val: u64, min: f64, max: f64, precision: u8) -> f64 {
-    let delta = max - min;
-    let mult = (1u64 << precision) as f64;
-    let bucket_size = delta / mult;
-
-    bucket_size.mul_add(val as f64, min)
-}
-
-// taken from rfc/3173-float-next-up-down, commented out part about nan in infinity as it is not
-// needed.
-fn f64_next_down(this: f64) -> f64 {
-    const NEG_TINY_BITS: u64 = 0x8000_0000_0000_0001;
-    const CLEAR_SIGN_MASK: u64 = 0x7fff_ffff_ffff_ffff;
-
-    let bits = this.to_bits();
-    // if this.is_nan() || bits == f64::NEG_INFINITY.to_bits() {
-    // return this;
-    // }
-    let abs = bits & CLEAR_SIGN_MASK;
-    let next_bits = if abs == 0 {
-        NEG_TINY_BITS
-    } else if bits == abs {
-        bits - 1
-    } else {
-        bits + 1
-    };
-    f64::from_bits(next_bits)
-}
-
 #[cfg(test)]
 pub mod test {
-    use std::cmp::Ordering;

    use proptest::prelude::*;

-    use super::{
-        f32_to_u64, f64_to_fixed_point, f64_to_u64, fixed_point_to_f64, i64_to_u64, u64_to_f32,
-        u64_to_f64, u64_to_i64, BinarySerializable, FixedSize,
-    };
+    use super::{f64_to_u64, i64_to_u64, u64_to_f64, u64_to_i64, BinarySerializable, FixedSize};

    fn test_i64_converter_helper(val: i64) {
        assert_eq!(u64_to_i64(i64_to_u64(val)), val);
    }

    fn test_f64_converter_helper(val: f64) {
-        assert_eq!(u64_to_f64(f64_to_u64(val)).total_cmp(&val), Ordering::Equal);
-    }
-
-    fn test_f32_converter_helper(val: f32) {
-        assert_eq!(u64_to_f32(f32_to_u64(val)).total_cmp(&val), Ordering::Equal);
-    }
-
-    fn test_fixed_point_converter_helper(val: f64, min: f64, max: f64, precision: u8) {
-        let bucket_count = 1 << precision;
-
-        let packed = f64_to_fixed_point(val, min, max, precision);
-
-        assert!(packed < bucket_count, "used to much bits");
-
-        let depacked = fixed_point_to_f64(packed, min, max, precision);
-        let repacked = f64_to_fixed_point(depacked, min, max, precision);
-
-        assert_eq!(packed, repacked, "generational loss");
-
-        let error = (val.clamp(min, crate::f64_next_down(max)) - depacked).abs();
-
-        let expected = (max - min) / (bucket_count as f64);
-        assert!(
-            error <= (max - min) / (bucket_count as f64) * 2.0,
-            "error larger than expected"
-        );
+        assert_eq!(u64_to_f64(f64_to_u64(val)), val);
    }

    pub fn fixed_size_test<O: BinarySerializable + FixedSize + Default>() {
@@ -248,75 +125,12 @@ pub mod test {
        assert_eq!(buffer.len(), O::SIZE_IN_BYTES);
    }

-    fn fixed_point_bound() -> proptest::num::f64::Any {
-        proptest::num::f64::POSITIVE
-            | proptest::num::f64::NEGATIVE
-            | proptest::num::f64::NORMAL
-            | proptest::num::f64::ZERO
-    }
-
    proptest! {
        #[test]
-        fn test_f64_converter_monotonicity_proptest((left, right) in (proptest::num::f64::ANY, proptest::num::f64::ANY)) {
-            test_f64_converter_helper(left);
-            test_f64_converter_helper(right);
-
+        fn test_f64_converter_monotonicity_proptest((left, right) in (proptest::num::f64::NORMAL, proptest::num::f64::NORMAL)) {
            let left_u64 = f64_to_u64(left);
            let right_u64 = f64_to_u64(right);
-
-            assert_eq!(left_u64.cmp(&right_u64),  left.total_cmp(&right));
-        }
-
-        #[test]
-        fn test_f32_converter_monotonicity_proptest((left, right) in (proptest::num::f32::ANY, proptest::num::f32::ANY)) {
-            test_f32_converter_helper(left);
-            test_f32_converter_helper(right);
-
-            let left_u64 = f32_to_u64(left);
-            let right_u64 = f32_to_u64(right);
-            assert_eq!(left_u64.cmp(&right_u64),  left.total_cmp(&right));
-        }
-
-        #[test]
-        fn test_fixed_point_converter_proptest((left, right, min, max, precision) in
-                (fixed_point_bound(), fixed_point_bound(),
-                fixed_point_bound(), fixed_point_bound(),
-                proptest::num::u8::ANY)) {
-            // convert so all input are legal
-            let (min, max) = if min < max {
-                (min, max)
-            } else if min > max {
-                (max, min)
-            } else {
-                return Ok(()); // equals
-            };
-            if 1 > precision || precision >= 50 {
-                return Ok(());
-            }
-
-            let max_full_precision = 53.0 - precision as f64;
-            if (max / min).abs().log2().abs() > max_full_precision {
-                return Ok(());
-            }
-            // we will go in subnormal territories => loss of precision
-            if (((max - min).log2() - precision as f64) as i32) < f64::MIN_EXP {
-                return Ok(());
-            }
-
-            if (max - min).is_infinite() {
-                return Ok(());
-            }
-
-            test_fixed_point_converter_helper(left, min, max, precision);
-            test_fixed_point_converter_helper(right, min, max, precision);
-
-            let left_u64 = f64_to_fixed_point(left, min, max, precision);
-            let right_u64 = f64_to_fixed_point(right, min, max, precision);
-            if left < right {
-                assert!(left_u64 <= right_u64);
-            } else if left > right {
-                assert!(left_u64 >= right_u64)
-            }
+            assert_eq!(left_u64 < right_u64,  left < right);
        }
    }

@@ -354,27 +168,4 @@ pub mod test {
        assert!(f64_to_u64(-2.0) < f64_to_u64(1.0));
        assert!(f64_to_u64(-2.0) < f64_to_u64(-1.5));
    }
-
-    #[test]
-    fn test_f32_converter() {
-        test_f32_converter_helper(f32::INFINITY);
-        test_f32_converter_helper(f32::NEG_INFINITY);
-        test_f32_converter_helper(0.0);
-        test_f32_converter_helper(-0.0);
-        test_f32_converter_helper(1.0);
-        test_f32_converter_helper(-1.0);
-    }
-
-    #[test]
-    fn test_f32_order() {
-        assert!(!(f32_to_u64(f32::NEG_INFINITY)..f32_to_u64(f32::INFINITY))
-            .contains(&f32_to_u64(f32::NAN))); // nan is not a number
-        assert!(f32_to_u64(1.5) > f32_to_u64(1.0)); // same exponent, different mantissa
-        assert!(f32_to_u64(2.0) > f32_to_u64(1.0)); // same mantissa, different exponent
-        assert!(f32_to_u64(2.0) > f32_to_u64(1.5)); // different exponent and mantissa
-        assert!(f32_to_u64(1.0) > f32_to_u64(-1.0)); // pos > neg
-        assert!(f32_to_u64(-1.5) < f32_to_u64(-1.0));
-        assert!(f32_to_u64(-2.0) < f32_to_u64(1.0));
-        assert!(f32_to_u64(-2.0) < f32_to_u64(-1.5));
-    }
 }
--- a/common/src/serialize.rs
+++ b/common/src/serialize.rs
@@ -1,3 +1,4 @@
+use std::borrow::Cow;
 use std::io::{Read, Write};
 use std::{fmt, io};

@@ -107,6 +108,19 @@ impl FixedSize for u64 {
    const SIZE_IN_BYTES: usize = 8;
 }

+impl BinarySerializable for u128 {
+    fn serialize<W: Write>(&self, writer: &mut W) -> io::Result<()> {
+        writer.write_u128::<Endianness>(*self)
+    }
+    fn deserialize<R: Read>(reader: &mut R) -> io::Result<Self> {
+        reader.read_u128::<Endianness>()
+    }
+}
+
+impl FixedSize for u128 {
+    const SIZE_IN_BYTES: usize = 16;
+}
+
 impl BinarySerializable for f32 {
    fn serialize<W: Write>(&self, writer: &mut W) -> io::Result<()> {
        writer.write_f32::<Endianness>(*self)
@@ -197,6 +211,23 @@ impl BinarySerializable for String {
    }
 }

+impl<'a> BinarySerializable for Cow<'a, str> {
+    fn serialize<W: Write>(&self, writer: &mut W) -> io::Result<()> {
+        let data: &[u8] = self.as_bytes();
+        VInt(data.len() as u64).serialize(writer)?;
+        writer.write_all(data)
+    }
+
+    fn deserialize<R: Read>(reader: &mut R) -> io::Result<Self> {
+        let string_length = VInt::deserialize(reader)?.val() as usize;
+        let mut result = String::with_capacity(string_length);
+        reader
+            .take(string_length as u64)
+            .read_to_string(&mut result)?;
+        Ok(Cow::Owned(result))
+    }
+}
+
 #[cfg(test)]
 pub mod test {

--- a/examples/custom_collector.rs
+++ b/examples/custom_collector.rs
@@ -105,7 +105,7 @@ impl SegmentCollector for StatsSegmentCollector {
    type Fruit = Option<Stats>;

    fn collect(&mut self, doc: u32, _score: Score) {
-        let value = self.fast_field_reader.get_val(doc as u64) as f64;
+        let value = self.fast_field_reader.get_val(doc) as f64;
        self.stats.count += 1;
        self.stats.sum += value;
        self.stats.squared_sum += value * value;
--- a/examples/warmer.rs
+++ b/examples/warmer.rs
@@ -51,7 +51,7 @@ impl Warmer for DynamicPriceColumn {
            let product_id_reader = segment.fast_fields().u64(self.field)?;
            let product_ids: Vec<ProductId> = segment
                .doc_ids_alive()
-                .map(|doc| product_id_reader.get_val(doc as u64))
+                .map(|doc| product_id_reader.get_val(doc))
                .collect();
            let mut prices_it = self.price_fetcher.fetch_prices(&product_ids).into_iter();
            let mut price_vals: Vec<Price> = Vec::new();
--- a/fastfield_codecs/benches/bench.rs
+++ b/fastfield_codecs/benches/bench.rs
@@ -65,7 +65,7 @@ mod tests {
        b.iter(|| {
            let mut a = 0u64;
            for _ in 0..n {
-                a = column.get_val(a as u64);
+                a = column.get_val(a as u32);
            }
            a
        });
@@ -100,9 +100,10 @@ mod tests {

    fn get_u128_column_from_data(data: &[u128]) -> Arc<dyn Column<u128>> {
        let mut out = vec![];
-        serialize_u128(VecColumn::from(&data), &mut out).unwrap();
+        let iter_gen = || data.iter().cloned();
+        serialize_u128(iter_gen, data.len() as u32, &mut out).unwrap();
        let out = OwnedBytes::new(out);
-        open_u128(out).unwrap()
+        open_u128::<u128>(out).unwrap()
    }

    #[bench]
@@ -110,7 +111,15 @@ mod tests {
        let (major_item, _minor_item, data) = get_data_50percent_item();
        let column = get_u128_column_from_data(&data);

-        b.iter(|| column.get_between_vals(major_item..=major_item));
+        b.iter(|| {
+            let mut positions = Vec::new();
+            column.get_positions_for_value_range(
+                major_item..=major_item,
+                0..data.len() as u32,
+                &mut positions,
+            );
+            positions
+        });
    }

    #[bench]
@@ -118,7 +127,15 @@ mod tests {
        let (_major_item, minor_item, data) = get_data_50percent_item();
        let column = get_u128_column_from_data(&data);

-        b.iter(|| column.get_between_vals(minor_item..=minor_item));
+        b.iter(|| {
+            let mut positions = Vec::new();
+            column.get_positions_for_value_range(
+                minor_item..=minor_item,
+                0..data.len() as u32,
+                &mut positions,
+            );
+            positions
+        });
    }

    #[bench]
@@ -126,7 +143,15 @@ mod tests {
        let (_major_item, _minor_item, data) = get_data_50percent_item();
        let column = get_u128_column_from_data(&data);

-        b.iter(|| column.get_between_vals(0..=u128::MAX));
+        b.iter(|| {
+            let mut positions = Vec::new();
+            column.get_positions_for_value_range(
+                0..=u128::MAX,
+                0..data.len() as u32,
+                &mut positions,
+            );
+            positions
+        });
    }

    #[bench]
@@ -136,7 +161,7 @@ mod tests {
        b.iter(|| {
            let mut a = 0u128;
            for i in 0u64..column.num_vals() as u64 {
-                a += column.get_val(i);
+                a += column.get_val(i as u32);
            }
            a
        });
@@ -150,7 +175,7 @@ mod tests {
            let n = column.num_vals();
            let mut a = 0u128;
            for i in (0..n / 5).map(|val| val * 5) {
-                a += column.get_val(i as u64);
+                a += column.get_val(i);
            }
            a
        });
@@ -175,9 +200,9 @@ mod tests {
        let n = permutation.len();
        let column: Arc<dyn Column<u64>> = serialize_and_load(&permutation);
        b.iter(|| {
-            let mut a = 0u64;
+            let mut a = 0;
            for i in (0..n / 7).map(|val| val * 7) {
-                a += column.get_val(i as u64);
+                a += column.get_val(i as u32);
            }
            a
        });
@@ -190,7 +215,7 @@ mod tests {
        let column: Arc<dyn Column<u64>> = serialize_and_load(&permutation);
        b.iter(|| {
            let mut a = 0u64;
-            for i in 0u64..n as u64 {
+            for i in 0u32..n as u32 {
                a += column.get_val(i);
            }
            a
@@ -204,8 +229,8 @@ mod tests {
        let column: Arc<dyn Column<u64>> = serialize_and_load(&permutation);
        b.iter(|| {
            let mut a = 0u64;
-            for i in 0..n as u64 {
-                a += column.get_val(i);
+            for i in 0..n {
+                a += column.get_val(i as u32);
            }
            a
        });
--- a/fastfield_codecs/src/bitpacked.rs
+++ b/fastfield_codecs/src/bitpacked.rs
@@ -17,7 +17,7 @@ pub struct BitpackedReader {

 impl Column for BitpackedReader {
    #[inline]
-    fn get_val(&self, doc: u64) -> u64 {
+    fn get_val(&self, doc: u32) -> u64 {
        self.bit_unpacker.get(doc, &self.data)
    }
    #[inline]
@@ -30,7 +30,7 @@ impl Column for BitpackedReader {
        self.normalized_header.max_value
    }
    #[inline]
-    fn num_vals(&self) -> u64 {
+    fn num_vals(&self) -> u32 {
        self.normalized_header.num_vals
    }
 }
--- a/fastfield_codecs/src/blockwise_linear.rs
+++ b/fastfield_codecs/src/blockwise_linear.rs
@@ -36,7 +36,7 @@ impl BinarySerializable for Block {
    }
 }

-fn compute_num_blocks(num_vals: u64) -> usize {
+fn compute_num_blocks(num_vals: u32) -> usize {
    (num_vals as usize + CHUNK_SIZE - 1) / CHUNK_SIZE
 }

@@ -72,13 +72,13 @@ impl FastFieldCodec for BlockwiseLinearCodec {

    // Estimate first_chunk and extrapolate
    fn estimate(column: &dyn crate::Column) -> Option<f32> {
-        if column.num_vals() < 10 * CHUNK_SIZE as u64 {
+        if column.num_vals() < 10 * CHUNK_SIZE as u32 {
            return None;
        }
        let mut first_chunk: Vec<u64> = column.iter().take(CHUNK_SIZE as usize).collect();
        let line = Line::train(&VecColumn::from(&first_chunk));
        for (i, buffer_val) in first_chunk.iter_mut().enumerate() {
-            let interpolated_val = line.eval(i as u64);
+            let interpolated_val = line.eval(i as u32);
            *buffer_val = buffer_val.wrapping_sub(interpolated_val);
        }
        let estimated_bit_width = first_chunk
@@ -95,7 +95,7 @@ impl FastFieldCodec for BlockwiseLinearCodec {
        };
        let num_bits = estimated_bit_width as u64 * column.num_vals() as u64
            // function metadata per block
-            + metadata_per_block as u64 * (column.num_vals() / CHUNK_SIZE as u64);
+            + metadata_per_block as u64 * (column.num_vals() as u64 / CHUNK_SIZE as u64);
        let num_bits_uncompressed = 64 * column.num_vals();
        Some(num_bits as f32 / num_bits_uncompressed as f32)
    }
@@ -121,7 +121,7 @@ impl FastFieldCodec for BlockwiseLinearCodec {
            assert!(!buffer.is_empty());

            for (i, buffer_val) in buffer.iter_mut().enumerate() {
-                let interpolated_val = line.eval(i as u64);
+                let interpolated_val = line.eval(i as u32);
                *buffer_val = buffer_val.wrapping_sub(interpolated_val);
            }
            let bit_width = buffer.iter().copied().map(compute_num_bits).max().unwrap();
@@ -161,9 +161,9 @@ pub struct BlockwiseLinearReader {

 impl Column for BlockwiseLinearReader {
    #[inline(always)]
-    fn get_val(&self, idx: u64) -> u64 {
-        let block_id = (idx / CHUNK_SIZE as u64) as usize;
-        let idx_within_block = idx % (CHUNK_SIZE as u64);
+    fn get_val(&self, idx: u32) -> u64 {
+        let block_id = (idx / CHUNK_SIZE as u32) as usize;
+        let idx_within_block = idx % (CHUNK_SIZE as u32);
        let block = &self.blocks[block_id];
        let interpoled_val: u64 = block.line.eval(idx_within_block);
        let block_bytes = &self.data[block.data_start_offset..];
@@ -180,7 +180,7 @@ impl Column for BlockwiseLinearReader {
        self.normalized_header.max_value
    }

-    fn num_vals(&self) -> u64 {
+    fn num_vals(&self) -> u32 {
        self.normalized_header.num_vals
    }
 }
--- a/fastfield_codecs/src/column.rs
+++ b/fastfield_codecs/src/column.rs
@@ -1,8 +1,11 @@
 use std::marker::PhantomData;
-use std::ops::RangeInclusive;
+use std::ops::{Range, RangeInclusive};

 use tantivy_bitpacker::minmax;

+use crate::monotonic_mapping::StrictlyMonotonicFn;
+
+/// `Column` provides columnar access on a field.
 pub trait Column<T: PartialOrd = u64>: Send + Sync {
    /// Return the value associated with the given idx.
    ///
@@ -11,7 +14,7 @@ pub trait Column<T: PartialOrd = u64>: Send + Sync {
    /// # Panics
    ///
    /// May panic if `idx` is greater than the column length.
-    fn get_val(&self, idx: u64) -> T;
+    fn get_val(&self, idx: u32) -> T;

    /// Fills an output buffer with the fast field values
    /// associated with the `DocId` going from
@@ -24,21 +27,28 @@ pub trait Column<T: PartialOrd = u64>: Send + Sync {
    #[inline]
    fn get_range(&self, start: u64, output: &mut [T]) {
        for (out, idx) in output.iter_mut().zip(start..) {
-            *out = self.get_val(idx);
+            *out = self.get_val(idx as u32);
        }
    }

-    /// Return the positions of values which are in the provided range.
+    /// Get the positions of values which are in the provided value range.
+    ///
+    /// Note that position == docid for single value fast fields
    #[inline]
-    fn get_between_vals(&self, range: RangeInclusive<T>) -> Vec<u64> {
-        let mut vals = Vec::new();
-        for idx in 0..self.num_vals() {
+    fn get_positions_for_value_range(
+        &self,
+        value_range: RangeInclusive<T>,
+        doc_id_range: Range<u32>,
+        positions: &mut Vec<u32>,
+    ) {
+        let doc_id_range = doc_id_range.start..doc_id_range.end.min(self.num_vals());
+
+        for idx in doc_id_range.start..doc_id_range.end {
            let val = self.get_val(idx);
-            if range.contains(&val) {
-                vals.push(idx);
+            if value_range.contains(&val) {
+                positions.push(idx);
            }
        }
-        vals
    }

    /// Returns the minimum value for this fast field.
@@ -57,7 +67,8 @@ pub trait Column<T: PartialOrd = u64>: Send + Sync {
    /// `.max_value()`.
    fn max_value(&self) -> T;

-    fn num_vals(&self) -> u64;
+    /// The number of values in the column.
+    fn num_vals(&self) -> u32;

    /// Returns a iterator over the data
    fn iter<'a>(&'a self) -> Box<dyn Iterator<Item = T> + 'a> {
@@ -65,6 +76,7 @@ pub trait Column<T: PartialOrd = u64>: Send + Sync {
    }
 }

+/// VecColumn provides `Column` over a slice.
 pub struct VecColumn<'a, T = u64> {
    values: &'a [T],
    min_value: T,
@@ -72,7 +84,7 @@ pub struct VecColumn<'a, T = u64> {
 }

 impl<'a, C: Column<T>, T: Copy + PartialOrd> Column<T> for &'a C {
-    fn get_val(&self, idx: u64) -> T {
+    fn get_val(&self, idx: u32) -> T {
        (*self).get_val(idx)
    }

@@ -84,7 +96,7 @@ impl<'a, C: Column<T>, T: Copy + PartialOrd> Column<T> for &'a C {
        (*self).max_value()
    }

-    fn num_vals(&self) -> u64 {
+    fn num_vals(&self) -> u32 {
        (*self).num_vals()
    }

@@ -98,7 +110,7 @@ impl<'a, C: Column<T>, T: Copy + PartialOrd> Column<T> for &'a C {
 }

 impl<'a, T: Copy + PartialOrd + Send + Sync> Column<T> for VecColumn<'a, T> {
-    fn get_val(&self, position: u64) -> T {
+    fn get_val(&self, position: u32) -> T {
        self.values[position as usize]
    }

@@ -114,8 +126,8 @@ impl<'a, T: Copy + PartialOrd + Send + Sync> Column<T> for VecColumn<'a, T> {
        self.max_value
    }

-    fn num_vals(&self) -> u64 {
-        self.values.len() as u64
+    fn num_vals(&self) -> u32 {
+        self.values.len() as u32
    }

    fn get_range(&self, start: u64, output: &mut [T]) {
@@ -143,16 +155,30 @@ struct MonotonicMappingColumn<C, T, Input> {
    _phantom: PhantomData<Input>,
 }

-/// Creates a view of a column transformed by a monotonic mapping.
-pub fn monotonic_map_column<C, T, Input: PartialOrd, Output: PartialOrd>(
+/// Creates a view of a column transformed by a strictly monotonic mapping. See
+/// [`StrictlyMonotonicFn`].
+///
+/// E.g. apply a gcd monotonic_mapping([100, 200, 300]) == [1, 2, 3]
+/// monotonic_mapping.mapping() is expected to be injective, and we should always have
+/// monotonic_mapping.inverse(monotonic_mapping.mapping(el)) == el
+///
+/// The inverse of the mapping is required for:
+/// `fn get_positions_for_value_range(&self, range: RangeInclusive<T>) -> Vec<u64> `
+/// The user provides the original value range and we need to monotonic map them in the same way the
+/// serialization does before calling the underlying column.
+///
+/// Note that when opening a codec, the monotonic_mapping should be the inverse of the mapping
+/// during serialization. And therefore the monotonic_mapping_inv when opening is the same as
+/// monotonic_mapping during serialization.
+pub fn monotonic_map_column<C, T, Input, Output>(
    from_column: C,
    monotonic_mapping: T,
 ) -> impl Column<Output>
 where
    C: Column<Input>,
-    T: Fn(Input) -> Output + Send + Sync,
-    Input: Send + Sync,
-    Output: Send + Sync,
+    T: StrictlyMonotonicFn<Input, Output> + Send + Sync,
+    Input: PartialOrd + Send + Sync + Clone,
+    Output: PartialOrd + Send + Sync + Clone,
 {
    MonotonicMappingColumn {
        from_column,
@@ -161,36 +187,53 @@ where
    }
 }

-impl<C, T, Input: PartialOrd, Output: PartialOrd> Column<Output>
-    for MonotonicMappingColumn<C, T, Input>
+impl<C, T, Input, Output> Column<Output> for MonotonicMappingColumn<C, T, Input>
 where
    C: Column<Input>,
-    T: Fn(Input) -> Output + Send + Sync,
-    Input: Send + Sync,
-    Output: Send + Sync,
+    T: StrictlyMonotonicFn<Input, Output> + Send + Sync,
+    Input: PartialOrd + Send + Sync + Clone,
+    Output: PartialOrd + Send + Sync + Clone,
 {
    #[inline]
-    fn get_val(&self, idx: u64) -> Output {
+    fn get_val(&self, idx: u32) -> Output {
        let from_val = self.from_column.get_val(idx);
-        (self.monotonic_mapping)(from_val)
+        self.monotonic_mapping.mapping(from_val)
    }

    fn min_value(&self) -> Output {
        let from_min_value = self.from_column.min_value();
-        (self.monotonic_mapping)(from_min_value)
+        self.monotonic_mapping.mapping(from_min_value)
    }

    fn max_value(&self) -> Output {
        let from_max_value = self.from_column.max_value();
-        (self.monotonic_mapping)(from_max_value)
+        self.monotonic_mapping.mapping(from_max_value)
    }

-    fn num_vals(&self) -> u64 {
+    fn num_vals(&self) -> u32 {
        self.from_column.num_vals()
    }

    fn iter(&self) -> Box<dyn Iterator<Item = Output> + '_> {
-        Box::new(self.from_column.iter().map(&self.monotonic_mapping))
+        Box::new(
+            self.from_column
+                .iter()
+                .map(|el| self.monotonic_mapping.mapping(el)),
+        )
+    }
+
+    fn get_positions_for_value_range(
+        &self,
+        range: RangeInclusive<Output>,
+        doc_id_range: Range<u32>,
+        positions: &mut Vec<u32>,
+    ) {
+        self.from_column.get_positions_for_value_range(
+            self.monotonic_mapping.inverse(range.start().clone())
+                ..=self.monotonic_mapping.inverse(range.end().clone()),
+            doc_id_range,
+            positions,
+        )
    }

    // We voluntarily do not implement get_range as it yields a regression,
@@ -212,7 +255,7 @@ where
    T: Iterator + Clone + ExactSizeIterator + Send + Sync,
    T::Item: PartialOrd,
 {
-    fn get_val(&self, idx: u64) -> T::Item {
+    fn get_val(&self, idx: u32) -> T::Item {
        self.0.clone().nth(idx as usize).unwrap()
    }

@@ -224,8 +267,8 @@ where
        self.0.clone().last().unwrap()
    }

-    fn num_vals(&self) -> u64 {
-        self.0.len() as u64
+    fn num_vals(&self) -> u32 {
+        self.0.len() as u32
    }

    fn iter(&self) -> Box<dyn Iterator<Item = T::Item> + '_> {
@@ -236,19 +279,22 @@ where
 #[cfg(test)]
 mod tests {
    use super::*;
-    use crate::MonotonicallyMappableToU64;
+    use crate::monotonic_mapping::{
+        StrictlyMonotonicMappingInverter, StrictlyMonotonicMappingToInternalBaseval,
+        StrictlyMonotonicMappingToInternalGCDBaseval,
+    };

    #[test]
    fn test_monotonic_mapping() {
-        let vals = &[1u64, 3u64][..];
+        let vals = &[3u64, 5u64][..];
        let col = VecColumn::from(vals);
-        let mapped = monotonic_map_column(col, |el| el + 4);
-        assert_eq!(mapped.min_value(), 5u64);
-        assert_eq!(mapped.max_value(), 7u64);
+        let mapped = monotonic_map_column(col, StrictlyMonotonicMappingToInternalBaseval::new(2));
+        assert_eq!(mapped.min_value(), 1u64);
+        assert_eq!(mapped.max_value(), 3u64);
        assert_eq!(mapped.num_vals(), 2);
        assert_eq!(mapped.num_vals(), 2);
-        assert_eq!(mapped.get_val(0), 5);
-        assert_eq!(mapped.get_val(1), 7);
+        assert_eq!(mapped.get_val(0), 1);
+        assert_eq!(mapped.get_val(1), 3);
    }

    #[test]
@@ -260,10 +306,15 @@ mod tests {

    #[test]
    fn test_monotonic_mapping_iter() {
-        let vals: Vec<u64> = (-1..99).map(i64::to_u64).collect();
+        let vals: Vec<u64> = (10..110u64).map(|el| el * 10).collect();
        let col = VecColumn::from(&vals);
-        let mapped = monotonic_map_column(col, |el| i64::from_u64(el) * 10i64);
-        let val_i64s: Vec<i64> = mapped.iter().collect();
+        let mapped = monotonic_map_column(
+            col,
+            StrictlyMonotonicMappingInverter::from(
+                StrictlyMonotonicMappingToInternalGCDBaseval::new(10, 100),
+            ),
+        );
+        let val_i64s: Vec<u64> = mapped.iter().collect();
        for i in 0..100 {
            assert_eq!(val_i64s[i as usize], mapped.get_val(i));
        }
@@ -271,20 +322,26 @@ mod tests {

    #[test]
    fn test_monotonic_mapping_get_range() {
-        let vals: Vec<u64> = (-1..99).map(i64::to_u64).collect();
+        let vals: Vec<u64> = (0..100u64).map(|el| el * 10).collect();
        let col = VecColumn::from(&vals);
-        let mapped = monotonic_map_column(col, |el| i64::from_u64(el) * 10i64);
-        assert_eq!(mapped.min_value(), -10i64);
-        assert_eq!(mapped.max_value(), 980i64);
+        let mapped = monotonic_map_column(
+            col,
+            StrictlyMonotonicMappingInverter::from(
+                StrictlyMonotonicMappingToInternalGCDBaseval::new(10, 0),
+            ),
+        );
+
+        assert_eq!(mapped.min_value(), 0u64);
+        assert_eq!(mapped.max_value(), 9900u64);
        assert_eq!(mapped.num_vals(), 100);
-        let val_i64s: Vec<i64> = mapped.iter().collect();
-        assert_eq!(val_i64s.len(), 100);
+        let val_u64s: Vec<u64> = mapped.iter().collect();
+        assert_eq!(val_u64s.len(), 100);
        for i in 0..100 {
-            assert_eq!(val_i64s[i as usize], mapped.get_val(i));
-            assert_eq!(val_i64s[i as usize], i64::from_u64(vals[i as usize]) * 10);
+            assert_eq!(val_u64s[i as usize], mapped.get_val(i));
+            assert_eq!(val_u64s[i as usize], vals[i as usize] * 10);
        }
-        let mut buf = [0i64; 20];
+        let mut buf = [0u64; 20];
        mapped.get_range(7, &mut buf[..]);
-        assert_eq!(&val_i64s[7..][..20], &buf);
+        assert_eq!(&val_u64s[7..][..20], &buf);
    }
 }
--- a/fastfield_codecs/src/compact_space/build_compact_space.rs
+++ b/fastfield_codecs/src/compact_space/build_compact_space.rs
@@ -57,7 +57,7 @@ fn num_bits(val: u128) -> u8 {
 /// metadata.
 pub fn get_compact_space(
    values_deduped_sorted: &BTreeSet<u128>,
-    total_num_values: u64,
+    total_num_values: u32,
    cost_per_blank: usize,
 ) -> CompactSpace {
    let mut compact_space_builder = CompactSpaceBuilder::new();
--- a/fastfield_codecs/src/compact_space/mod.rs
+++ b/fastfield_codecs/src/compact_space/mod.rs
@@ -14,7 +14,7 @@ use std::{
    cmp::Ordering,
    collections::BTreeSet,
    io::{self, Write},
-    ops::RangeInclusive,
+    ops::{Range, RangeInclusive},
 };

 use common::{BinarySerializable, CountingWriter, VInt, VIntU128};
@@ -165,16 +165,16 @@ pub struct IPCodecParams {
    bit_unpacker: BitUnpacker,
    min_value: u128,
    max_value: u128,
-    num_vals: u64,
+    num_vals: u32,
    num_bits: u8,
 }

 impl CompactSpaceCompressor {
    /// Taking the vals as Vec may cost a lot of memory. It is used to sort the vals.
-    pub fn train_from(column: &impl Column<u128>) -> Self {
+    pub fn train_from(iter: impl Iterator<Item = u128>, num_vals: u32) -> Self {
        let mut values_sorted = BTreeSet::new();
-        values_sorted.extend(column.iter());
-        let total_num_values = column.num_vals();
+        values_sorted.extend(iter);
+        let total_num_values = num_vals;

        let compact_space =
            get_compact_space(&values_sorted, total_num_values, COST_PER_BLANK_IN_BITS);
@@ -200,7 +200,7 @@ impl CompactSpaceCompressor {
                bit_unpacker: BitUnpacker::new(num_bits),
                min_value,
                max_value,
-                num_vals: total_num_values as u64,
+                num_vals: total_num_values,
                num_bits,
            },
        }
@@ -267,7 +267,7 @@ impl BinarySerializable for IPCodecParams {
        let _header_flags = u64::deserialize(reader)?;
        let min_value = VIntU128::deserialize(reader)?.0;
        let max_value = VIntU128::deserialize(reader)?.0;
-        let num_vals = VIntU128::deserialize(reader)?.0 as u64;
+        let num_vals = VIntU128::deserialize(reader)?.0 as u32;
        let num_bits = u8::deserialize(reader)?;
        let compact_space = CompactSpace::deserialize(reader)?;

@@ -284,7 +284,7 @@ impl BinarySerializable for IPCodecParams {

 impl Column<u128> for CompactSpaceDecompressor {
    #[inline]
-    fn get_val(&self, doc: u64) -> u128 {
+    fn get_val(&self, doc: u32) -> u128 {
        self.get(doc)
    }

@@ -296,7 +296,7 @@ impl Column<u128> for CompactSpaceDecompressor {
        self.max_value()
    }

-    fn num_vals(&self) -> u64 {
+    fn num_vals(&self) -> u32 {
        self.params.num_vals
    }

@@ -304,8 +304,15 @@ impl Column<u128> for CompactSpaceDecompressor {
    fn iter(&self) -> Box<dyn Iterator<Item = u128> + '_> {
        Box::new(self.iter())
    }
-    fn get_between_vals(&self, range: RangeInclusive<u128>) -> Vec<u64> {
-        self.get_between_vals(range)
+
+    #[inline]
+    fn get_positions_for_value_range(
+        &self,
+        value_range: RangeInclusive<u128>,
+        doc_id_range: Range<u32>,
+        positions: &mut Vec<u32>,
+    ) {
+        self.get_positions_for_value_range(value_range, doc_id_range, positions)
    }
 }

@@ -340,12 +347,19 @@ impl CompactSpaceDecompressor {
    /// Comparing on compact space: Real dataset 1.08 GElements/s
    ///
    /// Comparing on original space: Real dataset .06 GElements/s (not completely optimized)
-    pub fn get_between_vals(&self, range: RangeInclusive<u128>) -> Vec<u64> {
-        if range.start() > range.end() {
-            return Vec::new();
+    #[inline]
+    pub fn get_positions_for_value_range(
+        &self,
+        value_range: RangeInclusive<u128>,
+        doc_id_range: Range<u32>,
+        positions: &mut Vec<u32>,
+    ) {
+        if value_range.start() > value_range.end() {
+            return;
        }
-        let from_value = *range.start();
-        let to_value = *range.end();
+        let doc_id_range = doc_id_range.start..doc_id_range.end.min(self.num_vals());
+        let from_value = *value_range.start();
+        let to_value = *value_range.end();
        assert!(to_value >= from_value);
        let compact_from = self.u128_to_compact(from_value);
        let compact_to = self.u128_to_compact(to_value);
@@ -353,7 +367,7 @@ impl CompactSpaceDecompressor {
        // Quick return, if both ranges fall into the same non-mapped space, the range can't cover
        // any values, so we can early exit
        match (compact_to, compact_from) {
-            (Err(pos1), Err(pos2)) if pos1 == pos2 => return Vec::new(),
+            (Err(pos1), Err(pos2)) if pos1 == pos2 => return,
            _ => {}
        }

@@ -375,27 +389,28 @@ impl CompactSpaceDecompressor {
        });

        let range = compact_from..=compact_to;
-        let mut positions = Vec::new();
+
+        let scan_num_docs = doc_id_range.end - doc_id_range.start;

        let step_size = 4;
-        let cutoff = self.params.num_vals - self.params.num_vals % step_size;
+        let cutoff = doc_id_range.start + scan_num_docs - scan_num_docs % step_size;

        let mut push_if_in_range = |idx, val| {
            if range.contains(&val) {
                positions.push(idx);
            }
        };
-        let get_val = |idx| self.params.bit_unpacker.get(idx as u64, &self.data);
+        let get_val = |idx| self.params.bit_unpacker.get(idx, &self.data);
        // unrolled loop
-        for idx in (0..cutoff).step_by(step_size as usize) {
+        for idx in (doc_id_range.start..cutoff).step_by(step_size as usize) {
            let idx1 = idx;
            let idx2 = idx + 1;
            let idx3 = idx + 2;
            let idx4 = idx + 3;
-            let val1 = get_val(idx1);
-            let val2 = get_val(idx2);
-            let val3 = get_val(idx3);
-            let val4 = get_val(idx4);
+            let val1 = get_val(idx1 as u32);
+            let val2 = get_val(idx2 as u32);
+            let val3 = get_val(idx3 as u32);
+            let val4 = get_val(idx4 as u32);
            push_if_in_range(idx1, val1);
            push_if_in_range(idx2, val2);
            push_if_in_range(idx3, val3);
@@ -403,17 +418,15 @@ impl CompactSpaceDecompressor {
        }

        // handle rest
-        for idx in cutoff..self.params.num_vals {
-            push_if_in_range(idx, get_val(idx));
+        for idx in cutoff..doc_id_range.end {
+            push_if_in_range(idx, get_val(idx as u32));
        }
-
-        positions
    }

    #[inline]
    fn iter_compact(&self) -> impl Iterator<Item = u64> + '_ {
        (0..self.params.num_vals)
-            .map(move |idx| self.params.bit_unpacker.get(idx as u64, &self.data) as u64)
+            .map(move |idx| self.params.bit_unpacker.get(idx, &self.data) as u64)
    }

    #[inline]
@@ -425,7 +438,7 @@ impl CompactSpaceDecompressor {
    }

    #[inline]
-    pub fn get(&self, idx: u64) -> u128 {
+    pub fn get(&self, idx: u32) -> u128 {
        let compact = self.params.bit_unpacker.get(idx, &self.data);
        self.compact_to_u128(compact)
    }
@@ -443,7 +456,7 @@ impl CompactSpaceDecompressor {
 mod tests {

    use super::*;
-    use crate::{open_u128, serialize_u128, VecColumn};
+    use crate::{open_u128, serialize_u128};

    #[test]
    fn compact_space_test() {
@@ -452,7 +465,7 @@ mod tests {
        ]
        .into_iter()
        .collect();
-        let compact_space = get_compact_space(ips, ips.len() as u64, 11);
+        let compact_space = get_compact_space(ips, ips.len() as u32, 11);
        let amplitude = compact_space.amplitude_compact_space();
        assert_eq!(amplitude, 17);
        assert_eq!(1, compact_space.u128_to_compact(2).unwrap());
@@ -483,7 +496,7 @@ mod tests {
    #[test]
    fn compact_space_amplitude_test() {
        let ips = &[100000u128, 1000000].into_iter().collect();
-        let compact_space = get_compact_space(ips, ips.len() as u64, 1);
+        let compact_space = get_compact_space(ips, ips.len() as u32, 1);
        let amplitude = compact_space.amplitude_compact_space();
        assert_eq!(amplitude, 2);
    }
@@ -491,16 +504,21 @@ mod tests {
    fn test_all(data: OwnedBytes, expected: &[u128]) {
        let decompressor = CompactSpaceDecompressor::open(data).unwrap();
        for (idx, expected_val) in expected.iter().cloned().enumerate() {
-            let val = decompressor.get(idx as u64);
+            let val = decompressor.get(idx as u32);
            assert_eq!(val, expected_val);

            let test_range = |range: RangeInclusive<u128>| {
                let expected_positions = expected
                    .iter()
                    .positions(|val| range.contains(val))
-                    .map(|pos| pos as u64)
+                    .map(|pos| pos as u32)
                    .collect::<Vec<_>>();
-                let positions = decompressor.get_between_vals(range);
+                let mut positions = Vec::new();
+                decompressor.get_positions_for_value_range(
+                    range,
+                    0..decompressor.num_vals(),
+                    &mut positions,
+                );
                assert_eq!(positions, expected_positions);
            };

@@ -513,7 +531,12 @@ mod tests {

    fn test_aux_vals(u128_vals: &[u128]) -> OwnedBytes {
        let mut out = Vec::new();
-        serialize_u128(VecColumn::from(u128_vals), &mut out).unwrap();
+        serialize_u128(
+            || u128_vals.iter().cloned(),
+            u128_vals.len() as u32,
+            &mut out,
+        )
+        .unwrap();

        let data = OwnedBytes::new(out);
        test_all(data.clone(), u128_vals);
@@ -535,24 +558,107 @@ mod tests {
        ];
        let data = test_aux_vals(vals);
        let decomp = CompactSpaceDecompressor::open(data).unwrap();
-        let positions = decomp.get_between_vals(0..=1);
+        let complete_range = 0..vals.len() as u32;
+        for (pos, val) in vals.iter().enumerate() {
+            let val = *val as u128;
+            let pos = pos as u32;
+            let mut positions = Vec::new();
+            decomp.get_positions_for_value_range(val..=val, pos..pos + 1, &mut positions);
+            assert_eq!(positions, vec![pos]);
+        }
+
+        // handle docid range out of bounds
+        let positions = get_positions_for_value_range_helper(&decomp, 0..=1, 1..u32::MAX);
+        assert_eq!(positions, vec![]);
+
+        let positions =
+            get_positions_for_value_range_helper(&decomp, 0..=1, complete_range.clone());
        assert_eq!(positions, vec![0]);
-        let positions = decomp.get_between_vals(0..=2);
+        let positions =
+            get_positions_for_value_range_helper(&decomp, 0..=2, complete_range.clone());
        assert_eq!(positions, vec![0]);
-        let positions = decomp.get_between_vals(0..=3);
+        let positions =
+            get_positions_for_value_range_helper(&decomp, 0..=3, complete_range.clone());
        assert_eq!(positions, vec![0, 2]);
-        assert_eq!(decomp.get_between_vals(99999u128..=99999u128), vec![3]);
-        assert_eq!(decomp.get_between_vals(99999u128..=100000u128), vec![3, 4]);
-        assert_eq!(decomp.get_between_vals(99998u128..=100000u128), vec![3, 4]);
-        assert_eq!(decomp.get_between_vals(99998u128..=99999u128), vec![3]);
-        assert_eq!(decomp.get_between_vals(99998u128..=99998u128), vec![]);
-        assert_eq!(decomp.get_between_vals(333u128..=333u128), vec![8]);
-        assert_eq!(decomp.get_between_vals(332u128..=333u128), vec![8]);
-        assert_eq!(decomp.get_between_vals(332u128..=334u128), vec![8]);
-        assert_eq!(decomp.get_between_vals(333u128..=334u128), vec![8]);
+        assert_eq!(
+            get_positions_for_value_range_helper(
+                &decomp,
+                99999u128..=99999u128,
+                complete_range.clone()
+            ),
+            vec![3]
+        );
+        assert_eq!(
+            get_positions_for_value_range_helper(
+                &decomp,
+                99999u128..=100000u128,
+                complete_range.clone()
+            ),
+            vec![3, 4]
+        );
+        assert_eq!(
+            get_positions_for_value_range_helper(
+                &decomp,
+                99998u128..=100000u128,
+                complete_range.clone()
+            ),
+            vec![3, 4]
+        );
+        assert_eq!(
+            get_positions_for_value_range_helper(
+                &decomp,
+                99998u128..=99999u128,
+                complete_range.clone()
+            ),
+            vec![3]
+        );
+        assert_eq!(
+            get_positions_for_value_range_helper(
+                &decomp,
+                99998u128..=99998u128,
+                complete_range.clone()
+            ),
+            vec![]
+        );
+        assert_eq!(
+            get_positions_for_value_range_helper(
+                &decomp,
+                333u128..=333u128,
+                complete_range.clone()
+            ),
+            vec![8]
+        );
+        assert_eq!(
+            get_positions_for_value_range_helper(
+                &decomp,
+                332u128..=333u128,
+                complete_range.clone()
+            ),
+            vec![8]
+        );
+        assert_eq!(
+            get_positions_for_value_range_helper(
+                &decomp,
+                332u128..=334u128,
+                complete_range.clone()
+            ),
+            vec![8]
+        );
+        assert_eq!(
+            get_positions_for_value_range_helper(
+                &decomp,
+                333u128..=334u128,
+                complete_range.clone()
+            ),
+            vec![8]
+        );

        assert_eq!(
-            decomp.get_between_vals(4_000_211_221u128..=5_000_000_000u128),
+            get_positions_for_value_range_helper(
+                &decomp,
+                4_000_211_221u128..=5_000_000_000u128,
+                complete_range.clone()
+            ),
            vec![6, 7]
        );
    }
@@ -577,12 +683,29 @@ mod tests {
        ];
        let data = test_aux_vals(vals);
        let decomp = CompactSpaceDecompressor::open(data).unwrap();
-        let positions = decomp.get_between_vals(0..=5);
-        assert_eq!(positions, vec![]);
-        let positions = decomp.get_between_vals(0..=100);
-        assert_eq!(positions, vec![0]);
-        let positions = decomp.get_between_vals(0..=105);
-        assert_eq!(positions, vec![0]);
+        let complete_range = 0..vals.len() as u32;
+        assert_eq!(
+            get_positions_for_value_range_helper(&decomp, 0..=5, complete_range.clone()),
+            vec![]
+        );
+        assert_eq!(
+            get_positions_for_value_range_helper(&decomp, 0..=100, complete_range.clone()),
+            vec![0]
+        );
+        assert_eq!(
+            get_positions_for_value_range_helper(&decomp, 0..=105, complete_range.clone()),
+            vec![0]
+        );
+    }
+
+    fn get_positions_for_value_range_helper<C: Column<T> + ?Sized, T: PartialOrd>(
+        column: &C,
+        value_range: RangeInclusive<T>,
+        doc_id_range: Range<u32>,
+    ) -> Vec<u32> {
+        let mut positions = Vec::new();
+        column.get_positions_for_value_range(value_range, doc_id_range, &mut positions);
+        positions
    }

    #[test]
@@ -603,13 +726,33 @@ mod tests {
            5_000_000_000,
        ];
        let mut out = Vec::new();
-        serialize_u128(VecColumn::from(vals), &mut out).unwrap();
-        let decomp = open_u128(OwnedBytes::new(out)).unwrap();
+        serialize_u128(|| vals.iter().cloned(), vals.len() as u32, &mut out).unwrap();
+        let decomp = open_u128::<u128>(OwnedBytes::new(out)).unwrap();
+        let complete_range = 0..vals.len() as u32;

-        assert_eq!(decomp.get_between_vals(199..=200), vec![0]);
-        assert_eq!(decomp.get_between_vals(199..=201), vec![0, 1]);
-        assert_eq!(decomp.get_between_vals(200..=200), vec![0]);
-        assert_eq!(decomp.get_between_vals(1_000_000..=1_000_000), vec![11]);
+        assert_eq!(
+            get_positions_for_value_range_helper(&*decomp, 199..=200, complete_range.clone()),
+            vec![0]
+        );
+
+        assert_eq!(
+            get_positions_for_value_range_helper(&*decomp, 199..=201, complete_range.clone()),
+            vec![0, 1]
+        );
+
+        assert_eq!(
+            get_positions_for_value_range_helper(&*decomp, 200..=200, complete_range.clone()),
+            vec![0]
+        );
+
+        assert_eq!(
+            get_positions_for_value_range_helper(
+                &*decomp,
+                1_000_000..=1_000_000,
+                complete_range.clone()
+            ),
+            vec![11]
+        );
    }

    #[test]
--- a/fastfield_codecs/src/lib.rs
+++ b/fastfield_codecs/src/lib.rs
@@ -1,5 +1,12 @@
+#![warn(missing_docs)]
 #![cfg_attr(all(feature = "unstable", test), feature(test))]

+//! # `fastfield_codecs`
+//!
+//! - Columnar storage of data for tantivy [`Column`].
+//! - Encode data in different codecs.
+//! - Monotonically map values to u64/u128
+
 #[cfg(test)]
 #[macro_use]
 extern crate more_asserts;
@@ -13,6 +20,10 @@ use std::sync::Arc;

 use common::BinarySerializable;
 use compact_space::CompactSpaceDecompressor;
+use monotonic_mapping::{
+    StrictlyMonotonicMappingInverter, StrictlyMonotonicMappingToInternal,
+    StrictlyMonotonicMappingToInternalBaseval, StrictlyMonotonicMappingToInternalGCDBaseval,
+};
 use ownedbytes::OwnedBytes;
 use serialize::Header;

@@ -22,6 +33,7 @@ mod compact_space;
 mod line;
 mod linear;
 mod monotonic_mapping;
+mod monotonic_mapping_u128;

 mod column;
 mod gcd;
@@ -31,16 +43,24 @@ use self::bitpacked::BitpackedCodec;
 use self::blockwise_linear::BlockwiseLinearCodec;
 pub use self::column::{monotonic_map_column, Column, VecColumn};
 use self::linear::LinearCodec;
-pub use self::monotonic_mapping::MonotonicallyMappableToU64;
+pub use self::monotonic_mapping::{MonotonicallyMappableToU64, StrictlyMonotonicFn};
+pub use self::monotonic_mapping_u128::MonotonicallyMappableToU128;
 pub use self::serialize::{
    estimate, serialize, serialize_and_load, serialize_u128, NormalizedHeader,
 };

 #[derive(PartialEq, Eq, PartialOrd, Ord, Debug, Clone, Copy)]
 #[repr(u8)]
+/// Available codecs to use to encode the u64 (via [`MonotonicallyMappableToU64`]) converted data.
 pub enum FastFieldCodecType {
+    /// Bitpack all values in the value range. The number of bits is defined by the amplitude
+    /// `column.max_value() - column.min_value()`
    Bitpacked = 1,
+    /// Linear interpolation puts a line between the first and last value and then bitpacks the
+    /// values by the offset from the line. The number of bits is defined by the max deviation from
+    /// the line.
    Linear = 2,
+    /// Same as [`FastFieldCodecType::Linear`], but encodes in blocks of 512 elements.
    BlockwiseLinear = 3,
 }

@@ -58,11 +78,11 @@ impl BinarySerializable for FastFieldCodecType {
 }

 impl FastFieldCodecType {
-    pub fn to_code(self) -> u8 {
+    pub(crate) fn to_code(self) -> u8 {
        self as u8
    }

-    pub fn from_code(code: u8) -> Option<Self> {
+    pub(crate) fn from_code(code: u8) -> Option<Self> {
        match code {
            1 => Some(Self::Bitpacked),
            2 => Some(Self::Linear),
@@ -73,8 +93,13 @@ impl FastFieldCodecType {
 }

 /// Returns the correct codec reader wrapped in the `Arc` for the data.
-pub fn open_u128(bytes: OwnedBytes) -> io::Result<Arc<dyn Column<u128>>> {
-    Ok(Arc::new(CompactSpaceDecompressor::open(bytes)?))
+pub fn open_u128<Item: MonotonicallyMappableToU128>(
+    bytes: OwnedBytes,
+) -> io::Result<Arc<dyn Column<Item>>> {
+    let reader = CompactSpaceDecompressor::open(bytes)?;
+    let inverted: StrictlyMonotonicMappingInverter<StrictlyMonotonicMappingToInternal<Item>> =
+        StrictlyMonotonicMappingToInternal::<Item>::new().into();
+    Ok(Arc::new(monotonic_map_column(reader, inverted)))
 }

 /// Returns the correct codec reader wrapped in the `Arc` for the data.
@@ -99,11 +124,15 @@ fn open_specific_codec<C: FastFieldCodec, Item: MonotonicallyMappableToU64>(
    let reader = C::open_from_bytes(bytes, normalized_header)?;
    let min_value = header.min_value;
    if let Some(gcd) = header.gcd {
-        let monotonic_mapping = move |val: u64| Item::from_u64(min_value + val * gcd.get());
-        Ok(Arc::new(monotonic_map_column(reader, monotonic_mapping)))
+        let mapping = StrictlyMonotonicMappingInverter::from(
+            StrictlyMonotonicMappingToInternalGCDBaseval::new(gcd.get(), min_value),
+        );
+        Ok(Arc::new(monotonic_map_column(reader, mapping)))
    } else {
-        let monotonic_mapping = move |val: u64| Item::from_u64(min_value + val);
-        Ok(Arc::new(monotonic_map_column(reader, monotonic_mapping)))
+        let mapping = StrictlyMonotonicMappingInverter::from(
+            StrictlyMonotonicMappingToInternalBaseval::new(min_value),
+        );
+        Ok(Arc::new(monotonic_map_column(reader, mapping)))
    }
 }

@@ -135,6 +164,7 @@ trait FastFieldCodec: 'static {
    fn estimate(column: &dyn Column) -> Option<f32>;
 }

+/// The list of all available codecs for u64 convertible data.
 pub const ALL_CODEC_TYPES: [FastFieldCodecType; 3] = [
    FastFieldCodecType::Bitpacked,
    FastFieldCodecType::BlockwiseLinear,
@@ -143,6 +173,7 @@ pub const ALL_CODEC_TYPES: [FastFieldCodecType; 3] = [

 #[cfg(test)]
 mod tests {
+
    use proptest::prelude::*;
    use proptest::strategy::Strategy;
    use proptest::{prop_oneof, proptest};
@@ -168,15 +199,32 @@ mod tests {
        let actual_compression = out.len() as f32 / (data.len() as f32 * 8.0);

        let reader = crate::open::<u64>(OwnedBytes::new(out)).unwrap();
-        assert_eq!(reader.num_vals(), data.len() as u64);
+        assert_eq!(reader.num_vals(), data.len() as u32);
        for (doc, orig_val) in data.iter().copied().enumerate() {
-            let val = reader.get_val(doc as u64);
+            let val = reader.get_val(doc as u32);
            assert_eq!(
                val, orig_val,
                "val `{val}` does not match orig_val {orig_val:?}, in data set {name}, data \
                 `{data:?}`",
            );
        }
+
+        if !data.is_empty() {
+            let test_rand_idx = rand::thread_rng().gen_range(0..=data.len() - 1);
+            let expected_positions: Vec<u32> = data
+                .iter()
+                .enumerate()
+                .filter(|(_, el)| **el == data[test_rand_idx])
+                .map(|(pos, _)| pos as u32)
+                .collect();
+            let mut positions = Vec::new();
+            reader.get_positions_for_value_range(
+                data[test_rand_idx]..=data[test_rand_idx],
+                0..data.len() as u32,
+                &mut positions,
+            );
+            assert_eq!(expected_positions, positions);
+        }
        Some((estimation, actual_compression))
    }

@@ -386,7 +434,7 @@ mod bench {
        b.iter(|| {
            let mut sum = 0u64;
            for pos in value_iter() {
-                let val = col.get_val(pos as u64);
+                let val = col.get_val(pos as u32);
                sum = sum.wrapping_add(val);
            }
            sum
@@ -398,7 +446,7 @@ mod bench {
        b.iter(|| {
            let mut sum = 0u64;
            for pos in value_iter() {
-                let val = col.get_val(pos as u64);
+                let val = col.get_val(pos as u32);
                sum = sum.wrapping_add(val);
            }
            sum
--- a/fastfield_codecs/src/line.rs
+++ b/fastfield_codecs/src/line.rs
@@ -1,5 +1,5 @@
 use std::io;
-use std::num::NonZeroU64;
+use std::num::NonZeroU32;

 use common::{BinarySerializable, VInt};

@@ -29,7 +29,7 @@ pub struct Line {
 ///   compute_slope(y0, y1)
 /// = compute_slope(y0 + X % 2^64, y1 + X % 2^64)
 /// `
-fn compute_slope(y0: u64, y1: u64, num_vals: NonZeroU64) -> u64 {
+fn compute_slope(y0: u64, y1: u64, num_vals: NonZeroU32) -> u64 {
    let dy = y1.wrapping_sub(y0);
    let sign = dy <= (1 << 63);
    let abs_dy = if sign {
@@ -43,7 +43,7 @@ fn compute_slope(y0: u64, y1: u64, num_vals: NonZeroU64) -> u64 {
        return 0u64;
    }

-    let abs_slope = (abs_dy << 32) / num_vals.get();
+    let abs_slope = (abs_dy << 32) / num_vals.get() as u64;
    if sign {
        abs_slope
    } else {
@@ -62,8 +62,8 @@ fn compute_slope(y0: u64, y1: u64, num_vals: NonZeroU64) -> u64 {

 impl Line {
    #[inline(always)]
-    pub fn eval(&self, x: u64) -> u64 {
-        let linear_part = (x.wrapping_mul(self.slope) >> 32) as i32 as u64;
+    pub fn eval(&self, x: u32) -> u64 {
+        let linear_part = ((x as u64).wrapping_mul(self.slope) >> 32) as i32 as u64;
        self.intercept.wrapping_add(linear_part)
    }

@@ -75,7 +75,7 @@ impl Line {
        Self::train_from(
            first_val,
            last_val,
-            num_vals,
+            num_vals as u32,
            sample_positions_and_values.iter().cloned(),
        )
    }
@@ -84,11 +84,11 @@ impl Line {
    fn train_from(
        first_val: u64,
        last_val: u64,
-        num_vals: u64,
+        num_vals: u32,
        positions_and_values: impl Iterator<Item = (u64, u64)>,
    ) -> Self {
        // TODO replace with let else
-        let idx_last_val = if let Some(idx_last_val) = NonZeroU64::new(num_vals - 1) {
+        let idx_last_val = if let Some(idx_last_val) = NonZeroU32::new(num_vals - 1) {
            idx_last_val
        } else {
            return Line::default();
@@ -129,7 +129,7 @@ impl Line {
        };
        let heuristic_shift = y0.wrapping_sub(MID_POINT);
        line.intercept = positions_and_values
-            .map(|(pos, y)| y.wrapping_sub(line.eval(pos)))
+            .map(|(pos, y)| y.wrapping_sub(line.eval(pos as u32)))
            .min_by_key(|&val| val.wrapping_sub(heuristic_shift))
            .unwrap_or(0u64); //< Never happens.
        line
@@ -199,7 +199,7 @@ mod tests {
        let line = Line::train(&VecColumn::from(&ys));
        ys.iter()
            .enumerate()
-            .map(|(x, y)| y.wrapping_sub(line.eval(x as u64)))
+            .map(|(x, y)| y.wrapping_sub(line.eval(x as u32)))
            .max()
    }

--- a/fastfield_codecs/src/linear.rs
+++ b/fastfield_codecs/src/linear.rs
@@ -19,7 +19,7 @@ pub struct LinearReader {

 impl Column for LinearReader {
    #[inline]
-    fn get_val(&self, doc: u64) -> u64 {
+    fn get_val(&self, doc: u32) -> u64 {
        let interpoled_val: u64 = self.linear_params.line.eval(doc);
        let bitpacked_diff = self.linear_params.bit_unpacker.get(doc, &self.data);
        interpoled_val.wrapping_add(bitpacked_diff)
@@ -37,7 +37,7 @@ impl Column for LinearReader {
    }

    #[inline]
-    fn num_vals(&self) -> u64 {
+    fn num_vals(&self) -> u32 {
        self.header.num_vals
    }
 }
@@ -93,7 +93,7 @@ impl FastFieldCodec for LinearCodec {
            .iter()
            .enumerate()
            .map(|(pos, actual_value)| {
-                let calculated_value = line.eval(pos as u64);
+                let calculated_value = line.eval(pos as u32);
                actual_value.wrapping_sub(calculated_value)
            })
            .max()
@@ -108,7 +108,7 @@ impl FastFieldCodec for LinearCodec {

        let mut bit_packer = BitPacker::new();
        for (pos, actual_value) in column.iter().enumerate() {
-            let calculated_value = line.eval(pos as u64);
+            let calculated_value = line.eval(pos as u32);
            let offset = actual_value.wrapping_sub(calculated_value);
            bit_packer.write(offset, num_bits, write)?;
        }
@@ -140,7 +140,7 @@ impl FastFieldCodec for LinearCodec {
        let estimated_bit_width = sample_positions_and_values
            .into_iter()
            .map(|(pos, actual_value)| {
-                let interpolated_val = line.eval(pos as u64);
+                let interpolated_val = line.eval(pos as u32);
                actual_value.wrapping_sub(interpolated_val)
            })
            .map(|diff| ((diff as f32 * 1.5) * 2.0) as u64)
--- a/fastfield_codecs/src/main.rs
+++ b/fastfield_codecs/src/main.rs
@@ -90,7 +90,7 @@ fn bench_ip() {
    {
        let mut data = vec![];
        for dataset in dataset.chunks(500_000) {
-            serialize_u128(VecColumn::from(dataset), &mut data).unwrap();
+            serialize_u128(|| dataset.iter().cloned(), dataset.len() as u32, &mut data).unwrap();
        }
        let compression = data.len() as f64 / (dataset.len() * 16) as f64;
        println!("Compression 50_000 chunks {:.4}", compression);
@@ -101,7 +101,10 @@ fn bench_ip() {
    }

    let mut data = vec![];
-    serialize_u128(VecColumn::from(&dataset), &mut data).unwrap();
+    {
+        print_time!("creation");
+        serialize_u128(|| dataset.iter().cloned(), dataset.len() as u32, &mut data).unwrap();
+    }

    let compression = data.len() as f64 / (dataset.len() * 16) as f64;
    println!("Compression {:.2}", compression);
@@ -110,11 +113,17 @@ fn bench_ip() {
        (data.len() * 8) as f32 / dataset.len() as f32
    );

-    let decompressor = open_u128(OwnedBytes::new(data)).unwrap();
+    let decompressor = open_u128::<u128>(OwnedBytes::new(data)).unwrap();
    // Sample some ranges
+    let mut doc_values = Vec::new();
    for value in dataset.iter().take(1110).skip(1100).cloned() {
+        doc_values.clear();
        print_time!("get range");
-        let doc_values = decompressor.get_between_vals(value..=value);
+        decompressor.get_positions_for_value_range(
+            value..=value,
+            0..decompressor.num_vals(),
+            &mut doc_values,
+        );
        println!("{:?}", doc_values.len());
    }
 }
--- a/fastfield_codecs/src/monotonic_mapping.rs
+++ b/fastfield_codecs/src/monotonic_mapping.rs
@@ -1,3 +1,11 @@
+use std::marker::PhantomData;
+
+use fastdivide::DividerU64;
+
+use crate::MonotonicallyMappableToU128;
+
+/// Monotonic maps a value to u64 value space.
+/// Monotonic mapping enables `PartialOrd` on u64 space without conversion to original space.
 pub trait MonotonicallyMappableToU64: 'static + PartialOrd + Copy + Send + Sync {
    /// Converts a value to u64.
    ///
@@ -11,6 +19,145 @@ pub trait MonotonicallyMappableToU64: 'static + PartialOrd + Copy + Send + Sync
    fn from_u64(val: u64) -> Self;
 }

+/// Values need to be strictly monotonic mapped to a `Internal` value (u64 or u128) that can be
+/// used in fast field codecs.
+///
+/// The monotonic mapping is required so that `PartialOrd` can be used on `Internal` without
+/// converting to `External`.
+///
+/// All strictly monotonic functions are invertible because they are guaranteed to have a one-to-one
+/// mapping from their range to their domain. The `inverse` method is required when opening a codec,
+/// so a value can be converted back to its original domain (e.g. ip address or f64) from its
+/// internal representation.
+pub trait StrictlyMonotonicFn<External, Internal> {
+    /// Strictly monotonically maps the value from External to Internal.
+    fn mapping(&self, inp: External) -> Internal;
+    /// Inverse of `mapping`. Maps the value from Internal to External.
+    fn inverse(&self, out: Internal) -> External;
+}
+
+/// Inverts a strictly monotonic mapping from `StrictlyMonotonicFn<A, B>` to
+/// `StrictlyMonotonicFn<B, A>`.
+///
+/// # Warning
+///
+/// This type comes with a footgun. A type being strictly monotonic does not impose that the inverse
+/// mapping is strictly monotonic over the entire space External. e.g. a -> a * 2. Use at your own
+/// risks.
+pub(crate) struct StrictlyMonotonicMappingInverter<T> {
+    orig_mapping: T,
+}
+impl<T> From<T> for StrictlyMonotonicMappingInverter<T> {
+    fn from(orig_mapping: T) -> Self {
+        Self { orig_mapping }
+    }
+}
+
+impl<From, To, T> StrictlyMonotonicFn<To, From> for StrictlyMonotonicMappingInverter<T>
+where T: StrictlyMonotonicFn<From, To>
+{
+    fn mapping(&self, val: To) -> From {
+        self.orig_mapping.inverse(val)
+    }
+
+    fn inverse(&self, val: From) -> To {
+        self.orig_mapping.mapping(val)
+    }
+}
+
+/// Applies the strictly monotonic mapping from `T` without any additional changes.
+pub(crate) struct StrictlyMonotonicMappingToInternal<T> {
+    _phantom: PhantomData<T>,
+}
+
+impl<T> StrictlyMonotonicMappingToInternal<T> {
+    pub(crate) fn new() -> StrictlyMonotonicMappingToInternal<T> {
+        Self {
+            _phantom: PhantomData,
+        }
+    }
+}
+
+impl<External: MonotonicallyMappableToU128, T: MonotonicallyMappableToU128>
+    StrictlyMonotonicFn<External, u128> for StrictlyMonotonicMappingToInternal<T>
+where T: MonotonicallyMappableToU128
+{
+    fn mapping(&self, inp: External) -> u128 {
+        External::to_u128(inp)
+    }
+
+    fn inverse(&self, out: u128) -> External {
+        External::from_u128(out)
+    }
+}
+
+impl<External: MonotonicallyMappableToU64, T: MonotonicallyMappableToU64>
+    StrictlyMonotonicFn<External, u64> for StrictlyMonotonicMappingToInternal<T>
+where T: MonotonicallyMappableToU64
+{
+    fn mapping(&self, inp: External) -> u64 {
+        External::to_u64(inp)
+    }
+
+    fn inverse(&self, out: u64) -> External {
+        External::from_u64(out)
+    }
+}
+
+/// Mapping dividing by  gcd and a base value.
+///
+/// The function is assumed to be only called on values divided by passed
+/// gcd value. (It is necessary for the function to be monotonic.)
+pub(crate) struct StrictlyMonotonicMappingToInternalGCDBaseval {
+    gcd_divider: DividerU64,
+    gcd: u64,
+    min_value: u64,
+}
+impl StrictlyMonotonicMappingToInternalGCDBaseval {
+    pub(crate) fn new(gcd: u64, min_value: u64) -> Self {
+        let gcd_divider = DividerU64::divide_by(gcd);
+        Self {
+            gcd_divider,
+            gcd,
+            min_value,
+        }
+    }
+}
+impl<External: MonotonicallyMappableToU64> StrictlyMonotonicFn<External, u64>
+    for StrictlyMonotonicMappingToInternalGCDBaseval
+{
+    fn mapping(&self, inp: External) -> u64 {
+        self.gcd_divider
+            .divide(External::to_u64(inp) - self.min_value)
+    }
+
+    fn inverse(&self, out: u64) -> External {
+        External::from_u64(self.min_value + out * self.gcd)
+    }
+}
+
+/// Strictly monotonic mapping with a base value.
+pub(crate) struct StrictlyMonotonicMappingToInternalBaseval {
+    min_value: u64,
+}
+impl StrictlyMonotonicMappingToInternalBaseval {
+    pub(crate) fn new(min_value: u64) -> Self {
+        Self { min_value }
+    }
+}
+
+impl<External: MonotonicallyMappableToU64> StrictlyMonotonicFn<External, u64>
+    for StrictlyMonotonicMappingToInternalBaseval
+{
+    fn mapping(&self, val: External) -> u64 {
+        External::to_u64(val) - self.min_value
+    }
+
+    fn inverse(&self, val: u64) -> External {
+        External::from_u64(self.min_value + val)
+    }
+}
+
 impl MonotonicallyMappableToU64 for u64 {
    fn to_u64(self) -> u64 {
        self
@@ -54,3 +201,33 @@ impl MonotonicallyMappableToU64 for f64 {
        common::u64_to_f64(val)
    }
 }
+
+#[cfg(test)]
+mod tests {
+
+    use super::*;
+
+    #[test]
+    fn strictly_monotonic_test() {
+        // identity mapping
+        test_round_trip(&StrictlyMonotonicMappingToInternal::<u64>::new(), 100u64);
+        // round trip to i64
+        test_round_trip(&StrictlyMonotonicMappingToInternal::<i64>::new(), 100u64);
+        // identity mapping
+        test_round_trip(&StrictlyMonotonicMappingToInternal::<u128>::new(), 100u128);
+
+        // base value to i64 round trip
+        let mapping = StrictlyMonotonicMappingToInternalBaseval::new(100);
+        test_round_trip::<_, _, u64>(&mapping, 100i64);
+        // base value and gcd to u64 round trip
+        let mapping = StrictlyMonotonicMappingToInternalGCDBaseval::new(10, 100);
+        test_round_trip::<_, _, u64>(&mapping, 100u64);
+    }
+
+    fn test_round_trip<T: StrictlyMonotonicFn<K, L>, K: std::fmt::Debug + Eq + Copy, L>(
+        mapping: &T,
+        test_val: K,
+    ) {
+        assert_eq!(mapping.inverse(mapping.mapping(test_val)), test_val);
+    }
+}
--- a/fastfield_codecs/src/monotonic_mapping_u128.rs
+++ b/fastfield_codecs/src/monotonic_mapping_u128.rs
@@ -1,5 +1,7 @@
-use std::net::{IpAddr, Ipv6Addr};
+use std::net::Ipv6Addr;

+/// Montonic maps a value to u128 value space
+/// Monotonic mapping enables `PartialOrd` on u128 space without conversion to original space.
 pub trait MonotonicallyMappableToU128: 'static + PartialOrd + Copy + Send + Sync {
    /// Converts a value to u128.
    ///
@@ -23,20 +25,16 @@ impl MonotonicallyMappableToU128 for u128 {
    }
 }

-impl MonotonicallyMappableToU128 for IpAddr {
+impl MonotonicallyMappableToU128 for Ipv6Addr {
    fn to_u128(self) -> u128 {
        ip_to_u128(self)
    }

    fn from_u128(val: u128) -> Self {
-        IpAddr::from(val.to_be_bytes())
+        Ipv6Addr::from(val.to_be_bytes())
    }
 }

-fn ip_to_u128(ip_addr: IpAddr) -> u128 {
-    let ip_addr_v6: Ipv6Addr = match ip_addr {
-        IpAddr::V4(v4) => v4.to_ipv6_mapped(),
-        IpAddr::V6(v6) => v6,
-    };
-    u128::from_be_bytes(ip_addr_v6.octets())
+fn ip_to_u128(ip_addr: Ipv6Addr) -> u128 {
+    u128::from_be_bytes(ip_addr.octets())
 }
--- a/fastfield_codecs/src/serialize.rs
+++ b/fastfield_codecs/src/serialize.rs
@@ -22,7 +22,6 @@ use std::num::NonZeroU64;
 use std::sync::Arc;

 use common::{BinarySerializable, VInt};
-use fastdivide::DividerU64;
 use log::warn;
 use ownedbytes::OwnedBytes;

@@ -30,6 +29,10 @@ use crate::bitpacked::BitpackedCodec;
 use crate::blockwise_linear::BlockwiseLinearCodec;
 use crate::compact_space::CompactSpaceCompressor;
 use crate::linear::LinearCodec;
+use crate::monotonic_mapping::{
+    StrictlyMonotonicFn, StrictlyMonotonicMappingToInternal,
+    StrictlyMonotonicMappingToInternalGCDBaseval,
+};
 use crate::{
    monotonic_map_column, Column, FastFieldCodec, FastFieldCodecType, MonotonicallyMappableToU64,
    VecColumn, ALL_CODEC_TYPES,
@@ -37,18 +40,20 @@ use crate::{

 /// The normalized header gives some parameters after applying the following
 /// normalization of the vector:
-/// val -> (val - min_value) / gcd
+/// `val -> (val - min_value) / gcd`
 ///
 /// By design, after normalization, `min_value = 0` and `gcd = 1`.
 #[derive(Debug, Copy, Clone)]
 pub struct NormalizedHeader {
-    pub num_vals: u64,
+    /// The number of values in the underlying column.
+    pub num_vals: u32,
+    /// The max value of the underlying column.
    pub max_value: u64,
 }

 #[derive(Debug, Copy, Clone)]
 pub(crate) struct Header {
-    pub num_vals: u64,
+    pub num_vals: u32,
    pub min_value: u64,
    pub max_value: u64,
    pub gcd: Option<NonZeroU64>,
@@ -57,8 +62,11 @@ pub(crate) struct Header {

 impl Header {
    pub fn normalized(self) -> NormalizedHeader {
-        let max_value =
-            (self.max_value - self.min_value) / self.gcd.map(|gcd| gcd.get()).unwrap_or(1);
+        let gcd = self.gcd.map(|gcd| gcd.get()).unwrap_or(1);
+        let gcd_min_val_mapping =
+            StrictlyMonotonicMappingToInternalGCDBaseval::new(gcd, self.min_value);
+
+        let max_value = gcd_min_val_mapping.mapping(self.max_value);
        NormalizedHeader {
            num_vals: self.num_vals,
            max_value,
@@ -66,10 +74,7 @@ impl Header {
    }

    pub fn normalize_column<C: Column>(&self, from_column: C) -> impl Column {
-        let min_value = self.min_value;
-        let gcd = self.gcd.map(|gcd| gcd.get()).unwrap_or(1);
-        let divider = DividerU64::divide_by(gcd);
-        monotonic_map_column(from_column, move |val| divider.divide(val - min_value))
+        normalize_column(from_column, self.min_value, self.gcd)
    }

    pub fn compute_header(
@@ -81,9 +86,8 @@ impl Header {
        let max_value = column.max_value();
        let gcd = crate::gcd::find_gcd(column.iter().map(|val| val - min_value))
            .filter(|gcd| gcd.get() > 1u64);
-        let divider = DividerU64::divide_by(gcd.map(|gcd| gcd.get()).unwrap_or(1u64));
-        let shifted_column = monotonic_map_column(&column, |val| divider.divide(val - min_value));
-        let codec_type = detect_codec(shifted_column, codecs)?;
+        let normalized_column = normalize_column(column, min_value, gcd);
+        let codec_type = detect_codec(normalized_column, codecs)?;
        Some(Header {
            num_vals,
            min_value,
@@ -94,9 +98,19 @@ impl Header {
    }
 }

+pub fn normalize_column<C: Column>(
+    from_column: C,
+    min_value: u64,
+    gcd: Option<NonZeroU64>,
+) -> impl Column {
+    let gcd = gcd.map(|gcd| gcd.get()).unwrap_or(1);
+    let mapping = StrictlyMonotonicMappingToInternalGCDBaseval::new(gcd, min_value);
+    monotonic_map_column(from_column, mapping)
+}
+
 impl BinarySerializable for Header {
    fn serialize<W: io::Write>(&self, writer: &mut W) -> io::Result<()> {
-        VInt(self.num_vals).serialize(writer)?;
+        VInt(self.num_vals as u64).serialize(writer)?;
        VInt(self.min_value).serialize(writer)?;
        VInt(self.max_value - self.min_value).serialize(writer)?;
        if let Some(gcd) = self.gcd {
@@ -109,7 +123,7 @@ impl BinarySerializable for Header {
    }

    fn deserialize<R: io::Read>(reader: &mut R) -> io::Result<Self> {
-        let num_vals = VInt::deserialize(reader)?.0;
+        let num_vals = VInt::deserialize(reader)?.0 as u32;
        let min_value = VInt::deserialize(reader)?.0;
        let amplitude = VInt::deserialize(reader)?.0;
        let max_value = min_value + amplitude;
@@ -125,16 +139,21 @@ impl BinarySerializable for Header {
    }
 }

+/// Return estimated compression for given codec in the value range [0.0..1.0], where 1.0 means no
+/// compression.
 pub fn estimate<T: MonotonicallyMappableToU64>(
    typed_column: impl Column<T>,
    codec_type: FastFieldCodecType,
 ) -> Option<f32> {
-    let column = monotonic_map_column(typed_column, T::to_u64);
+    let column = monotonic_map_column(typed_column, StrictlyMonotonicMappingToInternal::<T>::new());
    let min_value = column.min_value();
    let gcd = crate::gcd::find_gcd(column.iter().map(|val| val - min_value))
        .filter(|gcd| gcd.get() > 1u64);
-    let divider = DividerU64::divide_by(gcd.map(|gcd| gcd.get()).unwrap_or(1u64));
-    let normalized_column = monotonic_map_column(&column, |val| divider.divide(val - min_value));
+    let mapping = StrictlyMonotonicMappingToInternalGCDBaseval::new(
+        gcd.map(|gcd| gcd.get()).unwrap_or(1u64),
+        min_value,
+    );
+    let normalized_column = monotonic_map_column(&column, mapping);
    match codec_type {
        FastFieldCodecType::Bitpacked => BitpackedCodec::estimate(&normalized_column),
        FastFieldCodecType::Linear => LinearCodec::estimate(&normalized_column),
@@ -142,25 +161,26 @@ pub fn estimate<T: MonotonicallyMappableToU64>(
    }
 }

-pub fn serialize_u128(
-    typed_column: impl Column<u128>,
+/// Serializes u128 values with the compact space codec.
+pub fn serialize_u128<F: Fn() -> I, I: Iterator<Item = u128>>(
+    iter_gen: F,
+    num_vals: u32,
    output: &mut impl io::Write,
 ) -> io::Result<()> {
    // TODO write header, to later support more codecs
-    let compressor = CompactSpaceCompressor::train_from(&typed_column);
-    compressor
-        .compress_into(typed_column.iter(), output)
-        .unwrap();
+    let compressor = CompactSpaceCompressor::train_from(iter_gen(), num_vals);
+    compressor.compress_into(iter_gen(), output).unwrap();

    Ok(())
 }

+/// Serializes the column with the codec with the best estimate on the data.
 pub fn serialize<T: MonotonicallyMappableToU64>(
    typed_column: impl Column<T>,
    output: &mut impl io::Write,
    codecs: &[FastFieldCodecType],
 ) -> io::Result<()> {
-    let column = monotonic_map_column(typed_column, T::to_u64);
+    let column = monotonic_map_column(typed_column, StrictlyMonotonicMappingToInternal::<T>::new());
    let header = Header::compute_header(&column, codecs).ok_or_else(|| {
        io::Error::new(
            io::ErrorKind::InvalidInput,
@@ -225,6 +245,7 @@ fn serialize_given_codec(
    Ok(())
 }

+/// Helper function to serialize a column (autodetect from all codecs) and then open it
 pub fn serialize_and_load<T: MonotonicallyMappableToU64 + Ord + Default>(
    column: &[T],
 ) -> Arc<dyn Column<T>> {
--- a/query-grammar/src/query_grammar.rs
+++ b/query-grammar/src/query_grammar.rs
@@ -62,6 +62,20 @@ fn word<'a>() -> impl Parser<&'a str, Output = String> {
        })
 }

+// word variant that allows more characters, e.g. for range queries that don't allow field
+// specifier
+fn relaxed_word<'a>() -> impl Parser<&'a str, Output = String> {
+    (
+        satisfy(|c: char| {
+            !c.is_whitespace() && !['`', '{', '}', '"', '[', ']', '(', ')'].contains(&c)
+        }),
+        many(satisfy(|c: char| {
+            !c.is_whitespace() && !['{', '}', '"', '[', ']', '(', ')'].contains(&c)
+        })),
+    )
+        .map(|(s1, s2): (char, String)| format!("{}{}", s1, s2))
+}
+
 /// Parses a date time according to rfc3339
 /// 2015-08-02T18:54:42+02
 /// 2021-04-13T19:46:26.266051969+00:00
@@ -181,8 +195,8 @@ fn spaces1<'a>() -> impl Parser<&'a str, Output = ()> {
 fn range<'a>() -> impl Parser<&'a str, Output = UserInputLeaf> {
    let range_term_val = || {
        attempt(date_time())
-            .or(word())
            .or(negative_number())
+            .or(relaxed_word())
            .or(char('*').with(value("*".to_string())))
    };

@@ -649,6 +663,34 @@ mod test {
            .expect("Cannot parse date range")
            .0;
        assert_eq!(res6, expected_flexible_dates);
+        // IP Range Unbounded
+        let expected_weight = UserInputLeaf::Range {
+            field: Some("ip".to_string()),
+            lower: UserInputBound::Inclusive("::1".to_string()),
+            upper: UserInputBound::Unbounded,
+        };
+        let res1 = range()
+            .parse("ip: >=::1")
+            .expect("Cannot parse ip v6 format")
+            .0;
+        let res2 = range()
+            .parse("ip:[::1 TO *}")
+            .expect("Cannot parse ip v6 format")
+            .0;
+        assert_eq!(res1, expected_weight);
+        assert_eq!(res2, expected_weight);
+
+        // IP Range Bounded
+        let expected_weight = UserInputLeaf::Range {
+            field: Some("ip".to_string()),
+            lower: UserInputBound::Inclusive("::0.0.0.50".to_string()),
+            upper: UserInputBound::Exclusive("::0.0.0.52".to_string()),
+        };
+        let res1 = range()
+            .parse("ip:[::0.0.0.50 TO ::0.0.0.52}")
+            .expect("Cannot parse ip v6 format")
+            .0;
+        assert_eq!(res1, expected_weight);
    }

    #[test]
--- a/src/aggregation/agg_result.rs
+++ b/src/aggregation/agg_result.rs
@@ -6,7 +6,7 @@

 use std::collections::HashMap;

-use fnv::FnvHashMap;
+use rustc_hash::FxHashMap;
 use serde::{Deserialize, Serialize};

 use super::agg_req::BucketAggregationInternal;
@@ -145,7 +145,7 @@ pub enum BucketEntries<T> {
    /// Vector format bucket entries
    Vec(Vec<T>),
    /// HashMap format bucket entries
-    HashMap(FnvHashMap<String, T>),
+    HashMap(FxHashMap<String, T>),
 }

 /// This is the default entry for a bucket, which contains a key, count, and optionally
--- a/src/aggregation/bucket/histogram/histogram.rs
+++ b/src/aggregation/bucket/histogram/histogram.rs
@@ -331,10 +331,10 @@ impl SegmentHistogramCollector {
            .expect("unexpected fast field cardinatility");
        let mut iter = doc.chunks_exact(4);
        for docs in iter.by_ref() {
-            let val0 = self.f64_from_fastfield_u64(accessor.get_val(docs[0] as u64));
-            let val1 = self.f64_from_fastfield_u64(accessor.get_val(docs[1] as u64));
-            let val2 = self.f64_from_fastfield_u64(accessor.get_val(docs[2] as u64));
-            let val3 = self.f64_from_fastfield_u64(accessor.get_val(docs[3] as u64));
+            let val0 = self.f64_from_fastfield_u64(accessor.get_val(docs[0]));
+            let val1 = self.f64_from_fastfield_u64(accessor.get_val(docs[1]));
+            let val2 = self.f64_from_fastfield_u64(accessor.get_val(docs[2]));
+            let val3 = self.f64_from_fastfield_u64(accessor.get_val(docs[3]));

            let bucket_pos0 = get_bucket_num(val0);
            let bucket_pos1 = get_bucket_num(val1);
@@ -371,7 +371,7 @@ impl SegmentHistogramCollector {
            )?;
        }
        for &doc in iter.remainder() {
-            let val = f64_from_fastfield_u64(accessor.get_val(doc as u64), &self.field_type);
+            let val = f64_from_fastfield_u64(accessor.get_val(doc), &self.field_type);
            if !bounds.contains(val) {
                continue;
            }
--- a/src/aggregation/bucket/range.rs
+++ b/src/aggregation/bucket/range.rs
@@ -1,7 +1,7 @@
 use std::fmt::Debug;
 use std::ops::Range;

-use fnv::FnvHashMap;
+use rustc_hash::FxHashMap;
 use serde::{Deserialize, Serialize};

 use crate::aggregation::agg_req_with_accessor::{
@@ -176,7 +176,7 @@ impl SegmentRangeCollector {
    ) -> crate::Result<IntermediateBucketResult> {
        let field_type = self.field_type;

-        let buckets: FnvHashMap<SerializedKey, IntermediateRangeBucketEntry> = self
+        let buckets: FxHashMap<SerializedKey, IntermediateRangeBucketEntry> = self
            .buckets
            .into_iter()
            .map(move |range_bucket| {
@@ -263,10 +263,10 @@ impl SegmentRangeCollector {
            .as_single()
            .expect("unexpected fast field cardinality");
        for docs in iter.by_ref() {
-            let val1 = accessor.get_val(docs[0] as u64);
-            let val2 = accessor.get_val(docs[1] as u64);
-            let val3 = accessor.get_val(docs[2] as u64);
-            let val4 = accessor.get_val(docs[3] as u64);
+            let val1 = accessor.get_val(docs[0]);
+            let val2 = accessor.get_val(docs[1]);
+            let val3 = accessor.get_val(docs[2]);
+            let val4 = accessor.get_val(docs[3]);
            let bucket_pos1 = self.get_bucket_pos(val1);
            let bucket_pos2 = self.get_bucket_pos(val2);
            let bucket_pos3 = self.get_bucket_pos(val3);
@@ -278,7 +278,7 @@ impl SegmentRangeCollector {
            self.increment_bucket(bucket_pos4, docs[3], &bucket_with_accessor.sub_aggregation)?;
        }
        for &doc in iter.remainder() {
-            let val = accessor.get_val(doc as u64);
+            let val = accessor.get_val(doc);
            let bucket_pos = self.get_bucket_pos(val);
            self.increment_bucket(bucket_pos, doc, &bucket_with_accessor.sub_aggregation)?;
        }
--- a/src/aggregation/bucket/term_agg.rs
+++ b/src/aggregation/bucket/term_agg.rs
@@ -1,7 +1,7 @@
 use std::fmt::Debug;

-use fnv::FnvHashMap;
 use itertools::Itertools;
+use rustc_hash::FxHashMap;
 use serde::{Deserialize, Serialize};

 use super::{CustomOrder, Order, OrderTarget};
@@ -17,7 +17,11 @@ use crate::fastfield::MultiValuedFastFieldReader;
 use crate::schema::Type;
 use crate::{DocId, TantivyError};

-/// Creates a bucket for every unique term
+/// Creates a bucket for every unique term and counts the number of occurences.
+/// Note that doc_count in the response buckets equals term count here.
+///
+/// If the text is untokenized and single value, that means one term per document and therefore it
+/// is in fact doc count.
 ///
 /// ### Terminology
 /// Shard parameters are supposed to be equivalent to elasticsearch shard parameter.
@@ -64,6 +68,25 @@ use crate::{DocId, TantivyError};
 ///     }
 /// }
 /// ```
+///
+/// /// # Response JSON Format
+/// ```json
+/// {
+///     ...
+///     "aggregations": {
+///         "genres": {
+///             "doc_count_error_upper_bound": 0,   
+///             "sum_other_doc_count": 0,           
+///             "buckets": [                        
+///                 { "key": "drumnbass", "doc_count": 6 },
+///                 { "key": "raggae", "doc_count": 4 },
+///                 { "key": "jazz", "doc_count": 2 }
+///             ]
+///         }
+///     }
+/// }
+/// ```
+
 #[derive(Clone, Debug, Default, PartialEq, Serialize, Deserialize)]
 pub struct TermsAggregation {
    /// The field to aggregate on.
@@ -176,7 +199,7 @@ impl TermsAggregationInternal {
 #[derive(Clone, Debug, PartialEq)]
 /// Container to store term_ids and their buckets.
 struct TermBuckets {
-    pub(crate) entries: FnvHashMap<u32, TermBucketEntry>,
+    pub(crate) entries: FxHashMap<u32, TermBucketEntry>,
    blueprint: Option<SegmentAggregationResultsCollector>,
 }

@@ -374,7 +397,7 @@ impl SegmentTermCollector {
            .expect("internal error: inverted index not loaded for term aggregation");
        let term_dict = inverted_index.terms();

-        let mut dict: FnvHashMap<String, IntermediateTermBucketEntry> = Default::default();
+        let mut dict: FxHashMap<String, IntermediateTermBucketEntry> = Default::default();
        let mut buffer = vec![];
        for (term_id, entry) in entries {
            term_dict
@@ -1106,9 +1129,9 @@ mod tests {

        assert_eq!(res["my_texts"]["buckets"][0]["key"], "terma");
        assert_eq!(res["my_texts"]["buckets"][0]["doc_count"], 4);
-        assert_eq!(res["my_texts"]["buckets"][1]["key"], "termb");
+        assert_eq!(res["my_texts"]["buckets"][1]["key"], "termc");
        assert_eq!(res["my_texts"]["buckets"][1]["doc_count"], 0);
-        assert_eq!(res["my_texts"]["buckets"][2]["key"], "termc");
+        assert_eq!(res["my_texts"]["buckets"][2]["key"], "termb");
        assert_eq!(res["my_texts"]["buckets"][2]["doc_count"], 0);
        assert_eq!(res["my_texts"]["sum_other_doc_count"], 0);
        assert_eq!(res["my_texts"]["doc_count_error_upper_bound"], 0);
@@ -1206,11 +1229,43 @@ mod tests {
        .collect();

        let res = exec_request_with_query(agg_req, &index, None);
+
        assert!(res.is_err());

        Ok(())
    }

+    #[test]
+    fn terms_aggregation_multi_token_per_doc() -> crate::Result<()> {
+        let terms = vec!["Hello Hello", "Hallo Hallo"];
+
+        let index = get_test_index_from_terms(true, &[terms])?;
+
+        let agg_req: Aggregations = vec![(
+            "my_texts".to_string(),
+            Aggregation::Bucket(BucketAggregation {
+                bucket_agg: BucketAggregationType::Terms(TermsAggregation {
+                    field: "text_id".to_string(),
+                    min_doc_count: Some(0),
+                    ..Default::default()
+                }),
+                sub_aggregation: Default::default(),
+            }),
+        )]
+        .into_iter()
+        .collect();
+
+        let res = exec_request_with_query(agg_req, &index, None).unwrap();
+
+        assert_eq!(res["my_texts"]["buckets"][0]["key"], "hello");
+        assert_eq!(res["my_texts"]["buckets"][0]["doc_count"], 2);
+
+        assert_eq!(res["my_texts"]["buckets"][1]["key"], "hallo");
+        assert_eq!(res["my_texts"]["buckets"][1]["doc_count"], 2);
+
+        Ok(())
+    }
+
    #[test]
    fn test_json_format() -> crate::Result<()> {
        let agg_req: Aggregations = vec![(
--- a/src/aggregation/intermediate_agg_result.rs
+++ b/src/aggregation/intermediate_agg_result.rs
@@ -5,8 +5,8 @@
 use std::cmp::Ordering;
 use std::collections::HashMap;

-use fnv::FnvHashMap;
 use itertools::Itertools;
+use rustc_hash::FxHashMap;
 use serde::{Deserialize, Serialize};

 use super::agg_req::{
@@ -288,7 +288,7 @@ impl IntermediateBucketResult {
                    .keyed;
                let buckets = if is_keyed {
                    let mut bucket_map =
-                        FnvHashMap::with_capacity_and_hasher(buckets.len(), Default::default());
+                        FxHashMap::with_capacity_and_hasher(buckets.len(), Default::default());
                    for bucket in buckets {
                        bucket_map.insert(bucket.key.to_string(), bucket);
                    }
@@ -308,7 +308,7 @@ impl IntermediateBucketResult {

                let buckets = if req.as_histogram().unwrap().keyed {
                    let mut bucket_map =
-                        FnvHashMap::with_capacity_and_hasher(buckets.len(), Default::default());
+                        FxHashMap::with_capacity_and_hasher(buckets.len(), Default::default());
                    for bucket in buckets {
                        bucket_map.insert(bucket.key.to_string(), bucket);
                    }
@@ -396,13 +396,13 @@ impl IntermediateBucketResult {
 #[derive(Default, Clone, Debug, PartialEq, Serialize, Deserialize)]
 /// Range aggregation including error counts
 pub struct IntermediateRangeBucketResult {
-    pub(crate) buckets: FnvHashMap<SerializedKey, IntermediateRangeBucketEntry>,
+    pub(crate) buckets: FxHashMap<SerializedKey, IntermediateRangeBucketEntry>,
 }

 #[derive(Default, Clone, Debug, PartialEq, Serialize, Deserialize)]
 /// Term aggregation including error counts
 pub struct IntermediateTermBucketResult {
-    pub(crate) entries: FnvHashMap<String, IntermediateTermBucketEntry>,
+    pub(crate) entries: FxHashMap<String, IntermediateTermBucketEntry>,
    pub(crate) sum_other_doc_count: u64,
    pub(crate) doc_count_error_upper_bound: u64,
 }
@@ -499,8 +499,8 @@ trait MergeFruits {
 }

 fn merge_maps<V: MergeFruits + Clone>(
-    entries_left: &mut FnvHashMap<SerializedKey, V>,
-    mut entries_right: FnvHashMap<SerializedKey, V>,
+    entries_left: &mut FxHashMap<SerializedKey, V>,
+    mut entries_right: FxHashMap<SerializedKey, V>,
 ) {
    for (name, entry_left) in entries_left.iter_mut() {
        if let Some(entry_right) = entries_right.remove(name) {
@@ -626,7 +626,7 @@ mod tests {

    fn get_sub_test_tree(data: &[(String, u64)]) -> IntermediateAggregationResults {
        let mut map = HashMap::new();
-        let mut buckets = FnvHashMap::default();
+        let mut buckets = FxHashMap::default();
        for (key, doc_count) in data {
            buckets.insert(
                key.to_string(),
@@ -653,7 +653,7 @@ mod tests {
        data: &[(String, u64, String, u64)],
    ) -> IntermediateAggregationResults {
        let mut map = HashMap::new();
-        let mut buckets: FnvHashMap<_, _> = Default::default();
+        let mut buckets: FxHashMap<_, _> = Default::default();
        for (key, doc_count, sub_aggregation_key, sub_aggregation_count) in data {
            buckets.insert(
                key.to_string(),
--- a/src/aggregation/metric/average.rs
+++ b/src/aggregation/metric/average.rs
@@ -60,10 +60,10 @@ impl SegmentAverageCollector {
    pub(crate) fn collect_block(&mut self, doc: &[DocId], field: &dyn Column<u64>) {
        let mut iter = doc.chunks_exact(4);
        for docs in iter.by_ref() {
-            let val1 = field.get_val(docs[0] as u64);
-            let val2 = field.get_val(docs[1] as u64);
-            let val3 = field.get_val(docs[2] as u64);
-            let val4 = field.get_val(docs[3] as u64);
+            let val1 = field.get_val(docs[0]);
+            let val2 = field.get_val(docs[1]);
+            let val3 = field.get_val(docs[2]);
+            let val4 = field.get_val(docs[3]);
            let val1 = f64_from_fastfield_u64(val1, &self.field_type);
            let val2 = f64_from_fastfield_u64(val2, &self.field_type);
            let val3 = f64_from_fastfield_u64(val3, &self.field_type);
@@ -74,7 +74,7 @@ impl SegmentAverageCollector {
            self.data.collect(val4);
        }
        for &doc in iter.remainder() {
-            let val = field.get_val(doc as u64);
+            let val = field.get_val(doc);
            let val = f64_from_fastfield_u64(val, &self.field_type);
            self.data.collect(val);
        }
--- a/src/aggregation/metric/stats.rs
+++ b/src/aggregation/metric/stats.rs
@@ -166,10 +166,10 @@ impl SegmentStatsCollector {
    pub(crate) fn collect_block(&mut self, doc: &[DocId], field: &dyn Column<u64>) {
        let mut iter = doc.chunks_exact(4);
        for docs in iter.by_ref() {
-            let val1 = field.get_val(docs[0] as u64);
-            let val2 = field.get_val(docs[1] as u64);
-            let val3 = field.get_val(docs[2] as u64);
-            let val4 = field.get_val(docs[3] as u64);
+            let val1 = field.get_val(docs[0]);
+            let val2 = field.get_val(docs[1]);
+            let val3 = field.get_val(docs[2]);
+            let val4 = field.get_val(docs[3]);
            let val1 = f64_from_fastfield_u64(val1, &self.field_type);
            let val2 = f64_from_fastfield_u64(val2, &self.field_type);
            let val3 = f64_from_fastfield_u64(val3, &self.field_type);
@@ -180,7 +180,7 @@ impl SegmentStatsCollector {
            self.stats.collect(val4);
        }
        for &doc in iter.remainder() {
-            let val = field.get_val(doc as u64);
+            let val = field.get_val(doc);
            let val = f64_from_fastfield_u64(val, &self.field_type);
            self.stats.collect(val);
        }
--- a/src/aggregation/mod.rs
+++ b/src/aggregation/mod.rs
@@ -10,21 +10,19 @@
 //!
 //! There are two categories: [Metrics](metric) and [Buckets](bucket).
 //!
-//! # Usage
-//!
+//! ## Prerequisite
+//! Currently aggregations work only on [fast fields](`crate::fastfield`). Single value fast fields
+//! of type `u64`, `f64`, `i64` and fast fields on text fields.
 //!
+//! ## Usage
 //! To use aggregations, build an aggregation request by constructing
 //! [`Aggregations`](agg_req::Aggregations).
 //! Create an [`AggregationCollector`] from this request. `AggregationCollector` implements the
 //! [`Collector`](crate::collector::Collector) trait and can be passed as collector into
 //! [`Searcher::search()`](crate::Searcher::search).
 //!
-//! #### Limitations
 //!
-//! Currently aggregations work only on single value fast fields of type `u64`, `f64`, `i64` and
-//! fast fields on text fields.
-//!
-//! # JSON Format
+//! ## JSON Format
 //! Aggregations request and result structures de/serialize into elasticsearch compatible JSON.
 //!
 //! ```verbatim
@@ -35,7 +33,7 @@
 //! let json_response_string: String = &serde_json::to_string(&agg_res)?;
 //! ```
 //!
-//! # Supported Aggregations
+//! ## Supported Aggregations
 //! - [Bucket](bucket)
 //!     - [Histogram](bucket::HistogramAggregation)
 //!     - [Range](bucket::RangeAggregation)
--- a/src/collector/filter_collector_wrapper.rs
+++ b/src/collector/filter_collector_wrapper.rs
@@ -177,7 +177,7 @@ where
    type Fruit = TSegmentCollector::Fruit;

    fn collect(&mut self, doc: u32, score: Score) {
-        let value = self.fast_field_reader.get_val(doc as u64);
+        let value = self.fast_field_reader.get_val(doc);
        if (self.predicate)(value) {
            self.segment_collector.collect(doc, score)
        }
--- a/src/collector/histogram_collector.rs
+++ b/src/collector/histogram_collector.rs
@@ -94,7 +94,7 @@ impl SegmentCollector for SegmentHistogramCollector {
    type Fruit = Vec<u64>;

    fn collect(&mut self, doc: DocId, _score: Score) {
-        let value = self.ff_reader.get_val(doc as u64);
+        let value = self.ff_reader.get_val(doc);
        self.histogram_computer.add_value(value);
    }

--- a/src/collector/tests.rs
+++ b/src/collector/tests.rs
@@ -201,7 +201,7 @@ impl SegmentCollector for FastFieldSegmentCollector {
    type Fruit = Vec<u64>;

    fn collect(&mut self, doc: DocId, _score: Score) {
-        let val = self.reader.get_val(doc as u64);
+        let val = self.reader.get_val(doc);
        self.vals.push(val);
    }

--- a/src/collector/top_score_collector.rs
+++ b/src/collector/top_score_collector.rs
@@ -137,7 +137,7 @@ struct ScorerByFastFieldReader {

 impl CustomSegmentScorer<u64> for ScorerByFastFieldReader {
    fn score(&mut self, doc: DocId) -> u64 {
-        self.ff_reader.get_val(doc as u64)
+        self.ff_reader.get_val(doc)
    }
 }

@@ -458,7 +458,7 @@ impl TopDocs {
    ///
    ///             // We can now define our actual scoring function
    ///             move |doc: DocId, original_score: Score| {
-    ///                 let popularity: u64 = popularity_reader.get_val(doc as u64);
+    ///                 let popularity: u64 = popularity_reader.get_val(doc);
    ///                 // Well.. For the sake of the example we use a simple logarithm
    ///                 // function.
    ///                 let popularity_boost_score = ((2u64 + popularity) as Score).log2();
@@ -567,8 +567,8 @@ impl TopDocs {
    ///
    ///             // We can now define our actual scoring function
    ///             move |doc: DocId| {
-    ///                 let popularity: u64 = popularity_reader.get_val(doc as u64);
-    ///                 let boosted: u64 = boosted_reader.get_val(doc as u64);
+    ///                 let popularity: u64 = popularity_reader.get_val(doc);
+    ///                 let boosted: u64 = boosted_reader.get_val(doc);
    ///                 // Score do not have to be `f64` in tantivy.
    ///                 // Here we return a couple to get lexicographical order
    ///                 // for free.
--- a/src/directory/mmap_directory.rs
+++ b/src/directory/mmap_directory.rs
@@ -571,9 +571,21 @@ mod tests {
        assert_eq!(mmap_directory.get_cache_info().mmapped.len(), 0);
    }

+    fn assert_eventually<P: Fn() -> Option<String>>(predicate: P) {
+        for _ in 0..30 {
+            if predicate().is_none() {
+                break;
+            }
+            std::thread::sleep(Duration::from_millis(200));
+        }
+        if let Some(error_msg) = predicate() {
+            panic!("{}", error_msg);
+        }
+    }
+
    #[test]
-    fn test_mmap_released() -> crate::Result<()> {
-        let mmap_directory = MmapDirectory::create_from_tempdir()?;
+    fn test_mmap_released() {
+        let mmap_directory = MmapDirectory::create_from_tempdir().unwrap();
        let mut schema_builder: SchemaBuilder = Schema::builder();
        let text_field = schema_builder.add_text_field("text", TEXT);
        let schema = schema_builder.build();
@@ -582,49 +594,56 @@ mod tests {
            let index =
                Index::create(mmap_directory.clone(), schema, IndexSettings::default()).unwrap();

-            let mut index_writer = index.writer_for_tests()?;
+            let mut index_writer = index.writer_for_tests().unwrap();
            let mut log_merge_policy = LogMergePolicy::default();
            log_merge_policy.set_min_num_segments(3);
            index_writer.set_merge_policy(Box::new(log_merge_policy));
            for _num_commits in 0..10 {
                for _ in 0..10 {
-                    index_writer.add_document(doc!(text_field=>"abc"))?;
+                    index_writer.add_document(doc!(text_field=>"abc")).unwrap();
                }
-                index_writer.commit()?;
+                index_writer.commit().unwrap();
            }

            let reader = index
                .reader_builder()
                .reload_policy(ReloadPolicy::Manual)
-                .try_into()?;
+                .try_into()
+                .unwrap();

            for _ in 0..4 {
-                index_writer.add_document(doc!(text_field=>"abc"))?;
-                index_writer.commit()?;
-                reader.reload()?;
+                index_writer.add_document(doc!(text_field=>"abc")).unwrap();
+                index_writer.commit().unwrap();
+                reader.reload().unwrap();
            }
-            index_writer.wait_merging_threads()?;
+            index_writer.wait_merging_threads().unwrap();

-            reader.reload()?;
+            reader.reload().unwrap();
            let num_segments = reader.searcher().segment_readers().len();
            assert!(num_segments <= 4);
            let num_components_except_deletes_and_tempstore =
                crate::core::SegmentComponent::iterator().len() - 2;
-            let num_mmapped = mmap_directory.get_cache_info().mmapped.len();
-            assert!(
-                num_mmapped <= num_segments * num_components_except_deletes_and_tempstore,
-                "Expected at most {} mmapped files, got {num_mmapped}",
-                num_segments * num_components_except_deletes_and_tempstore
-            );
+            let max_num_mmapped = num_components_except_deletes_and_tempstore * num_segments;
+            assert_eventually(|| {
+                let num_mmapped = mmap_directory.get_cache_info().mmapped.len();
+                if num_mmapped > max_num_mmapped {
+                    Some(format!(
+                        "Expected at most {max_num_mmapped} mmapped files, got {num_mmapped}"
+                    ))
+                } else {
+                    None
+                }
+            });
        }
        // This test failed on CI. The last Mmap is dropped from the merging thread so there might
        // be a race condition indeed.
-        for _ in 0..10 {
-            if mmap_directory.get_cache_info().mmapped.is_empty() {
-                return Ok(());
+        assert_eventually(|| {
+            let num_mmapped = mmap_directory.get_cache_info().mmapped.len();
+            if num_mmapped > 0 {
+                Some(format!("Expected no mmapped files, got {num_mmapped}"))
+            } else {
+                None
            }
-            std::thread::sleep(Duration::from_millis(200));
-        }
-        panic!("The cache still contains information. One of the Mmap has not been dropped.");
+        });
    }
 }
--- a/src/fastfield/bytes/reader.rs
+++ b/src/fastfield/bytes/reader.rs
@@ -32,10 +32,9 @@ impl BytesFastFieldReader {
        Ok(BytesFastFieldReader { idx_reader, values })
    }

-    fn range(&self, doc: DocId) -> Range<u64> {
-        let idx = doc as u64;
-        let start = self.idx_reader.get_val(idx);
-        let end = self.idx_reader.get_val(idx + 1);
+    fn range(&self, doc: DocId) -> Range<u32> {
+        let start = self.idx_reader.get_val(doc) as u32;
+        let end = self.idx_reader.get_val(doc + 1) as u32;
        start..end
    }

@@ -48,7 +47,7 @@ impl BytesFastFieldReader {
    /// Returns the length of the bytes associated with the given `doc`
    pub fn num_bytes(&self, doc: DocId) -> u64 {
        let range = self.range(doc);
-        range.end - range.start
+        (range.end - range.start) as u64
    }

    /// Returns the overall number of bytes in this bytes fast field.
@@ -58,7 +57,7 @@ impl BytesFastFieldReader {
 }

 impl MultiValueLength for BytesFastFieldReader {
-    fn get_range(&self, doc_id: DocId) -> std::ops::Range<u64> {
+    fn get_range(&self, doc_id: DocId) -> std::ops::Range<u32> {
        self.range(doc_id)
    }
    fn get_len(&self, doc_id: DocId) -> u64 {
--- a/src/fastfield/bytes/writer.rs
+++ b/src/fastfield/bytes/writer.rs
@@ -57,14 +57,15 @@ impl BytesFastFieldWriter {

    /// Shift to the next document and add all of the
    /// matching field values present in the document.
-    pub fn add_document(&mut self, doc: &Document) {
+    pub fn add_document(&mut self, doc: &Document) -> crate::Result<()> {
        self.next_doc();
        for field_value in doc.get_all(self.field) {
            if let Value::Bytes(ref bytes) = field_value {
                self.vals.extend_from_slice(bytes);
-                return;
+                return Ok(());
            }
        }
+        Ok(())
    }

    /// Register the bytes associated with a document.
--- a/src/fastfield/mod.rs
+++ b/src/fastfield/mod.rs
@@ -7,16 +7,15 @@
 //! It is designed for the fast random access of some document
 //! fields given a document id.
 //!
-//! `FastField` are useful when a field is required for all or most of
-//! the `DocSet` : for instance for scoring, grouping, filtering, or faceting.
+//! Fast fields are useful when a field is required for all or most of
+//! the `DocSet`: for instance for scoring, grouping, aggregation, filtering, or faceting.
 //!
 //!
-//! Fields have to be declared as `FAST` in the  schema.
-//! Currently supported fields are: u64, i64, f64 and bytes.
+//! Fields have to be declared as `FAST` in the schema.
+//! Currently supported fields are: u64, i64, f64, bytes and text.
 //!
-//! u64, i64 and f64 fields are stored in a bit-packed fashion so that
-//! their memory usage is directly linear with the amplitude of the
-//! values stored.
+//! Fast fields are stored in with [different codecs](fastfield_codecs). The best codec is detected
+//! automatically, when serializing.
 //!
 //! Read access performance is comparable to that of an array lookup.

@@ -27,10 +26,14 @@ pub use self::bytes::{BytesFastFieldReader, BytesFastFieldWriter};
 pub use self::error::{FastFieldNotAvailableError, Result};
 pub use self::facet_reader::FacetReader;
 pub(crate) use self::multivalued::{get_fastfield_codecs_for_multivalue, MultivalueStartIndex};
-pub use self::multivalued::{MultiValuedFastFieldReader, MultiValuedFastFieldWriter};
+pub use self::multivalued::{
+    MultiValueU128FastFieldWriter, MultiValuedFastFieldReader, MultiValuedFastFieldWriter,
+    MultiValuedU128FastFieldReader,
+};
 pub use self::readers::FastFieldReaders;
 pub(crate) use self::readers::{type_and_cardinality, FastType};
 pub use self::serializer::{Column, CompositeFastFieldSerializer};
+use self::writer::unexpected_value;
 pub use self::writer::{FastFieldsWriter, IntFastFieldWriter};
 use crate::schema::{Type, Value};
 use crate::{DateTime, DocId};
@@ -48,7 +51,7 @@ mod writer;
 /// for a doc_id
 pub trait MultiValueLength {
    /// returns the positions for a docid
-    fn get_range(&self, doc_id: DocId) -> std::ops::Range<u64>;
+    fn get_range(&self, doc_id: DocId) -> std::ops::Range<u32>;
    /// returns the num of values associated with a doc_id
    fn get_len(&self, doc_id: DocId) -> u64;
    /// returns the sum of num values for all doc_ids
@@ -117,15 +120,16 @@ impl FastValue for DateTime {
    }
 }

-fn value_to_u64(value: &Value) -> u64 {
-    match value {
+fn value_to_u64(value: &Value) -> crate::Result<u64> {
+    let value = match value {
        Value::U64(val) => val.to_u64(),
        Value::I64(val) => val.to_u64(),
        Value::F64(val) => val.to_u64(),
        Value::Bool(val) => val.to_u64(),
        Value::Date(val) => val.to_u64(),
-        _ => panic!("Expected a u64/i64/f64/bool/date field, got {:?} ", value),
-    }
+        _ => return Err(unexpected_value("u64/i64/f64/bool/date", value)),
+    };
+    Ok(value)
 }

 /// The fast field type
@@ -180,9 +184,9 @@ mod tests {
    #[test]
    pub fn test_fastfield() {
        let test_fastfield = fastfield_codecs::serialize_and_load(&[100u64, 200u64, 300u64][..]);
-        assert_eq!(test_fastfield.get_val(0u64), 100);
-        assert_eq!(test_fastfield.get_val(1u64), 200);
-        assert_eq!(test_fastfield.get_val(2u64), 300);
+        assert_eq!(test_fastfield.get_val(0), 100);
+        assert_eq!(test_fastfield.get_val(1), 200);
+        assert_eq!(test_fastfield.get_val(2), 300);
    }

    #[test]
@@ -199,9 +203,15 @@ mod tests {
            let write: WritePtr = directory.open_write(Path::new("test")).unwrap();
            let mut serializer = CompositeFastFieldSerializer::from_write(write).unwrap();
            let mut fast_field_writers = FastFieldsWriter::from_schema(&SCHEMA);
-            fast_field_writers.add_document(&doc!(*FIELD=>13u64));
-            fast_field_writers.add_document(&doc!(*FIELD=>14u64));
-            fast_field_writers.add_document(&doc!(*FIELD=>2u64));
+            fast_field_writers
+                .add_document(&doc!(*FIELD=>13u64))
+                .unwrap();
+            fast_field_writers
+                .add_document(&doc!(*FIELD=>14u64))
+                .unwrap();
+            fast_field_writers
+                .add_document(&doc!(*FIELD=>2u64))
+                .unwrap();
            fast_field_writers
                .serialize(&mut serializer, &HashMap::new(), None)
                .unwrap();
@@ -226,15 +236,33 @@ mod tests {
            let write: WritePtr = directory.open_write(Path::new("test"))?;
            let mut serializer = CompositeFastFieldSerializer::from_write(write)?;
            let mut fast_field_writers = FastFieldsWriter::from_schema(&SCHEMA);
-            fast_field_writers.add_document(&doc!(*FIELD=>4u64));
-            fast_field_writers.add_document(&doc!(*FIELD=>14_082_001u64));
-            fast_field_writers.add_document(&doc!(*FIELD=>3_052u64));
-            fast_field_writers.add_document(&doc!(*FIELD=>9_002u64));
-            fast_field_writers.add_document(&doc!(*FIELD=>15_001u64));
-            fast_field_writers.add_document(&doc!(*FIELD=>777u64));
-            fast_field_writers.add_document(&doc!(*FIELD=>1_002u64));
-            fast_field_writers.add_document(&doc!(*FIELD=>1_501u64));
-            fast_field_writers.add_document(&doc!(*FIELD=>215u64));
+            fast_field_writers
+                .add_document(&doc!(*FIELD=>4u64))
+                .unwrap();
+            fast_field_writers
+                .add_document(&doc!(*FIELD=>14_082_001u64))
+                .unwrap();
+            fast_field_writers
+                .add_document(&doc!(*FIELD=>3_052u64))
+                .unwrap();
+            fast_field_writers
+                .add_document(&doc!(*FIELD=>9_002u64))
+                .unwrap();
+            fast_field_writers
+                .add_document(&doc!(*FIELD=>15_001u64))
+                .unwrap();
+            fast_field_writers
+                .add_document(&doc!(*FIELD=>777u64))
+                .unwrap();
+            fast_field_writers
+                .add_document(&doc!(*FIELD=>1_002u64))
+                .unwrap();
+            fast_field_writers
+                .add_document(&doc!(*FIELD=>1_501u64))
+                .unwrap();
+            fast_field_writers
+                .add_document(&doc!(*FIELD=>215u64))
+                .unwrap();
            fast_field_writers.serialize(&mut serializer, &HashMap::new(), None)?;
            serializer.close()?;
        }
@@ -270,7 +298,9 @@ mod tests {
            let mut serializer = CompositeFastFieldSerializer::from_write(write).unwrap();
            let mut fast_field_writers = FastFieldsWriter::from_schema(&SCHEMA);
            for _ in 0..10_000 {
-                fast_field_writers.add_document(&doc!(*FIELD=>100_000u64));
+                fast_field_writers
+                    .add_document(&doc!(*FIELD=>100_000u64))
+                    .unwrap();
            }
            fast_field_writers
                .serialize(&mut serializer, &HashMap::new(), None)
@@ -303,9 +333,13 @@ mod tests {
            let mut serializer = CompositeFastFieldSerializer::from_write(write).unwrap();
            let mut fast_field_writers = FastFieldsWriter::from_schema(&SCHEMA);
            // forcing the amplitude to be high
-            fast_field_writers.add_document(&doc!(*FIELD=>0u64));
+            fast_field_writers
+                .add_document(&doc!(*FIELD=>0u64))
+                .unwrap();
            for i in 0u64..10_000u64 {
-                fast_field_writers.add_document(&doc!(*FIELD=>5_000_000_000_000_000_000u64 + i));
+                fast_field_writers
+                    .add_document(&doc!(*FIELD=>5_000_000_000_000_000_000u64 + i))
+                    .unwrap();
            }
            fast_field_writers
                .serialize(&mut serializer, &HashMap::new(), None)
@@ -347,7 +381,7 @@ mod tests {
            for i in -100i64..10_000i64 {
                let mut doc = Document::default();
                doc.add_i64(i64_field, i);
-                fast_field_writers.add_document(&doc);
+                fast_field_writers.add_document(&doc).unwrap();
            }
            fast_field_writers
                .serialize(&mut serializer, &HashMap::new(), None)
@@ -368,7 +402,7 @@ mod tests {
            assert_eq!(fast_field_reader.min_value(), -100i64);
            assert_eq!(fast_field_reader.max_value(), 9_999i64);
            for (doc, i) in (-100i64..10_000i64).enumerate() {
-                assert_eq!(fast_field_reader.get_val(doc as u64), i);
+                assert_eq!(fast_field_reader.get_val(doc as u32), i);
            }
            let mut buffer = vec![0i64; 100];
            fast_field_reader.get_range(53, &mut buffer[..]);
@@ -392,7 +426,7 @@ mod tests {
            let mut serializer = CompositeFastFieldSerializer::from_write(write).unwrap();
            let mut fast_field_writers = FastFieldsWriter::from_schema(&schema);
            let doc = Document::default();
-            fast_field_writers.add_document(&doc);
+            fast_field_writers.add_document(&doc).unwrap();
            fast_field_writers
                .serialize(&mut serializer, &HashMap::new(), None)
                .unwrap();
@@ -435,7 +469,7 @@ mod tests {
            let mut serializer = CompositeFastFieldSerializer::from_write(write)?;
            let mut fast_field_writers = FastFieldsWriter::from_schema(&SCHEMA);
            for &x in &permutation {
-                fast_field_writers.add_document(&doc!(*FIELD=>x));
+                fast_field_writers.add_document(&doc!(*FIELD=>x)).unwrap();
            }
            fast_field_writers.serialize(&mut serializer, &HashMap::new(), None)?;
            serializer.close()?;
@@ -450,7 +484,7 @@ mod tests {
            let fast_field_reader = open::<u64>(data)?;

            for a in 0..n {
-                assert_eq!(fast_field_reader.get_val(a as u64), permutation[a as usize]);
+                assert_eq!(fast_field_reader.get_val(a as u32), permutation[a as usize]);
            }
        }
        Ok(())
@@ -785,10 +819,14 @@ mod tests {
            let write: WritePtr = directory.open_write(path).unwrap();
            let mut serializer = CompositeFastFieldSerializer::from_write(write).unwrap();
            let mut fast_field_writers = FastFieldsWriter::from_schema(&schema);
-            fast_field_writers.add_document(&doc!(field=>true));
-            fast_field_writers.add_document(&doc!(field=>false));
-            fast_field_writers.add_document(&doc!(field=>true));
-            fast_field_writers.add_document(&doc!(field=>false));
+            fast_field_writers.add_document(&doc!(field=>true)).unwrap();
+            fast_field_writers
+                .add_document(&doc!(field=>false))
+                .unwrap();
+            fast_field_writers.add_document(&doc!(field=>true)).unwrap();
+            fast_field_writers
+                .add_document(&doc!(field=>false))
+                .unwrap();
            fast_field_writers
                .serialize(&mut serializer, &HashMap::new(), None)
                .unwrap();
@@ -822,8 +860,10 @@ mod tests {
            let mut serializer = CompositeFastFieldSerializer::from_write(write).unwrap();
            let mut fast_field_writers = FastFieldsWriter::from_schema(&schema);
            for _ in 0..50 {
-                fast_field_writers.add_document(&doc!(field=>true));
-                fast_field_writers.add_document(&doc!(field=>false));
+                fast_field_writers.add_document(&doc!(field=>true)).unwrap();
+                fast_field_writers
+                    .add_document(&doc!(field=>false))
+                    .unwrap();
            }
            fast_field_writers
                .serialize(&mut serializer, &HashMap::new(), None)
@@ -857,7 +897,7 @@ mod tests {
            let mut serializer = CompositeFastFieldSerializer::from_write(write)?;
            let mut fast_field_writers = FastFieldsWriter::from_schema(&schema);
            let doc = Document::default();
-            fast_field_writers.add_document(&doc);
+            fast_field_writers.add_document(&doc).unwrap();
            fast_field_writers.serialize(&mut serializer, &HashMap::new(), None)?;
            serializer.close()?;
        }
@@ -883,7 +923,7 @@ mod tests {
                CompositeFastFieldSerializer::from_write_with_codec(write, codec_types).unwrap();
            let mut fast_field_writers = FastFieldsWriter::from_schema(schema);
            for doc in docs {
-                fast_field_writers.add_document(doc);
+                fast_field_writers.add_document(doc).unwrap();
            }
            fast_field_writers
                .serialize(&mut serializer, &HashMap::new(), None)
@@ -936,7 +976,7 @@ mod tests {
        let test_fastfield = open::<DateTime>(file.read_bytes()?)?;

        for (i, time) in times.iter().enumerate() {
-            assert_eq!(test_fastfield.get_val(i as u64), time.truncate(precision));
+            assert_eq!(test_fastfield.get_val(i as u32), time.truncate(precision));
        }
        Ok(len)
    }
--- a/src/fastfield/multivalued/mod.rs
+++ b/src/fastfield/multivalued/mod.rs
@@ -3,9 +3,9 @@ mod writer;

 use fastfield_codecs::FastFieldCodecType;

-pub use self::reader::MultiValuedFastFieldReader;
-pub use self::writer::MultiValuedFastFieldWriter;
+pub use self::reader::{MultiValuedFastFieldReader, MultiValuedU128FastFieldReader};
 pub(crate) use self::writer::MultivalueStartIndex;
+pub use self::writer::{MultiValueU128FastFieldWriter, MultiValuedFastFieldWriter};

 /// The valid codecs for multivalue values excludes the linear interpolation codec.
 ///
@@ -515,7 +515,7 @@ mod bench {
                for val in block {
                    doc.add_u64(field, *val);
                }
-                fast_field_writers.add_document(&doc);
+                fast_field_writers.add_document(&doc).unwrap();
            }
            fast_field_writers
                .serialize(&mut serializer, &HashMap::new(), None)
@@ -573,7 +573,7 @@ mod bench {
                for val in block {
                    doc.add_u64(field, *val);
                }
-                fast_field_writers.add_document(&doc);
+                fast_field_writers.add_document(&doc).unwrap();
            }
            fast_field_writers
                .serialize(&mut serializer, &HashMap::new(), None)
@@ -606,7 +606,7 @@ mod bench {
                for val in block {
                    doc.add_u64(field, *val);
                }
-                fast_field_writers.add_document(&doc);
+                fast_field_writers.add_document(&doc).unwrap();
            }
            fast_field_writers
                .serialize(&mut serializer, &HashMap::new(), Some(&doc_id_mapping))
--- a/src/fastfield/multivalued/reader.rs
+++ b/src/fastfield/multivalued/reader.rs
@@ -1,7 +1,7 @@
-use std::ops::Range;
+use std::ops::{Range, RangeInclusive};
 use std::sync::Arc;

-use fastfield_codecs::Column;
+use fastfield_codecs::{Column, MonotonicallyMappableToU128};

 use crate::fastfield::{FastValue, MultiValueLength};
 use crate::DocId;
@@ -33,19 +33,19 @@ impl<Item: FastValue> MultiValuedFastFieldReader<Item> {
    /// Returns `[start, end)`, such that the values associated with
    /// the given document are `start..end`.
    #[inline]
-    fn range(&self, doc: DocId) -> Range<u64> {
-        let idx = doc as u64;
-        let start = self.idx_reader.get_val(idx);
-        let end = self.idx_reader.get_val(idx + 1);
+    fn range(&self, doc: DocId) -> Range<u32> {
+        let start = self.idx_reader.get_val(doc) as u32;
+        let end = self.idx_reader.get_val(doc + 1) as u32;
        start..end
    }

    /// Returns the array of values associated with the given `doc`.
    #[inline]
-    fn get_vals_for_range(&self, range: Range<u64>, vals: &mut Vec<Item>) {
+    fn get_vals_for_range(&self, range: Range<u32>, vals: &mut Vec<Item>) {
        let len = (range.end - range.start) as usize;
        vals.resize(len, Item::make_zero());
-        self.vals_reader.get_range(range.start, &mut vals[..]);
+        self.vals_reader
+            .get_range(range.start as u64, &mut vals[..]);
    }

    /// Returns the array of values associated with the given `doc`.
@@ -88,7 +88,7 @@ impl<Item: FastValue> MultiValuedFastFieldReader<Item> {
 }

 impl<Item: FastValue> MultiValueLength for MultiValuedFastFieldReader<Item> {
-    fn get_range(&self, doc_id: DocId) -> Range<u64> {
+    fn get_range(&self, doc_id: DocId) -> Range<u32> {
        self.range(doc_id)
    }
    fn get_len(&self, doc_id: DocId) -> u64 {
@@ -99,12 +99,183 @@ impl<Item: FastValue> MultiValueLength for MultiValuedFastFieldReader<Item> {
        self.total_num_vals() as u64
    }
 }
+
+/// Reader for a multivalued `u128` fast field.
+///
+/// The reader is implemented as a `u64` fast field for the index and a `u128` fast field.
+///
+/// The `vals_reader` will access the concatenated list of all
+/// values for all reader.
+/// The `idx_reader` associated, for each document, the index of its first value.
+#[derive(Clone)]
+pub struct MultiValuedU128FastFieldReader<T: MonotonicallyMappableToU128> {
+    idx_reader: Arc<dyn Column<u64>>,
+    vals_reader: Arc<dyn Column<T>>,
+}
+
+impl<T: MonotonicallyMappableToU128> MultiValuedU128FastFieldReader<T> {
+    pub(crate) fn open(
+        idx_reader: Arc<dyn Column<u64>>,
+        vals_reader: Arc<dyn Column<T>>,
+    ) -> MultiValuedU128FastFieldReader<T> {
+        Self {
+            idx_reader,
+            vals_reader,
+        }
+    }
+
+    /// Returns `[start, end)`, such that the values associated
+    /// to the given document are `start..end`.
+    #[inline]
+    fn range(&self, doc: DocId) -> Range<u32> {
+        let start = self.idx_reader.get_val(doc) as u32;
+        let end = self.idx_reader.get_val(doc + 1) as u32;
+        start..end
+    }
+
+    /// Returns the array of values associated to the given `doc`.
+    #[inline]
+    pub fn get_first_val(&self, doc: DocId) -> Option<T> {
+        let range = self.range(doc);
+        if range.is_empty() {
+            return None;
+        }
+        Some(self.vals_reader.get_val(range.start))
+    }
+
+    /// Returns the array of values associated to the given `doc`.
+    #[inline]
+    fn get_vals_for_range(&self, range: Range<u32>, vals: &mut Vec<T>) {
+        let len = (range.end - range.start) as usize;
+        vals.resize(len, T::from_u128(0));
+        self.vals_reader
+            .get_range(range.start as u64, &mut vals[..]);
+    }
+
+    /// Returns the array of values associated to the given `doc`.
+    #[inline]
+    pub fn get_vals(&self, doc: DocId, vals: &mut Vec<T>) {
+        let range = self.range(doc);
+        self.get_vals_for_range(range, vals);
+    }
+
+    /// Returns all docids which are in the provided value range
+    pub fn get_positions_for_value_range(
+        &self,
+        value_range: RangeInclusive<T>,
+        doc_id_range: Range<u32>,
+    ) -> Vec<DocId> {
+        let mut positions = Vec::new(); // TODO replace
+        self.vals_reader
+            .get_positions_for_value_range(value_range, doc_id_range, &mut positions);
+
+        positions_to_docids(&positions, self.idx_reader.as_ref())
+    }
+
+    /// Iterates over all elements in the fast field
+    pub fn iter(&self) -> impl Iterator<Item = T> + '_ {
+        self.vals_reader.iter()
+    }
+
+    /// Returns the minimum value for this fast field.
+    ///
+    /// The min value does not take in account of possible
+    /// deleted document, and should be considered as a lower bound
+    /// of the actual mimimum value.
+    pub fn min_value(&self) -> T {
+        self.vals_reader.min_value()
+    }
+
+    /// Returns the maximum value for this fast field.
+    ///
+    /// The max value does not take in account of possible
+    /// deleted document, and should be considered as an upper bound
+    /// of the actual maximum value.
+    pub fn max_value(&self) -> T {
+        self.vals_reader.max_value()
+    }
+
+    /// Returns the number of values associated with the document `DocId`.
+    #[inline]
+    pub fn num_vals(&self, doc: DocId) -> usize {
+        let range = self.range(doc);
+        (range.end - range.start) as usize
+    }
+
+    /// Returns the overall number of values in this field.
+    #[inline]
+    pub fn total_num_vals(&self) -> u64 {
+        self.idx_reader.max_value()
+    }
+}
+
+impl<T: MonotonicallyMappableToU128> MultiValueLength for MultiValuedU128FastFieldReader<T> {
+    fn get_range(&self, doc_id: DocId) -> std::ops::Range<u32> {
+        self.range(doc_id)
+    }
+    fn get_len(&self, doc_id: DocId) -> u64 {
+        self.num_vals(doc_id) as u64
+    }
+    fn get_total_len(&self) -> u64 {
+        self.total_num_vals() as u64
+    }
+}
+
+/// Converts a list of positions of values in a 1:n index to the corresponding list of DocIds.
+///
+/// Since there is no index for value pos -> docid, but docid -> value pos range, we scan the index.
+///
+/// Correctness: positions needs to be sorted. idx_reader needs to contain monotonically increasing
+/// positions.
+///
+/// TODO: Instead of a linear scan we can employ a expotential search into binary search to match a
+/// docid to its value position.
+fn positions_to_docids<C: Column + ?Sized>(positions: &[u32], idx_reader: &C) -> Vec<DocId> {
+    let mut docs = vec![];
+    let mut cur_doc = 0u32;
+    let mut last_doc = None;
+
+    for pos in positions {
+        loop {
+            let end = idx_reader.get_val(cur_doc + 1) as u32;
+            if end > *pos {
+                // avoid duplicates
+                if Some(cur_doc) == last_doc {
+                    break;
+                }
+                docs.push(cur_doc);
+                last_doc = Some(cur_doc);
+                break;
+            }
+            cur_doc += 1;
+        }
+    }
+
+    docs
+}
+
 #[cfg(test)]
 mod tests {

+    use fastfield_codecs::VecColumn;
+
    use crate::core::Index;
+    use crate::fastfield::multivalued::reader::positions_to_docids;
    use crate::schema::{Cardinality, Facet, FacetOptions, NumericOptions, Schema};

+    #[test]
+    fn test_positions_to_docid() {
+        let positions = vec![10u32, 11, 15, 20, 21, 22];
+
+        let offsets = vec![0, 10, 12, 15, 22, 23];
+        {
+            let column = VecColumn::from(&offsets);
+
+            let docids = positions_to_docids(&positions, &column);
+            assert_eq!(docids, vec![1, 3, 4]);
+        }
+    }
+
    #[test]
    fn test_multifastfield_reader() -> crate::Result<()> {
        let mut schema_builder = Schema::builder();
--- a/src/fastfield/multivalued/writer.rs
+++ b/src/fastfield/multivalued/writer.rs
@@ -1,9 +1,12 @@
 use std::io;

-use fastfield_codecs::{Column, MonotonicallyMappableToU64, VecColumn};
-use fnv::FnvHashMap;
+use fastfield_codecs::{
+    Column, MonotonicallyMappableToU128, MonotonicallyMappableToU64, VecColumn,
+};
+use rustc_hash::FxHashMap;

 use super::get_fastfield_codecs_for_multivalue;
+use crate::fastfield::writer::unexpected_value;
 use crate::fastfield::{value_to_u64, CompositeFastFieldSerializer, FastFieldType};
 use crate::indexer::doc_id_mapping::DocIdMapping;
 use crate::postings::UnorderedTermId;
@@ -79,11 +82,11 @@ impl MultiValuedFastFieldWriter {

    /// Shift to the next document and adds
    /// all of the matching field values present in the document.
-    pub fn add_document(&mut self, doc: &Document) {
+    pub fn add_document(&mut self, doc: &Document) -> crate::Result<()> {
        self.next_doc();
        // facets/texts are indexed in the `SegmentWriter` as we encode their unordered id.
        if self.fast_field_type.is_storing_term_ids() {
-            return;
+            return Ok(());
        }
        for field_value in doc.field_values() {
            if field_value.field == self.field {
@@ -92,11 +95,12 @@ impl MultiValuedFastFieldWriter {
                    (Some(precision), Value::Date(date_val)) => {
                        date_val.truncate(precision).to_u64()
                    }
-                    _ => value_to_u64(value),
+                    _ => value_to_u64(value)?,
                };
                self.add_val(value_u64);
            }
        }
+        Ok(())
    }

    /// Returns an iterator over values per doc_id in ascending doc_id order.
@@ -140,7 +144,7 @@ impl MultiValuedFastFieldWriter {
    pub fn serialize(
        mut self,
        serializer: &mut CompositeFastFieldSerializer,
-        term_mapping_opt: Option<&FnvHashMap<UnorderedTermId, TermOrdinal>>,
+        term_mapping_opt: Option<&FxHashMap<UnorderedTermId, TermOrdinal>>,
        doc_id_map: Option<&DocIdMapping>,
    ) -> io::Result<()> {
        {
@@ -215,7 +219,7 @@ pub(crate) struct MultivalueStartIndex<'a, C: Column> {

 impl<'a, C: Column> MultivalueStartIndex<'a, C> {
    pub fn new(column: &'a C, doc_id_map: &'a DocIdMapping) -> Self {
-        assert_eq!(column.num_vals(), doc_id_map.num_old_doc_ids() as u64 + 1);
+        assert_eq!(column.num_vals(), doc_id_map.num_old_doc_ids() as u32 + 1);
        let (min, max) =
            tantivy_bitpacker::minmax(iter_remapped_multivalue_index(doc_id_map, column))
                .unwrap_or((0u64, 0u64));
@@ -228,7 +232,7 @@ impl<'a, C: Column> MultivalueStartIndex<'a, C> {
    }
 }
 impl<'a, C: Column> Column for MultivalueStartIndex<'a, C> {
-    fn get_val(&self, _idx: u64) -> u64 {
+    fn get_val(&self, _idx: u32) -> u64 {
        unimplemented!()
    }

@@ -240,8 +244,8 @@ impl<'a, C: Column> Column for MultivalueStartIndex<'a, C> {
        self.max
    }

-    fn num_vals(&self) -> u64 {
-        (self.doc_id_map.num_new_doc_ids() + 1) as u64
+    fn num_vals(&self) -> u32 {
+        (self.doc_id_map.num_new_doc_ids() + 1) as u32
    }

    fn iter(&self) -> Box<dyn Iterator<Item = u64> + '_> {
@@ -258,12 +262,150 @@ fn iter_remapped_multivalue_index<'a, C: Column>(
 ) -> impl Iterator<Item = u64> + 'a {
    let mut offset = 0;
    std::iter::once(0).chain(doc_id_map.iter_old_doc_ids().map(move |old_doc| {
-        let num_vals_for_doc = column.get_val(old_doc as u64 + 1) - column.get_val(old_doc as u64);
+        let num_vals_for_doc = column.get_val(old_doc + 1) - column.get_val(old_doc);
        offset += num_vals_for_doc;
        offset as u64
    }))
 }

+/// Writer for multi-valued (as in, more than one value per document)
+/// int fast field.
+///
+/// This `Writer` is only useful for advanced users.
+/// The normal way to get your multivalued int in your index
+/// is to
+/// - declare your field with fast set to `Cardinality::MultiValues`
+/// in your schema
+/// - add your document simply by calling `.add_document(...)`.
+///
+/// The `MultiValuedFastFieldWriter` can be acquired from the
+
+pub struct MultiValueU128FastFieldWriter {
+    field: Field,
+    vals: Vec<u128>,
+    doc_index: Vec<u64>,
+}
+
+impl MultiValueU128FastFieldWriter {
+    /// Creates a new `U128MultiValueFastFieldWriter`
+    pub(crate) fn new(field: Field) -> Self {
+        MultiValueU128FastFieldWriter {
+            field,
+            vals: Vec::new(),
+            doc_index: Vec::new(),
+        }
+    }
+
+    /// The memory used (inclusive childs)
+    pub fn mem_usage(&self) -> usize {
+        self.vals.capacity() * std::mem::size_of::<UnorderedTermId>()
+            + self.doc_index.capacity() * std::mem::size_of::<u64>()
+    }
+
+    /// Finalize the current document.
+    pub(crate) fn next_doc(&mut self) {
+        self.doc_index.push(self.vals.len() as u64);
+    }
+
+    /// Pushes a new value to the current document.
+    pub(crate) fn add_val(&mut self, val: u128) {
+        self.vals.push(val);
+    }
+
+    /// Shift to the next document and adds
+    /// all of the matching field values present in the document.
+    pub fn add_document(&mut self, doc: &Document) -> crate::Result<()> {
+        self.next_doc();
+        for field_value in doc.field_values() {
+            if field_value.field == self.field {
+                let value = field_value.value();
+                let ip_addr = value
+                    .as_ip_addr()
+                    .ok_or_else(|| unexpected_value("ip", value))?;
+                let ip_addr_u128 = ip_addr.to_u128();
+                self.add_val(ip_addr_u128);
+            }
+        }
+        Ok(())
+    }
+
+    /// Returns an iterator over values per doc_id in ascending doc_id order.
+    ///
+    /// Normally the order is simply iterating self.doc_id_index.
+    /// With doc_id_map it accounts for the new mapping, returning values in the order of the
+    /// new doc_ids.
+    fn get_ordered_values<'a: 'b, 'b>(
+        &'a self,
+        doc_id_map: Option<&'b DocIdMapping>,
+    ) -> impl Iterator<Item = &'b [u128]> {
+        get_ordered_values(&self.vals, &self.doc_index, doc_id_map)
+    }
+
+    /// Serializes fast field values.
+    pub fn serialize(
+        mut self,
+        serializer: &mut CompositeFastFieldSerializer,
+        doc_id_map: Option<&DocIdMapping>,
+    ) -> io::Result<()> {
+        {
+            // writing the offset index
+            //
+            self.doc_index.push(self.vals.len() as u64);
+            let col = VecColumn::from(&self.doc_index[..]);
+            if let Some(doc_id_map) = doc_id_map {
+                let multi_value_start_index = MultivalueStartIndex::new(&col, doc_id_map);
+                serializer.create_auto_detect_u64_fast_field_with_idx(
+                    self.field,
+                    multi_value_start_index,
+                    0,
+                )?;
+            } else {
+                serializer.create_auto_detect_u64_fast_field_with_idx(self.field, col, 0)?;
+            }
+        }
+        {
+            let iter_gen = || self.get_ordered_values(doc_id_map).flatten().cloned();
+
+            serializer.create_u128_fast_field_with_idx(
+                self.field,
+                iter_gen,
+                self.vals.len() as u32,
+                1,
+            )?;
+        }
+        Ok(())
+    }
+}
+
+/// Returns an iterator over values per doc_id in ascending doc_id order.
+///
+/// Normally the order is simply iterating self.doc_id_index.
+/// With doc_id_map it accounts for the new mapping, returning values in the order of the
+/// new doc_ids.
+fn get_ordered_values<'a: 'b, 'b, T>(
+    vals: &'a [T],
+    doc_index: &'a [u64],
+    doc_id_map: Option<&'b DocIdMapping>,
+) -> impl Iterator<Item = &'b [T]> {
+    let doc_id_iter: Box<dyn Iterator<Item = u32>> = if let Some(doc_id_map) = doc_id_map {
+        Box::new(doc_id_map.iter_old_doc_ids())
+    } else {
+        let max_doc = doc_index.len() as DocId;
+        Box::new(0..max_doc)
+    };
+    doc_id_iter.map(move |doc_id| get_values_for_doc_id(doc_id, vals, doc_index))
+}
+
+/// returns all values for a doc_id
+fn get_values_for_doc_id<'a, T>(doc_id: u32, vals: &'a [T], doc_index: &'a [u64]) -> &'a [T] {
+    let start_pos = doc_index[doc_id as usize] as usize;
+    let end_pos = doc_index
+        .get(doc_id as usize + 1)
+        .cloned()
+        .unwrap_or(vals.len() as u64) as usize; // special case, last doc_id has no offset information
+    &vals[start_pos..end_pos]
+}
+
 #[cfg(test)]
 mod tests {
    use super::*;
--- a/src/fastfield/readers.rs
+++ b/src/fastfield/readers.rs
@@ -1,7 +1,9 @@
+use std::net::Ipv6Addr;
 use std::sync::Arc;

-use fastfield_codecs::{open, Column};
+use fastfield_codecs::{open, open_u128, Column};

+use super::multivalued::MultiValuedU128FastFieldReader;
 use crate::directory::{CompositeFile, FileSlice};
 use crate::fastfield::{
    BytesFastFieldReader, FastFieldNotAvailableError, FastValue, MultiValuedFastFieldReader,
@@ -23,6 +25,7 @@ pub struct FastFieldReaders {
 pub(crate) enum FastType {
    I64,
    U64,
+    U128,
    F64,
    Bool,
    Date,
@@ -49,6 +52,9 @@ pub(crate) fn type_and_cardinality(field_type: &FieldType) -> Option<(FastType,
        FieldType::Str(options) if options.is_fast() => {
            Some((FastType::U64, Cardinality::MultiValues))
        }
+        FieldType::IpAddr(options) => options
+            .get_fastfield_cardinality()
+            .map(|cardinality| (FastType::U128, cardinality)),
        _ => None,
    }
 }
@@ -143,6 +149,59 @@ impl FastFieldReaders {
        self.typed_fast_field_reader(field)
    }

+    /// Returns the `ip` fast field reader reader associated to `field`.
+    ///
+    /// If `field` is not a u128 fast field, this method returns an Error.
+    pub fn ip_addr(&self, field: Field) -> crate::Result<Arc<dyn Column<Ipv6Addr>>> {
+        self.check_type(field, FastType::U128, Cardinality::SingleValue)?;
+        let bytes = self.fast_field_data(field, 0)?.read_bytes()?;
+        Ok(open_u128::<Ipv6Addr>(bytes)?)
+    }
+
+    /// Returns the `ip` fast field reader reader associated to `field`.
+    ///
+    /// If `field` is not a u128 fast field, this method returns an Error.
+    pub fn ip_addrs(
+        &self,
+        field: Field,
+    ) -> crate::Result<MultiValuedU128FastFieldReader<Ipv6Addr>> {
+        self.check_type(field, FastType::U128, Cardinality::MultiValues)?;
+        let idx_reader: Arc<dyn Column<u64>> = self.typed_fast_field_reader(field)?;
+
+        let bytes = self.fast_field_data(field, 1)?.read_bytes()?;
+        let vals_reader = open_u128::<Ipv6Addr>(bytes)?;
+
+        Ok(MultiValuedU128FastFieldReader::open(
+            idx_reader,
+            vals_reader,
+        ))
+    }
+
+    /// Returns the `u128` fast field reader reader associated to `field`.
+    ///
+    /// If `field` is not a u128 fast field, this method returns an Error.
+    pub(crate) fn u128(&self, field: Field) -> crate::Result<Arc<dyn Column<u128>>> {
+        self.check_type(field, FastType::U128, Cardinality::SingleValue)?;
+        let bytes = self.fast_field_data(field, 0)?.read_bytes()?;
+        Ok(open_u128::<u128>(bytes)?)
+    }
+
+    /// Returns the `u128` multi-valued fast field reader reader associated to `field`.
+    ///
+    /// If `field` is not a u128 multi-valued fast field, this method returns an Error.
+    pub fn u128s(&self, field: Field) -> crate::Result<MultiValuedU128FastFieldReader<u128>> {
+        self.check_type(field, FastType::U128, Cardinality::MultiValues)?;
+        let idx_reader: Arc<dyn Column<u64>> = self.typed_fast_field_reader(field)?;
+
+        let bytes = self.fast_field_data(field, 1)?.read_bytes()?;
+        let vals_reader = open_u128::<u128>(bytes)?;
+
+        Ok(MultiValuedU128FastFieldReader::open(
+            idx_reader,
+            vals_reader,
+        ))
+    }
+
    /// Returns the `u64` fast field reader reader associated with `field`, regardless of whether
    /// the given field is effectively of type `u64` or not.
    ///
--- a/src/fastfield/serializer/mod.rs
+++ b/src/fastfield/serializer/mod.rs
@@ -84,6 +84,21 @@ impl CompositeFastFieldSerializer {
        Ok(())
    }

+    /// Serialize data into a new u128 fast field. The codec will be compact space compressor,
+    /// which is optimized for scanning the fast field for a given range.
+    pub fn create_u128_fast_field_with_idx<F: Fn() -> I, I: Iterator<Item = u128>>(
+        &mut self,
+        field: Field,
+        iter_gen: F,
+        num_vals: u32,
+        idx: usize,
+    ) -> io::Result<()> {
+        let field_write = self.composite_write.for_field_with_idx(field, idx);
+        fastfield_codecs::serialize_u128(iter_gen, num_vals, field_write)?;
+
+        Ok(())
+    }
+
    /// Start serializing a new [u8] fast field. Use the returned writer to write data into the
    /// bytes field. To associate the bytes with documents a seperate index must be created on
    /// index 0. See bytes/writer.rs::serialize for an example.
--- a/src/fastfield/writer.rs
+++ b/src/fastfield/writer.rs
@@ -2,11 +2,11 @@ use std::collections::HashMap;
 use std::io;

 use common;
-use fastfield_codecs::{Column, MonotonicallyMappableToU64};
-use fnv::FnvHashMap;
+use fastfield_codecs::{Column, MonotonicallyMappableToU128, MonotonicallyMappableToU64};
+use rustc_hash::FxHashMap;
 use tantivy_bitpacker::BlockedBitpacker;

-use super::multivalued::MultiValuedFastFieldWriter;
+use super::multivalued::{MultiValueU128FastFieldWriter, MultiValuedFastFieldWriter};
 use super::FastFieldType;
 use crate::fastfield::{BytesFastFieldWriter, CompositeFastFieldSerializer};
 use crate::indexer::doc_id_mapping::DocIdMapping;
@@ -19,10 +19,19 @@ use crate::DatePrecision;
 pub struct FastFieldsWriter {
    term_id_writers: Vec<MultiValuedFastFieldWriter>,
    single_value_writers: Vec<IntFastFieldWriter>,
+    u128_value_writers: Vec<U128FastFieldWriter>,
+    u128_multi_value_writers: Vec<MultiValueU128FastFieldWriter>,
    multi_values_writers: Vec<MultiValuedFastFieldWriter>,
    bytes_value_writers: Vec<BytesFastFieldWriter>,
 }

+pub(crate) fn unexpected_value(expected: &str, actual: &Value) -> crate::TantivyError {
+    crate::TantivyError::SchemaError(format!(
+        "Expected a {:?} in fast field, but got {:?}",
+        expected, actual
+    ))
+}
+
 fn fast_field_default_value(field_entry: &FieldEntry) -> u64 {
    match *field_entry.field_type() {
        FieldType::I64(_) | FieldType::Date(_) => common::i64_to_u64(0i64),
@@ -34,6 +43,8 @@ fn fast_field_default_value(field_entry: &FieldEntry) -> u64 {
 impl FastFieldsWriter {
    /// Create all `FastFieldWriter` required by the schema.
    pub fn from_schema(schema: &Schema) -> FastFieldsWriter {
+        let mut u128_value_writers = Vec::new();
+        let mut u128_multi_value_writers = Vec::new();
        let mut single_value_writers = Vec::new();
        let mut term_id_writers = Vec::new();
        let mut multi_values_writers = Vec::new();
@@ -97,10 +108,27 @@ impl FastFieldsWriter {
                        bytes_value_writers.push(fast_field_writer);
                    }
                }
+                FieldType::IpAddr(opt) => {
+                    if opt.is_fast() {
+                        match opt.get_fastfield_cardinality() {
+                            Some(Cardinality::SingleValue) => {
+                                let fast_field_writer = U128FastFieldWriter::new(field);
+                                u128_value_writers.push(fast_field_writer);
+                            }
+                            Some(Cardinality::MultiValues) => {
+                                let fast_field_writer = MultiValueU128FastFieldWriter::new(field);
+                                u128_multi_value_writers.push(fast_field_writer);
+                            }
+                            None => {}
+                        }
+                    }
+                }
                FieldType::Str(_) | FieldType::JsonObject(_) => {}
            }
        }
        FastFieldsWriter {
+            u128_value_writers,
+            u128_multi_value_writers,
            term_id_writers,
            single_value_writers,
            multi_values_writers,
@@ -129,6 +157,16 @@ impl FastFieldsWriter {
                .iter()
                .map(|w| w.mem_usage())
                .sum::<usize>()
+            + self
+                .u128_value_writers
+                .iter()
+                .map(|w| w.mem_usage())
+                .sum::<usize>()
+            + self
+                .u128_multi_value_writers
+                .iter()
+                .map(|w| w.mem_usage())
+                .sum::<usize>()
    }

    /// Get the `FastFieldWriter` associated with a field.
@@ -190,21 +228,27 @@ impl FastFieldsWriter {
            .iter_mut()
            .find(|field_writer| field_writer.field() == field)
    }
-
    /// Indexes all of the fastfields of a new document.
-    pub fn add_document(&mut self, doc: &Document) {
+    pub fn add_document(&mut self, doc: &Document) -> crate::Result<()> {
        for field_writer in &mut self.term_id_writers {
-            field_writer.add_document(doc);
+            field_writer.add_document(doc)?;
        }
        for field_writer in &mut self.single_value_writers {
-            field_writer.add_document(doc);
+            field_writer.add_document(doc)?;
        }
        for field_writer in &mut self.multi_values_writers {
-            field_writer.add_document(doc);
+            field_writer.add_document(doc)?;
        }
        for field_writer in &mut self.bytes_value_writers {
-            field_writer.add_document(doc);
+            field_writer.add_document(doc)?;
        }
+        for field_writer in &mut self.u128_value_writers {
+            field_writer.add_document(doc)?;
+        }
+        for field_writer in &mut self.u128_multi_value_writers {
+            field_writer.add_document(doc)?;
+        }
+        Ok(())
    }

    /// Serializes all of the `FastFieldWriter`s by pushing them in
@@ -212,7 +256,7 @@ impl FastFieldsWriter {
    pub fn serialize(
        self,
        serializer: &mut CompositeFastFieldSerializer,
-        mapping: &HashMap<Field, FnvHashMap<UnorderedTermId, TermOrdinal>>,
+        mapping: &HashMap<Field, FxHashMap<UnorderedTermId, TermOrdinal>>,
        doc_id_map: Option<&DocIdMapping>,
    ) -> io::Result<()> {
        for field_writer in self.term_id_writers {
@@ -230,6 +274,108 @@ impl FastFieldsWriter {
        for field_writer in self.bytes_value_writers {
            field_writer.serialize(serializer, doc_id_map)?;
        }
+        for field_writer in self.u128_value_writers {
+            field_writer.serialize(serializer, doc_id_map)?;
+        }
+        for field_writer in self.u128_multi_value_writers {
+            field_writer.serialize(serializer, doc_id_map)?;
+        }
+
+        Ok(())
+    }
+}
+
+/// Fast field writer for u128 values.
+/// The fast field writer just keeps the values in memory.
+///
+/// Only when the segment writer can be closed and
+/// persisted on disk, the fast field writer is
+/// sent to a `FastFieldSerializer` via the `.serialize(...)`
+/// method.
+///
+/// We cannot serialize earlier as the values are
+/// compressed to a compact number space and the number of
+/// bits required for bitpacking can only been known once
+/// we have seen all of the values.
+pub struct U128FastFieldWriter {
+    field: Field,
+    vals: Vec<u128>,
+    val_count: u32,
+}
+
+impl U128FastFieldWriter {
+    /// Creates a new `IntFastFieldWriter`
+    pub fn new(field: Field) -> Self {
+        Self {
+            field,
+            vals: vec![],
+            val_count: 0,
+        }
+    }
+
+    /// The memory used (inclusive childs)
+    pub fn mem_usage(&self) -> usize {
+        self.vals.len() * 16
+    }
+
+    /// Records a new value.
+    ///
+    /// The n-th value being recorded is implicitely
+    /// associated to the document with the `DocId` n.
+    /// (Well, `n-1` actually because of 0-indexing)
+    pub fn add_val(&mut self, val: u128) {
+        self.vals.push(val);
+    }
+
+    /// Extract the fast field value from the document
+    /// (or use the default value) and records it.
+    ///
+    /// Extract the value associated to the fast field for
+    /// this document.
+    pub fn add_document(&mut self, doc: &Document) -> crate::Result<()> {
+        match doc.get_first(self.field) {
+            Some(v) => {
+                let ip_addr = v.as_ip_addr().ok_or_else(|| unexpected_value("ip", v))?;
+                let value = ip_addr.to_u128();
+                self.add_val(value);
+            }
+            None => {
+                self.add_val(0); // TODO fix null handling
+            }
+        };
+        self.val_count += 1;
+        Ok(())
+    }
+
+    /// Push the fast fields value to the `FastFieldWriter`.
+    pub fn serialize(
+        &self,
+        serializer: &mut CompositeFastFieldSerializer,
+        doc_id_map: Option<&DocIdMapping>,
+    ) -> io::Result<()> {
+        if let Some(doc_id_map) = doc_id_map {
+            let iter_gen = || {
+                doc_id_map
+                    .iter_old_doc_ids()
+                    .map(|idx| self.vals[idx as usize])
+            };
+
+            serializer.create_u128_fast_field_with_idx(
+                self.field,
+                iter_gen,
+                self.val_count as u32,
+                0,
+            )?;
+        } else {
+            let iter_gen = || self.vals.iter().cloned();
+            serializer.create_u128_fast_field_with_idx(
+                self.field,
+                iter_gen,
+                self.val_count as u32,
+                0,
+            )?;
+        }
+
        Ok(())
    }
 }
@@ -238,7 +384,7 @@ impl FastFieldsWriter {
 /// The fast field writer just keeps the values in memory.
 ///
 /// Only when the segment writer can be closed and
-/// persisted on disc, the fast field writer is
+/// persisted on disk, the fast field writer is
 /// sent to a `FastFieldSerializer` via the `.serialize(...)`
 /// method.
 ///
@@ -325,14 +471,14 @@ impl IntFastFieldWriter {
    /// only the first one is taken in account.
    ///
    /// Values on text fast fields are skipped.
-    pub fn add_document(&mut self, doc: &Document) {
+    pub fn add_document(&mut self, doc: &Document) -> crate::Result<()> {
        match doc.get_first(self.field) {
            Some(v) => {
                let value = match (self.precision_opt, v) {
                    (Some(precision), Value::Date(date_val)) => {
                        date_val.truncate(precision).to_u64()
                    }
-                    _ => super::value_to_u64(v),
+                    _ => super::value_to_u64(v)?,
                };
                self.add_val(value);
            }
@@ -340,6 +486,7 @@ impl IntFastFieldWriter {
                self.add_val(self.val_if_missing);
            }
        };
+        Ok(())
    }

    /// get iterator over the data
@@ -364,7 +511,7 @@ impl IntFastFieldWriter {
            vals: &self.vals,
            min_value: min,
            max_value: max,
-            num_vals: self.val_count as u64,
+            num_vals: self.val_count as u32,
        };

        serializer.create_auto_detect_u64_fast_field(self.field, fastfield_accessor)?;
@@ -379,7 +526,7 @@ struct WriterFastFieldAccessProvider<'map, 'bitp> {
    vals: &'bitp BlockedBitpacker,
    min_value: u64,
    max_value: u64,
-    num_vals: u64,
+    num_vals: u32,
 }

 impl<'map, 'bitp> Column for WriterFastFieldAccessProvider<'map, 'bitp> {
@@ -391,7 +538,7 @@ impl<'map, 'bitp> Column for WriterFastFieldAccessProvider<'map, 'bitp> {
    /// # Panics
    ///
    /// May panic if `doc` is greater than the index.
-    fn get_val(&self, _doc: u64) -> u64 {
+    fn get_val(&self, _doc: u32) -> u64 {
        unimplemented!()
    }

@@ -415,7 +562,7 @@ impl<'map, 'bitp> Column for WriterFastFieldAccessProvider<'map, 'bitp> {
        self.max_value
    }

-    fn num_vals(&self) -> u64 {
+    fn num_vals(&self) -> u32 {
        self.num_vals
    }
 }
--- a/src/indexer/index_writer.rs
+++ b/src/indexer/index_writer.rs
@@ -803,7 +803,9 @@ impl Drop for IndexWriter {
 #[cfg(test)]
 mod tests {
    use std::collections::{HashMap, HashSet};
+    use std::net::Ipv6Addr;

+    use fastfield_codecs::MonotonicallyMappableToU128;
    use proptest::prelude::*;
    use proptest::prop_oneof;
    use proptest::strategy::Strategy;
@@ -815,11 +817,13 @@ mod tests {
    use crate::indexer::NoMergePolicy;
    use crate::query::{BooleanQuery, Occur, Query, QueryParser, TermQuery};
    use crate::schema::{
-        self, Cardinality, Facet, FacetOptions, IndexRecordOption, NumericOptions,
+        self, Cardinality, Facet, FacetOptions, IndexRecordOption, IpAddrOptions, NumericOptions,
        TextFieldIndexing, TextOptions, FAST, INDEXED, STORED, STRING, TEXT,
    };
    use crate::store::DOCSTORE_CACHE_CAPACITY;
-    use crate::{DocAddress, Index, IndexSettings, IndexSortByField, Order, ReloadPolicy, Term};
+    use crate::{
+        DateTime, DocAddress, Index, IndexSettings, IndexSortByField, Order, ReloadPolicy, Term,
+    };

    const LOREM: &str = "Doc Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do \
                         eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad \
@@ -1468,7 +1472,7 @@ mod tests {
        let fast_field_reader = segment_reader.fast_fields().u64(id_field)?;
        let in_order_alive_ids: Vec<u64> = segment_reader
            .doc_ids_alive()
-            .map(|doc| fast_field_reader.get_val(doc as u64))
+            .map(|doc| fast_field_reader.get_val(doc))
            .collect();
        assert_eq!(&in_order_alive_ids[..], &[9, 8, 7, 6, 5, 4, 1, 0]);
        Ok(())
@@ -1529,7 +1533,7 @@ mod tests {
        let fast_field_reader = segment_reader.fast_fields().u64(id_field)?;
        let in_order_alive_ids: Vec<u64> = segment_reader
            .doc_ids_alive()
-            .map(|doc| fast_field_reader.get_val(doc as u64))
+            .map(|doc| fast_field_reader.get_val(doc))
            .collect();
        assert_eq!(&in_order_alive_ids[..], &[9, 8, 7, 6, 5, 4, 2, 0]);
        Ok(())
@@ -1593,7 +1597,15 @@ mod tests {
        force_end_merge: bool,
    ) -> crate::Result<()> {
        let mut schema_builder = schema::Schema::builder();
+        let ip_field = schema_builder.add_ip_addr_field("ip", FAST | INDEXED | STORED);
+        let ips_field = schema_builder.add_ip_addr_field(
+            "ips",
+            IpAddrOptions::default().set_fast(Cardinality::MultiValues),
+        );
        let id_field = schema_builder.add_u64_field("id", FAST | INDEXED | STORED);
+        let i64_field = schema_builder.add_i64_field("i64", INDEXED);
+        let f64_field = schema_builder.add_f64_field("f64", INDEXED);
+        let date_field = schema_builder.add_date_field("date", INDEXED);
        let bytes_field = schema_builder.add_bytes_field("bytes", FAST | INDEXED | STORED);
        let bool_field = schema_builder.add_bool_field("bool", FAST | INDEXED | STORED);
        let text_field = schema_builder.add_text_field(
@@ -1607,6 +1619,7 @@ mod tests {
        );

        let large_text_field = schema_builder.add_text_field("large_text_field", TEXT | STORED);
+        let multi_text_fields = schema_builder.add_text_field("multi_text_fields", TEXT | STORED);

        let multi_numbers = schema_builder.add_u64_field(
            "multi_numbers",
@@ -1644,21 +1657,61 @@ mod tests {

        let old_reader = index.reader()?;

+        let ip_exists = |id| id % 3 != 0; // 0 does not exist
+
+        let multi_text_field_text1 = "test1 test2 test3 test1 test2 test3";
+        // rotate left
+        let multi_text_field_text2 = "test2 test3 test1 test2 test3 test1";
+        // rotate right
+        let multi_text_field_text3 = "test3 test1 test2 test3 test1 test2";
+
        for &op in ops {
            match op {
                IndexingOp::AddDoc { id } => {
                    let facet = Facet::from(&("/cola/".to_string() + &id.to_string()));
-                    index_writer.add_document(doc!(id_field=>id,
-                            bytes_field => id.to_le_bytes().as_slice(),
-                            multi_numbers=> id,
-                            multi_numbers => id,
-                            bool_field => (id % 2u64) != 0,
-                            multi_bools => (id % 2u64) != 0,
-                            multi_bools => (id % 2u64) == 0,
-                            text_field => id.to_string(),
-                            facet_field => facet,
-                            large_text_field=> LOREM
-                    ))?;
+                    let ip_from_id = Ipv6Addr::from_u128(id as u128);
+
+                    if !ip_exists(id) {
+                        // every 3rd doc has no ip field
+                        index_writer.add_document(doc!(id_field=>id,
+                                bytes_field => id.to_le_bytes().as_slice(),
+                                multi_numbers=> id,
+                                multi_numbers => id,
+                                bool_field => (id % 2u64) != 0,
+                                i64_field => id as i64,
+                                f64_field => id as f64,
+                                date_field => DateTime::from_timestamp_secs(id as i64),
+                                multi_bools => (id % 2u64) != 0,
+                                multi_bools => (id % 2u64) == 0,
+                                text_field => id.to_string(),
+                                facet_field => facet,
+                                large_text_field => LOREM,
+                                multi_text_fields => multi_text_field_text1,
+                                multi_text_fields => multi_text_field_text2,
+                                multi_text_fields => multi_text_field_text3,
+                        ))?;
+                    } else {
+                        index_writer.add_document(doc!(id_field=>id,
+                                bytes_field => id.to_le_bytes().as_slice(),
+                                ip_field => ip_from_id,
+                                ips_field => ip_from_id,
+                                ips_field => ip_from_id,
+                                multi_numbers=> id,
+                                multi_numbers => id,
+                                bool_field => (id % 2u64) != 0,
+                                i64_field => id as i64,
+                                f64_field => id as f64,
+                                date_field => DateTime::from_timestamp_secs(id as i64),
+                                multi_bools => (id % 2u64) != 0,
+                                multi_bools => (id % 2u64) == 0,
+                                text_field => id.to_string(),
+                                facet_field => facet,
+                                large_text_field => LOREM,
+                                multi_text_fields => multi_text_field_text1,
+                                multi_text_fields => multi_text_field_text2,
+                                multi_text_fields => multi_text_field_text3,
+                        ))?;
+                    }
                }
                IndexingOp::DeleteDoc { id } => {
                    index_writer.delete_term(Term::from_field_u64(id_field, id));
@@ -1707,7 +1760,7 @@ mod tests {
                let ff_reader = segment_reader.fast_fields().u64(id_field).unwrap();
                segment_reader
                    .doc_ids_alive()
-                    .map(move |doc| ff_reader.get_val(doc as u64))
+                    .map(move |doc| ff_reader.get_val(doc))
            })
            .collect();

@@ -1718,7 +1771,7 @@ mod tests {
                let ff_reader = segment_reader.fast_fields().u64(id_field).unwrap();
                segment_reader
                    .doc_ids_alive()
-                    .map(move |doc| ff_reader.get_val(doc as u64))
+                    .map(move |doc| ff_reader.get_val(doc))
            })
            .collect();

@@ -1744,6 +1797,60 @@ mod tests {
                .collect::<HashSet<_>>()
        );

+        // Load all ips addr
+        let ips: HashSet<Ipv6Addr> = searcher
+            .segment_readers()
+            .iter()
+            .flat_map(|segment_reader| {
+                let ff_reader = segment_reader.fast_fields().ip_addr(ip_field).unwrap();
+                segment_reader.doc_ids_alive().flat_map(move |doc| {
+                    let val = ff_reader.get_val(doc);
+                    if val == Ipv6Addr::from_u128(0) {
+                        // TODO Fix null handling
+                        None
+                    } else {
+                        Some(val)
+                    }
+                })
+            })
+            .collect();
+
+        let expected_ips = expected_ids_and_num_occurrences
+            .keys()
+            .flat_map(|id| {
+                if !ip_exists(*id) {
+                    None
+                } else {
+                    Some(Ipv6Addr::from_u128(*id as u128))
+                }
+            })
+            .collect::<HashSet<_>>();
+        assert_eq!(ips, expected_ips);
+
+        let expected_ips = expected_ids_and_num_occurrences
+            .keys()
+            .filter_map(|id| {
+                if !ip_exists(*id) {
+                    None
+                } else {
+                    Some(Ipv6Addr::from_u128(*id as u128))
+                }
+            })
+            .collect::<HashSet<_>>();
+        let ips: HashSet<Ipv6Addr> = searcher
+            .segment_readers()
+            .iter()
+            .flat_map(|segment_reader| {
+                let ff_reader = segment_reader.fast_fields().ip_addrs(ips_field).unwrap();
+                segment_reader.doc_ids_alive().flat_map(move |doc| {
+                    let mut vals = vec![];
+                    ff_reader.get_vals(doc, &mut vals);
+                    vals.into_iter().filter(|val| val.to_u128() != 0) // TODO Fix null handling
+                })
+            })
+            .collect();
+        assert_eq!(ips, expected_ips);
+
        // multivalue fast field tests
        for segment_reader in searcher.segment_readers().iter() {
            let id_reader = segment_reader.fast_fields().u64(id_field).unwrap();
@@ -1754,7 +1861,7 @@ mod tests {
                ff_reader.get_vals(doc, &mut vals);
                assert_eq!(vals.len(), 2);
                assert_eq!(vals[0], vals[1]);
-                assert_eq!(id_reader.get_val(doc as u64), vals[0]);
+                assert_eq!(id_reader.get_val(doc), vals[0]);

                let mut bool_vals = vec![];
                bool_ff_reader.get_vals(doc, &mut bool_vals);
@@ -1808,10 +1915,8 @@ mod tests {
            }
        }
        // test search
-        let my_text_field = index.schema().get_field("text_field").unwrap();
-
-        let do_search = |term: &str| {
-            let query = QueryParser::for_index(&index, vec![my_text_field])
+        let do_search = |term: &str, field| {
+            let query = QueryParser::for_index(&index, vec![field])
                .parse_query(term)
                .unwrap();
            let top_docs: Vec<(f32, DocAddress)> =
@@ -1820,11 +1925,80 @@ mod tests {
            top_docs.iter().map(|el| el.1).collect::<Vec<_>>()
        };

-        for (existing_id, count) in expected_ids_and_num_occurrences {
-            assert_eq!(do_search(&existing_id.to_string()).len() as u64, count);
+        let do_search2 = |term: Term| {
+            let query = TermQuery::new(term, IndexRecordOption::Basic);
+            let top_docs: Vec<(f32, DocAddress)> =
+                searcher.search(&query, &TopDocs::with_limit(1000)).unwrap();
+
+            top_docs.iter().map(|el| el.1).collect::<Vec<_>>()
+        };
+
+        for (existing_id, count) in &expected_ids_and_num_occurrences {
+            let (existing_id, count) = (*existing_id, *count);
+            let get_num_hits = |field| do_search(&existing_id.to_string(), field).len() as u64;
+            assert_eq!(get_num_hits(text_field), count);
+            assert_eq!(get_num_hits(i64_field), count);
+            assert_eq!(get_num_hits(f64_field), count);
+            assert_eq!(get_num_hits(id_field), count);
+
+            // Test multi text
+            assert_eq!(
+                do_search("\"test1 test2\"", multi_text_fields).len(),
+                num_docs_expected
+            );
+            assert_eq!(
+                do_search("\"test2 test3\"", multi_text_fields).len(),
+                num_docs_expected
+            );
+
+            // Test bytes
+            let term = Term::from_field_bytes(bytes_field, existing_id.to_le_bytes().as_slice());
+            assert_eq!(do_search2(term).len() as u64, count);
+
+            // Test date
+            let term = Term::from_field_date(
+                date_field,
+                DateTime::from_timestamp_secs(existing_id as i64),
+            );
+            assert_eq!(do_search2(term).len() as u64, count);
        }
-        for existing_id in deleted_ids {
-            assert_eq!(do_search(&existing_id.to_string()).len(), 0);
+        for deleted_id in deleted_ids {
+            let assert_field = |field| {
+                assert_eq!(do_search(&deleted_id.to_string(), field).len() as u64, 0);
+            };
+            assert_field(text_field);
+            assert_field(f64_field);
+            assert_field(i64_field);
+            assert_field(id_field);
+
+            // Test bytes
+            let term = Term::from_field_bytes(bytes_field, deleted_id.to_le_bytes().as_slice());
+            assert_eq!(do_search2(term).len() as u64, 0);
+
+            // Test date
+            let term =
+                Term::from_field_date(date_field, DateTime::from_timestamp_secs(deleted_id as i64));
+            assert_eq!(do_search2(term).len() as u64, 0);
+        }
+        // search ip address
+        //
+        for (existing_id, count) in &expected_ids_and_num_occurrences {
+            let (existing_id, count) = (*existing_id, *count);
+            if !ip_exists(existing_id) {
+                continue;
+            }
+            let do_search_ip_field = |term: &str| do_search(term, ip_field).len() as u64;
+            let ip_addr = Ipv6Addr::from_u128(existing_id as u128);
+            // Test incoming ip as ipv6
+            assert_eq!(do_search_ip_field(&format!("\"{}\"", ip_addr)), count);
+
+            let term = Term::from_field_ip_addr(ip_field, ip_addr);
+            assert_eq!(do_search2(term).len() as u64, count);
+
+            // Test incoming ip as ipv4
+            if let Some(ip_addr) = ip_addr.to_ipv4_mapped() {
+                assert_eq!(do_search_ip_field(&format!("\"{}\"", ip_addr)), count);
+            }
        }
        // test facets
        for segment_reader in searcher.segment_readers().iter() {
@@ -1838,7 +2012,7 @@ mod tests {
                facet_reader
                    .facet_from_ord(facet_ords[0], &mut facet)
                    .unwrap();
-                let id = ff_reader.get_val(doc_id as u64);
+                let id = ff_reader.get_val(doc_id);
                let facet_expected = Facet::from(&("/cola/".to_string() + &id.to_string()));

                assert_eq!(facet, facet_expected);
@@ -1847,6 +2021,36 @@ mod tests {
        Ok(())
    }

+    #[test]
+    fn test_minimal() {
+        assert!(test_operation_strategy(
+            &[
+                IndexingOp::AddDoc { id: 23 },
+                IndexingOp::AddDoc { id: 13 },
+                IndexingOp::DeleteDoc { id: 13 }
+            ],
+            true,
+            false
+        )
+        .is_ok());
+
+        assert!(test_operation_strategy(
+            &[
+                IndexingOp::AddDoc { id: 23 },
+                IndexingOp::AddDoc { id: 13 },
+                IndexingOp::DeleteDoc { id: 13 }
+            ],
+            false,
+            false
+        )
+        .is_ok());
+    }
+
+    #[test]
+    fn test_minimal_sort_merge() {
+        assert!(test_operation_strategy(&[IndexingOp::AddDoc { id: 3 },], true, true).is_ok());
+    }
+
    proptest! {
        #![proptest_config(ProptestConfig::with_cases(20))]
        #[test]
@@ -1939,4 +2143,135 @@ mod tests {
        index_writer.commit()?;
        Ok(())
    }
+
+    #[test]
+    fn test_bug_1617_3() {
+        assert!(test_operation_strategy(
+            &[
+                IndexingOp::DeleteDoc { id: 0 },
+                IndexingOp::AddDoc { id: 6 },
+                IndexingOp::DeleteDocQuery { id: 11 },
+                IndexingOp::Commit,
+                IndexingOp::Merge,
+                IndexingOp::Commit,
+                IndexingOp::Commit
+            ],
+            false,
+            false
+        )
+        .is_ok());
+    }
+
+    #[test]
+    fn test_bug_1617_2() {
+        assert!(test_operation_strategy(
+            &[
+                IndexingOp::AddDoc { id: 13 },
+                IndexingOp::DeleteDoc { id: 13 },
+                IndexingOp::Commit,
+                IndexingOp::AddDoc { id: 30 },
+                IndexingOp::Commit,
+                IndexingOp::Merge,
+            ],
+            false,
+            true
+        )
+        .is_ok());
+    }
+
+    #[test]
+    fn test_bug_1617() -> crate::Result<()> {
+        let mut schema_builder = schema::Schema::builder();
+        let id_field = schema_builder.add_u64_field("id", INDEXED);
+
+        let schema = schema_builder.build();
+        let index = Index::builder().schema(schema).create_in_ram()?;
+        let mut index_writer = index.writer_for_tests()?;
+        index_writer.set_merge_policy(Box::new(NoMergePolicy));
+
+        let existing_id = 16u64;
+        let deleted_id = 13u64;
+        index_writer.add_document(doc!(
+            id_field=>existing_id,
+        ))?;
+        index_writer.add_document(doc!(
+            id_field=>deleted_id,
+        ))?;
+        index_writer.delete_term(Term::from_field_u64(id_field, deleted_id));
+        index_writer.commit()?;
+
+        // Merge
+        {
+            assert!(index_writer.wait_merging_threads().is_ok());
+            let mut index_writer = index.writer_for_tests()?;
+            let segment_ids = index
+                .searchable_segment_ids()
+                .expect("Searchable segments failed.");
+            index_writer.merge(&segment_ids).wait().unwrap();
+            assert!(index_writer.wait_merging_threads().is_ok());
+        }
+        let searcher = index.reader()?.searcher();
+
+        let query = TermQuery::new(
+            Term::from_field_u64(id_field, existing_id),
+            IndexRecordOption::Basic,
+        );
+        let top_docs: Vec<(f32, DocAddress)> =
+            searcher.search(&query, &TopDocs::with_limit(10)).unwrap();
+
+        assert_eq!(top_docs.len(), 1); // Fails
+
+        Ok(())
+    }
+
+    #[test]
+    fn test_bug_1618() -> crate::Result<()> {
+        let mut schema_builder = schema::Schema::builder();
+        let id_field = schema_builder.add_i64_field("id", INDEXED);
+
+        let schema = schema_builder.build();
+        let index = Index::builder().schema(schema).create_in_ram()?;
+        let mut index_writer = index.writer_for_tests()?;
+        index_writer.set_merge_policy(Box::new(NoMergePolicy));
+
+        index_writer.add_document(doc!(
+            id_field=>10i64,
+        ))?;
+        index_writer.add_document(doc!(
+            id_field=>30i64,
+        ))?;
+        index_writer.commit()?;
+
+        // Merge
+        {
+            assert!(index_writer.wait_merging_threads().is_ok());
+            let mut index_writer = index.writer_for_tests()?;
+            let segment_ids = index
+                .searchable_segment_ids()
+                .expect("Searchable segments failed.");
+            index_writer.merge(&segment_ids).wait().unwrap();
+            assert!(index_writer.wait_merging_threads().is_ok());
+        }
+        let searcher = index.reader()?.searcher();
+
+        let query = TermQuery::new(
+            Term::from_field_i64(id_field, 10i64),
+            IndexRecordOption::Basic,
+        );
+        let top_docs: Vec<(f32, DocAddress)> =
+            searcher.search(&query, &TopDocs::with_limit(10)).unwrap();
+
+        assert_eq!(top_docs.len(), 1); // Fails
+
+        let query = TermQuery::new(
+            Term::from_field_i64(id_field, 30i64),
+            IndexRecordOption::Basic,
+        );
+        let top_docs: Vec<(f32, DocAddress)> =
+            searcher.search(&query, &TopDocs::with_limit(10)).unwrap();
+
+        assert_eq!(top_docs.len(), 1); // Fails
+
+        Ok(())
+    }
 }
--- a/src/indexer/json_term_writer.rs
+++ b/src/indexer/json_term_writer.rs
@@ -1,6 +1,6 @@
 use fastfield_codecs::MonotonicallyMappableToU64;
-use fnv::FnvHashMap;
 use murmurhash32::murmurhash2;
+use rustc_hash::FxHashMap;

 use crate::fastfield::FastValue;
 use crate::postings::{IndexingContext, IndexingPosition, PostingsWriter};
@@ -52,7 +52,7 @@ use crate::{DatePrecision, DateTime, DocId, Term};
 /// path map to the same index position as long as the probability is relatively low.
 #[derive(Default)]
 struct IndexingPositionsPerPath {
-    positions_per_path: FnvHashMap<u32, IndexingPosition>,
+    positions_per_path: FxHashMap<u32, IndexingPosition>,
 }

 impl IndexingPositionsPerPath {
@@ -242,10 +242,12 @@ pub(crate) fn set_string_and_get_terms(
 ) -> Vec<(usize, Term)> {
    let mut positions_and_terms = Vec::<(usize, Term)>::new();
    json_term_writer.close_path_and_set_type(Type::Str);
-    let term_num_bytes = json_term_writer.term_buffer.as_slice().len();
+    let term_num_bytes = json_term_writer.term_buffer.len_bytes();
    let mut token_stream = text_analyzer.token_stream(value);
    token_stream.process(&mut |token| {
-        json_term_writer.term_buffer.truncate(term_num_bytes);
+        json_term_writer
+            .term_buffer
+            .truncate_value_bytes(term_num_bytes);
        json_term_writer
            .term_buffer
            .append_bytes(token.text.as_bytes());
@@ -265,7 +267,7 @@ impl<'a> JsonTermWriter<'a> {
        json_path: &str,
        term_buffer: &'a mut Term,
    ) -> Self {
-        term_buffer.set_field(Type::Json, field);
+        term_buffer.set_field_and_type(field, Type::Json);
        let mut json_term_writer = Self::wrap(term_buffer);
        for segment in json_path.split('.') {
            json_term_writer.push_path_segment(segment);
@@ -276,7 +278,7 @@ impl<'a> JsonTermWriter<'a> {
    pub fn wrap(term_buffer: &'a mut Term) -> Self {
        term_buffer.clear_with_type(Type::Json);
        let mut path_stack = Vec::with_capacity(10);
-        path_stack.push(5);
+        path_stack.push(0);
        Self {
            term_buffer,
            path_stack,
@@ -285,28 +287,28 @@ impl<'a> JsonTermWriter<'a> {

    fn trim_to_end_of_path(&mut self) {
        let end_of_path = *self.path_stack.last().unwrap();
-        self.term_buffer.truncate(end_of_path);
+        self.term_buffer.truncate_value_bytes(end_of_path);
    }

    pub fn close_path_and_set_type(&mut self, typ: Type) {
        self.trim_to_end_of_path();
-        let buffer = self.term_buffer.as_mut();
+        let buffer = self.term_buffer.value_bytes_mut();
        let buffer_len = buffer.len();
        buffer[buffer_len - 1] = JSON_END_OF_PATH;
-        buffer.push(typ.to_code());
+        self.term_buffer.append_bytes(&[typ.to_code()]);
    }

    pub fn push_path_segment(&mut self, segment: &str) {
        // the path stack should never be empty.
        self.trim_to_end_of_path();
-        let buffer = self.term_buffer.as_mut();
+        let buffer = self.term_buffer.value_bytes_mut();
        let buffer_len = buffer.len();
        if self.path_stack.len() > 1 {
            buffer[buffer_len - 1] = JSON_PATH_SEGMENT_SEP;
        }
-        buffer.extend(segment.as_bytes());
-        buffer.push(JSON_PATH_SEGMENT_SEP);
-        self.path_stack.push(buffer.len());
+        self.term_buffer.append_bytes(segment.as_bytes());
+        self.term_buffer.append_bytes(&[JSON_PATH_SEGMENT_SEP]);
+        self.path_stack.push(self.term_buffer.len_bytes());
    }

    pub fn pop_path_segment(&mut self) {
@@ -318,8 +320,8 @@ impl<'a> JsonTermWriter<'a> {
    /// Returns the json path of the term being currently built.
    #[cfg(test)]
    pub(crate) fn path(&self) -> &[u8] {
-        let end_of_path = self.path_stack.last().cloned().unwrap_or(6);
-        &self.term().as_slice()[5..end_of_path - 1]
+        let end_of_path = self.path_stack.last().cloned().unwrap_or(1);
+        &self.term().value_bytes()[..end_of_path - 1]
    }

    pub fn set_fast_value<T: FastValue>(&mut self, val: T) {
@@ -332,14 +334,13 @@ impl<'a> JsonTermWriter<'a> {
            val.to_u64()
        };
        self.term_buffer
-            .as_mut()
-            .extend_from_slice(value.to_be_bytes().as_slice());
+            .append_bytes(value.to_be_bytes().as_slice());
    }

    #[cfg(test)]
    pub(crate) fn set_str(&mut self, text: &str) {
        self.close_path_and_set_type(Type::Str);
-        self.term_buffer.as_mut().extend_from_slice(text.as_bytes());
+        self.term_buffer.append_bytes(text.as_bytes());
    }

    pub fn term(&self) -> &Term {
@@ -356,8 +357,7 @@ mod tests {
    #[test]
    fn test_json_writer() {
        let field = Field::from_field_id(1);
-        let mut term = Term::new();
-        term.set_field(Type::Json, field);
+        let mut term = Term::with_type_and_field(Type::Json, field);
        let mut json_writer = JsonTermWriter::wrap(&mut term);
        json_writer.push_path_segment("attributes");
        json_writer.push_path_segment("color");
@@ -391,8 +391,7 @@ mod tests {
    #[test]
    fn test_string_term() {
        let field = Field::from_field_id(1);
-        let mut term = Term::new();
-        term.set_field(Type::Json, field);
+        let mut term = Term::with_type_and_field(Type::Json, field);
        let mut json_writer = JsonTermWriter::wrap(&mut term);
        json_writer.push_path_segment("color");
        json_writer.set_str("red");
@@ -405,8 +404,7 @@ mod tests {
    #[test]
    fn test_i64_term() {
        let field = Field::from_field_id(1);
-        let mut term = Term::new();
-        term.set_field(Type::Json, field);
+        let mut term = Term::with_type_and_field(Type::Json, field);
        let mut json_writer = JsonTermWriter::wrap(&mut term);
        json_writer.push_path_segment("color");
        json_writer.set_fast_value(-4i64);
@@ -419,8 +417,7 @@ mod tests {
    #[test]
    fn test_u64_term() {
        let field = Field::from_field_id(1);
-        let mut term = Term::new();
-        term.set_field(Type::Json, field);
+        let mut term = Term::with_type_and_field(Type::Json, field);
        let mut json_writer = JsonTermWriter::wrap(&mut term);
        json_writer.push_path_segment("color");
        json_writer.set_fast_value(4u64);
@@ -433,8 +430,7 @@ mod tests {
    #[test]
    fn test_f64_term() {
        let field = Field::from_field_id(1);
-        let mut term = Term::new();
-        term.set_field(Type::Json, field);
+        let mut term = Term::with_type_and_field(Type::Json, field);
        let mut json_writer = JsonTermWriter::wrap(&mut term);
        json_writer.push_path_segment("color");
        json_writer.set_fast_value(4.0f64);
@@ -447,8 +443,7 @@ mod tests {
    #[test]
    fn test_bool_term() {
        let field = Field::from_field_id(1);
-        let mut term = Term::new();
-        term.set_field(Type::Json, field);
+        let mut term = Term::with_type_and_field(Type::Json, field);
        let mut json_writer = JsonTermWriter::wrap(&mut term);
        json_writer.push_path_segment("color");
        json_writer.set_fast_value(true);
@@ -461,8 +456,7 @@ mod tests {
    #[test]
    fn test_push_after_set_path_segment() {
        let field = Field::from_field_id(1);
-        let mut term = Term::new();
-        term.set_field(Type::Json, field);
+        let mut term = Term::with_type_and_field(Type::Json, field);
        let mut json_writer = JsonTermWriter::wrap(&mut term);
        json_writer.push_path_segment("attribute");
        json_writer.set_str("something");
@@ -477,8 +471,7 @@ mod tests {
    #[test]
    fn test_pop_segment() {
        let field = Field::from_field_id(1);
-        let mut term = Term::new();
-        term.set_field(Type::Json, field);
+        let mut term = Term::with_type_and_field(Type::Json, field);
        let mut json_writer = JsonTermWriter::wrap(&mut term);
        json_writer.push_path_segment("color");
        json_writer.push_path_segment("hue");
@@ -493,8 +486,7 @@ mod tests {
    #[test]
    fn test_json_writer_path() {
        let field = Field::from_field_id(1);
-        let mut term = Term::new();
-        term.set_field(Type::Json, field);
+        let mut term = Term::with_type_and_field(Type::Json, field);
        let mut json_writer = JsonTermWriter::wrap(&mut term);
        json_writer.push_path_segment("color");
        assert_eq!(json_writer.path(), b"color");
--- a/src/indexer/merger.rs
+++ b/src/indexer/merger.rs
@@ -6,13 +6,14 @@ use fastfield_codecs::VecColumn;
 use itertools::Itertools;
 use measure_time::debug_time;

+use super::flat_map_with_buffer::FlatMapWithBufferIter;
 use super::sorted_doc_id_multivalue_column::RemappedDocIdMultiValueIndexColumn;
 use crate::core::{Segment, SegmentReader};
 use crate::docset::{DocSet, TERMINATED};
 use crate::error::DataCorruption;
 use crate::fastfield::{
    get_fastfield_codecs_for_multivalue, AliveBitSet, Column, CompositeFastFieldSerializer,
-    MultiValueLength, MultiValuedFastFieldReader,
+    MultiValueLength, MultiValuedFastFieldReader, MultiValuedU128FastFieldReader,
 };
 use crate::fieldnorm::{FieldNormReader, FieldNormReaders, FieldNormsSerializer, FieldNormsWriter};
 use crate::indexer::doc_id_mapping::{expect_field_id_for_sort_field, SegmentDocIdMapping};
@@ -295,6 +296,24 @@ impl IndexMerger {
                        self.write_bytes_fast_field(field, fast_field_serializer, doc_id_mapping)?;
                    }
                }
+                FieldType::IpAddr(options) => match options.get_fastfield_cardinality() {
+                    Some(Cardinality::SingleValue) => {
+                        self.write_u128_single_fast_field(
+                            field,
+                            fast_field_serializer,
+                            doc_id_mapping,
+                        )?;
+                    }
+                    Some(Cardinality::MultiValues) => {
+                        self.write_u128_multi_fast_field(
+                            field,
+                            fast_field_serializer,
+                            doc_id_mapping,
+                        )?;
+                    }
+                    None => {}
+                },
+
                FieldType::JsonObject(_) | FieldType::Facet(_) | FieldType::Str(_) => {
                    // We don't handle json fast field for the moment
                    // They can be implemented using what is done
@@ -305,6 +324,91 @@ impl IndexMerger {
        Ok(())
    }

+    // used to merge `u128` single fast fields.
+    fn write_u128_multi_fast_field(
+        &self,
+        field: Field,
+        fast_field_serializer: &mut CompositeFastFieldSerializer,
+        doc_id_mapping: &SegmentDocIdMapping,
+    ) -> crate::Result<()> {
+        let segment_and_ff_readers: Vec<(&SegmentReader, MultiValuedU128FastFieldReader<u128>)> =
+            self.readers
+                .iter()
+                .map(|segment_reader| {
+                    let ff_reader: MultiValuedU128FastFieldReader<u128> =
+                        segment_reader.fast_fields().u128s(field).expect(
+                            "Failed to find index for multivalued field. This is a bug in \
+                             tantivy, please report.",
+                        );
+                    (segment_reader, ff_reader)
+                })
+                .collect::<Vec<_>>();
+
+        Self::write_1_n_fast_field_idx_generic(
+            field,
+            fast_field_serializer,
+            doc_id_mapping,
+            &segment_and_ff_readers,
+        )?;
+
+        let fast_field_readers = segment_and_ff_readers
+            .into_iter()
+            .map(|(_, ff_reader)| ff_reader)
+            .collect::<Vec<_>>();
+
+        let iter_gen = || {
+            doc_id_mapping
+                .iter_old_doc_addrs()
+                .flat_map_with_buffer(|doc_addr, buffer| {
+                    let fast_field_reader = &fast_field_readers[doc_addr.segment_ord as usize];
+                    fast_field_reader.get_vals(doc_addr.doc_id, buffer);
+                })
+        };
+
+        fast_field_serializer.create_u128_fast_field_with_idx(
+            field,
+            iter_gen,
+            doc_id_mapping.len() as u32,
+            1,
+        )?;
+
+        Ok(())
+    }
+
+    // used to merge `u128` single fast fields.
+    fn write_u128_single_fast_field(
+        &self,
+        field: Field,
+        fast_field_serializer: &mut CompositeFastFieldSerializer,
+        doc_id_mapping: &SegmentDocIdMapping,
+    ) -> crate::Result<()> {
+        let fast_field_readers = self
+            .readers
+            .iter()
+            .map(|reader| {
+                let u128_reader: Arc<dyn Column<u128>> = reader.fast_fields().u128(field).expect(
+                    "Failed to find a reader for single fast field. This is a tantivy bug and it \
+                     should never happen.",
+                );
+                u128_reader
+            })
+            .collect::<Vec<_>>();
+
+        let iter_gen = || {
+            doc_id_mapping.iter_old_doc_addrs().map(|doc_addr| {
+                let fast_field_reader = &fast_field_readers[doc_addr.segment_ord as usize];
+                fast_field_reader.get_val(doc_addr.doc_id)
+            })
+        };
+        fast_field_serializer.create_u128_fast_field_with_idx(
+            field,
+            iter_gen,
+            doc_id_mapping.len() as u32,
+            0,
+        )?;
+        Ok(())
+    }
+
    // used both to merge field norms, `u64/i64` single fast fields.
    fn write_single_fast_field(
        &self,
@@ -406,8 +510,8 @@ impl IndexMerger {
            doc_id_reader_pair
                .into_iter()
                .kmerge_by(|a, b| {
-                    let val1 = a.2.get_val(a.0 as u64);
-                    let val2 = b.2.get_val(b.0 as u64);
+                    let val1 = a.2.get_val(a.0);
+                    let val2 = b.2.get_val(b.0);
                    if sort_by_field.order == Order::Asc {
                        val1 < val2
                    } else {
--- a/src/indexer/merger_sorted_index_test.rs
+++ b/src/indexer/merger_sorted_index_test.rs
@@ -190,13 +190,13 @@ mod tests {
        assert_eq!(fast_field.get_val(4), 2u64);
        assert_eq!(fast_field.get_val(3), 3u64);
        if force_disjunct_segment_sort_values {
-            assert_eq!(fast_field.get_val(2u64), 20u64);
-            assert_eq!(fast_field.get_val(1u64), 100u64);
+            assert_eq!(fast_field.get_val(2), 20u64);
+            assert_eq!(fast_field.get_val(1), 100u64);
        } else {
-            assert_eq!(fast_field.get_val(2u64), 10u64);
-            assert_eq!(fast_field.get_val(1u64), 20u64);
+            assert_eq!(fast_field.get_val(2), 10u64);
+            assert_eq!(fast_field.get_val(1), 20u64);
        }
-        assert_eq!(fast_field.get_val(0u64), 1_000u64);
+        assert_eq!(fast_field.get_val(0), 1_000u64);

        // test new field norm mapping
        {
@@ -545,7 +545,7 @@ mod bench_sorted_index_merge {
            // add values in order of the new doc_ids
            let mut val = 0;
            for (doc_id, _reader, field_reader) in sorted_doc_ids {
-                val = field_reader.get_val(doc_id as u64);
+                val = field_reader.get_val(doc_id);
            }

            val
--- a/src/indexer/segment_serializer.rs
+++ b/src/indexer/segment_serializer.rs
@@ -30,8 +30,10 @@ impl SegmentSerializer {
            StoreWriter::new(
                store_write,
                crate::store::Compressor::None,
-                0, // we want random access on the docs, so we choose a minimal block size. Every
-                // doc will get its own block.
+                // We want fast random access on the docs, so we choose a small block size.
+                // If this is zero, the skip index will contain too many checkpoints and
+                // therefore will be relatively slow.
+                16000,
                settings.docstore_compress_dedicated_thread,
            )?
        } else {
--- a/src/indexer/segment_writer.rs
+++ b/src/indexer/segment_writer.rs
@@ -12,11 +12,9 @@ use crate::postings::{
    compute_table_size, serialize_postings, IndexingContext, IndexingPosition,
    PerFieldPostingsWriter, PostingsWriter,
 };
-use crate::schema::{FieldEntry, FieldType, FieldValue, Schema, Term, Value};
+use crate::schema::{FieldEntry, FieldType, Schema, Term, Value};
 use crate::store::{StoreReader, StoreWriter};
-use crate::tokenizer::{
-    BoxTokenStream, FacetTokenizer, PreTokenizedStream, TextAnalyzer, Tokenizer,
-};
+use crate::tokenizer::{FacetTokenizer, PreTokenizedStream, TextAnalyzer, Tokenizer};
 use crate::{DatePrecision, DocId, Document, Opstamp, SegmentComponent};

 /// Computes the initial size of the hash table.
@@ -116,7 +114,7 @@ impl SegmentWriter {
            fast_field_writers: FastFieldsWriter::from_schema(&schema),
            doc_opstamps: Vec::with_capacity(1_000),
            per_field_text_analyzers,
-            term_buffer: Term::new(),
+            term_buffer: Term::with_capacity(16),
            schema,
        })
    }
@@ -160,7 +158,6 @@ impl SegmentWriter {
        let doc_id = self.max_doc;
        let vals_grouped_by_field = doc
            .field_values()
-            .iter()
            .sorted_by_key(|el| el.field())
            .group_by(|el| el.field());
        for (field, field_values) in &vals_grouped_by_field {
@@ -176,10 +173,12 @@ impl SegmentWriter {
            if !field_entry.is_indexed() {
                continue;
            }
+
            let (term_buffer, ctx) = (&mut self.term_buffer, &mut self.ctx);
            let postings_writer: &mut dyn PostingsWriter =
                self.per_field_postings_writers.get_for_field_mut(field);
-            term_buffer.set_field(field_entry.field_type().value_type(), field);
+            term_buffer.clear_with_field_and_type(field_entry.field_type().value_type(), field);
+
            match *field_entry.field_type() {
                FieldType::Facet(_) => {
                    for value in values {
@@ -204,27 +203,23 @@ impl SegmentWriter {
                    }
                }
                FieldType::Str(_) => {
-                    let mut token_streams: Vec<BoxTokenStream> = vec![];
-
+                    let mut indexing_position = IndexingPosition::default();
                    for value in values {
-                        match value {
+                        let mut token_stream = match value {
                            Value::PreTokStr(tok_str) => {
-                                token_streams
-                                    .push(PreTokenizedStream::from(tok_str.clone()).into());
+                                PreTokenizedStream::from(tok_str.clone()).into()
                            }
                            Value::Str(ref text) => {
                                let text_analyzer =
                                    &self.per_field_text_analyzers[field.field_id() as usize];
-                                token_streams.push(text_analyzer.token_stream(text));
+                                text_analyzer.token_stream(text)
                            }
-                            _ => (),
-                        }
-                    }
+                            _ => {
+                                continue;
+                            }
+                        };

-                    let mut indexing_position = IndexingPosition::default();
-
-                    for mut token_stream in token_streams {
-                        assert_eq!(term_buffer.as_slice().len(), 5);
+                        assert!(term_buffer.is_empty());
                        postings_writer.index_text(
                            doc_id,
                            &mut *token_stream,
@@ -240,46 +235,76 @@ impl SegmentWriter {
                    }
                }
                FieldType::U64(_) => {
+                    let mut num_vals = 0;
                    for value in values {
+                        num_vals += 1;
                        let u64_val = value.as_u64().ok_or_else(make_schema_error)?;
                        term_buffer.set_u64(u64_val);
                        postings_writer.subscribe(doc_id, 0u32, term_buffer, ctx);
                    }
+                    if field_entry.has_fieldnorms() {
+                        self.fieldnorms_writer.record(doc_id, field, num_vals);
+                    }
                }
                FieldType::Date(_) => {
+                    let mut num_vals = 0;
                    for value in values {
+                        num_vals += 1;
                        let date_val = value.as_date().ok_or_else(make_schema_error)?;
                        term_buffer.set_u64(date_val.truncate(DatePrecision::Seconds).to_u64());
                        postings_writer.subscribe(doc_id, 0u32, term_buffer, ctx);
                    }
+                    if field_entry.has_fieldnorms() {
+                        self.fieldnorms_writer.record(doc_id, field, num_vals);
+                    }
                }
                FieldType::I64(_) => {
+                    let mut num_vals = 0;
                    for value in values {
+                        num_vals += 1;
                        let i64_val = value.as_i64().ok_or_else(make_schema_error)?;
                        term_buffer.set_i64(i64_val);
                        postings_writer.subscribe(doc_id, 0u32, term_buffer, ctx);
                    }
+                    if field_entry.has_fieldnorms() {
+                        self.fieldnorms_writer.record(doc_id, field, num_vals);
+                    }
                }
                FieldType::F64(_) => {
+                    let mut num_vals = 0;
                    for value in values {
+                        num_vals += 1;
                        let f64_val = value.as_f64().ok_or_else(make_schema_error)?;
                        term_buffer.set_f64(f64_val);
                        postings_writer.subscribe(doc_id, 0u32, term_buffer, ctx);
                    }
+                    if field_entry.has_fieldnorms() {
+                        self.fieldnorms_writer.record(doc_id, field, num_vals);
+                    }
                }
                FieldType::Bool(_) => {
+                    let mut num_vals = 0;
                    for value in values {
+                        num_vals += 1;
                        let bool_val = value.as_bool().ok_or_else(make_schema_error)?;
                        term_buffer.set_bool(bool_val);
                        postings_writer.subscribe(doc_id, 0u32, term_buffer, ctx);
                    }
+                    if field_entry.has_fieldnorms() {
+                        self.fieldnorms_writer.record(doc_id, field, num_vals);
+                    }
                }
                FieldType::Bytes(_) => {
+                    let mut num_vals = 0;
                    for value in values {
+                        num_vals += 1;
                        let bytes = value.as_bytes().ok_or_else(make_schema_error)?;
                        term_buffer.set_bytes(bytes);
                        postings_writer.subscribe(doc_id, 0u32, term_buffer, ctx);
                    }
+                    if field_entry.has_fieldnorms() {
+                        self.fieldnorms_writer.record(doc_id, field, num_vals);
+                    }
                }
                FieldType::JsonObject(_) => {
                    let text_analyzer = &self.per_field_text_analyzers[field.field_id() as usize];
@@ -294,6 +319,18 @@ impl SegmentWriter {
                        ctx,
                    )?;
                }
+                FieldType::IpAddr(_) => {
+                    let mut num_vals = 0;
+                    for value in values {
+                        num_vals += 1;
+                        let ip_addr = value.as_ip_addr().ok_or_else(make_schema_error)?;
+                        term_buffer.set_ip_addr(ip_addr);
+                        postings_writer.subscribe(doc_id, 0u32, term_buffer, ctx);
+                    }
+                    if field_entry.has_fieldnorms() {
+                        self.fieldnorms_writer.record(doc_id, field, num_vals);
+                    }
+                }
            }
        }
        Ok(())
@@ -305,11 +342,10 @@ impl SegmentWriter {
    pub fn add_document(&mut self, add_operation: AddOperation) -> crate::Result<()> {
        let doc = add_operation.document;
        self.doc_opstamps.push(add_operation.opstamp);
-        self.fast_field_writers.add_document(&doc);
+        self.fast_field_writers.add_document(&doc)?;
        self.index_document(&doc)?;
-        let prepared_doc = prepare_doc_for_store(doc, &self.schema);
        let doc_writer = self.segment_serializer.get_store_writer();
-        doc_writer.store(&prepared_doc)?;
+        doc_writer.store(&doc, &self.schema)?;
        self.max_doc += 1;
        Ok(())
    }
@@ -406,40 +442,24 @@ fn remap_and_write(
    Ok(())
 }

-/// Prepares Document for being stored in the document store
-///
-/// Method transforms PreTokenizedString values into String
-/// values.
-pub fn prepare_doc_for_store(doc: Document, schema: &Schema) -> Document {
-    Document::from(
-        doc.into_iter()
-            .filter(|field_value| schema.get_field_entry(field_value.field()).is_stored())
-            .map(|field_value| match field_value {
-                FieldValue {
-                    field,
-                    value: Value::PreTokStr(pre_tokenized_text),
-                } => FieldValue {
-                    field,
-                    value: Value::Str(pre_tokenized_text.text),
-                },
-                field_value => field_value,
-            })
-            .collect::<Vec<_>>(),
-    )
-}
-
 #[cfg(test)]
 mod tests {
+    use std::path::Path;
+
    use super::compute_initial_table_size;
    use crate::collector::Count;
+    use crate::directory::RamDirectory;
    use crate::indexer::json_term_writer::JsonTermWriter;
    use crate::postings::TermInfo;
    use crate::query::PhraseQuery;
    use crate::schema::{IndexRecordOption, Schema, Type, STORED, STRING, TEXT};
+    use crate::store::{Compressor, StoreReader, StoreWriter};
    use crate::time::format_description::well_known::Rfc3339;
    use crate::time::OffsetDateTime;
    use crate::tokenizer::{PreTokenizedString, Token};
-    use crate::{DateTime, DocAddress, DocSet, Document, Index, Postings, Term, TERMINATED};
+    use crate::{
+        DateTime, Directory, DocAddress, DocSet, Document, Index, Postings, Term, TERMINATED,
+    };

    #[test]
    fn test_hashmap_size() {
@@ -469,14 +489,29 @@ mod tests {

        doc.add_pre_tokenized_text(text_field, pre_tokenized_text);
        doc.add_text(text_field, "title");
-        let prepared_doc = super::prepare_doc_for_store(doc, &schema);

-        assert_eq!(prepared_doc.field_values().len(), 2);
-        assert_eq!(prepared_doc.field_values()[0].value().as_text(), Some("A"));
+        let path = Path::new("store");
+        let directory = RamDirectory::create();
+        let store_wrt = directory.open_write(path).unwrap();
+
+        let mut store_writer = StoreWriter::new(store_wrt, Compressor::None, 0, false).unwrap();
+        store_writer.store(&doc, &schema).unwrap();
+        store_writer.close().unwrap();
+
+        let reader = StoreReader::open(directory.open_read(path).unwrap(), 0).unwrap();
+        let doc = reader.get(0).unwrap();
+
+        assert_eq!(doc.value_count(), 2);
+        let mut field_value_iter = doc.field_values();
        assert_eq!(
-            prepared_doc.field_values()[1].value().as_text(),
+            field_value_iter.next().unwrap().value().as_text(),
+            Some("A")
+        );
+        assert_eq!(
+            field_value_iter.next().unwrap().value().as_text(),
            Some("title")
        );
+        assert!(field_value_iter.next().is_none());
    }

    #[test]
@@ -526,8 +561,7 @@ mod tests {
        let inv_idx = segment_reader.inverted_index(json_field).unwrap();
        let term_dict = inv_idx.terms();

-        let mut term = Term::new();
-        term.set_field(Type::Json, json_field);
+        let mut term = Term::with_type_and_field(Type::Json, json_field);
        let mut term_stream = term_dict.stream().unwrap();

        let mut json_term_writer = JsonTermWriter::wrap(&mut term);
@@ -620,8 +654,7 @@ mod tests {
        let searcher = reader.searcher();
        let segment_reader = searcher.segment_reader(0u32);
        let inv_index = segment_reader.inverted_index(json_field).unwrap();
-        let mut term = Term::new();
-        term.set_field(Type::Json, json_field);
+        let mut term = Term::with_type_and_field(Type::Json, json_field);
        let mut json_term_writer = JsonTermWriter::wrap(&mut term);
        json_term_writer.push_path_segment("mykey");
        json_term_writer.set_str("token");
@@ -665,8 +698,7 @@ mod tests {
        let searcher = reader.searcher();
        let segment_reader = searcher.segment_reader(0u32);
        let inv_index = segment_reader.inverted_index(json_field).unwrap();
-        let mut term = Term::new();
-        term.set_field(Type::Json, json_field);
+        let mut term = Term::with_type_and_field(Type::Json, json_field);
        let mut json_term_writer = JsonTermWriter::wrap(&mut term);
        json_term_writer.push_path_segment("mykey");
        json_term_writer.set_str("two tokens");
@@ -711,8 +743,7 @@ mod tests {
        writer.commit().unwrap();
        let reader = index.reader().unwrap();
        let searcher = reader.searcher();
-        let mut term = Term::new();
-        term.set_field(Type::Json, json_field);
+        let mut term = Term::with_type_and_field(Type::Json, json_field);
        let mut json_term_writer = JsonTermWriter::wrap(&mut term);
        json_term_writer.push_path_segment("mykey");
        json_term_writer.push_path_segment("field");
@@ -727,4 +758,124 @@ mod tests {
        let phrase_query = PhraseQuery::new(vec![nothello_term, happy_term]);
        assert_eq!(searcher.search(&phrase_query, &Count).unwrap(), 0);
    }
+
+    #[test]
+    fn test_bug_regression_1629_position_when_array_with_a_field_value_that_does_not_contain_any_token(
+    ) {
+        // We experienced a bug where we would have a position underflow when computing position
+        // delta in an horrible corner case.
+        //
+        // See the commit with this unit test if you want the details.
+        let mut schema_builder = Schema::builder();
+        let text = schema_builder.add_text_field("text", TEXT);
+        let schema = schema_builder.build();
+        let doc = schema
+            .parse_document(r#"{"text": [ "bbb", "aaa", "", "aaa"]}"#)
+            .unwrap();
+        let index = Index::create_in_ram(schema);
+        let mut index_writer = index.writer_for_tests().unwrap();
+        index_writer.add_document(doc).unwrap();
+        // On debug this did panic on the underflow
+        index_writer.commit().unwrap();
+        let reader = index.reader().unwrap();
+        let searcher = reader.searcher();
+        let seg_reader = searcher.segment_reader(0);
+        let inv_index = seg_reader.inverted_index(text).unwrap();
+        let term = Term::from_field_text(text, "aaa");
+        let mut postings = inv_index
+            .read_postings(&term, IndexRecordOption::WithFreqsAndPositions)
+            .unwrap()
+            .unwrap();
+        assert_eq!(postings.doc(), 0u32);
+        let mut positions = Vec::new();
+        postings.positions(&mut positions);
+        // On release this was [2, 1]. (< note the decreasing values)
+        assert_eq!(positions, &[2, 5]);
+    }
+
+    #[test]
+    fn test_multiple_field_value_and_long_tokens() {
+        let mut schema_builder = Schema::builder();
+        let text = schema_builder.add_text_field("text", TEXT);
+        let schema = schema_builder.build();
+        let mut doc = Document::default();
+        // This is a bit of a contrived example.
+        let tokens = PreTokenizedString {
+            text: "roller-coaster".to_string(),
+            tokens: vec![Token {
+                offset_from: 0,
+                offset_to: 14,
+                position: 0,
+                text: "rollercoaster".to_string(),
+                position_length: 2,
+            }],
+        };
+        doc.add_pre_tokenized_text(text, tokens.clone());
+        doc.add_pre_tokenized_text(text, tokens);
+        let index = Index::create_in_ram(schema);
+        let mut index_writer = index.writer_for_tests().unwrap();
+        index_writer.add_document(doc).unwrap();
+        index_writer.commit().unwrap();
+        let reader = index.reader().unwrap();
+        let searcher = reader.searcher();
+        let seg_reader = searcher.segment_reader(0);
+        let inv_index = seg_reader.inverted_index(text).unwrap();
+        let term = Term::from_field_text(text, "rollercoaster");
+        let mut postings = inv_index
+            .read_postings(&term, IndexRecordOption::WithFreqsAndPositions)
+            .unwrap()
+            .unwrap();
+        assert_eq!(postings.doc(), 0u32);
+        let mut positions = Vec::new();
+        postings.positions(&mut positions);
+        assert_eq!(positions, &[0, 3]); //< as opposed to 0, 2 if we had a position length of 1.
+    }
+
+    #[test]
+    fn test_last_token_not_ending_last() {
+        let mut schema_builder = Schema::builder();
+        let text = schema_builder.add_text_field("text", TEXT);
+        let schema = schema_builder.build();
+        let mut doc = Document::default();
+        // This is a bit of a contrived example.
+        let tokens = PreTokenizedString {
+            text: "contrived-example".to_string(), //< I can't think of a use case where this corner case happens in real life.
+            tokens: vec![
+                Token {
+                    // Not the last token, yet ends after the last token.
+                    offset_from: 0,
+                    offset_to: 14,
+                    position: 0,
+                    text: "long_token".to_string(),
+                    position_length: 3,
+                },
+                Token {
+                    offset_from: 0,
+                    offset_to: 14,
+                    position: 1,
+                    text: "short".to_string(),
+                    position_length: 1,
+                },
+            ],
+        };
+        doc.add_pre_tokenized_text(text, tokens);
+        doc.add_text(text, "hello");
+        let index = Index::create_in_ram(schema);
+        let mut index_writer = index.writer_for_tests().unwrap();
+        index_writer.add_document(doc).unwrap();
+        index_writer.commit().unwrap();
+        let reader = index.reader().unwrap();
+        let searcher = reader.searcher();
+        let seg_reader = searcher.segment_reader(0);
+        let inv_index = seg_reader.inverted_index(text).unwrap();
+        let term = Term::from_field_text(text, "hello");
+        let mut postings = inv_index
+            .read_postings(&term, IndexRecordOption::WithFreqsAndPositions)
+            .unwrap()
+            .unwrap();
+        assert_eq!(postings.doc(), 0u32);
+        let mut positions = Vec::new();
+        postings.positions(&mut positions);
+        assert_eq!(positions, &[4]); //< as opposed to 3 if we had a position length of 1.
+    }
 }
--- a/src/indexer/sorted_doc_id_column.rs
+++ b/src/indexer/sorted_doc_id_column.rs
@@ -12,7 +12,7 @@ pub(crate) struct RemappedDocIdColumn<'a> {
    fast_field_readers: Vec<Arc<dyn Column<u64>>>,
    min_value: u64,
    max_value: u64,
-    num_vals: u64,
+    num_vals: u32,
 }

 fn compute_min_max_val(
@@ -32,7 +32,7 @@ fn compute_min_max_val(
    // we need to recompute the max / min
    segment_reader
        .doc_ids_alive()
-        .map(|doc_id| u64_reader.get_val(doc_id as u64))
+        .map(|doc_id| u64_reader.get_val(doc_id))
        .minmax()
        .into_option()
 }
@@ -73,13 +73,13 @@ impl<'a> RemappedDocIdColumn<'a> {
            fast_field_readers,
            min_value,
            max_value,
-            num_vals: doc_id_mapping.len() as u64,
+            num_vals: doc_id_mapping.len() as u32,
        }
    }
 }

 impl<'a> Column for RemappedDocIdColumn<'a> {
-    fn get_val(&self, _doc: u64) -> u64 {
+    fn get_val(&self, _doc: u32) -> u64 {
        unimplemented!()
    }

@@ -90,7 +90,7 @@ impl<'a> Column for RemappedDocIdColumn<'a> {
                .map(|old_doc_addr| {
                    let fast_field_reader =
                        &self.fast_field_readers[old_doc_addr.segment_ord as usize];
-                    fast_field_reader.get_val(old_doc_addr.doc_id as u64)
+                    fast_field_reader.get_val(old_doc_addr.doc_id)
                }),
        )
    }
@@ -102,7 +102,7 @@ impl<'a> Column for RemappedDocIdColumn<'a> {
        self.max_value
    }

-    fn num_vals(&self) -> u64 {
+    fn num_vals(&self) -> u32 {
        self.num_vals
    }
 }
--- a/src/indexer/sorted_doc_id_multivalue_column.rs
+++ b/src/indexer/sorted_doc_id_multivalue_column.rs
@@ -13,7 +13,7 @@ pub(crate) struct RemappedDocIdMultiValueColumn<'a> {
    fast_field_readers: Vec<MultiValuedFastFieldReader<u64>>,
    min_value: u64,
    max_value: u64,
-    num_vals: u64,
+    num_vals: u32,
 }

 impl<'a> RemappedDocIdMultiValueColumn<'a> {
@@ -61,13 +61,13 @@ impl<'a> RemappedDocIdMultiValueColumn<'a> {
            fast_field_readers,
            min_value,
            max_value,
-            num_vals: num_vals as u64,
+            num_vals: num_vals as u32,
        }
    }
 }

 impl<'a> Column for RemappedDocIdMultiValueColumn<'a> {
-    fn get_val(&self, _pos: u64) -> u64 {
+    fn get_val(&self, _pos: u32) -> u64 {
        unimplemented!()
    }

@@ -89,7 +89,7 @@ impl<'a> Column for RemappedDocIdMultiValueColumn<'a> {
        self.max_value
    }

-    fn num_vals(&self) -> u64 {
+    fn num_vals(&self) -> u32 {
        self.num_vals
    }
 }
@@ -99,7 +99,7 @@ pub(crate) struct RemappedDocIdMultiValueIndexColumn<'a, T: MultiValueLength> {
    multi_value_length_readers: Vec<&'a T>,
    min_value: u64,
    max_value: u64,
-    num_vals: u64,
+    num_vals: u32,
 }

 impl<'a, T: MultiValueLength> RemappedDocIdMultiValueIndexColumn<'a, T> {
@@ -123,7 +123,7 @@ impl<'a, T: MultiValueLength> RemappedDocIdMultiValueIndexColumn<'a, T> {
                    max_value += multi_value_length_reader.get_len(doc);
                }
            }
-            num_vals += segment_reader.num_docs() as u64;
+            num_vals += segment_reader.num_docs();
            multi_value_length_readers.push(multi_value_length_reader);
        }
        Self {
@@ -137,7 +137,7 @@ impl<'a, T: MultiValueLength> RemappedDocIdMultiValueIndexColumn<'a, T> {
 }

 impl<'a, T: MultiValueLength + Send + Sync> Column for RemappedDocIdMultiValueIndexColumn<'a, T> {
-    fn get_val(&self, _pos: u64) -> u64 {
+    fn get_val(&self, _pos: u32) -> u64 {
        unimplemented!()
    }

@@ -162,7 +162,7 @@ impl<'a, T: MultiValueLength + Send + Sync> Column for RemappedDocIdMultiValueIn
        self.max_value
    }

-    fn num_vals(&self) -> u64 {
+    fn num_vals(&self) -> u32 {
        self.num_vals
    }
 }
--- a/src/lib.rs
+++ b/src/lib.rs
@@ -311,7 +311,7 @@ pub use crate::postings::Postings;
 pub use crate::schema::{DateOptions, DatePrecision, Document, Term};

 /// Index format version.
-const INDEX_FORMAT_VERSION: u32 = 4;
+const INDEX_FORMAT_VERSION: u32 = 5;

 /// Structure version for the index.
 #[derive(Clone, PartialEq, Eq, Serialize, Deserialize)]
@@ -819,7 +819,7 @@ pub mod tests {
    fn test_indexedfield_not_in_documents() -> crate::Result<()> {
        let mut schema_builder = Schema::builder();
        let text_field = schema_builder.add_text_field("text", TEXT);
-        let absent_field = schema_builder.add_text_field("text", TEXT);
+        let absent_field = schema_builder.add_text_field("absent_text", TEXT);
        let schema = schema_builder.build();
        let index = Index::create_in_ram(schema);
        let mut index_writer = index.writer_for_tests()?;
@@ -1001,7 +1001,7 @@ pub mod tests {
        let fast_field_signed = schema_builder.add_i64_field("signed", FAST);
        let fast_field_float = schema_builder.add_f64_field("float", FAST);
        let text_field = schema_builder.add_text_field("text", TEXT);
-        let stored_int_field = schema_builder.add_u64_field("text", STORED);
+        let stored_int_field = schema_builder.add_u64_field("stored_int", STORED);
        let schema = schema_builder.build();

        let index = Index::create_in_ram(schema);
--- a/src/postings/json_postings_writer.rs
+++ b/src/postings/json_postings_writer.rs
@@ -3,7 +3,7 @@ use std::io;
 use crate::fastfield::MultiValuedFastFieldWriter;
 use crate::indexer::doc_id_mapping::DocIdMapping;
 use crate::postings::postings_writer::SpecializedPostingsWriter;
-use crate::postings::recorder::{BufferLender, NothingRecorder, Recorder};
+use crate::postings::recorder::{BufferLender, DocIdRecorder, Recorder};
 use crate::postings::stacker::Addr;
 use crate::postings::{
    FieldSerializer, IndexingContext, IndexingPosition, PostingsWriter, UnorderedTermId,
@@ -16,7 +16,7 @@ use crate::{DocId, Term};
 #[derive(Default)]
 pub(crate) struct JsonPostingsWriter<Rec: Recorder> {
    str_posting_writer: SpecializedPostingsWriter<Rec>,
-    non_str_posting_writer: SpecializedPostingsWriter<NothingRecorder>,
+    non_str_posting_writer: SpecializedPostingsWriter<DocIdRecorder>,
 }

 impl<Rec: Recorder> From<JsonPostingsWriter<Rec>> for Box<dyn PostingsWriter> {
@@ -77,7 +77,7 @@ impl<Rec: Recorder> PostingsWriter for JsonPostingsWriter<Rec> {
                        serializer,
                    )?;
                } else {
-                    SpecializedPostingsWriter::<NothingRecorder>::serialize_one_term(
+                    SpecializedPostingsWriter::<DocIdRecorder>::serialize_one_term(
                        term,
                        *addr,
                        doc_id_map,
--- a/src/postings/per_field_postings_writer.rs
+++ b/src/postings/per_field_postings_writer.rs
@@ -1,6 +1,6 @@
 use crate::postings::json_postings_writer::JsonPostingsWriter;
 use crate::postings::postings_writer::SpecializedPostingsWriter;
-use crate::postings::recorder::{NothingRecorder, TermFrequencyRecorder, TfAndPositionRecorder};
+use crate::postings::recorder::{DocIdRecorder, TermFrequencyRecorder, TfAndPositionRecorder};
 use crate::postings::PostingsWriter;
 use crate::schema::{Field, FieldEntry, FieldType, IndexRecordOption, Schema};

@@ -34,7 +34,7 @@ fn posting_writer_from_field_entry(field_entry: &FieldEntry) -> Box<dyn Postings
            .get_indexing_options()
            .map(|indexing_options| match indexing_options.index_option() {
                IndexRecordOption::Basic => {
-                    SpecializedPostingsWriter::<NothingRecorder>::default().into()
+                    SpecializedPostingsWriter::<DocIdRecorder>::default().into()
                }
                IndexRecordOption::WithFreqs => {
                    SpecializedPostingsWriter::<TermFrequencyRecorder>::default().into()
@@ -43,19 +43,20 @@ fn posting_writer_from_field_entry(field_entry: &FieldEntry) -> Box<dyn Postings
                    SpecializedPostingsWriter::<TfAndPositionRecorder>::default().into()
                }
            })
-            .unwrap_or_else(|| SpecializedPostingsWriter::<NothingRecorder>::default().into()),
+            .unwrap_or_else(|| SpecializedPostingsWriter::<DocIdRecorder>::default().into()),
        FieldType::U64(_)
        | FieldType::I64(_)
        | FieldType::F64(_)
        | FieldType::Bool(_)
        | FieldType::Date(_)
        | FieldType::Bytes(_)
-        | FieldType::Facet(_) => Box::new(SpecializedPostingsWriter::<NothingRecorder>::default()),
+        | FieldType::IpAddr(_)
+        | FieldType::Facet(_) => Box::new(SpecializedPostingsWriter::<DocIdRecorder>::default()),
        FieldType::JsonObject(ref json_object_options) => {
            if let Some(text_indexing_option) = json_object_options.get_text_indexing_options() {
                match text_indexing_option.index_option() {
                    IndexRecordOption::Basic => {
-                        JsonPostingsWriter::<NothingRecorder>::default().into()
+                        JsonPostingsWriter::<DocIdRecorder>::default().into()
                    }
                    IndexRecordOption::WithFreqs => {
                        JsonPostingsWriter::<TermFrequencyRecorder>::default().into()
@@ -65,7 +66,7 @@ fn posting_writer_from_field_entry(field_entry: &FieldEntry) -> Box<dyn Postings
                    }
                }
            } else {
-                JsonPostingsWriter::<NothingRecorder>::default().into()
+                JsonPostingsWriter::<DocIdRecorder>::default().into()
            }
        }
    }
--- a/src/postings/postings_writer.rs
+++ b/src/postings/postings_writer.rs
@@ -3,7 +3,7 @@ use std::io;
 use std::marker::PhantomData;
 use std::ops::Range;

-use fnv::FnvHashMap;
+use rustc_hash::FxHashMap;

 use super::stacker::Addr;
 use crate::fastfield::MultiValuedFastFieldWriter;
@@ -56,12 +56,12 @@ pub(crate) fn serialize_postings(
    doc_id_map: Option<&DocIdMapping>,
    schema: &Schema,
    serializer: &mut InvertedIndexSerializer,
-) -> crate::Result<HashMap<Field, FnvHashMap<UnorderedTermId, TermOrdinal>>> {
+) -> crate::Result<HashMap<Field, FxHashMap<UnorderedTermId, TermOrdinal>>> {
    let mut term_offsets: Vec<(Term<&[u8]>, Addr, UnorderedTermId)> =
        Vec::with_capacity(ctx.term_index.len());
    term_offsets.extend(ctx.term_index.iter());
    term_offsets.sort_unstable_by_key(|(k, _, _)| k.clone());
-    let mut unordered_term_mappings: HashMap<Field, FnvHashMap<UnorderedTermId, TermOrdinal>> =
+    let mut unordered_term_mappings: HashMap<Field, FxHashMap<UnorderedTermId, TermOrdinal>> =
        HashMap::new();

    let field_offsets = make_field_partition(&term_offsets);
@@ -74,7 +74,7 @@ pub(crate) fn serialize_postings(
                let unordered_term_ids = term_offsets[byte_offsets.clone()]
                    .iter()
                    .map(|&(_, _, bucket)| bucket);
-                let mapping: FnvHashMap<UnorderedTermId, TermOrdinal> = unordered_term_ids
+                let mapping: FxHashMap<UnorderedTermId, TermOrdinal> = unordered_term_ids
                    .enumerate()
                    .map(|(term_ord, unord_term_id)| {
                        (unord_term_id as UnorderedTermId, term_ord as TermOrdinal)
@@ -89,6 +89,7 @@ pub(crate) fn serialize_postings(
            | FieldType::Bool(_) => {}
            FieldType::Bytes(_) => {}
            FieldType::JsonObject(_) => {}
+            FieldType::IpAddr(_) => {}
        }

        let postings_writer = per_field_postings_writers.get_for_field(field);
@@ -152,9 +153,9 @@ pub(crate) trait PostingsWriter: Send + Sync {
        indexing_position: &mut IndexingPosition,
        mut term_id_fast_field_writer_opt: Option<&mut MultiValuedFastFieldWriter>,
    ) {
-        let end_of_path_idx = term_buffer.as_slice().len();
+        let end_of_path_idx = term_buffer.len_bytes();
        let mut num_tokens = 0;
-        let mut end_position = 0;
+        let mut end_position = indexing_position.end_position;
        token_stream.process(&mut |token: &Token| {
            // We skip all tokens with a len greater than u16.
            if token.text.len() > MAX_TOKEN_LEN {
@@ -166,10 +167,10 @@ pub(crate) trait PostingsWriter: Send + Sync {
                );
                return;
            }
-            term_buffer.truncate(end_of_path_idx);
+            term_buffer.truncate_value_bytes(end_of_path_idx);
            term_buffer.append_bytes(token.text.as_bytes());
            let start_position = indexing_position.end_position + token.position as u32;
-            end_position = start_position + token.position_length as u32;
+            end_position = end_position.max(start_position + token.position_length as u32);
            let unordered_term_id = self.subscribe(doc_id, start_position, term_buffer, ctx);
            if let Some(term_id_fast_field_writer) = term_id_fast_field_writer_opt.as_mut() {
                term_id_fast_field_writer.add_val(unordered_term_id);
@@ -180,7 +181,7 @@ pub(crate) trait PostingsWriter: Send + Sync {

        indexing_position.end_position = end_position + POSITION_GAP;
        indexing_position.num_tokens += num_tokens;
-        term_buffer.truncate(end_of_path_idx);
+        term_buffer.truncate_value_bytes(end_of_path_idx);
    }

    fn total_num_tokens(&self) -> u64;
--- a/src/postings/recorder.rs
+++ b/src/postings/recorder.rs
@@ -83,21 +83,21 @@ pub(crate) trait Recorder: Copy + Default + Send + Sync + 'static {

 /// Only records the doc ids
 #[derive(Clone, Copy)]
-pub struct NothingRecorder {
+pub struct DocIdRecorder {
    stack: ExpUnrolledLinkedList,
    current_doc: DocId,
 }

-impl Default for NothingRecorder {
+impl Default for DocIdRecorder {
    fn default() -> Self {
-        NothingRecorder {
+        DocIdRecorder {
            stack: ExpUnrolledLinkedList::new(),
            current_doc: u32::MAX,
        }
    }
 }

-impl Recorder for NothingRecorder {
+impl Recorder for DocIdRecorder {
    fn current_doc(&self) -> DocId {
        self.current_doc
    }
--- a/src/postings/stacker/term_hashmap.rs
+++ b/src/postings/stacker/term_hashmap.rs
@@ -98,7 +98,7 @@ impl<'a> Iterator for Iter<'a> {
 /// # Panics if n == 0
 fn compute_previous_power_of_two(n: usize) -> usize {
    assert!(n > 0);
-    let msb = (63u32 - n.leading_zeros()) as u8;
+    let msb = (63u32 - (n as u64).leading_zeros()) as u8;
    1 << msb
 }

--- a/src/query/bitset/mod.rs
+++ b/src/query/bitset/mod.rs
@@ -86,10 +86,7 @@ impl DocSet for BitSetDocSet {
        self.doc
    }

-    /// Returns half of the `max_doc`
-    /// This is quite a terrible heuristic,
-    /// but we don't have access to any better
-    /// value.
+    /// Returns the number of values set in the underlying bitset.
    fn size_hint(&self) -> u32 {
        self.docs.len() as u32
    }
--- a/src/query/boolean_query/block_wand.rs
+++ b/src/query/boolean_query/block_wand.rs
@@ -212,12 +212,12 @@ pub fn block_wand(
 }

 /// Specialized version of [`block_wand`] for a single scorer.
-/// In this case, the algorithm is simple and readable and faster (~ x3)
+/// In this case, the algorithm is simple, readable and faster (~ x3)
 /// than the generic algorithm.
 /// The algorithm behaves as follows:
 /// - While we don't hit the end of the docset:
 ///   - While the block max score is under the `threshold`, go to the next block.
-///   - On a block, advance until the end and execute `callback`` when the doc score is greater or
+///   - On a block, advance until the end and execute `callback` when the doc score is greater or
 ///     equal to the `threshold`.
 pub fn block_wand_single_scorer(
    mut scorer: TermScorer,
--- a/src/query/mod.rs
+++ b/src/query/mod.rs
@@ -18,6 +18,7 @@ mod phrase_query;
 mod query;
 mod query_parser;
 mod range_query;
+mod range_query_ip_fastfield;
 mod regex_query;
 mod reqopt_scorer;
 mod scorer;
--- a/src/query/more_like_this/query.rs
+++ b/src/query/more_like_this/query.rs
@@ -31,7 +31,7 @@ pub struct MoreLikeThisQuery {
 #[derive(Debug, PartialEq, Clone)]
 enum TargetDocument {
    DocumentAdress(DocAddress),
-    DocumentFields(Vec<(Field, Vec<Value>)>),
+    DocumentFields(Vec<(Field, Vec<Value<'static>>)>),
 }

 impl MoreLikeThisQuery {
@@ -160,7 +160,10 @@ impl MoreLikeThisQueryBuilder {
    /// that will be used to compose the resulting query.
    /// This interface is meant to be used when you want to provide your own set of fields
    /// not necessarily from a specific document.
-    pub fn with_document_fields(self, doc_fields: Vec<(Field, Vec<Value>)>) -> MoreLikeThisQuery {
+    pub fn with_document_fields(
+        self,
+        doc_fields: Vec<(Field, Vec<Value<'static>>)>,
+    ) -> MoreLikeThisQuery {
        MoreLikeThisQuery {
            mlt: self.mlt,
            target: TargetDocument::DocumentFields(doc_fields),
--- a/src/query/query_parser/query_parser.rs
+++ b/src/query/query_parser/query_parser.rs
@@ -1,4 +1,5 @@
 use std::collections::HashMap;
+use std::net::{AddrParseError, IpAddr};
 use std::num::{ParseFloatError, ParseIntError};
 use std::ops::Bound;
 use std::str::{FromStr, ParseBoolError};
@@ -15,7 +16,7 @@ use crate::query::{
    TermQuery,
 };
 use crate::schema::{
-    Facet, FacetParseError, Field, FieldType, IndexRecordOption, Schema, Term, Type,
+    Facet, FacetParseError, Field, FieldType, IndexRecordOption, IntoIpv6Addr, Schema, Term, Type,
 };
 use crate::time::format_description::well_known::Rfc3339;
 use crate::time::OffsetDateTime;
@@ -84,6 +85,9 @@ pub enum QueryParserError {
    /// The format for the facet field is invalid.
    #[error("The facet field is malformed: {0}")]
    FacetFormatError(#[from] FacetParseError),
+    /// The format for the ip field is invalid.
+    #[error("The ip field is malformed: {0}")]
+    IpFormatError(#[from] AddrParseError),
 }

 /// Recursively remove empty clause from the AST
@@ -400,6 +404,10 @@ impl QueryParser {
                let bytes = base64::decode(phrase).map_err(QueryParserError::ExpectedBase64)?;
                Ok(Term::from_field_bytes(field, &bytes))
            }
+            FieldType::IpAddr(_) => {
+                let ip_v6 = IpAddr::from_str(phrase)?.into_ipv6_addr();
+                Ok(Term::from_field_ip_addr(field, ip_v6))
+            }
        }
    }

@@ -506,6 +514,11 @@ impl QueryParser {
                let bytes_term = Term::from_field_bytes(field, &bytes);
                Ok(vec![LogicalLiteral::Term(bytes_term)])
            }
+            FieldType::IpAddr(_) => {
+                let ip_v6 = IpAddr::from_str(phrase)?.into_ipv6_addr();
+                let term = Term::from_field_ip_addr(field, ip_v6);
+                Ok(vec![LogicalLiteral::Term(term)])
+            }
        }
    }

@@ -730,7 +743,7 @@ fn generate_literals_for_json_object(
    index_record_option: IndexRecordOption,
 ) -> Result<Vec<LogicalLiteral>, QueryParserError> {
    let mut logical_literals = Vec::new();
-    let mut term = Term::new();
+    let mut term = Term::with_capacity(100);
    let mut json_term_writer =
        JsonTermWriter::from_field_and_json_path(field, json_path, &mut term);
    if let Some(term) = convert_to_fast_value_and_get_term(&mut json_term_writer, phrase) {
--- a/src/query/range_query.rs
+++ b/src/query/range_query.rs
@@ -6,12 +6,13 @@ use common::BitSet;
 use crate::core::{Searcher, SegmentReader};
 use crate::error::TantivyError;
 use crate::query::explanation::does_not_match;
+use crate::query::range_query_ip_fastfield::IPFastFieldRangeWeight;
 use crate::query::{BitSetDocSet, ConstScorer, Explanation, Query, Scorer, Weight};
 use crate::schema::{Field, IndexRecordOption, Term, Type};
 use crate::termdict::{TermDictionary, TermStreamer};
 use crate::{DocId, Score};

-fn map_bound<TFrom, TTo, Transform: Fn(&TFrom) -> TTo>(
+pub(crate) fn map_bound<TFrom, TTo, Transform: Fn(&TFrom) -> TTo>(
    bound: &Bound<TFrom>,
    transform: &Transform,
 ) -> Bound<TTo> {
@@ -29,8 +30,17 @@ fn map_bound<TFrom, TTo, Transform: Fn(&TFrom) -> TTo>(
 ///
 /// # Implementation
 ///
-/// The current implement will iterate over the terms within the range
-/// and append all of the document cross into a `BitSet`.
+/// ## Default
+/// The default implementation collects all documents _upfront_ into a `BitSet`.
+/// This is done by iterating over the terms within the range and loading all docs for each
+/// `TermInfo` from the inverted index (posting list) and put them into a `BitSet`.
+/// Depending on the number of terms matched, this is a potentially expensive operation.
+///
+/// ## IP fast field
+/// For IP fast fields a custom variant is used, by scanning the fast field. Unlike the default
+/// variant we can walk in a lazy fashion over it, since the fastfield is implicit orderered by
+/// DocId.
+///
 ///
 /// # Example
 ///
@@ -249,7 +259,8 @@ impl Query for RangeQuery {
        _scoring_enabled: bool,
    ) -> crate::Result<Box<dyn Weight>> {
        let schema = searcher.schema();
-        let value_type = schema.get_field_entry(self.field).field_type().value_type();
+        let field_type = schema.get_field_entry(self.field).field_type();
+        let value_type = field_type.value_type();
        if value_type != self.value_type {
            let err_msg = format!(
                "Create a range query of the type {:?}, when the field given was of type {:?}",
@@ -257,11 +268,20 @@ impl Query for RangeQuery {
            );
            return Err(TantivyError::SchemaError(err_msg));
        }
-        Ok(Box::new(RangeWeight {
-            field: self.field,
-            left_bound: self.left_bound.clone(),
-            right_bound: self.right_bound.clone(),
-        }))
+
+        if field_type.is_ip_addr() && field_type.is_fast() {
+            Ok(Box::new(IPFastFieldRangeWeight::new(
+                self.field,
+                &self.left_bound,
+                &self.right_bound,
+            )))
+        } else {
+            Ok(Box::new(RangeWeight {
+                field: self.field,
+                left_bound: self.left_bound.clone(),
+                right_bound: self.right_bound.clone(),
+            }))
+        }
    }
 }

@@ -328,13 +348,15 @@ impl Weight for RangeWeight {
 #[cfg(test)]
 mod tests {

+    use std::net::IpAddr;
    use std::ops::Bound;
+    use std::str::FromStr;

    use super::RangeQuery;
    use crate::collector::{Count, TopDocs};
    use crate::query::QueryParser;
-    use crate::schema::{Document, Field, Schema, INDEXED, TEXT};
-    use crate::Index;
+    use crate::schema::{Document, Field, IntoIpv6Addr, Schema, FAST, INDEXED, STORED, TEXT};
+    use crate::{doc, Index};

    #[test]
    fn test_range_query_simple() -> crate::Result<()> {
@@ -506,4 +528,165 @@ mod tests {
        assert_eq!(top_docs.len(), 1);
        Ok(())
    }
+
+    #[test]
+    fn search_ip_range_test_posting_list() {
+        search_ip_range_test_opt(false);
+    }
+
+    #[test]
+    fn search_ip_range_test() {
+        search_ip_range_test_opt(true);
+    }
+
+    fn search_ip_range_test_opt(with_fast_field: bool) {
+        let mut schema_builder = Schema::builder();
+        let ip_field = if with_fast_field {
+            schema_builder.add_ip_addr_field("ip", INDEXED | STORED | FAST)
+        } else {
+            schema_builder.add_ip_addr_field("ip", INDEXED | STORED)
+        };
+        let text_field = schema_builder.add_text_field("text", TEXT | STORED);
+        let schema = schema_builder.build();
+        let index = Index::create_in_ram(schema);
+        let ip_addr_1 = IpAddr::from_str("127.0.0.10").unwrap().into_ipv6_addr();
+        let ip_addr_2 = IpAddr::from_str("127.0.0.20").unwrap().into_ipv6_addr();
+
+        {
+            let mut index_writer = index.writer(3_000_000).unwrap();
+            for _ in 0..1_000 {
+                index_writer
+                    .add_document(doc!(
+                        ip_field => ip_addr_1,
+                        text_field => "BLUBBER"
+                    ))
+                    .unwrap();
+            }
+            for _ in 0..1_000 {
+                index_writer
+                    .add_document(doc!(
+                        ip_field => ip_addr_2,
+                        text_field => "BLOBBER"
+                    ))
+                    .unwrap();
+            }
+
+            index_writer.commit().unwrap();
+        }
+        let reader = index.reader().unwrap();
+        let searcher = reader.searcher();
+
+        let get_num_hits = |query| {
+            let (_top_docs, count) = searcher
+                .search(&query, &(TopDocs::with_limit(10), Count))
+                .unwrap();
+            count
+        };
+        let query_from_text = |text: &str| {
+            QueryParser::for_index(&index, vec![])
+                .parse_query(text)
+                .unwrap()
+        };
+
+        // Inclusive range
+        assert_eq!(
+            get_num_hits(query_from_text("ip:[127.0.0.1 TO 127.0.0.20]")),
+            2000
+        );
+
+        assert_eq!(
+            get_num_hits(query_from_text("ip:[127.0.0.10 TO 127.0.0.20]")),
+            2000
+        );
+
+        assert_eq!(
+            get_num_hits(query_from_text("ip:[127.0.0.11 TO 127.0.0.20]")),
+            1000
+        );
+
+        assert_eq!(
+            get_num_hits(query_from_text("ip:[127.0.0.11 TO 127.0.0.19]")),
+            0
+        );
+
+        assert_eq!(get_num_hits(query_from_text("ip:[127.0.0.11 TO *]")), 1000);
+        assert_eq!(get_num_hits(query_from_text("ip:[127.0.0.21 TO *]")), 0);
+        assert_eq!(get_num_hits(query_from_text("ip:[* TO 127.0.0.9]")), 0);
+        assert_eq!(get_num_hits(query_from_text("ip:[* TO 127.0.0.10]")), 1000);
+
+        // Exclusive range
+        assert_eq!(
+            get_num_hits(query_from_text("ip:{127.0.0.1 TO 127.0.0.20}")),
+            1000
+        );
+
+        assert_eq!(
+            get_num_hits(query_from_text("ip:{127.0.0.1 TO 127.0.0.21}")),
+            2000
+        );
+
+        assert_eq!(
+            get_num_hits(query_from_text("ip:{127.0.0.10 TO 127.0.0.20}")),
+            0
+        );
+
+        assert_eq!(
+            get_num_hits(query_from_text("ip:{127.0.0.11 TO 127.0.0.20}")),
+            0
+        );
+
+        assert_eq!(
+            get_num_hits(query_from_text("ip:{127.0.0.11 TO 127.0.0.19}")),
+            0
+        );
+
+        assert_eq!(get_num_hits(query_from_text("ip:{127.0.0.11 TO *}")), 1000);
+        assert_eq!(get_num_hits(query_from_text("ip:{127.0.0.10 TO *}")), 1000);
+        assert_eq!(get_num_hits(query_from_text("ip:{127.0.0.21 TO *}")), 0);
+        assert_eq!(get_num_hits(query_from_text("ip:{127.0.0.20 TO *}")), 0);
+        assert_eq!(get_num_hits(query_from_text("ip:{127.0.0.19 TO *}")), 1000);
+        assert_eq!(get_num_hits(query_from_text("ip:{* TO 127.0.0.9}")), 0);
+        assert_eq!(get_num_hits(query_from_text("ip:{* TO 127.0.0.10}")), 0);
+        assert_eq!(get_num_hits(query_from_text("ip:{* TO 127.0.0.11}")), 1000);
+
+        // Inclusive/Exclusive range
+        assert_eq!(
+            get_num_hits(query_from_text("ip:[127.0.0.1 TO 127.0.0.20}")),
+            1000
+        );
+
+        assert_eq!(
+            get_num_hits(query_from_text("ip:{127.0.0.1 TO 127.0.0.20]")),
+            2000
+        );
+
+        // Intersection
+        assert_eq!(
+            get_num_hits(query_from_text(
+                "text:BLUBBER AND ip:[127.0.0.10 TO 127.0.0.10]"
+            )),
+            1000
+        );
+
+        assert_eq!(
+            get_num_hits(query_from_text(
+                "text:BLOBBER AND ip:[127.0.0.10 TO 127.0.0.10]"
+            )),
+            0
+        );
+
+        assert_eq!(
+            get_num_hits(query_from_text(
+                "text:BLOBBER AND ip:[127.0.0.20 TO 127.0.0.20]"
+            )),
+            1000
+        );
+
+        assert_eq!(
+            get_num_hits(query_from_text(
+                "text:BLUBBER AND ip:[127.0.0.20 TO 127.0.0.20]"
+            )),
+            0
+        );
+    }
 }
--- a/src/query/range_query_ip_fastfield.rs
+++ b/src/query/range_query_ip_fastfield.rs
@@ -0,0 +1,595 @@
+//! IP Fastfields support efficient scanning for range queries.
+//! We use this variant only if the fastfield exists, otherwise the default in `range_query` is
+//! used, which uses the term dictionary + postings.
+
+use std::net::Ipv6Addr;
+use std::ops::{Bound, RangeInclusive};
+use std::sync::Arc;
+
+use common::BinarySerializable;
+use fastfield_codecs::{Column, MonotonicallyMappableToU128};
+
+use super::range_query::map_bound;
+use super::{ConstScorer, Explanation, Scorer, Weight};
+use crate::schema::{Cardinality, Field};
+use crate::{DocId, DocSet, Score, SegmentReader, TantivyError, TERMINATED};
+
+/// `IPFastFieldRangeWeight` uses the ip address fast field to execute range queries.
+pub struct IPFastFieldRangeWeight {
+    field: Field,
+    left_bound: Bound<Ipv6Addr>,
+    right_bound: Bound<Ipv6Addr>,
+}
+
+impl IPFastFieldRangeWeight {
+    pub fn new(field: Field, left_bound: &Bound<Vec<u8>>, right_bound: &Bound<Vec<u8>>) -> Self {
+        let ip_from_bound_raw_data = |data: &Vec<u8>| {
+            let left_ip_u128: u128 =
+                u128::from_be(BinarySerializable::deserialize(&mut &data[..]).unwrap());
+            Ipv6Addr::from_u128(left_ip_u128)
+        };
+        let left_bound = map_bound(left_bound, &ip_from_bound_raw_data);
+        let right_bound = map_bound(right_bound, &ip_from_bound_raw_data);
+        Self {
+            field,
+            left_bound,
+            right_bound,
+        }
+    }
+}
+
+impl Weight for IPFastFieldRangeWeight {
+    fn scorer(&self, reader: &SegmentReader, boost: Score) -> crate::Result<Box<dyn Scorer>> {
+        let field_type = reader.schema().get_field_entry(self.field).field_type();
+        match field_type.fastfield_cardinality().unwrap() {
+            Cardinality::SingleValue => {
+                let ip_addr_fast_field = reader.fast_fields().ip_addr(self.field)?;
+                let value_range = bound_to_value_range(
+                    &self.left_bound,
+                    &self.right_bound,
+                    ip_addr_fast_field.as_ref(),
+                );
+                let docset = IpRangeDocSet::new(value_range, ip_addr_fast_field);
+                Ok(Box::new(ConstScorer::new(docset, boost)))
+            }
+            Cardinality::MultiValues => unimplemented!(),
+        }
+    }
+
+    fn explain(&self, reader: &SegmentReader, doc: DocId) -> crate::Result<Explanation> {
+        let mut scorer = self.scorer(reader, 1.0)?;
+        if scorer.seek(doc) != doc {
+            return Err(TantivyError::InvalidArgument(format!(
+                "Document #({}) does not match",
+                doc
+            )));
+        }
+        let explanation = Explanation::new("Const", scorer.score());
+
+        Ok(explanation)
+    }
+}
+
+fn bound_to_value_range(
+    left_bound: &Bound<Ipv6Addr>,
+    right_bound: &Bound<Ipv6Addr>,
+    column: &dyn Column<Ipv6Addr>,
+) -> RangeInclusive<Ipv6Addr> {
+    let start_value = match left_bound {
+        Bound::Included(ip_addr) => *ip_addr,
+        Bound::Excluded(ip_addr) => Ipv6Addr::from(ip_addr.to_u128() + 1),
+        Bound::Unbounded => column.min_value(),
+    };
+
+    let end_value = match right_bound {
+        Bound::Included(ip_addr) => *ip_addr,
+        Bound::Excluded(ip_addr) => Ipv6Addr::from(ip_addr.to_u128() - 1),
+        Bound::Unbounded => column.max_value(),
+    };
+    start_value..=end_value
+}
+
+/// Helper to have a cursor over a vec of docids
+struct VecCursor {
+    docs: Vec<u32>,
+    current_pos: usize,
+}
+impl VecCursor {
+    fn new() -> Self {
+        Self {
+            docs: Vec::with_capacity(32),
+            current_pos: 0,
+        }
+    }
+    fn next(&mut self) -> Option<u32> {
+        self.current_pos += 1;
+        self.current()
+    }
+    #[inline]
+    fn current(&self) -> Option<u32> {
+        self.docs.get(self.current_pos).map(|el| *el as u32)
+    }
+
+    fn get_cleared_data(&mut self) -> &mut Vec<u32> {
+        self.docs.clear();
+        self.current_pos = 0;
+        &mut self.docs
+    }
+
+    fn is_empty(&self) -> bool {
+        self.current_pos >= self.docs.len()
+    }
+}
+
+struct IpRangeDocSet {
+    /// The range filter on the values.
+    value_range: RangeInclusive<Ipv6Addr>,
+    ip_addr_fast_field: Arc<dyn Column<Ipv6Addr>>,
+    /// The next docid start range to fetch (inclusive).
+    next_fetch_start: u32,
+    /// Number of docs range checked in a batch.
+    ///
+    /// There are two patterns.
+    /// - We do a full scan. => We can load large chunks. We don't know in advance if seek call
+    /// will come, so we start with small chunks
+    /// - We load docs, interspersed with seek calls. When there are big jumps in the seek, we
+    /// should load small chunks. When the seeks are small, we can employ the same strategy as on a
+    /// full scan.
+    fetch_horizon: u32,
+    /// Current batch of loaded docs.
+    loaded_docs: VecCursor,
+    last_seek_pos_opt: Option<u32>,
+}
+
+const DEFALT_FETCH_HORIZON: u32 = 128;
+impl IpRangeDocSet {
+    fn new(
+        value_range: RangeInclusive<Ipv6Addr>,
+        ip_addr_fast_field: Arc<dyn Column<Ipv6Addr>>,
+    ) -> Self {
+        let mut ip_range_docset = Self {
+            value_range,
+            ip_addr_fast_field,
+            loaded_docs: VecCursor::new(),
+            next_fetch_start: 0,
+            fetch_horizon: DEFALT_FETCH_HORIZON,
+            last_seek_pos_opt: None,
+        };
+        ip_range_docset.reset_fetch_range();
+        ip_range_docset.fetch_block();
+        ip_range_docset
+    }
+
+    fn reset_fetch_range(&mut self) {
+        self.fetch_horizon = DEFALT_FETCH_HORIZON;
+    }
+
+    /// Returns true if more data could be fetched
+    fn fetch_block(&mut self) {
+        const MAX_HORIZON: u32 = 100_000;
+        while self.loaded_docs.is_empty() {
+            let finished_to_end = self.fetch_horizon(self.fetch_horizon);
+            if finished_to_end {
+                break;
+            }
+            // Fetch more data, increase horizon. Horizon only gets reset when doing a seek.
+            self.fetch_horizon = (self.fetch_horizon * 2).min(MAX_HORIZON);
+        }
+    }
+
+    /// check if the distance between the seek calls is large
+    fn is_last_seek_distance_large(&self, new_seek: DocId) -> bool {
+        if let Some(last_seek_pos) = self.last_seek_pos_opt {
+            (new_seek - last_seek_pos) >= 128
+        } else {
+            true
+        }
+    }
+
+    /// Fetches a block for docid range [next_fetch_start .. next_fetch_start + HORIZON]
+    fn fetch_horizon(&mut self, horizon: u32) -> bool {
+        let mut finished_to_end = false;
+
+        let limit = self.ip_addr_fast_field.num_vals();
+        let mut end = self.next_fetch_start + horizon;
+        if end >= limit {
+            end = limit;
+            finished_to_end = true;
+        }
+
+        let data = self.loaded_docs.get_cleared_data();
+        self.ip_addr_fast_field.get_positions_for_value_range(
+            self.value_range.clone(),
+            self.next_fetch_start..end,
+            data,
+        );
+        self.next_fetch_start = end;
+        finished_to_end
+    }
+}
+
+impl DocSet for IpRangeDocSet {
+    #[inline]
+    fn advance(&mut self) -> DocId {
+        if let Some(docid) = self.loaded_docs.next() {
+            docid as u32
+        } else {
+            if self.next_fetch_start >= self.ip_addr_fast_field.num_vals() as u32 {
+                return TERMINATED;
+            }
+            self.fetch_block();
+            self.loaded_docs.current().unwrap_or(TERMINATED)
+        }
+    }
+
+    #[inline]
+    fn doc(&self) -> DocId {
+        self.loaded_docs
+            .current()
+            .map(|el| el as u32)
+            .unwrap_or(TERMINATED)
+    }
+
+    /// Advances the `DocSet` forward until reaching the target, or going to the
+    /// lowest [`DocId`] greater than the target.
+    ///
+    /// If the end of the `DocSet` is reached, [`TERMINATED`] is returned.
+    ///
+    /// Calling `.seek(target)` on a terminated `DocSet` is legal. Implementation
+    /// of `DocSet` should support it.
+    ///
+    /// Calling `seek(TERMINATED)` is also legal and is the normal way to consume a `DocSet`.
+    fn seek(&mut self, target: DocId) -> DocId {
+        if self.is_last_seek_distance_large(target) {
+            self.reset_fetch_range();
+        }
+        if target > self.next_fetch_start {
+            self.next_fetch_start = target;
+        }
+        let mut doc = self.doc();
+        debug_assert!(doc <= target);
+        while doc < target {
+            doc = self.advance();
+        }
+        self.last_seek_pos_opt = Some(target);
+        doc
+    }
+
+    fn size_hint(&self) -> u32 {
+        0 // heuristic possible by checking number of hits when fetching a block
+    }
+}
+
+#[cfg(test)]
+mod tests {
+    use proptest::prelude::ProptestConfig;
+    use proptest::strategy::Strategy;
+    use proptest::{prop_oneof, proptest};
+
+    use super::*;
+    use crate::collector::Count;
+    use crate::query::QueryParser;
+    use crate::schema::{Schema, FAST, INDEXED, STORED, STRING};
+    use crate::Index;
+
+    #[derive(Clone, Debug)]
+    pub struct Doc {
+        pub id: String,
+        pub ip: Ipv6Addr,
+    }
+
+    fn operation_strategy() -> impl Strategy<Value = Doc> {
+        prop_oneof![
+            (0u64..100u64).prop_map(doc_from_id_1),
+            (1u64..100u64).prop_map(doc_from_id_2),
+        ]
+    }
+
+    pub fn doc_from_id_1(id: u64) -> Doc {
+        Doc {
+            // ip != id
+            id: id.to_string(),
+            ip: Ipv6Addr::from_u128(id as u128),
+        }
+    }
+    fn doc_from_id_2(id: u64) -> Doc {
+        Doc {
+            // ip != id
+            id: (id - 1).to_string(),
+            ip: Ipv6Addr::from_u128(id as u128),
+        }
+    }
+
+    proptest! {
+        #![proptest_config(ProptestConfig::with_cases(10))]
+        #[test]
+        fn test_ip_range_for_docs_prop(ops in proptest::collection::vec(operation_strategy(), 1..1000)) {
+            assert!(test_ip_range_for_docs(ops).is_ok());
+        }
+    }
+
+    #[test]
+    fn ip_range_regression1_test() {
+        let ops = vec![
+            doc_from_id_1(52),
+            doc_from_id_1(63),
+            doc_from_id_1(12),
+            doc_from_id_2(91),
+            doc_from_id_2(33),
+        ];
+        assert!(test_ip_range_for_docs(ops).is_ok());
+    }
+
+    #[test]
+    fn ip_range_regression2_test() {
+        let ops = vec![doc_from_id_1(0)];
+        assert!(test_ip_range_for_docs(ops).is_ok());
+    }
+
+    pub fn create_index_from_docs(docs: &[Doc]) -> Index {
+        let mut schema_builder = Schema::builder();
+        let ip_field = schema_builder.add_ip_addr_field("ip", INDEXED | STORED | FAST);
+        let text_field = schema_builder.add_text_field("id", STRING | STORED);
+        let schema = schema_builder.build();
+        let index = Index::create_in_ram(schema);
+
+        {
+            let mut index_writer = index.writer(3_000_000).unwrap();
+            for doc in docs.iter() {
+                index_writer
+                    .add_document(doc!(
+                        ip_field => doc.ip,
+                        text_field => doc.id.to_string(),
+                    ))
+                    .unwrap();
+            }
+
+            index_writer.commit().unwrap();
+        }
+        index
+    }
+
+    fn test_ip_range_for_docs(docs: Vec<Doc>) -> crate::Result<()> {
+        let index = create_index_from_docs(&docs);
+        let reader = index.reader().unwrap();
+        let searcher = reader.searcher();
+
+        let get_num_hits = |query| searcher.search(&query, &(Count)).unwrap();
+        let query_from_text = |text: &str| {
+            QueryParser::for_index(&index, vec![])
+                .parse_query(text)
+                .unwrap()
+        };
+
+        let gen_query_inclusive = |from: Ipv6Addr, to: Ipv6Addr| {
+            format!("ip:[{} TO {}]", &from.to_string(), &to.to_string())
+        };
+
+        let test_sample = |sample_docs: Vec<Doc>| {
+            let mut ips: Vec<Ipv6Addr> = sample_docs.iter().map(|doc| doc.ip).collect();
+            ips.sort();
+            let expected_num_hits = docs
+                .iter()
+                .filter(|doc| (ips[0]..=ips[1]).contains(&doc.ip))
+                .count();
+
+            let query = gen_query_inclusive(ips[0], ips[1]);
+            assert_eq!(get_num_hits(query_from_text(&query)), expected_num_hits);
+
+            // Intersection search
+            let id_filter = sample_docs[0].id.to_string();
+            let expected_num_hits = docs
+                .iter()
+                .filter(|doc| (ips[0]..=ips[1]).contains(&doc.ip) && doc.id == id_filter)
+                .count();
+            let query = format!("{} AND id:{}", query, &id_filter);
+            assert_eq!(get_num_hits(query_from_text(&query)), expected_num_hits);
+        };
+
+        test_sample(vec![docs[0].clone(), docs[0].clone()]);
+        if docs.len() > 1 {
+            test_sample(vec![docs[0].clone(), docs[1].clone()]);
+            test_sample(vec![docs[1].clone(), docs[1].clone()]);
+        }
+        if docs.len() > 2 {
+            test_sample(vec![docs[1].clone(), docs[2].clone()]);
+        }
+
+        Ok(())
+    }
+}
+
+#[cfg(all(test, feature = "unstable"))]
+mod bench {
+
+    use rand::{thread_rng, Rng};
+    use test::Bencher;
+
+    use super::tests::*;
+    use super::*;
+    use crate::collector::Count;
+    use crate::query::QueryParser;
+    use crate::Index;
+
+    fn get_index_0_to_100() -> Index {
+        let mut rng = thread_rng();
+        let num_vals = 100_000;
+        let docs: Vec<_> = (0..num_vals)
+            .map(|_i| {
+                let id = if rng.gen_bool(0.01) {
+                    "veryfew".to_string() // 1%
+                } else if rng.gen_bool(0.1) {
+                    "few".to_string() // 9%
+                } else {
+                    "many".to_string() // 90%
+                };
+                Doc {
+                    id: id,
+                    // Multiply by 1000, so that we create many buckets in the compact space
+                    ip: Ipv6Addr::from_u128(rng.gen_range(0..100) * 1000),
+                }
+            })
+            .collect();
+
+        let index = create_index_from_docs(&docs);
+        index
+    }
+    fn excute_query(
+        start_inclusive: Ipv6Addr,
+        end_inclusive: Ipv6Addr,
+        suffix: &str,
+        index: &Index,
+    ) -> usize {
+        let gen_query_inclusive = |from: Ipv6Addr, to: Ipv6Addr| {
+            format!(
+                "ip:[{} TO {}] {}",
+                &from.to_string(),
+                &to.to_string(),
+                suffix
+            )
+        };
+
+        let query = gen_query_inclusive(start_inclusive, end_inclusive);
+        let query_from_text = |text: &str| {
+            QueryParser::for_index(&index, vec![])
+                .parse_query(text)
+                .unwrap()
+        };
+        let query = query_from_text(&query);
+        let reader = index.reader().unwrap();
+        let searcher = reader.searcher();
+        searcher.search(&query, &(Count)).unwrap()
+    }
+
+    #[bench]
+    fn bench_ip_range_hit_90_percent(bench: &mut Bencher) {
+        let index = get_index_0_to_100();
+
+        bench.iter(|| {
+            let start = Ipv6Addr::from_u128(0);
+            let end = Ipv6Addr::from_u128(90 * 1000);
+
+            excute_query(start, end, "", &index)
+        });
+    }
+
+    #[bench]
+    fn bench_ip_range_hit_10_percent(bench: &mut Bencher) {
+        let index = get_index_0_to_100();
+
+        bench.iter(|| {
+            let start = Ipv6Addr::from_u128(0);
+            let end = Ipv6Addr::from_u128(10 * 1000);
+
+            excute_query(start, end, "", &index)
+        });
+    }
+
+    #[bench]
+    fn bench_ip_range_hit_1_percent(bench: &mut Bencher) {
+        let index = get_index_0_to_100();
+
+        bench.iter(|| {
+            let start = Ipv6Addr::from_u128(10 * 1000);
+            let end = Ipv6Addr::from_u128(10 * 1000);
+
+            excute_query(start, end, "", &index)
+        });
+    }
+
+    #[bench]
+    fn bench_ip_range_hit_10_percent_intersect_with_10_percent(bench: &mut Bencher) {
+        let index = get_index_0_to_100();
+
+        bench.iter(|| {
+            let start = Ipv6Addr::from_u128(0);
+            let end = Ipv6Addr::from_u128(10 * 1000);
+
+            excute_query(start, end, "AND id:few", &index)
+        });
+    }
+
+    #[bench]
+    fn bench_ip_range_hit_1_percent_intersect_with_10_percent(bench: &mut Bencher) {
+        let index = get_index_0_to_100();
+
+        bench.iter(|| {
+            let start = Ipv6Addr::from_u128(10 * 1000);
+            let end = Ipv6Addr::from_u128(10 * 1000);
+
+            excute_query(start, end, "AND id:few", &index)
+        });
+    }
+
+    #[bench]
+    fn bench_ip_range_hit_1_percent_intersect_with_90_percent(bench: &mut Bencher) {
+        let index = get_index_0_to_100();
+
+        bench.iter(|| {
+            let start = Ipv6Addr::from_u128(10 * 1000);
+            let end = Ipv6Addr::from_u128(10 * 1000);
+
+            excute_query(start, end, "AND id:many", &index)
+        });
+    }
+
+    #[bench]
+    fn bench_ip_range_hit_1_percent_intersect_with_1_percent(bench: &mut Bencher) {
+        let index = get_index_0_to_100();
+
+        bench.iter(|| {
+            let start = Ipv6Addr::from_u128(10 * 1000);
+            let end = Ipv6Addr::from_u128(10 * 1000);
+
+            excute_query(start, end, "AND id:veryfew", &index)
+        });
+    }
+
+    #[bench]
+    fn bench_ip_range_hit_10_percent_intersect_with_90_percent(bench: &mut Bencher) {
+        let index = get_index_0_to_100();
+
+        bench.iter(|| {
+            let start = Ipv6Addr::from_u128(0);
+            let end = Ipv6Addr::from_u128(10 * 1000);
+
+            excute_query(start, end, "AND id:many", &index)
+        });
+    }
+
+    #[bench]
+    fn bench_ip_range_hit_90_percent_intersect_with_90_percent(bench: &mut Bencher) {
+        let index = get_index_0_to_100();
+
+        bench.iter(|| {
+            let start = Ipv6Addr::from_u128(0);
+            let end = Ipv6Addr::from_u128(90 * 1000);
+
+            excute_query(start, end, "AND id:many", &index)
+        });
+    }
+
+    #[bench]
+    fn bench_ip_range_hit_90_percent_intersect_with_10_percent(bench: &mut Bencher) {
+        let index = get_index_0_to_100();
+
+        bench.iter(|| {
+            let start = Ipv6Addr::from_u128(0);
+            let end = Ipv6Addr::from_u128(90 * 1000);
+
+            excute_query(start, end, "AND id:few", &index)
+        });
+    }
+
+    #[bench]
+    fn bench_ip_range_hit_90_percent_intersect_with_1_percent(bench: &mut Bencher) {
+        let index = get_index_0_to_100();
+
+        bench.iter(|| {
+            let start = Ipv6Addr::from_u128(0);
+            let end = Ipv6Addr::from_u128(90 * 1000);
+
+            excute_query(start, end, "AND id:veryfew", &index)
+        });
+    }
+}
--- a/src/query/set_query.rs
+++ b/src/query/set_query.rs
@@ -115,7 +115,7 @@ mod tests {
    pub fn test_term_set_query() -> crate::Result<()> {
        let mut schema_builder = Schema::builder();
        let field1 = schema_builder.add_text_field("field1", TEXT);
-        let field2 = schema_builder.add_text_field("field1", TEXT);
+        let field2 = schema_builder.add_text_field("field2", TEXT);
        let schema = schema_builder.build();
        let index = Index::create_in_ram(schema);
        {
--- a/src/query/term_query/term_query.rs
+++ b/src/query/term_query/term_query.rs
@@ -124,3 +124,70 @@ impl Query for TermQuery {
        visitor(&self.term, false);
    }
 }
+
+#[cfg(test)]
+mod tests {
+    use std::net::{IpAddr, Ipv6Addr};
+    use std::str::FromStr;
+
+    use fastfield_codecs::MonotonicallyMappableToU128;
+
+    use crate::collector::{Count, TopDocs};
+    use crate::query::{Query, QueryParser, TermQuery};
+    use crate::schema::{IndexRecordOption, IntoIpv6Addr, Schema, INDEXED, STORED};
+    use crate::{doc, Index, Term};
+
+    #[test]
+    fn search_ip_test() {
+        let mut schema_builder = Schema::builder();
+        let ip_field = schema_builder.add_ip_addr_field("ip", INDEXED | STORED);
+        let schema = schema_builder.build();
+        let index = Index::create_in_ram(schema);
+        let ip_addr_1 = IpAddr::from_str("127.0.0.1").unwrap().into_ipv6_addr();
+        let ip_addr_2 = Ipv6Addr::from_u128(10);
+
+        {
+            let mut index_writer = index.writer(3_000_000).unwrap();
+            index_writer
+                .add_document(doc!(
+                    ip_field => ip_addr_1
+                ))
+                .unwrap();
+            index_writer
+                .add_document(doc!(
+                    ip_field => ip_addr_2
+                ))
+                .unwrap();
+
+            index_writer.commit().unwrap();
+        }
+        let reader = index.reader().unwrap();
+        let searcher = reader.searcher();
+
+        let assert_single_hit = |query| {
+            let (_top_docs, count) = searcher
+                .search(&query, &(TopDocs::with_limit(2), Count))
+                .unwrap();
+            assert_eq!(count, 1);
+        };
+        let query_from_text = |text: String| {
+            QueryParser::for_index(&index, vec![ip_field])
+                .parse_query(&text)
+                .unwrap()
+        };
+
+        let query_from_ip = |ip_addr| -> Box<dyn Query> {
+            Box::new(TermQuery::new(
+                Term::from_field_ip_addr(ip_field, ip_addr),
+                IndexRecordOption::Basic,
+            ))
+        };
+
+        assert_single_hit(query_from_ip(ip_addr_1));
+        assert_single_hit(query_from_ip(ip_addr_2));
+        assert_single_hit(query_from_text("127.0.0.1".to_string()));
+        assert_single_hit(query_from_text("\"127.0.0.1\"".to_string()));
+        assert_single_hit(query_from_text(format!("\"{}\"", ip_addr_1)));
+        assert_single_hit(query_from_text(format!("\"{}\"", ip_addr_2)));
+    }
+}
--- a/src/schema/document.rs
+++ b/src/schema/document.rs
@@ -1,34 +1,105 @@
 use std::collections::{HashMap, HashSet};
 use std::io::{self, Read, Write};
-use std::mem;
+use std::net::Ipv6Addr;
+use std::sync::Arc;
+use std::{fmt, mem};

 use common::{BinarySerializable, VInt};
+use itertools::Either;
+use yoke::erased::ErasedArcCart;
+use yoke::Yoke;

 use super::*;
+use crate::schema::value::MaybeOwnedString;
 use crate::tokenizer::PreTokenizedString;
 use crate::DateTime;

+/// A group of FieldValue sharing an underlying storage
+///
+/// Or a single owned FieldValue.
+#[derive(Clone)]
+enum FieldValueGroup {
+    Single(FieldValue<'static>),
+    Group(Yoke<VecFieldValue<'static>, ErasedArcCart>),
+}
+
+// this NewType is required to make it possible to yoke a vec with non 'static inner values.
+#[derive(yoke::Yokeable, Clone)]
+struct VecFieldValue<'a>(Vec<FieldValue<'a>>);
+
+impl<'a> std::ops::Deref for VecFieldValue<'a> {
+    type Target = Vec<FieldValue<'a>>;
+
+    fn deref(&self) -> &Self::Target {
+        &self.0
+    }
+}
+
+impl<'a> From<Vec<FieldValue<'a>>> for VecFieldValue<'a> {
+    fn from(field_values: Vec<FieldValue>) -> VecFieldValue {
+        VecFieldValue(field_values)
+    }
+}
+
+impl FieldValueGroup {
+    fn iter(&self) -> impl Iterator<Item = &FieldValue> {
+        match self {
+            FieldValueGroup::Single(field_value) => Either::Left(std::iter::once(field_value)),
+            FieldValueGroup::Group(field_values) => Either::Right(field_values.get().iter()),
+        }
+    }
+
+    fn count(&self) -> usize {
+        match self {
+            FieldValueGroup::Single(_) => 1,
+            FieldValueGroup::Group(field_values) => field_values.get().len(),
+        }
+    }
+}
+
+impl From<Vec<FieldValue<'static>>> for FieldValueGroup {
+    fn from(field_values: Vec<FieldValue<'static>>) -> FieldValueGroup {
+        FieldValueGroup::Group(
+            Yoke::new_always_owned(field_values.into())
+                .wrap_cart_in_arc()
+                .erase_arc_cart(),
+        )
+    }
+}
+
 /// Tantivy's Document is the object that can
 /// be indexed and then searched for.
 ///
 /// Documents are fundamentally a collection of unordered couples `(field, value)`.
 /// In this list, one field may appear more than once.
-#[derive(Clone, Debug, serde::Serialize, serde::Deserialize, Default)]
+#[derive(Clone, Default)]
+// TODO bring back Ser/De and Debug
+//#[derive(Clone, Debug, serde::Serialize, serde::Deserialize, Default)]
+//#[serde(bound(deserialize = "'static: 'de, 'de: 'static"))]
 pub struct Document {
-    field_values: Vec<FieldValue>,
+    field_values: Vec<FieldValueGroup>,
 }

-impl From<Vec<FieldValue>> for Document {
-    fn from(field_values: Vec<FieldValue>) -> Self {
+impl fmt::Debug for Document {
+    fn fmt(&self, _: &mut fmt::Formatter<'_>) -> fmt::Result {
+        todo!()
+    }
+}
+
+impl From<Vec<FieldValue<'static>>> for Document {
+    fn from(field_values: Vec<FieldValue<'static>>) -> Self {
+        let field_values = vec![field_values.into()];
        Document { field_values }
    }
 }
 impl PartialEq for Document {
    fn eq(&self, other: &Document) -> bool {
        // super slow, but only here for tests
-        let convert_to_comparable_map = |field_values: &[FieldValue]| {
+        let convert_to_comparable_map = |field_values| {
            let mut field_value_set: HashMap<Field, HashSet<String>> = Default::default();
-            for field_value in field_values.iter() {
+            for field_value in field_values {
+                // for some reason rustc fails to guess the type
+                let field_value: &FieldValue = field_value;
                let json_val = serde_json::to_string(field_value.value()).unwrap();
                field_value_set
                    .entry(field_value.field())
@@ -38,9 +109,9 @@ impl PartialEq for Document {
            field_value_set
        };
        let self_field_values: HashMap<Field, HashSet<String>> =
-            convert_to_comparable_map(&self.field_values);
+            convert_to_comparable_map(self.field_values());
        let other_field_values: HashMap<Field, HashSet<String>> =
-            convert_to_comparable_map(&other.field_values);
+            convert_to_comparable_map(other.field_values());
        self_field_values.eq(&other_field_values)
    }
 }
@@ -48,12 +119,13 @@ impl PartialEq for Document {
 impl Eq for Document {}

 impl IntoIterator for Document {
-    type Item = FieldValue;
+    type Item = FieldValue<'static>;

-    type IntoIter = std::vec::IntoIter<FieldValue>;
+    type IntoIter = std::vec::IntoIter<FieldValue<'static>>;

    fn into_iter(self) -> Self::IntoIter {
-        self.field_values.into_iter()
+        todo!()
+        // self.field_values.into_iter()
    }
 }

@@ -83,7 +155,7 @@ impl Document {

    /// Add a text field.
    pub fn add_text<S: ToString>(&mut self, field: Field, text: S) {
-        let value = Value::Str(text.to_string());
+        let value = Value::Str(MaybeOwnedString::from_string(text.to_string()));
        self.add_field_value(field, value);
    }

@@ -97,6 +169,11 @@ impl Document {
        self.add_field_value(field, value);
    }

+    /// Add a IP address field. Internally only Ipv6Addr is used.
+    pub fn add_ip_addr(&mut self, field: Field, value: Ipv6Addr) {
+        self.add_field_value(field, value);
+    }
+
    /// Add a i64 field
    pub fn add_i64(&mut self, field: Field, value: i64) {
        self.add_field_value(field, value);
@@ -132,15 +209,35 @@ impl Document {
    }

    /// Add a (field, value) to the document.
-    pub fn add_field_value<T: Into<Value>>(&mut self, field: Field, typed_val: T) {
+    pub fn add_field_value<T: Into<Value<'static>>>(&mut self, field: Field, typed_val: T) {
        let value = typed_val.into();
        let field_value = FieldValue { field, value };
-        self.field_values.push(field_value);
+        self.field_values.push(FieldValueGroup::Single(field_value));
+    }
+
+    /// Add multiple borrowed values, also taking the container they're borrowing from
+    // TODO add a try_ variant?
+    pub fn add_borrowed_values<T, F>(&mut self, storage: T, f: F)
+    where
+        T: Send + Sync + 'static,
+        F: FnOnce(&T) -> Vec<FieldValue>,
+    {
+        let yoke =
+            Yoke::attach_to_cart(Arc::new(storage), |storage| f(storage).into()).erase_arc_cart();
+
+        self.field_values.push(FieldValueGroup::Group(yoke));
    }

    /// field_values accessor
-    pub fn field_values(&self) -> &[FieldValue] {
-        &self.field_values
+    pub fn field_values(&self) -> impl Iterator<Item = &FieldValue> {
+        self.field_values.iter().flat_map(|group| group.iter())
+    }
+
+    /// Return the total number of values
+    ///
+    /// More efficient than calling `self.field_values().count()`
+    pub fn value_count(&self) -> usize {
+        self.field_values.iter().map(|group| group.count()).sum()
    }

    /// Sort and groups the field_values by field.
@@ -148,7 +245,7 @@ impl Document {
    /// The result of this method is not cached and is
    /// computed on the fly when this method is called.
    pub fn get_sorted_field_values(&self) -> Vec<(Field, Vec<&Value>)> {
-        let mut field_values: Vec<&FieldValue> = self.field_values().iter().collect();
+        let mut field_values: Vec<&FieldValue> = self.field_values().collect();
        field_values.sort_by_key(|field_value| field_value.field());

        let mut field_values_it = field_values.into_iter();
@@ -183,6 +280,7 @@ impl Document {
    pub fn get_all(&self, field: Field) -> impl Iterator<Item = &Value> {
        self.field_values
            .iter()
+            .flat_map(|group| group.iter())
            .filter(move |field_value| field_value.field() == field)
            .map(FieldValue::value)
    }
@@ -191,12 +289,41 @@ impl Document {
    pub fn get_first(&self, field: Field) -> Option<&Value> {
        self.get_all(field).next()
    }
+
+    /// Serializes stored field values.
+    pub fn serialize_stored<W: Write>(&self, schema: &Schema, writer: &mut W) -> io::Result<()> {
+        let stored_field_values = || {
+            self.field_values()
+                .filter(|field_value| schema.get_field_entry(field_value.field()).is_stored())
+        };
+        let num_field_values = stored_field_values().count();
+
+        VInt(num_field_values as u64).serialize(writer)?;
+        for field_value in stored_field_values() {
+            match field_value {
+                FieldValue {
+                    field,
+                    value: Value::PreTokStr(pre_tokenized_text),
+                } => {
+                    let field_value = FieldValue {
+                        field: *field,
+                        value: Value::Str(MaybeOwnedString::from_string(
+                            pre_tokenized_text.text.to_string(),
+                        )),
+                    };
+                    field_value.serialize(writer)?;
+                }
+                field_value => field_value.serialize(writer)?,
+            };
+        }
+        Ok(())
+    }
 }

 impl BinarySerializable for Document {
    fn serialize<W: Write>(&self, writer: &mut W) -> io::Result<()> {
        let field_values = self.field_values();
-        VInt(field_values.len() as u64).serialize(writer)?;
+        VInt(self.value_count() as u64).serialize(writer)?;
        for field_value in field_values {
            field_value.serialize(writer)?;
        }
@@ -225,7 +352,7 @@ mod tests {
        let text_field = schema_builder.add_text_field("title", TEXT);
        let mut doc = Document::default();
        doc.add_text(text_field, "My title");
-        assert_eq!(doc.field_values().len(), 1);
+        assert_eq!(doc.value_count(), 1);
    }

    #[test]
@@ -239,7 +366,7 @@ mod tests {
                .clone(),
        );
        doc.add_text(Field::from_field_id(1), "hello");
-        assert_eq!(doc.field_values().len(), 2);
+        assert_eq!(doc.value_count(), 2);
        let mut payload: Vec<u8> = Vec::new();
        doc.serialize(&mut payload).unwrap();
        assert_eq!(payload.len(), 26);
--- a/src/schema/field_entry.rs
+++ b/src/schema/field_entry.rs
@@ -1,5 +1,6 @@
 use serde::{Deserialize, Serialize};

+use super::ip_options::IpAddrOptions;
 use crate::schema::bytes_options::BytesOptions;
 use crate::schema::{
    is_valid_field_name, DateOptions, FacetOptions, FieldType, JsonObjectOptions, NumericOptions,
@@ -60,6 +61,11 @@ impl FieldEntry {
        Self::new(field_name, FieldType::Date(date_options))
    }

+    /// Creates a new ip address field entry.
+    pub fn new_ip_addr(field_name: String, ip_options: IpAddrOptions) -> FieldEntry {
+        Self::new(field_name, FieldType::IpAddr(ip_options))
+    }
+
    /// Creates a field entry for a facet.
    pub fn new_facet(field_name: String, facet_options: FacetOptions) -> FieldEntry {
        Self::new(field_name, FieldType::Facet(facet_options))
@@ -114,6 +120,7 @@ impl FieldEntry {
            FieldType::Facet(ref options) => options.is_stored(),
            FieldType::Bytes(ref options) => options.is_stored(),
            FieldType::JsonObject(ref options) => options.is_stored(),
+            FieldType::IpAddr(ref options) => options.is_stored(),
        }
    }
 }
--- a/src/schema/field_type.rs
+++ b/src/schema/field_type.rs
@@ -1,10 +1,15 @@
+use std::net::IpAddr;
+use std::str::FromStr;
+
 use serde::{Deserialize, Serialize};
 use serde_json::Value as JsonValue;
 use thiserror::Error;

-use super::Cardinality;
+use super::ip_options::IpAddrOptions;
+use super::{Cardinality, IntoIpv6Addr};
 use crate::schema::bytes_options::BytesOptions;
 use crate::schema::facet_options::FacetOptions;
+use crate::schema::value::MaybeOwnedString;
 use crate::schema::{
    DateOptions, Facet, IndexRecordOption, JsonObjectOptions, NumericOptions, TextFieldIndexing,
    TextOptions, Value,
@@ -62,9 +67,11 @@ pub enum Type {
    Bytes = b'b',
    /// Leaf in a Json object.
    Json = b'j',
+    /// IpAddr
+    IpAddr = b'p',
 }

-const ALL_TYPES: [Type; 9] = [
+const ALL_TYPES: [Type; 10] = [
    Type::Str,
    Type::U64,
    Type::I64,
@@ -74,6 +81,7 @@ const ALL_TYPES: [Type; 9] = [
    Type::Facet,
    Type::Bytes,
    Type::Json,
+    Type::IpAddr,
 ];

 impl Type {
@@ -100,6 +108,7 @@ impl Type {
            Type::Facet => "Facet",
            Type::Bytes => "Bytes",
            Type::Json => "Json",
+            Type::IpAddr => "IpAddr",
        }
    }

@@ -116,6 +125,7 @@ impl Type {
            b'h' => Some(Type::Facet),
            b'b' => Some(Type::Bytes),
            b'j' => Some(Type::Json),
+            b'p' => Some(Type::IpAddr),
            _ => None,
        }
    }
@@ -146,6 +156,8 @@ pub enum FieldType {
    Bytes(BytesOptions),
    /// Json object
    JsonObject(JsonObjectOptions),
+    /// IpAddr field
+    IpAddr(IpAddrOptions),
 }

 impl FieldType {
@@ -161,9 +173,15 @@ impl FieldType {
            FieldType::Facet(_) => Type::Facet,
            FieldType::Bytes(_) => Type::Bytes,
            FieldType::JsonObject(_) => Type::Json,
+            FieldType::IpAddr(_) => Type::IpAddr,
        }
    }

+    /// returns true if this is an ip address field
+    pub fn is_ip_addr(&self) -> bool {
+        matches!(self, FieldType::IpAddr(_))
+    }
+
    /// returns true if the field is indexed.
    pub fn is_indexed(&self) -> bool {
        match *self {
@@ -176,6 +194,7 @@ impl FieldType {
            FieldType::Facet(ref _facet_options) => true,
            FieldType::Bytes(ref bytes_options) => bytes_options.is_indexed(),
            FieldType::JsonObject(ref json_object_options) => json_object_options.is_indexed(),
+            FieldType::IpAddr(ref ip_addr_options) => ip_addr_options.is_indexed(),
        }
    }

@@ -210,6 +229,7 @@ impl FieldType {
            | FieldType::F64(ref int_options)
            | FieldType::Bool(ref int_options) => int_options.is_fast(),
            FieldType::Date(ref date_options) => date_options.is_fast(),
+            FieldType::IpAddr(ref ip_addr_options) => ip_addr_options.is_fast(),
            FieldType::Facet(_) => true,
            FieldType::JsonObject(_) => false,
        }
@@ -218,11 +238,11 @@ impl FieldType {
    /// returns true if the field is fast.
    pub fn fastfield_cardinality(&self) -> Option<Cardinality> {
        match *self {
-            FieldType::Bytes(ref bytes_options) if bytes_options.is_fast() => {
-                Some(Cardinality::SingleValue)
+            FieldType::Bytes(ref bytes_options) => {
+                bytes_options.is_fast().then_some(Cardinality::SingleValue)
            }
-            FieldType::Str(ref text_options) if text_options.is_fast() => {
-                Some(Cardinality::MultiValues)
+            FieldType::Str(ref text_options) => {
+                text_options.is_fast().then_some(Cardinality::MultiValues)
            }
            FieldType::U64(ref int_options)
            | FieldType::I64(ref int_options)
@@ -231,7 +251,7 @@ impl FieldType {
            FieldType::Date(ref date_options) => date_options.get_fastfield_cardinality(),
            FieldType::Facet(_) => Some(Cardinality::MultiValues),
            FieldType::JsonObject(_) => None,
-            _ => None,
+            FieldType::IpAddr(ref ip_addr_options) => ip_addr_options.get_fastfield_cardinality(),
        }
    }

@@ -250,6 +270,7 @@ impl FieldType {
            FieldType::Facet(_) => false,
            FieldType::Bytes(ref bytes_options) => bytes_options.fieldnorms(),
            FieldType::JsonObject(ref _json_object_options) => false,
+            FieldType::IpAddr(ref ip_addr_options) => ip_addr_options.fieldnorms(),
        }
    }

@@ -294,6 +315,13 @@ impl FieldType {
            FieldType::JsonObject(ref json_obj_options) => json_obj_options
                .get_text_indexing_options()
                .map(TextFieldIndexing::index_option),
+            FieldType::IpAddr(ref ip_addr_options) => {
+                if ip_addr_options.is_indexed() {
+                    Some(IndexRecordOption::Basic)
+                } else {
+                    None
+                }
+            }
        }
    }

@@ -302,7 +330,7 @@ impl FieldType {
    /// Tantivy will not try to cast values.
    /// For instance, If the json value is the integer `3` and the
    /// target field is a `Str`, this method will return an Error.
-    pub fn value_from_json(&self, json: JsonValue) -> Result<Value, ValueParsingError> {
+    pub fn value_from_json(&self, json: JsonValue) -> Result<Value<'static>, ValueParsingError> {
        match json {
            JsonValue::String(field_text) => {
                match self {
@@ -314,7 +342,7 @@ impl FieldType {
                            })?;
                        Ok(DateTime::from_utc(dt_with_fixed_tz).into())
                    }
-                    FieldType::Str(_) => Ok(Value::Str(field_text)),
+                    FieldType::Str(_) => Ok(Value::Str(MaybeOwnedString::from_string(field_text))),
                    FieldType::U64(_) | FieldType::I64(_) | FieldType::F64(_) => {
                        Err(ValueParsingError::TypeError {
                            expected: "an integer",
@@ -333,6 +361,16 @@ impl FieldType {
                        expected: "a json object",
                        json: JsonValue::String(field_text),
                    }),
+                    FieldType::IpAddr(_) => {
+                        let ip_addr: IpAddr = IpAddr::from_str(&field_text).map_err(|err| {
+                            ValueParsingError::ParseError {
+                                error: err.to_string(),
+                                json: JsonValue::String(field_text),
+                            }
+                        })?;
+
+                        Ok(Value::IpAddr(ip_addr.into_ipv6_addr()))
+                    }
                }
            }
            JsonValue::Number(field_val_num) => match self {
@@ -380,6 +418,10 @@ impl FieldType {
                    expected: "a json object",
                    json: JsonValue::Number(field_val_num),
                }),
+                FieldType::IpAddr(_) => Err(ValueParsingError::TypeError {
+                    expected: "a string with an ip addr",
+                    json: JsonValue::Number(field_val_num),
+                }),
            },
            JsonValue::Object(json_map) => match self {
                FieldType::Str(_) => {
--- a/src/schema/field_value.rs
+++ b/src/schema/field_value.rs
@@ -7,12 +7,13 @@ use crate::schema::{Field, Value};
 /// `FieldValue` holds together a `Field` and its `Value`.
 #[allow(missing_docs)]
 #[derive(Debug, Clone, PartialEq, Eq, serde::Serialize, serde::Deserialize)]
-pub struct FieldValue {
+#[serde(bound(deserialize = "'a: 'de, 'de: 'a"))]
+pub struct FieldValue<'a> {
    pub field: Field,
-    pub value: Value,
+    pub value: Value<'a>,
 }

-impl FieldValue {
+impl<'a> FieldValue<'a> {
    /// Constructor
    pub fn new(field: Field, value: Value) -> FieldValue {
        FieldValue { field, value }
@@ -29,13 +30,13 @@ impl FieldValue {
    }
 }

-impl From<FieldValue> for Value {
-    fn from(field_value: FieldValue) -> Self {
+impl<'a> From<FieldValue<'a>> for Value<'a> {
+    fn from(field_value: FieldValue<'a>) -> Self {
        field_value.value
    }
 }

-impl BinarySerializable for FieldValue {
+impl<'a> BinarySerializable for FieldValue<'a> {
    fn serialize<W: Write>(&self, writer: &mut W) -> io::Result<()> {
        self.field.serialize(writer)?;
        self.value.serialize(writer)
--- a/src/schema/flags.rs
+++ b/src/schema/flags.rs
@@ -37,6 +37,8 @@ pub struct FastFlag;
 ///
 /// Fast fields can be random-accessed rapidly. Fields useful for scoring, filtering
 /// or collection should be mark as fast fields.
+///
+/// See [fast fields](`crate::fastfield`).
 pub const FAST: SchemaFlagList<FastFlag, ()> = SchemaFlagList {
    head: FastFlag,
    tail: (),
--- a/src/schema/ip_options.rs
+++ b/src/schema/ip_options.rs
@@ -1,3 +1,4 @@
+use std::net::{IpAddr, Ipv6Addr};
 use std::ops::BitOr;

 use serde::{Deserialize, Serialize};
@@ -5,25 +6,52 @@ use serde::{Deserialize, Serialize};
 use super::flags::{FastFlag, IndexedFlag, SchemaFlagList, StoredFlag};
 use super::Cardinality;

+/// Trait to convert into an Ipv6Addr.
+pub trait IntoIpv6Addr {
+    /// Consumes the object and returns an Ipv6Addr.
+    fn into_ipv6_addr(self) -> Ipv6Addr;
+}
+
+impl IntoIpv6Addr for IpAddr {
+    fn into_ipv6_addr(self) -> Ipv6Addr {
+        match self {
+            IpAddr::V4(addr) => addr.to_ipv6_mapped(),
+            IpAddr::V6(addr) => addr,
+        }
+    }
+}
+
 /// Define how an ip field should be handled by tantivy.
 #[derive(Clone, Debug, PartialEq, Eq, Serialize, Deserialize, Default)]
-pub struct IpOptions {
+pub struct IpAddrOptions {
    #[serde(skip_serializing_if = "Option::is_none")]
    fast: Option<Cardinality>,
    stored: bool,
+    indexed: bool,
+    fieldnorms: bool,
 }

-impl IpOptions {
+impl IpAddrOptions {
    /// Returns true iff the value is a fast field.
    pub fn is_fast(&self) -> bool {
        self.fast.is_some()
    }

-    /// Returns `true` if the json object should be stored.
+    /// Returns `true` if the ip address should be stored in the doc store.
    pub fn is_stored(&self) -> bool {
        self.stored
    }

+    /// Returns true iff the value is indexed and therefore searchable.
+    pub fn is_indexed(&self) -> bool {
+        self.indexed
+    }
+
+    /// Returns true if and only if the value is normed.
+    pub fn fieldnorms(&self) -> bool {
+        self.fieldnorms
+    }
+
    /// Returns the cardinality of the fastfield.
    ///
    /// If the field has not been declared as a fastfield, then
@@ -32,6 +60,16 @@ impl IpOptions {
        self.fast
    }

+    /// Set the field as normed.
+    ///
+    /// Setting an integer as normed will generate
+    /// the fieldnorm data for it.
+    #[must_use]
+    pub fn set_fieldnorms(mut self) -> Self {
+        self.fieldnorms = true;
+        self
+    }
+
    /// Sets the field as stored
    #[must_use]
    pub fn set_stored(mut self) -> Self {
@@ -39,6 +77,19 @@ impl IpOptions {
        self
    }

+    /// Set the field as indexed.
+    ///
+    /// Setting an ip address as indexed will generate
+    /// a posting list for each value taken by the ip address.
+    /// Ips are normalized to IpV6.
+    ///
+    /// This is required for the field to be searchable.
+    #[must_use]
+    pub fn set_indexed(mut self) -> Self {
+        self.indexed = true;
+        self
+    }
+
    /// Set the field as a fast field.
    ///
    /// Fast fields are designed for random access.
@@ -52,52 +103,60 @@ impl IpOptions {
    }
 }

-impl From<()> for IpOptions {
-    fn from(_: ()) -> IpOptions {
-        IpOptions::default()
+impl From<()> for IpAddrOptions {
+    fn from(_: ()) -> IpAddrOptions {
+        IpAddrOptions::default()
    }
 }

-impl From<FastFlag> for IpOptions {
+impl From<FastFlag> for IpAddrOptions {
    fn from(_: FastFlag) -> Self {
-        IpOptions {
+        IpAddrOptions {
+            fieldnorms: false,
+            indexed: false,
            stored: false,
            fast: Some(Cardinality::SingleValue),
        }
    }
 }

-impl From<StoredFlag> for IpOptions {
+impl From<StoredFlag> for IpAddrOptions {
    fn from(_: StoredFlag) -> Self {
-        IpOptions {
+        IpAddrOptions {
+            fieldnorms: false,
+            indexed: false,
            stored: true,
            fast: None,
        }
    }
 }

-impl From<IndexedFlag> for IpOptions {
+impl From<IndexedFlag> for IpAddrOptions {
    fn from(_: IndexedFlag) -> Self {
-        IpOptions {
+        IpAddrOptions {
+            fieldnorms: true,
+            indexed: true,
            stored: false,
            fast: None,
        }
    }
 }

-impl<T: Into<IpOptions>> BitOr<T> for IpOptions {
-    type Output = IpOptions;
+impl<T: Into<IpAddrOptions>> BitOr<T> for IpAddrOptions {
+    type Output = IpAddrOptions;

-    fn bitor(self, other: T) -> IpOptions {
+    fn bitor(self, other: T) -> IpAddrOptions {
        let other = other.into();
-        IpOptions {
+        IpAddrOptions {
+            fieldnorms: self.fieldnorms | other.fieldnorms,
+            indexed: self.indexed | other.indexed,
            stored: self.stored | other.stored,
            fast: self.fast.or(other.fast),
        }
    }
 }

-impl<Head, Tail> From<SchemaFlagList<Head, Tail>> for IpOptions
+impl<Head, Tail> From<SchemaFlagList<Head, Tail>> for IpAddrOptions
 where
    Head: Clone,
    Tail: Clone,
--- a/src/schema/mod.rs
+++ b/src/schema/mod.rs
@@ -138,7 +138,7 @@ pub use self::field_type::{FieldType, Type};
 pub use self::field_value::FieldValue;
 pub use self::flags::{FAST, INDEXED, STORED};
 pub use self::index_record_option::IndexRecordOption;
-pub use self::ip_options::IpOptions;
+pub use self::ip_options::{IntoIpv6Addr, IpAddrOptions};
 pub use self::json_object_options::JsonObjectOptions;
 pub use self::named_field_document::NamedFieldDocument;
 pub use self::numeric_options::NumericOptions;
--- a/src/schema/named_field_document.rs
+++ b/src/schema/named_field_document.rs
@@ -10,4 +10,5 @@ use crate::schema::Value;
 /// A `NamedFieldDocument` is a simple representation of a document
 /// as a `BTreeMap<String, Vec<Value>>`.
 #[derive(Debug, Deserialize, Serialize)]
-pub struct NamedFieldDocument(pub BTreeMap<String, Vec<Value>>);
+#[serde(bound(deserialize = "'static: 'de, 'de: 'static"))]
+pub struct NamedFieldDocument(pub BTreeMap<String, Vec<Value<'static>>>);
--- a/src/schema/numeric_options.rs
+++ b/src/schema/numeric_options.rs
@@ -59,7 +59,7 @@ impl From<NumericOptionsDeser> for NumericOptions {
 }

 impl NumericOptions {
-    /// Returns true iff the value is stored.
+    /// Returns true iff the value is stored in the doc store.
    pub fn is_stored(&self) -> bool {
        self.stored
    }
--- a/src/schema/schema.rs
+++ b/src/schema/schema.rs
@@ -7,6 +7,7 @@ use serde::ser::SerializeSeq;
 use serde::{Deserialize, Deserializer, Serialize, Serializer};
 use serde_json::{self, Value as JsonValue};

+use super::ip_options::IpAddrOptions;
 use super::*;
 use crate::schema::bytes_options::BytesOptions;
 use crate::schema::field_type::ValueParsingError;
@@ -45,13 +46,9 @@ impl SchemaBuilder {
    /// Adds a new u64 field.
    /// Returns the associated field handle
    ///
-    /// # Caution
+    /// # Panics
    ///
-    /// Appending two fields with the same name
-    /// will result in the shadowing of the first
-    /// by the second one.
-    /// The first field will get a field id
-    /// but only the second one will be indexed
+    /// Panics when field already exists.
    pub fn add_u64_field<T: Into<NumericOptions>>(
        &mut self,
        field_name_str: &str,
@@ -65,13 +62,9 @@ impl SchemaBuilder {
    /// Adds a new i64 field.
    /// Returns the associated field handle
    ///
-    /// # Caution
+    /// # Panics
    ///
-    /// Appending two fields with the same name
-    /// will result in the shadowing of the first
-    /// by the second one.
-    /// The first field will get a field id
-    /// but only the second one will be indexed
+    /// Panics when field already exists.
    pub fn add_i64_field<T: Into<NumericOptions>>(
        &mut self,
        field_name_str: &str,
@@ -85,13 +78,9 @@ impl SchemaBuilder {
    /// Adds a new f64 field.
    /// Returns the associated field handle
    ///
-    /// # Caution
+    /// # Panics
    ///
-    /// Appending two fields with the same name
-    /// will result in the shadowing of the first
-    /// by the second one.
-    /// The first field will get a field id
-    /// but only the second one will be indexed
+    /// Panics when field already exists.
    pub fn add_f64_field<T: Into<NumericOptions>>(
        &mut self,
        field_name_str: &str,
@@ -105,13 +94,9 @@ impl SchemaBuilder {
    /// Adds a new bool field.
    /// Returns the associated field handle
    ///
-    /// # Caution
+    /// # Panics
    ///
-    /// Appending two fields with the same name
-    /// will result in the shadowing of the first
-    /// by the second one.
-    /// The first field will get a field id
-    /// but only the second one will be indexed
+    /// Panics when field already exists.
    pub fn add_bool_field<T: Into<NumericOptions>>(
        &mut self,
        field_name_str: &str,
@@ -127,13 +112,9 @@ impl SchemaBuilder {
    /// Internally, Tantivy simply stores dates as i64 UTC timestamps,
    /// while the user supplies DateTime values for convenience.
    ///
-    /// # Caution
+    /// # Panics
    ///
-    /// Appending two fields with the same name
-    /// will result in the shadowing of the first
-    /// by the second one.
-    /// The first field will get a field id
-    /// but only the second one will be indexed
+    /// Panics when field already exists.
    pub fn add_date_field<T: Into<DateOptions>>(
        &mut self,
        field_name_str: &str,
@@ -144,16 +125,28 @@ impl SchemaBuilder {
        self.add_field(field_entry)
    }

+    /// Adds a ip field.
+    /// Returns the associated field handle.
+    ///
+    /// # Panics
+    ///
+    /// Panics when field already exists.
+    pub fn add_ip_addr_field<T: Into<IpAddrOptions>>(
+        &mut self,
+        field_name_str: &str,
+        field_options: T,
+    ) -> Field {
+        let field_name = String::from(field_name_str);
+        let field_entry = FieldEntry::new_ip_addr(field_name, field_options.into());
+        self.add_field(field_entry)
+    }
+
    /// Adds a new text field.
    /// Returns the associated field handle
    ///
-    /// # Caution
+    /// # Panics
    ///
-    /// Appending two fields with the same name
-    /// will result in the shadowing of the first
-    /// by the second one.
-    /// The first field will get a field id
-    /// but only the second one will be indexed
+    /// Panics when field already exists.
    pub fn add_text_field<T: Into<TextOptions>>(
        &mut self,
        field_name_str: &str,
@@ -207,8 +200,10 @@ impl SchemaBuilder {
    pub fn add_field(&mut self, field_entry: FieldEntry) -> Field {
        let field = Field::from_field_id(self.fields.len() as u32);
        let field_name = field_entry.name().to_string();
+        if let Some(_previous_value) = self.fields_map.insert(field_name, field) {
+            panic!("Field already exists in schema {}", field_entry.name());
+        };
        self.fields.push(field_entry);
-        self.fields_map.insert(field_name, field);
        field
    }

@@ -313,7 +308,11 @@ impl Schema {
        let mut field_map = BTreeMap::new();
        for (field, field_values) in doc.get_sorted_field_values() {
            let field_name = self.get_field_name(field);
-            let values: Vec<Value> = field_values.into_iter().cloned().collect();
+            let values: Vec<Value> = field_values
+                .into_iter()
+                .cloned()
+                .map(Value::into_owned)
+                .collect();
            field_map.insert(field_name.to_string(), values);
        }
        NamedFieldDocument(field_map)
@@ -343,20 +342,21 @@ impl Schema {
            if let Some(field) = self.get_field(&field_name) {
                let field_entry = self.get_field_entry(field);
                let field_type = field_entry.field_type();
+                // TODO rewrite this with shared allocation?
                match json_value {
                    JsonValue::Array(json_items) => {
                        for json_item in json_items {
                            let value = field_type
                                .value_from_json(json_item)
                                .map_err(|e| DocParsingError::ValueError(field_name.clone(), e))?;
-                            doc.add_field_value(field, value);
+                            doc.add_field_value(field, value.into_owned());
                        }
                    }
                    _ => {
                        let value = field_type
                            .value_from_json(json_value)
                            .map_err(|e| DocParsingError::ValueError(field_name.clone(), e))?;
-                        doc.add_field_value(field, value);
+                        doc.add_field_value(field, value.into_owned());
                    }
                }
            }
@@ -598,12 +598,14 @@ mod tests {
        schema_builder.add_text_field("title", TEXT);
        schema_builder.add_text_field("author", STRING);
        schema_builder.add_u64_field("count", count_options);
+        schema_builder.add_ip_addr_field("ip", FAST | STORED);
        schema_builder.add_bool_field("is_read", is_read_options);
        let schema = schema_builder.build();
        let doc_json = r#"{
                "title": "my title",
                "author": "fulmicoton",
                "count": 4,
+                "ip": "127.0.0.1",
                "is_read": true
        }"#;
        let doc = schema.parse_document(doc_json).unwrap();
@@ -612,6 +614,39 @@ mod tests {
        assert_eq!(doc, doc_serdeser);
    }

+    #[test]
+    pub fn test_document_to_ipv4_json() {
+        let mut schema_builder = Schema::builder();
+        schema_builder.add_ip_addr_field("ip", FAST | STORED);
+        let schema = schema_builder.build();
+
+        // IpV4 loopback
+        let doc_json = r#"{
+                "ip": "127.0.0.1"
+        }"#;
+        let doc = schema.parse_document(doc_json).unwrap();
+        let value: serde_json::Value = serde_json::from_str(&schema.to_json(&doc)).unwrap();
+        assert_eq!(value["ip"][0], "127.0.0.1");
+
+        // Special case IpV6 loopback. We don't want to map that to IPv4
+        let doc_json = r#"{
+                "ip": "::1"
+        }"#;
+        let doc = schema.parse_document(doc_json).unwrap();
+
+        let value: serde_json::Value = serde_json::from_str(&schema.to_json(&doc)).unwrap();
+        assert_eq!(value["ip"][0], "::1");
+
+        // testing ip address of every router in the world
+        let doc_json = r#"{
+                "ip": "192.168.0.1"
+        }"#;
+        let doc = schema.parse_document(doc_json).unwrap();
+
+        let value: serde_json::Value = serde_json::from_str(&schema.to_json(&doc)).unwrap();
+        assert_eq!(value["ip"][0], "192.168.0.1");
+    }
+
    #[test]
    pub fn test_document_from_nameddoc() {
        let mut schema_builder = Schema::builder();
@@ -676,7 +711,7 @@ mod tests {
        let schema = schema_builder.build();
        {
            let doc = schema.parse_document("{}").unwrap();
-            assert!(doc.field_values().is_empty());
+            assert_eq!(doc.value_count(), 0);
        }
        {
            let doc = schema
--- a/src/schema/term.rs
+++ b/src/schema/term.rs
@@ -1,24 +1,15 @@
 use std::convert::TryInto;
 use std::hash::{Hash, Hasher};
+use std::net::Ipv6Addr;
 use std::{fmt, str};

+use fastfield_codecs::MonotonicallyMappableToU128;
+
 use super::Field;
 use crate::fastfield::FastValue;
 use crate::schema::{Facet, Type};
 use crate::{DatePrecision, DateTime};

-/// Size (in bytes) of the buffer of a fast value (u64, i64, f64, or date) term.
-/// <field> + <type byte> + <value len>
-///
-/// - <field> is a big endian encoded u32 field id
-/// - <type_byte>'s most significant bit expresses whether the term is a json term or not
-/// The remaining 7 bits are used to encode the type of the value.
-/// If this is a JSON term, the type is the type of the leaf of the json.
-///
-/// - <value> is,  if this is not the json term, a binary representation specific to the type.
-/// If it is a JSON Term, then it is prepended with the path that leads to this leaf value.
-const FAST_VALUE_TERM_LEN: usize = 4 + 1 + 8;
-
 /// Separates the different segments of
 /// the json path.
 pub const JSON_PATH_SEGMENT_SEP: u8 = 1u8;
@@ -36,24 +27,57 @@ pub const JSON_END_OF_PATH: u8 = 0u8;
 pub struct Term<B = Vec<u8>>(B)
 where B: AsRef<[u8]>;

-impl AsMut<Vec<u8>> for Term {
-    fn as_mut(&mut self) -> &mut Vec<u8> {
-        &mut self.0
-    }
-}
+/// The number of bytes used as metadata by `Term`.
+const TERM_METADATA_LENGTH: usize = 5;

 impl Term {
-    pub(crate) fn new() -> Term {
-        Term(Vec::with_capacity(100))
+    pub(crate) fn with_capacity(capacity: usize) -> Term {
+        let mut data = Vec::with_capacity(TERM_METADATA_LENGTH + capacity);
+        data.resize(TERM_METADATA_LENGTH, 0u8);
+        Term(data)
+    }
+
+    pub(crate) fn with_type_and_field(typ: Type, field: Field) -> Term {
+        let mut term = Self::with_capacity(8);
+        term.set_field_and_type(field, typ);
+        term
+    }
+
+    fn with_bytes_and_field_and_payload(typ: Type, field: Field, bytes: &[u8]) -> Term {
+        let mut term = Self::with_capacity(bytes.len());
+        term.set_field_and_type(field, typ);
+        term.0.extend_from_slice(bytes);
+        term
    }

    fn from_fast_value<T: FastValue>(field: Field, val: &T) -> Term {
-        let mut term = Term(vec![0u8; FAST_VALUE_TERM_LEN]);
-        term.set_field(T::to_type(), field);
+        let mut term = Self::with_type_and_field(T::to_type(), field);
        term.set_u64(val.to_u64());
        term
    }

+    /// Panics when the term is not empty... ie: some value is set.
+    /// Use `clear_with_field_and_type` in that case.
+    ///
+    /// Sets field and the type.
+    pub(crate) fn set_field_and_type(&mut self, field: Field, typ: Type) {
+        assert!(self.is_empty());
+        self.0[0..4].clone_from_slice(field.field_id().to_be_bytes().as_ref());
+        self.0[4] = typ.to_code();
+    }
+
+    /// Is empty if there are no value bytes.
+    pub fn is_empty(&self) -> bool {
+        self.0.len() == TERM_METADATA_LENGTH
+    }
+
+    /// Builds a term given a field, and a `Ipv6Addr`-value
+    pub fn from_field_ip_addr(field: Field, ip_addr: Ipv6Addr) -> Term {
+        let mut term = Self::with_type_and_field(Type::IpAddr, field);
+        term.set_ip_addr(ip_addr);
+        term
+    }
+
    /// Builds a term given a field, and a `u64`-value
    pub fn from_field_u64(field: Field, val: u64) -> Term {
        Term::from_fast_value(field, &val)
@@ -82,31 +106,29 @@ impl Term {
    /// Creates a `Term` given a facet.
    pub fn from_facet(field: Field, facet: &Facet) -> Term {
        let facet_encoded_str = facet.encoded_str();
-        Term::create_bytes_term(Type::Facet, field, facet_encoded_str.as_bytes())
+        Term::with_bytes_and_field_and_payload(Type::Facet, field, facet_encoded_str.as_bytes())
    }

    /// Builds a term given a field, and a string value
    pub fn from_field_text(field: Field, text: &str) -> Term {
-        Term::create_bytes_term(Type::Str, field, text.as_bytes())
-    }
-
-    fn create_bytes_term(typ: Type, field: Field, bytes: &[u8]) -> Term {
-        let mut term = Term(vec![0u8; 5 + bytes.len()]);
-        term.set_field(typ, field);
-        term.0.extend_from_slice(bytes);
-        term
+        Term::with_bytes_and_field_and_payload(Type::Str, field, text.as_bytes())
    }

    /// Builds a term bytes.
    pub fn from_field_bytes(field: Field, bytes: &[u8]) -> Term {
-        Term::create_bytes_term(Type::Bytes, field, bytes)
+        Term::with_bytes_and_field_and_payload(Type::Bytes, field, bytes)
    }

-    pub(crate) fn set_field(&mut self, typ: Type, field: Field) {
-        self.0.clear();
-        self.0
-            .extend_from_slice(field.field_id().to_be_bytes().as_ref());
-        self.0.push(typ.to_code());
+    /// Removes the value_bytes and set the field and type code.
+    pub(crate) fn clear_with_field_and_type(&mut self, typ: Type, field: Field) {
+        self.truncate_value_bytes(0);
+        self.set_field_and_type(field, typ);
+    }
+
+    /// Removes the value_bytes and set the type code.
+    pub fn clear_with_type(&mut self, typ: Type) {
+        self.truncate_value_bytes(0);
+        self.0[4] = typ.to_code();
    }

    /// Sets a u64 value in the term.
@@ -117,12 +139,6 @@ impl Term {
    /// the natural order of the values.
    pub fn set_u64(&mut self, val: u64) {
        self.set_fast_value(val);
-        self.set_bytes(val.to_be_bytes().as_ref());
-    }
-
-    fn set_fast_value<T: FastValue>(&mut self, val: T) {
-        self.0.resize(FAST_VALUE_TERM_LEN, 0u8);
-        self.set_bytes(val.to_u64().to_be_bytes().as_ref());
    }

    /// Sets a `i64` value in the term.
@@ -145,9 +161,18 @@ impl Term {
        self.set_fast_value(val);
    }

+    fn set_fast_value<T: FastValue>(&mut self, val: T) {
+        self.set_bytes(val.to_u64().to_be_bytes().as_ref());
+    }
+
+    /// Sets a `Ipv6Addr` value in the term.
+    pub fn set_ip_addr(&mut self, val: Ipv6Addr) {
+        self.set_bytes(val.to_u128().to_be_bytes().as_ref());
+    }
+
    /// Sets the value of a `Bytes` field.
    pub fn set_bytes(&mut self, bytes: &[u8]) {
-        self.0.resize(5, 0u8);
+        self.truncate_value_bytes(0);
        self.0.extend(bytes);
    }

@@ -156,18 +181,22 @@ impl Term {
        self.set_bytes(text.as_bytes());
    }

-    /// Removes the value_bytes and set the type code.
-    pub fn clear_with_type(&mut self, typ: Type) {
-        self.truncate(5);
-        self.0[4] = typ.to_code();
+    /// Truncates the value bytes of the term. Value and field type stays the same.
+    pub fn truncate_value_bytes(&mut self, len: usize) {
+        self.0.truncate(len + TERM_METADATA_LENGTH);
    }

-    /// Truncate the term right after the field and the type code.
-    pub fn truncate(&mut self, len: usize) {
-        self.0.truncate(len);
+    /// Returns the value bytes as mutable slice
+    pub fn value_bytes_mut(&mut self) -> &mut [u8] {
+        &mut self.0[TERM_METADATA_LENGTH..]
    }

-    /// Truncate the term right after the field and the type code.
+    /// The length of the bytes.
+    pub fn len_bytes(&self) -> usize {
+        self.0.len() - TERM_METADATA_LENGTH
+    }
+
+    /// Appends value bytes to the Term.
    pub fn append_bytes(&mut self, bytes: &[u8]) {
        self.0.extend_from_slice(bytes);
    }
@@ -293,9 +322,6 @@ where B: AsRef<[u8]>
    /// Returns `None` if the field is not of string type
    /// or if the bytes are not valid utf-8.
    pub fn as_str(&self) -> Option<&str> {
-        if self.as_slice().len() < 5 {
-            return None;
-        }
        if self.typ() != Type::Str {
            return None;
        }
@@ -307,9 +333,6 @@ where B: AsRef<[u8]>
    /// Returns `None` if the field is not of facet type
    /// or if the bytes are not valid utf-8.
    pub fn as_facet(&self) -> Option<Facet> {
-        if self.as_slice().len() < 5 {
-            return None;
-        }
        if self.typ() != Type::Facet {
            return None;
        }
@@ -321,9 +344,6 @@ where B: AsRef<[u8]>
    ///
    /// Returns `None` if the field is not of bytes type.
    pub fn as_bytes(&self) -> Option<&[u8]> {
-        if self.as_slice().len() < 5 {
-            return None;
-        }
        if self.typ() != Type::Bytes {
            return None;
        }
@@ -337,7 +357,7 @@ where B: AsRef<[u8]>
    /// If the term is a u64, its value is encoded according
    /// to `byteorder::LittleEndian`.
    pub fn value_bytes(&self) -> &[u8] {
-        &self.0.as_ref()[5..]
+        &self.0.as_ref()[TERM_METADATA_LENGTH..]
    }

    /// Returns the underlying `&[u8]`.
@@ -415,6 +435,9 @@ fn debug_value_bytes(typ: Type, bytes: &[u8], f: &mut fmt::Formatter) -> fmt::Re
                debug_value_bytes(typ, bytes, f)?;
            }
        }
+        Type::IpAddr => {
+            write!(f, "")?; // TODO change once we actually have IP address terms.
+        }
    }
    Ok(())
 }
@@ -448,6 +471,18 @@ mod tests {
        assert_eq!(term.as_str(), Some("test"))
    }

+    /// Size (in bytes) of the buffer of a fast value (u64, i64, f64, or date) term.
+    /// <field> + <type byte> + <value len>
+    ///
+    /// - <field> is a big endian encoded u32 field id
+    /// - <type_byte>'s most significant bit expresses whether the term is a json term or not
+    /// The remaining 7 bits are used to encode the type of the value.
+    /// If this is a JSON term, the type is the type of the leaf of the json.
+    ///
+    /// - <value> is,  if this is not the json term, a binary representation specific to the type.
+    /// If it is a JSON Term, then it is prepended with the path that leads to this leaf value.
+    const FAST_VALUE_TERM_LEN: usize = 4 + 1 + 8;
+
    #[test]
    pub fn test_term_u64() {
        let mut schema_builder = Schema::builder();
@@ -455,7 +490,7 @@ mod tests {
        let term = Term::from_field_u64(count_field, 983u64);
        assert_eq!(term.field(), count_field);
        assert_eq!(term.typ(), Type::U64);
-        assert_eq!(term.as_slice().len(), super::FAST_VALUE_TERM_LEN);
+        assert_eq!(term.as_slice().len(), FAST_VALUE_TERM_LEN);
        assert_eq!(term.as_u64(), Some(983u64))
    }

@@ -466,7 +501,7 @@ mod tests {
        let term = Term::from_field_bool(bool_field, true);
        assert_eq!(term.field(), bool_field);
        assert_eq!(term.typ(), Type::Bool);
-        assert_eq!(term.as_slice().len(), super::FAST_VALUE_TERM_LEN);
+        assert_eq!(term.as_slice().len(), FAST_VALUE_TERM_LEN);
        assert_eq!(term.as_bool(), Some(true))
    }
 }
--- a/src/schema/value.rs
+++ b/src/schema/value.rs
@@ -1,5 +1,7 @@
 use std::fmt;
+use std::net::Ipv6Addr;

+pub use not_safe::MaybeOwnedString;
 use serde::de::Visitor;
 use serde::{Deserialize, Deserializer, Serialize, Serializer};
 use serde_json::Map;
@@ -11,9 +13,9 @@ use crate::DateTime;
 /// Value represents the value of a any field.
 /// It is an enum over all over all of the possible field type.
 #[derive(Debug, Clone, PartialEq)]
-pub enum Value {
+pub enum Value<'a> {
    /// The str type is used for any text information.
-    Str(String),
+    Str(MaybeOwnedString<'a>),
    /// Pre-tokenized str type,
    PreTokStr(PreTokenizedString),
    /// Unsigned 64-bits Integer `u64`
@@ -29,14 +31,38 @@ pub enum Value {
    /// Facet
    Facet(Facet),
    /// Arbitrarily sized byte array
+    // TODO allow Cow<'a, [u8]>
    Bytes(Vec<u8>),
    /// Json object value.
+    // TODO allow Cow keys and borrowed values
    JsonObject(serde_json::Map<String, serde_json::Value>),
+    /// IpV6 Address. Internally there is no IpV4, it needs to be converted to `Ipv6Addr`.
+    IpAddr(Ipv6Addr),
 }

-impl Eq for Value {}
+impl<'a> Value<'a> {
+    /// Convert a borrowing [`Value`] to an owning one.
+    pub fn into_owned(self) -> Value<'static> {
+        use Value::*;
+        match self {
+            Str(val) => Str(MaybeOwnedString::from_string(val.into_string())),
+            PreTokStr(val) => PreTokStr(val),
+            U64(val) => U64(val),
+            I64(val) => I64(val),
+            F64(val) => F64(val),
+            Bool(val) => Bool(val),
+            Date(val) => Date(val),
+            Facet(val) => Facet(val),
+            Bytes(val) => Bytes(val),
+            JsonObject(val) => JsonObject(val),
+            IpAddr(val) => IpAddr(val),
+        }
+    }
+}

-impl Serialize for Value {
+impl<'a> Eq for Value<'a> {}
+
+impl<'a> Serialize for Value<'a> {
    fn serialize<S>(&self, serializer: S) -> Result<S::Ok, S::Error>
    where S: Serializer {
        match *self {
@@ -48,19 +74,27 @@ impl Serialize for Value {
            Value::Bool(b) => serializer.serialize_bool(b),
            Value::Date(ref date) => time::serde::rfc3339::serialize(&date.into_utc(), serializer),
            Value::Facet(ref facet) => facet.serialize(serializer),
-            Value::Bytes(ref bytes) => serializer.serialize_bytes(bytes),
+            Value::Bytes(ref bytes) => serializer.serialize_str(&base64::encode(bytes)),
            Value::JsonObject(ref obj) => obj.serialize(serializer),
+            Value::IpAddr(ref obj) => {
+                // Ensure IpV4 addresses get serialized as IpV4, but excluding IpV6 loopback.
+                if let Some(ip_v4) = obj.to_ipv4_mapped() {
+                    ip_v4.serialize(serializer)
+                } else {
+                    obj.serialize(serializer)
+                }
+            }
        }
    }
 }

-impl<'de> Deserialize<'de> for Value {
+impl<'de> Deserialize<'de> for Value<'de> {
    fn deserialize<D>(deserializer: D) -> Result<Self, D::Error>
    where D: Deserializer<'de> {
        struct ValueVisitor;

        impl<'de> Visitor<'de> for ValueVisitor {
-            type Value = Value;
+            type Value = Value<'de>;

            fn expecting(&self, formatter: &mut fmt::Formatter<'_>) -> fmt::Result {
                formatter.write_str("a string or u32")
@@ -82,12 +116,13 @@ impl<'de> Deserialize<'de> for Value {
                Ok(Value::Bool(v))
            }

+            // TODO add visit_borrowed_str
            fn visit_str<E>(self, v: &str) -> Result<Self::Value, E> {
-                Ok(Value::Str(v.to_owned()))
+                Ok(Value::Str(MaybeOwnedString::from_string(v.to_owned())))
            }

            fn visit_string<E>(self, v: String) -> Result<Self::Value, E> {
-                Ok(Value::Str(v))
+                Ok(Value::Str(MaybeOwnedString::from_string(v)))
            }
        }

@@ -95,7 +130,7 @@ impl<'de> Deserialize<'de> for Value {
    }
 }

-impl Value {
+impl<'a> Value<'a> {
    /// Returns the text value, provided the value is of the `Str` type.
    /// (Returns `None` if the value is not of the `Str` type).
    pub fn as_text(&self) -> Option<&str> {
@@ -201,82 +236,99 @@ impl Value {
            None
        }
    }
-}

-impl From<String> for Value {
-    fn from(s: String) -> Value {
-        Value::Str(s)
+    /// Returns the ip addr, provided the value is of the `Ip` type.
+    /// (Returns None if the value is not of the `Ip` type)
+    pub fn as_ip_addr(&self) -> Option<Ipv6Addr> {
+        if let Value::IpAddr(val) = self {
+            Some(*val)
+        } else {
+            None
+        }
    }
 }

-impl From<u64> for Value {
-    fn from(v: u64) -> Value {
+impl From<String> for Value<'static> {
+    fn from(s: String) -> Value<'static> {
+        Value::Str(MaybeOwnedString::from_string(s))
+    }
+}
+
+impl From<Ipv6Addr> for Value<'static> {
+    fn from(v: Ipv6Addr) -> Value<'static> {
+        Value::IpAddr(v)
+    }
+}
+
+impl From<u64> for Value<'static> {
+    fn from(v: u64) -> Value<'static> {
        Value::U64(v)
    }
 }

-impl From<i64> for Value {
-    fn from(v: i64) -> Value {
+impl From<i64> for Value<'static> {
+    fn from(v: i64) -> Value<'static> {
        Value::I64(v)
    }
 }

-impl From<f64> for Value {
-    fn from(v: f64) -> Value {
+impl From<f64> for Value<'static> {
+    fn from(v: f64) -> Value<'static> {
        Value::F64(v)
    }
 }

-impl From<bool> for Value {
+impl From<bool> for Value<'static> {
    fn from(b: bool) -> Self {
        Value::Bool(b)
    }
 }

-impl From<DateTime> for Value {
-    fn from(dt: DateTime) -> Value {
+impl From<DateTime> for Value<'static> {
+    fn from(dt: DateTime) -> Value<'static> {
        Value::Date(dt)
    }
 }

-impl<'a> From<&'a str> for Value {
-    fn from(s: &'a str) -> Value {
-        Value::Str(s.to_string())
+impl<'a> From<&'a str> for Value<'a> {
+    fn from(s: &'a str) -> Value<'a> {
+        Value::Str(MaybeOwnedString::from_str(s))
    }
 }

-impl<'a> From<&'a [u8]> for Value {
-    fn from(bytes: &'a [u8]) -> Value {
+// TODO change lifetime to 'a
+impl<'a> From<&'a [u8]> for Value<'static> {
+    fn from(bytes: &'a [u8]) -> Value<'static> {
        Value::Bytes(bytes.to_vec())
    }
 }

-impl From<Facet> for Value {
-    fn from(facet: Facet) -> Value {
+impl From<Facet> for Value<'static> {
+    fn from(facet: Facet) -> Value<'static> {
        Value::Facet(facet)
    }
 }

-impl From<Vec<u8>> for Value {
-    fn from(bytes: Vec<u8>) -> Value {
+impl From<Vec<u8>> for Value<'static> {
+    fn from(bytes: Vec<u8>) -> Value<'static> {
        Value::Bytes(bytes)
    }
 }

-impl From<PreTokenizedString> for Value {
-    fn from(pretokenized_string: PreTokenizedString) -> Value {
+impl From<PreTokenizedString> for Value<'static> {
+    fn from(pretokenized_string: PreTokenizedString) -> Value<'static> {
        Value::PreTokStr(pretokenized_string)
    }
 }

-impl From<serde_json::Map<String, serde_json::Value>> for Value {
-    fn from(json_object: serde_json::Map<String, serde_json::Value>) -> Value {
+impl From<serde_json::Map<String, serde_json::Value>> for Value<'static> {
+    fn from(json_object: serde_json::Map<String, serde_json::Value>) -> Value<'static> {
        Value::JsonObject(json_object)
    }
 }

-impl From<serde_json::Value> for Value {
-    fn from(json_value: serde_json::Value) -> Value {
+impl From<serde_json::Value> for Value<'static> {
+    fn from(json_value: serde_json::Value) -> Value<'static> {
        match json_value {
            serde_json::Value::Object(json_object) => Value::JsonObject(json_object),
            _ => {
@@ -288,10 +340,12 @@ impl From<serde_json::Value> for Value {

 mod binary_serialize {
    use std::io::{self, Read, Write};
+    use std::net::Ipv6Addr;

    use common::{f64_to_u64, u64_to_f64, BinarySerializable};
+    use fastfield_codecs::MonotonicallyMappableToU128;

-    use super::Value;
+    use super::{MaybeOwnedString, Value};
    use crate::schema::Facet;
    use crate::tokenizer::PreTokenizedString;
    use crate::DateTime;
@@ -306,17 +360,19 @@ mod binary_serialize {
    const EXT_CODE: u8 = 7;
    const JSON_OBJ_CODE: u8 = 8;
    const BOOL_CODE: u8 = 9;
+    const IP_CODE: u8 = 10;

    // extended types

    const TOK_STR_CODE: u8 = 0;

-    impl BinarySerializable for Value {
+    impl<'a> BinarySerializable for Value<'a> {
        fn serialize<W: Write>(&self, writer: &mut W) -> io::Result<()> {
            match *self {
                Value::Str(ref text) => {
                    TEXT_CODE.serialize(writer)?;
-                    text.serialize(writer)
+                    // TODO impl trait for MaybeOwnedString
+                    text.as_str().to_owned().serialize(writer)
                }
                Value::PreTokStr(ref tok_str) => {
                    EXT_CODE.serialize(writer)?;
@@ -366,6 +422,10 @@ mod binary_serialize {
                    serde_json::to_writer(writer, &map)?;
                    Ok(())
                }
+                Value::IpAddr(ref ip) => {
+                    IP_CODE.serialize(writer)?;
+                    ip.to_u128().serialize(writer)
+                }
            }
        }

@@ -374,7 +434,7 @@ mod binary_serialize {
            match type_code {
                TEXT_CODE => {
                    let text = String::deserialize(reader)?;
-                    Ok(Value::Str(text))
+                    Ok(Value::Str(MaybeOwnedString::from_string(text)))
                }
                U64_CODE => {
                    let value = u64::deserialize(reader)?;
@@ -436,6 +496,11 @@ mod binary_serialize {
                    let json_map = <serde_json::Map::<String, serde_json::Value> as serde::Deserialize>::deserialize(&mut de)?;
                    Ok(Value::JsonObject(json_map))
                }
+                IP_CODE => {
+                    let value = u128::deserialize(reader)?;
+                    Ok(Value::IpAddr(Ipv6Addr::from_u128(value)))
+                }
+
                _ => Err(io::Error::new(
                    io::ErrorKind::InvalidData,
                    format!("No field type is associated with code {:?}", type_code),
@@ -448,9 +513,52 @@ mod binary_serialize {
 #[cfg(test)]
 mod tests {
    use super::Value;
+    use crate::schema::{BytesOptions, Schema};
    use crate::time::format_description::well_known::Rfc3339;
    use crate::time::OffsetDateTime;
-    use crate::DateTime;
+    use crate::{DateTime, Document};
+
+    #[test]
+    fn test_parse_bytes_doc() {
+        let mut schema_builder = Schema::builder();
+        let bytes_options = BytesOptions::default();
+        let bytes_field = schema_builder.add_bytes_field("my_bytes", bytes_options);
+        let schema = schema_builder.build();
+        let mut doc = Document::default();
+        doc.add_bytes(bytes_field, "this is a test".as_bytes());
+        let json_string = schema.to_json(&doc);
+        assert_eq!(json_string, r#"{"my_bytes":["dGhpcyBpcyBhIHRlc3Q="]}"#);
+    }
+
+    #[test]
+    fn test_parse_empty_bytes_doc() {
+        let mut schema_builder = Schema::builder();
+        let bytes_options = BytesOptions::default();
+        let bytes_field = schema_builder.add_bytes_field("my_bytes", bytes_options);
+        let schema = schema_builder.build();
+        let mut doc = Document::default();
+        doc.add_bytes(bytes_field, "".as_bytes());
+        let json_string = schema.to_json(&doc);
+        assert_eq!(json_string, r#"{"my_bytes":[""]}"#);
+    }
+
+    #[test]
+    fn test_parse_many_bytes_doc() {
+        let mut schema_builder = Schema::builder();
+        let bytes_options = BytesOptions::default();
+        let bytes_field = schema_builder.add_bytes_field("my_bytes", bytes_options);
+        let schema = schema_builder.build();
+        let mut doc = Document::default();
+        doc.add_bytes(
+            bytes_field,
+            "A bigger test I guess\nspanning on multiple lines\nhoping this will work".as_bytes(),
+        );
+        let json_string = schema.to_json(&doc);
+        assert_eq!(
+            json_string,
+            r#"{"my_bytes":["QSBiaWdnZXIgdGVzdCBJIGd1ZXNzCnNwYW5uaW5nIG9uIG11bHRpcGxlIGxpbmVzCmhvcGluZyB0aGlzIHdpbGwgd29yaw=="]}"#
+        );
+    }

    #[test]
    fn test_serialize_date() {
@@ -468,3 +576,104 @@ mod tests {
        assert_eq!(serialized_value_json, r#""1996-12-20T01:39:57Z""#);
    }
 }
+
+mod not_safe {
+    use std::ops::Deref;
+
+    union Ref<'a, T: ?Sized> {
+        shared: &'a T,
+        uniq: &'a mut T,
+    }
+
+    pub struct MaybeOwnedString<'a> {
+        string: Ref<'a, str>,
+        capacity: usize,
+    }
+
+    impl<'a> MaybeOwnedString<'a> {
+        pub fn from_str(string: &'a str) -> MaybeOwnedString<'a> {
+            MaybeOwnedString {
+                string: Ref { shared: string },
+                capacity: 0,
+            }
+        }
+
+        pub fn from_string(mut string: String) -> MaybeOwnedString<'static> {
+            string.shrink_to_fit(); // <= actually important for safety, todo use the Vec .as_ptr instead
+
+            let mut s = std::mem::ManuallyDrop::new(string);
+            let ptr = s.as_mut_ptr();
+            let len = s.len();
+            let capacity = s.capacity();
+
+            let string = unsafe {
+                std::str::from_utf8_unchecked_mut(std::slice::from_raw_parts_mut(ptr, len))
+            };
+            MaybeOwnedString {
+                string: Ref { uniq: string },
+                capacity,
+            }
+        }
+
+        pub fn into_string(mut self) -> String {
+            if self.capacity != 0 {
+                let string = unsafe { &mut self.string.uniq };
+                unsafe {
+                    return String::from_raw_parts(string.as_mut_ptr(), self.len(), self.capacity);
+                };
+            }
+            self.deref().to_owned()
+        }
+
+        pub fn as_str(&self) -> &str {
+            self.deref()
+        }
+    }
+
+    impl<'a> Deref for MaybeOwnedString<'a> {
+        type Target = str;
+
+        #[inline]
+        fn deref(&self) -> &str {
+            unsafe { self.string.shared }
+        }
+    }
+
+    impl<'a> Drop for MaybeOwnedString<'a> {
+        fn drop(&mut self) {
+            // if capacity is 0, either it's an empty String so there is no dealloc to do, or it's
+            // borrowed
+            if self.capacity != 0 {
+                let string = unsafe { &mut self.string.uniq };
+                unsafe { String::from_raw_parts(string.as_mut_ptr(), self.len(), self.capacity) };
+            }
+        }
+    }
+
+    impl<'a> Clone for MaybeOwnedString<'a> {
+        fn clone(&self) -> Self {
+            if self.capacity == 0 {
+                MaybeOwnedString {
+                    string: Ref {
+                        shared: unsafe { self.string.shared },
+                    },
+                    capacity: 0,
+                }
+            } else {
+                MaybeOwnedString::from_string(self.deref().to_owned())
+            }
+        }
+    }
+
+    impl<'a> std::fmt::Debug for MaybeOwnedString<'a> {
+        fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
+            f.write_str(self.deref())
+        }
+    }
+
+    impl<'a> PartialEq for MaybeOwnedString<'a> {
+        fn eq(&self, other: &Self) -> bool {
+            self.deref() == other.deref()
+        }
+    }
+}
--- a/src/store/index/block.rs
+++ b/src/store/index/block.rs
@@ -1,7 +1,7 @@
 use std::io;
 use std::ops::Range;

-use common::VInt;
+use common::{read_u32_vint, VInt};

 use crate::store::index::{Checkpoint, CHECKPOINT_PERIOD};
 use crate::DocId;
@@ -85,15 +85,15 @@ impl CheckpointBlock {
            return Err(io::Error::new(io::ErrorKind::UnexpectedEof, ""));
        }
        self.checkpoints.clear();
-        let len = VInt::deserialize_u64(data)? as usize;
+        let len = read_u32_vint(data);
        if len == 0 {
            return Ok(());
        }
-        let mut doc = VInt::deserialize_u64(data)? as DocId;
-        let mut start_offset = VInt::deserialize_u64(data)? as usize;
+        let mut doc = read_u32_vint(data);
+        let mut start_offset = read_u32_vint(data) as usize;
        for _ in 0..len {
-            let num_docs = VInt::deserialize_u64(data)? as DocId;
-            let block_num_bytes = VInt::deserialize_u64(data)? as usize;
+            let num_docs = read_u32_vint(data);
+            let block_num_bytes = read_u32_vint(data) as usize;
            self.checkpoints.push(Checkpoint {
                doc_range: doc..doc + num_docs,
                byte_range: start_offset..start_offset + block_num_bytes,
--- a/src/store/mod.rs
+++ b/src/store/mod.rs
@@ -96,7 +96,7 @@ pub mod tests {
                let mut doc = Document::default();
                doc.add_field_value(field_body, LOREM.to_string());
                doc.add_field_value(field_title, format!("Doc {i}"));
-                store_writer.store(&doc).unwrap();
+                store_writer.store(&doc, &schema).unwrap();
            }
            store_writer.close().unwrap();
        }
--- a/src/store/writer.rs
+++ b/src/store/writer.rs
@@ -1,11 +1,11 @@
-use std::io::{self, Write};
+use std::io;

 use common::BinarySerializable;

 use super::compressors::Compressor;
 use super::StoreReader;
 use crate::directory::WritePtr;
-use crate::schema::Document;
+use crate::schema::{Document, Schema};
 use crate::store::store_compressor::BlockCompressor;
 use crate::DocId;

@@ -20,7 +20,6 @@ pub struct StoreWriter {
    compressor: Compressor,
    block_size: usize,
    num_docs_in_current_block: DocId,
-    intermediary_buffer: Vec<u8>,
    current_block: Vec<u8>,
    doc_pos: Vec<u32>,
    block_compressor: BlockCompressor,
@@ -42,7 +41,6 @@ impl StoreWriter {
            compressor,
            block_size,
            num_docs_in_current_block: 0,
-            intermediary_buffer: Vec::new(),
            doc_pos: Vec::new(),
            current_block: Vec::new(),
            block_compressor,
@@ -55,9 +53,7 @@ impl StoreWriter {

    /// The memory used (inclusive childs)
    pub fn mem_usage(&self) -> usize {
-        self.intermediary_buffer.capacity()
-            + self.current_block.capacity()
-            + self.doc_pos.capacity() * std::mem::size_of::<u32>()
+        self.current_block.capacity() + self.doc_pos.capacity() * std::mem::size_of::<u32>()
    }

    /// Checks if the current block is full, and if so, compresses and flushes it.
@@ -99,15 +95,9 @@ impl StoreWriter {
    ///
    /// The document id is implicitly the current number
    /// of documents.
-    pub fn store(&mut self, stored_document: &Document) -> io::Result<()> {
-        self.intermediary_buffer.clear();
-        stored_document.serialize(&mut self.intermediary_buffer)?;
-        // calling store bytes would be preferable for code reuse, but then we can't use
-        // intermediary_buffer due to the borrow checker
-        // a new buffer costs ~1% indexing performance
+    pub fn store(&mut self, document: &Document, schema: &Schema) -> io::Result<()> {
        self.doc_pos.push(self.current_block.len() as u32);
-        self.current_block
-            .write_all(&self.intermediary_buffer[..])?;
+        document.serialize_stored(schema, &mut self.current_block)?;
        self.num_docs_in_current_block += 1;
        self.check_flush_block()?;
        Ok(())
--- a/src/termdict/sstable_termdict/sstable/sstable_index.rs
+++ b/src/termdict/sstable_termdict/sstable/sstable_index.rs
@@ -13,7 +13,7 @@ pub struct SSTableIndex {

 impl SSTableIndex {
    pub(crate) fn load(data: &[u8]) -> Result<SSTableIndex, DataCorruption> {
-        serde_cbor::de::from_slice(data)
+        ciborium::de::from_reader(data)
            .map_err(|_| DataCorruption::comment_only("SSTable index is corrupted"))
    }

@@ -85,9 +85,9 @@ impl SSTableIndexBuilder {
        })
    }

-    pub fn serialize(&self, wrt: &mut dyn io::Write) -> io::Result<()> {
-        serde_cbor::ser::to_writer(wrt, &self.index).unwrap();
-        Ok(())
+    pub fn serialize<W: std::io::Write>(&self, wrt: W) -> io::Result<()> {
+        ciborium::ser::into_writer(&self.index, wrt)
+            .map_err(|err| io::Error::new(io::ErrorKind::Other, err))
    }
 }

--- a/src/termdict/sstable_termdict/termdict.rs
+++ b/src/termdict/sstable_termdict/termdict.rs
@@ -24,6 +24,8 @@ impl SSTable for TermInfoSSTable {
    type Reader = TermInfoReader;
    type Writer = TermInfoWriter;
 }
+
+/// Builder for the new term dictionary.
 pub struct TermDictionaryBuilder<W: io::Write> {
    sstable_writer: Writer<W, TermInfoWriter>,
 }
@@ -138,6 +140,7 @@ impl TermDictionary {
        })
    }

+    /// Creates a term dictionary from the supplied bytes.
    pub fn from_bytes(owned_bytes: OwnedBytes) -> crate::Result<TermDictionary> {
        TermDictionary::open(FileSlice::new(Arc::new(owned_bytes)))
    }
@@ -229,19 +232,19 @@ impl TermDictionary {
        Ok(None)
    }

-    // Returns a range builder, to stream all of the terms
-    // within an interval.
+    /// Returns a range builder, to stream all of the terms
+    /// within an interval.
    pub fn range(&self) -> TermStreamerBuilder<'_> {
        TermStreamerBuilder::new(self, AlwaysMatch)
    }

-    // A stream of all the sorted terms.
+    /// A stream of all the sorted terms.
    pub fn stream(&self) -> io::Result<TermStreamer<'_>> {
        self.range().into_stream()
    }

-    // Returns a search builder, to stream all of the terms
-    // within the Automaton
+    /// Returns a search builder, to stream all of the terms
+    /// within the Automaton
    pub fn search<'a, A: Automaton + 'a>(&'a self, automaton: A) -> TermStreamerBuilder<'a, A>
    where A::State: Clone {
        TermStreamerBuilder::<A>::new(self, automaton)
--- a/src/tokenizer/mod.rs
+++ b/src/tokenizer/mod.rs
@@ -126,6 +126,7 @@ mod ngram_tokenizer;
 mod raw_tokenizer;
 mod remove_long;
 mod simple_tokenizer;
+mod split_compound_words;
 mod stemmer;
 mod stop_word_filter;
 mod tokenized_string;
@@ -141,6 +142,7 @@ pub use self::ngram_tokenizer::NgramTokenizer;
 pub use self::raw_tokenizer::RawTokenizer;
 pub use self::remove_long::RemoveLongFilter;
 pub use self::simple_tokenizer::SimpleTokenizer;
+pub use self::split_compound_words::SplitCompoundWords;
 pub use self::stemmer::{Language, Stemmer};
 pub use self::stop_word_filter::StopWordFilter;
 pub use self::tokenized_string::{PreTokenizedStream, PreTokenizedString};
--- a/src/tokenizer/split_compound_words.rs
+++ b/src/tokenizer/split_compound_words.rs
@@ -0,0 +1,252 @@
+use std::sync::Arc;
+
+use aho_corasick::{AhoCorasick, AhoCorasickBuilder, MatchKind, StateID};
+
+use super::{BoxTokenStream, Token, TokenFilter, TokenStream};
+
+/// A [`TokenFilter`] which splits compound words into their parts
+/// based on a given dictionary.
+///
+/// Words only will be split if they can be fully decomposed into
+/// consecutive matches into the given dictionary.
+///
+/// This is mostly useful to split [compound nouns][compound] common to many
+/// Germanic languages into their constituents.
+///
+/// # Example
+///
+/// The quality of the dictionary determines the quality of the splits,
+/// e.g. the missing stem "back" of "backen" implies that "brotbackautomat"
+/// is not split in the following example.
+///
+/// ```rust
+/// use tantivy::tokenizer::{SimpleTokenizer, SplitCompoundWords, TextAnalyzer};
+///
+/// let tokenizer =
+///        TextAnalyzer::from(SimpleTokenizer).filter(SplitCompoundWords::from_dictionary([
+///            "dampf", "schiff", "fahrt", "brot", "backen", "automat",
+///        ]));
+///
+/// let mut stream = tokenizer.token_stream("dampfschifffahrt");
+/// assert_eq!(stream.next().unwrap().text, "dampf");
+/// assert_eq!(stream.next().unwrap().text, "schiff");
+/// assert_eq!(stream.next().unwrap().text, "fahrt");
+/// assert_eq!(stream.next(), None);
+///
+/// let mut stream = tokenizer.token_stream("brotbackautomat");
+/// assert_eq!(stream.next().unwrap().text, "brotbackautomat");
+/// assert_eq!(stream.next(), None);
+/// ```
+///
+/// [compound]: https://en.wikipedia.org/wiki/Compound_(linguistics)
+#[derive(Clone)]
+pub struct SplitCompoundWords<S: StateID> {
+    dict: Arc<AhoCorasick<S>>,
+}
+
+impl SplitCompoundWords<usize> {
+    /// Create a filter from a given dictionary.
+    ///
+    /// The dictionary will be used to construct an [`AhoCorasick`] automaton
+    /// with reasonable defaults. See [`from_automaton`][Self::from_automaton] if
+    /// more control over its construction is required.
+    pub fn from_dictionary<I, P>(dict: I) -> Self
+    where
+        I: IntoIterator<Item = P>,
+        P: AsRef<[u8]>,
+    {
+        let dict = AhoCorasickBuilder::new()
+            .match_kind(MatchKind::LeftmostLongest)
+            .build(dict);
+
+        Self::from_automaton(dict)
+    }
+}
+
+impl<S: StateID> SplitCompoundWords<S> {
+    /// Create a filter from a given automaton.
+    ///
+    /// The automaton should use one of the leftmost-first match kinds
+    /// and it should not be anchored.
+    pub fn from_automaton(dict: AhoCorasick<S>) -> Self {
+        Self {
+            dict: Arc::new(dict),
+        }
+    }
+}
+
+impl<S: StateID + Send + Sync + 'static> TokenFilter for SplitCompoundWords<S> {
+    fn transform<'a>(&self, stream: BoxTokenStream<'a>) -> BoxTokenStream<'a> {
+        BoxTokenStream::from(SplitCompoundWordsTokenStream {
+            dict: self.dict.clone(),
+            tail: stream,
+            cuts: Vec::new(),
+            parts: Vec::new(),
+        })
+    }
+}
+
+struct SplitCompoundWordsTokenStream<'a, S: StateID> {
+    dict: Arc<AhoCorasick<S>>,
+    tail: BoxTokenStream<'a>,
+    cuts: Vec<usize>,
+    parts: Vec<Token>,
+}
+
+impl<'a, S: StateID> SplitCompoundWordsTokenStream<'a, S> {
+    // Will use `self.cuts` to fill `self.parts` if `self.tail.token()`
+    // can fully be split into consecutive matches against `self.dict`.
+    fn split(&mut self) {
+        let token = self.tail.token();
+        let mut text = token.text.as_str();
+
+        self.cuts.clear();
+        let mut pos = 0;
+
+        for match_ in self.dict.find_iter(text) {
+            if pos != match_.start() {
+                break;
+            }
+
+            self.cuts.push(pos);
+            pos = match_.end();
+        }
+
+        if pos == token.text.len() {
+            // Fill `self.parts` in reverse order,
+            // so that `self.parts.pop()` yields
+            // the tokens in their original order.
+            for pos in self.cuts.iter().rev() {
+                let (head, tail) = text.split_at(*pos);
+
+                text = head;
+                self.parts.push(Token {
+                    text: tail.to_owned(),
+                    ..*token
+                });
+            }
+        }
+    }
+}
+
+impl<'a, S: StateID> TokenStream for SplitCompoundWordsTokenStream<'a, S> {
+    fn advance(&mut self) -> bool {
+        self.parts.pop();
+
+        if !self.parts.is_empty() {
+            return true;
+        }
+
+        if !self.tail.advance() {
+            return false;
+        }
+
+        // Will yield either `self.parts.last()` or
+        // `self.tail.token()` if it could not be split.
+        self.split();
+        true
+    }
+
+    fn token(&self) -> &Token {
+        self.parts.last().unwrap_or_else(|| self.tail.token())
+    }
+
+    fn token_mut(&mut self) -> &mut Token {
+        self.parts
+            .last_mut()
+            .unwrap_or_else(|| self.tail.token_mut())
+    }
+}
+
+#[cfg(test)]
+mod tests {
+    use super::*;
+    use crate::tokenizer::{SimpleTokenizer, TextAnalyzer};
+
+    #[test]
+    fn splitting_compound_words_works() {
+        let tokenizer = TextAnalyzer::from(SimpleTokenizer)
+            .filter(SplitCompoundWords::from_dictionary(["foo", "bar"]));
+
+        {
+            let mut stream = tokenizer.token_stream("");
+            assert_eq!(stream.next(), None);
+        }
+
+        {
+            let mut stream = tokenizer.token_stream("foo bar");
+            assert_eq!(stream.next().unwrap().text, "foo");
+            assert_eq!(stream.next().unwrap().text, "bar");
+            assert_eq!(stream.next(), None);
+        }
+
+        {
+            let mut stream = tokenizer.token_stream("foobar");
+            assert_eq!(stream.next().unwrap().text, "foo");
+            assert_eq!(stream.next().unwrap().text, "bar");
+            assert_eq!(stream.next(), None);
+        }
+
+        {
+            let mut stream = tokenizer.token_stream("foobarbaz");
+            assert_eq!(stream.next().unwrap().text, "foobarbaz");
+            assert_eq!(stream.next(), None);
+        }
+
+        {
+            let mut stream = tokenizer.token_stream("baz foobar qux");
+            assert_eq!(stream.next().unwrap().text, "baz");
+            assert_eq!(stream.next().unwrap().text, "foo");
+            assert_eq!(stream.next().unwrap().text, "bar");
+            assert_eq!(stream.next().unwrap().text, "qux");
+            assert_eq!(stream.next(), None);
+        }
+
+        {
+            let mut stream = tokenizer.token_stream("foobar foobar");
+            assert_eq!(stream.next().unwrap().text, "foo");
+            assert_eq!(stream.next().unwrap().text, "bar");
+            assert_eq!(stream.next().unwrap().text, "foo");
+            assert_eq!(stream.next().unwrap().text, "bar");
+            assert_eq!(stream.next(), None);
+        }
+
+        {
+            let mut stream = tokenizer.token_stream("foobar foo bar foobar");
+            assert_eq!(stream.next().unwrap().text, "foo");
+            assert_eq!(stream.next().unwrap().text, "bar");
+            assert_eq!(stream.next().unwrap().text, "foo");
+            assert_eq!(stream.next().unwrap().text, "bar");
+            assert_eq!(stream.next().unwrap().text, "foo");
+            assert_eq!(stream.next().unwrap().text, "bar");
+            assert_eq!(stream.next(), None);
+        }
+
+        {
+            let mut stream = tokenizer.token_stream("foobazbar foo bar foobar");
+            assert_eq!(stream.next().unwrap().text, "foobazbar");
+            assert_eq!(stream.next().unwrap().text, "foo");
+            assert_eq!(stream.next().unwrap().text, "bar");
+            assert_eq!(stream.next().unwrap().text, "foo");
+            assert_eq!(stream.next().unwrap().text, "bar");
+            assert_eq!(stream.next(), None);
+        }
+
+        {
+            let mut stream = tokenizer.token_stream("foobar qux foobar");
+            assert_eq!(stream.next().unwrap().text, "foo");
+            assert_eq!(stream.next().unwrap().text, "bar");
+            assert_eq!(stream.next().unwrap().text, "qux");
+            assert_eq!(stream.next().unwrap().text, "foo");
+            assert_eq!(stream.next().unwrap().text, "bar");
+            assert_eq!(stream.next(), None);
+        }
+
+        {
+            let mut stream = tokenizer.token_stream("barfoo");
+            assert_eq!(stream.next().unwrap().text, "bar");
+            assert_eq!(stream.next().unwrap().text, "foo");
+            assert_eq!(stream.next(), None);
+        }
+    }
+}
--- a/src/tokenizer/stemmer.rs
+++ b/src/tokenizer/stemmer.rs
@@ -1,3 +1,6 @@
+use std::borrow::Cow;
+use std::mem;
+
 use rust_stemmers::{self, Algorithm};
 use serde::{Deserialize, Serialize};

@@ -84,6 +87,7 @@ impl TokenFilter for Stemmer {
        BoxTokenStream::from(StemmerTokenStream {
            tail: token_stream,
            stemmer: inner_stemmer,
+            buffer: String::new(),
        })
    }
 }
@@ -91,6 +95,7 @@ impl TokenFilter for Stemmer {
 pub struct StemmerTokenStream<'a> {
    tail: BoxTokenStream<'a>,
    stemmer: rust_stemmers::Stemmer,
+    buffer: String,
 }

 impl<'a> TokenStream for StemmerTokenStream<'a> {
@@ -98,10 +103,16 @@ impl<'a> TokenStream for StemmerTokenStream<'a> {
        if !self.tail.advance() {
            return false;
        }
-        // TODO remove allocation
-        let stemmed_str: String = self.stemmer.stem(&self.token().text).into_owned();
-        self.token_mut().text.clear();
-        self.token_mut().text.push_str(&stemmed_str);
+        let token = self.tail.token_mut();
+        let stemmed_str = self.stemmer.stem(&token.text);
+        match stemmed_str {
+            Cow::Owned(stemmed_str) => token.text = stemmed_str,
+            Cow::Borrowed(stemmed_str) => {
+                self.buffer.clear();
+                self.buffer.push_str(stemmed_str);
+                mem::swap(&mut token.text, &mut self.buffer);
+            }
+        }
        true
    }

--- a/src/tokenizer/stop_word_filter.rs
+++ b/src/tokenizer/stop_word_filter.rs
@@ -10,28 +10,21 @@
 //! assert_eq!(stream.next().unwrap().text, "crafty");
 //! assert!(stream.next().is_none());
 //! ```
-use std::collections::HashSet;
-use std::hash::BuildHasherDefault;
-
-use fnv::FnvHasher;
+use rustc_hash::FxHashSet;

 use super::{Token, TokenFilter, TokenStream};
 use crate::tokenizer::BoxTokenStream;

-// configure our hashers for SPEED
-type StopWordHasher = BuildHasherDefault<FnvHasher>;
-type StopWordHashSet = HashSet<String, StopWordHasher>;
-
 /// `TokenFilter` that removes stop words from a token stream
 #[derive(Clone)]
 pub struct StopWordFilter {
-    words: StopWordHashSet,
+    words: FxHashSet<String>,
 }

 impl StopWordFilter {
    /// Creates a `StopWordFilter` given a list of words to remove
    pub fn remove(words: Vec<String>) -> StopWordFilter {
-        let mut set = StopWordHashSet::default();
+        let mut set = FxHashSet::default();

        for word in words {
            set.insert(word);
@@ -52,7 +45,7 @@ impl StopWordFilter {
 }

 pub struct StopWordFilterStream<'a> {
-    words: StopWordHashSet,
+    words: FxHashSet<String>,
    tail: BoxTokenStream<'a>,
 }
Author	SHA1	Message	Date
trinity-1686a	bcff3eb2d2	try with custom Cow<str>	2023-01-11 16:02:52 +01:00
trinity-1686a	85f2588875	implement add_borrowed_values on Document	2022-12-23 16:16:22 +01:00
trinity-1686a	db6cf65d53	make Document support Yoked inner values	2022-12-22 17:52:53 +01:00
trinity-1686a	654aa7f42c	allow Value to borrow	2022-12-22 15:43:13 +01:00
François Massot	951a898633	Update bench.	2022-10-30 14:12:07 +01:00
François Massot	003722d831	Add bench to reproduce performance drop on array of texts.	2022-10-29 02:54:07 +02:00
PSeitz	4e46f4f8c4	Merge pull request #1649 from adamreichold/split-compound-words RFC: Add dictionary-based SplitCompoundWords token filter.	2022-10-27 17:12:48 +08:00
PSeitz	6647362464	Merge pull request #1648 from adamreichold/stemmer-todo-alloc Avoid unconditional allocation in StemmerTokenStream.	2022-10-27 16:50:41 +08:00
PSeitz	7a80851e36	Merge pull request #1645 from quickwit-oss/ip_field_range_query add ip range query benchmark, add seek behaviour	2022-10-27 16:13:52 +08:00
Adam Reichold	cd952429d2	Add dictionary-based SplitCompoundWords token filter.	2022-10-27 08:30:33 +02:00
PSeitz	d777c964da	Merge pull request #1650 from adamreichold/fnv-rustc-hash Replace FNV by rustc-hash	2022-10-27 12:11:26 +08:00
Adam Reichold	bbb058d976	Replace FNV by rustc-hash Both construction have similar goals but rustc-hash ist better suited for contemporary CPU as it works one word at a time instead of byte per byte.	2022-10-27 00:35:09 +02:00
Adam Reichold	5f7d027a52	Avoid unconditional allocation in StemmerTokenStream. This fixes the TODO in two ways: If the stemmer already yields an owned string, it is used directly as the new text of the token. Otherwise, a temporary buffer is used to copy the stemmed text (just as before) and then swapping it into the token to reuse its existing buffer.	2022-10-26 18:11:15 +02:00
PSeitz	0c2bd36fe3	Panic on duplicate field names (#1647 ) fixes #1601	2022-10-26 16:17:33 +09:00
Pascal Seitz	fec2b63571	improve bench by adding more blanks in compact space	2022-10-25 22:09:01 +08:00
Pascal Seitz	6213ea476a	pass positions parameter	2022-10-25 17:44:51 +08:00
Pascal Seitz	5e159c26bf	add ip range query benchmark, add seek behaviour	2022-10-25 15:57:19 +08:00
PSeitz	a5e59ab598	Merge pull request #1644 from quickwit-oss/get_val_u32 switch get_val() to u32	2022-10-24 19:30:03 +08:00
Pascal Seitz	e772d3170d	switch get_val() to u32 Fixes #1638	2022-10-24 19:05:57 +08:00
PSeitz	8c2ba7bd55	Merge pull request #1637 from quickwit-oss/ip_field_range_query add range query via ip fast field	2022-10-24 18:10:47 +08:00
Pascal Seitz	02328b0151	fix proptest	2022-10-24 17:46:06 +08:00
Pascal Seitz	7cc775256c	add comments, rename	2022-10-24 17:08:37 +08:00
Pascal Seitz	07b40f8b8b	add proptest	2022-10-24 16:52:55 +08:00
PSeitz	9b6b6be5b9	Apply suggestions from code review Co-authored-by: Paul Masurel <paul@quickwit.io>	2022-10-24 16:00:38 +08:00
Pascal Seitz	6bb73a527f	add range query via ip fast field	2022-10-24 16:00:38 +08:00
PSeitz	03885d0f3c	Merge pull request #1643 from quickwit-oss/range_query_parser allow more characters in range query	2022-10-24 15:09:47 +08:00
Pascal Seitz	f2e5135870	allow more characters in range query closes #1642	2022-10-21 18:05:15 +08:00
Paul Masurel	c24157f28b	Bumping version format. (#1640 ) The docstore format has changed in a non-compatible manner.	2022-10-21 15:35:35 +09:00
PSeitz	873382cdcb	Merge pull request #1639 from quickwit-oss/num_vals_u32 switch num_vals() to u32	2022-10-21 12:36:50 +08:00
Pascal Seitz	791350091c	switch num_vals() to u32 fixes #1630	2022-10-20 19:44:28 +08:00
Paul Masurel	483b1d13d4	Added unit test for long tokens (#1635 ) * Bugfix on long tokens and multivalue text fields. Fixes a minor bug for the strong edge case in which a tokenizer would emit tokens where the last token does not cover the last position. More importantly, this adds unit tests. Closes #1634 * Update src/indexer/segment_writer.rs Co-authored-by: PSeitz <PSeitz@users.noreply.github.com> Co-authored-by: PSeitz <PSeitz@users.noreply.github.com>	2022-10-20 15:05:37 +09:00
PSeitz	8de7fa9d95	Merge pull request #1631 from quickwit-oss/high_positions add test for phrase search on multi text field	2022-10-20 10:26:00 +08:00
Paul Masurel	94313b62f8	Hotfix issue/1629 - position broken (#1633 ) * Bugfix position broken. For Field with several FieldValues, with a value that contained no token at all, the token position was reinitialized to 0. As a result, PhraseQueries can show some false positives. In addition, after the computation of the position delta, we can underflow u32, and end up with gigantic delta. We haven't been able to actually explain the bug in 1629, but it is assumed that in some corner case these delta can cause a panic. Closes #1629	2022-10-20 11:03:55 +09:00
Pascal Seitz	f2b2628feb	add test for phrase search on multi text field	2022-10-19 16:29:56 +08:00
PSeitz	449f595832	Merge pull request #1628 from quickwit-oss/skip_index_deser faster skipindex deserialization, larger blocksize on sort	2022-10-19 11:05:20 +08:00
PSeitz	c9235df059	Merge pull request #1627 from quickwit-oss/ip_field_range_query add range query handling for ip via term dictionary	2022-10-19 10:53:00 +08:00
Pascal Seitz	a4485f7611	faster skipindex deserialization, larger blocksize on sort	2022-10-18 19:32:23 +08:00
Pascal Seitz	1082ff60f9	add range query handling for ip via term dictionary since IPs are mapped monotonically we can use the term dictionary for range queries	2022-10-18 13:08:27 +08:00
PSeitz	491854155c	Merge pull request #1625 from quickwit-oss/index_ip_field index ip field	2022-10-18 11:18:17 +08:00
Christoph Herzog	96c3d54ac7	fix: Fix power of two computation on 32bit architectures (#1624 ) The current `compute_previous_power_of_two()` implementation used for TermHashmap takes and returns `usize` , but actually only works correclty on 64 bit architectures (aka usize == u64) On other architectures the leading_zeros computation is run on the wrong type (must be u64), and leads to overflows. Fixed simply computing the leading_zeros based on a u64 value.	2022-10-18 11:55:02 +09:00
Pascal Seitz	6800fdec9d	add indexing for ip field Closes #1595	2022-10-18 10:07:48 +08:00
PSeitz	c9cf9c952a	Merge pull request #1614 from quickwit-oss/remove_superfluous_steps refactor Term	2022-10-17 18:25:31 +08:00
Pascal Seitz	024e53a99c	remove truncate	2022-10-17 12:14:35 +08:00
Pascal Seitz	8d75e451bd	fix truncate, remove mutable access from term	2022-10-17 12:14:35 +08:00
Pascal Seitz	fcfd76ec55	refactor Term fixes some issues with Term Remove duplicate calls to truncate or resize Replace Magic Number 5 with constant Enforce minimum size of 5 for metadata Fix broken truncate docs use constructor instead new + set calls normalize constructor stack replace assert on internal behavior fixes #1585	2022-10-17 12:14:34 +08:00
PSeitz	6b7b1cc4fa	Merge pull request #1623 from quickwit-oss/remove_unused_buffer remove unused buffer	2022-10-14 20:36:00 +08:00
Pascal Seitz	129f7422f5	remove unused buffer	2022-10-14 20:01:10 +08:00
PSeitz	f39cce2c8b	Merge pull request #1622 from quickwit-oss/term_aggregation add term aggregation clarification	2022-10-14 18:09:18 +08:00
PSeitz	d2478fac8a	Merge pull request #1621 from quickwit-oss/changelog update CHANGELOG	2022-10-14 18:08:57 +08:00
Pascal Seitz	952b048341	add term aggregation clarification	2022-10-14 16:12:19 +08:00
PSeitz	80f9596ec8	Merge pull request #1611 from quickwit-oss/remove_token_stream_alloc remove tokenstream vec alloc	2022-10-14 15:12:30 +08:00
Pascal Seitz	84f9e77e1d	update CHANGELOG	2022-10-14 15:10:33 +08:00
PSeitz	a602c248fb	Merge pull request #1590 from waywardmonkeys/fix-doc-warnings-quickwit Fix missing doc warnings when enabling feature "quickwit".	2022-10-14 14:09:25 +08:00
PSeitz	4b9d1fe828	Merge pull request #1620 from quickwit-oss/fix_fieldnorms_indexing Fix missing fieldnorm indexing	2022-10-14 13:41:38 +08:00
Pascal Seitz	63bc390b02	Fix missing fieldnorm indexing Fixes broken search (no results) with BM25 for u64, i64, f64, bool, bytes and date after deletion and merge. There were no fieldnorms recorded for those field. After merge InvertedIndexReader::total_num_tokens returns 0 (Sum over the fieldnorms is 0). BM25 does not work when total_num_tokens is 0. Fixes #1617	2022-10-14 12:44:40 +08:00
Paul Masurel	07393c2fa0	Attempt to fix race condition in test. (#1619 ) Close #1550	2022-10-14 10:56:37 +09:00
PSeitz	77a415cbe4	rename NothingRecorder to DocIdRecorder (#1615 )	2022-10-13 15:43:40 +09:00
PSeitz	4b4c231bba	Merge pull request #1612 from quickwit-oss/no_panic_please return Error instead panic in fastfields	2022-10-11 18:33:00 +08:00
PSeitz	11d3409286	add missing docs for fastfield_codecs crate (#1613 ) closes #1603	2022-10-11 18:54:24 +09:00
Pascal Seitz	9cb8cfbea8	return Error instead panic in fastfields fixes #1572	2022-10-11 14:15:22 +08:00
PSeitz	8b69aab0fc	avoid prepare_doc allocation (#1610 ) avoid prepare_doc allocation, ~10% more thoughput best case	2022-10-11 14:15:55 +09:00
PSeitz	3650d1f36a	Merge pull request #1553 from quickwit-oss/ip_field ip field	2022-10-11 13:09:47 +08:00
Pascal Seitz	2efebdb1bb	remove tokenstream vec alloc	2022-10-11 10:30:56 +08:00
François Massot	e443ca63aa	Merge pull request #1608 from quickwit-oss/nigel/serialise-bytes-as-b64-#2042 Serialise bytes as base64 strings instead of arrays.	2022-10-10 11:51:23 +02:00
Pascal Seitz	5c9cbee29d	handle IpV4 serialization case	2022-10-07 19:52:00 +08:00
Pascal Seitz	b2ca83a93c	switch to ipv6, add monotonic_mapping tests	2022-10-07 18:47:55 +08:00
Nigel Andrews	3b189080d4	Use raw string literals in tests	2022-10-07 12:28:25 +02:00
Nigel Andrews	00a6586efe	Replaced String::serialize for serializer.serialize_str	2022-10-07 11:55:05 +02:00
Pascal Seitz	b9b913510e	fmt	2022-10-07 16:56:19 +08:00
PSeitz	534b1d33c3	use ipv6 Co-authored-by: Paul Masurel <paul@quickwit.io>	2022-10-07 16:56:00 +08:00
PSeitz	f465173872	Apply suggestions from code review Co-authored-by: Paul Masurel <paul@quickwit.io>	2022-10-07 16:55:53 +08:00
Pascal Seitz	96315df20d	use idx part only for positions_to_docid	2022-10-07 16:54:04 +08:00
Pascal Seitz	9a1609d364	add test	2022-10-07 16:25:01 +08:00
Pascal Seitz	39f4e58450	improve comment	2022-10-07 16:25:01 +08:00
Pascal Seitz	a8a36b62cd	enable test	2022-10-07 16:25:01 +08:00
Pascal Seitz	226a49338f	add StrictlyMonotonicFn	2022-10-07 16:25:01 +08:00
Pascal Seitz	2864bf7123	use serializer for u128	2022-10-07 16:25:01 +08:00
Pascal Seitz	5171ff611b	serialize ip as u128, add test for positions_to_docid	2022-10-07 16:25:01 +08:00
Pascal Seitz	e50e74acf8	remove u128 type	2022-10-07 16:25:01 +08:00
Pascal Seitz	0b86658389	rename ip addr, use buffer	2022-10-07 16:25:01 +08:00
Pascal Seitz	5d6602a8d9	mark null handling TODO	2022-10-07 16:25:01 +08:00
Pascal Seitz	4d29ff4d01	finalize ip addr rename	2022-10-07 16:25:01 +08:00
Pascal Seitz	cdc8e3a8be	group montonic mapping and inverse fix mapping inverse remove ip indexing add get_between_vals test	2022-10-07 16:25:01 +08:00
Pascal Seitz	67f453b534	rename to iter_gen	2022-10-07 16:25:01 +08:00
Pascal Seitz	787a37bacf	expect instead of unwrap	2022-10-07 16:25:01 +08:00
Pascal Seitz	f5039f1846	remove roaring	2022-10-07 16:25:01 +08:00
Pascal Seitz	eeb1f19093	rename to iter_gen	2022-10-07 16:25:01 +08:00
Pascal Seitz	087beaf328	remove null handling	2022-10-07 16:25:01 +08:00
Pascal Seitz	309449dba3	rename to IpAddr	2022-10-07 16:25:01 +08:00
Pascal Seitz	5a76e6c5d3	fix get_between_vals forwarding fix get_between_vals forwarding in monotonicmapping column by adding an additional conversion function Output->Input	2022-10-07 16:25:01 +08:00
Pascal Seitz	c8713a01ed	use iter api	2022-10-07 16:25:01 +08:00
Pascal Seitz	6113e0408c	remove comment	2022-10-07 16:25:01 +08:00
Pascal Seitz	400a20b7af	add ip field add u128 multivalue reader and writer add ip to schema add ip writers, handle merge	2022-10-07 16:25:01 +08:00
PSeitz	5f565e77de	Merge pull request #1604 from quickwit-oss/replace_cbor replace cbor with cborium	2022-10-07 14:42:55 +08:00
Pascal Seitz	516e60900d	remove unwrap	2022-10-07 14:22:37 +08:00
Pascal Seitz	36e1c79f37	replace cbor with cborium closes #1526	2022-10-07 13:23:39 +08:00
Bruce Mitchener	c694bc039a	Fix missing doc warnings when enabling feature "quickwit".	2022-10-05 20:17:10 +07:00
Nigel Andrews	e5043d78d2	added a couple of tests + make fmt	2022-10-04 12:52:44 +02:00
Nigel Andrews	6d0bb82bd2	Fix issue 1576: serialize bytes as base64 strings	2022-10-04 12:18:13 +02:00