Fixing bench compilation

fmt
2026-02-20 14:50:38 +00:00 · 2019-10-04 16:36:17 +09:00 · 2019-10-02 09:50:20 +09:00
22 changed files with 123 additions and 265 deletions
--- a/README.md
+++ b/README.md
@@ -21,9 +21,9 @@
 [![Become a patron](https://c5.patreon.com/external/logo/become_a_patron_button.png)](https://www.patreon.com/fulmicoton)


-**Tantivy** is a **full text search engine library** written in Rust.
+**Tantivy** is a **full text search engine library** written in rust.

-It is closer to [Apache Lucene](https://lucene.apache.org/) than to [Elasticsearch](https://www.elastic.co/products/elasticsearch) or [Apache Solr](https://lucene.apache.org/solr/) in the sense it is not
+It is closer to [Apache Lucene](https://lucene.apache.org/) than to [Elasticsearch](https://www.elastic.co/products/elasticsearch) and [Apache Solr](https://lucene.apache.org/solr/) in the sense it is not
 an off-the-shelf search engine server, but rather a crate that can be used
 to build such a search engine.

@@ -31,7 +31,7 @@ Tantivy is, in fact, strongly inspired by Lucene's design.

 # Benchmark

-Tantivy is typically faster than Lucene, but the results depend on 
+Tantivy is typically faster than Lucene, but the results will depend on 
 the nature of the queries in your workload.

 The following [benchmark](https://tantivy-search.github.io/bench/) break downs 
@@ -40,19 +40,19 @@ performance for different type of queries / collection.
 # Features

 - Full-text search
- Configurable tokenizer (stemming available for 17 Latin languages with third party support for Chinese ([tantivy-jieba](https://crates.io/crates/tantivy-jieba) and [cang-jie](https://crates.io/crates/cang-jie)) and [Japanese](https://crates.io/crates/tantivy-tokenizer-tiny-segmenter))
+- Configurable tokenizer. (stemming available for 17 latin languages. Third party support for Chinese ([tantivy-jieba](https://crates.io/crates/tantivy-jieba) and [cang-jie](https://crates.io/crates/cang-jie)) and [Japanese](https://crates.io/crates/tantivy-tokenizer-tiny-segmenter)
 - Fast (check out the :racehorse: :sparkles: [benchmark](https://tantivy-search.github.io/bench/) :sparkles: :racehorse:)
 - Tiny startup time (<10ms), perfect for command line tools
- BM25 scoring (the same as Lucene)
- Natural query language (e.g. `(michael AND jackson) OR "king of pop"`)
- Phrase queries search (e.g. `"michael jackson"`)
+- BM25 scoring (the same as lucene)
+- Natural query language `(michael AND jackson) OR "king of pop"`
+- Phrase queries search (`"michael jackson"`)
 - Incremental indexing
 - Multithreaded indexing (indexing English Wikipedia takes < 3 minutes on my desktop)
 - Mmap directory
- SIMD integer compression when the platform/CPU includes the SSE2 instruction set
- Single valued and multivalued u64, i64, and f64 fast fields (equivalent of doc values in Lucene)
+- SIMD integer compression when the platform/CPU includes the SSE2 instruction set.
+- Single valued and multivalued u64, i64 and f64 fast fields (equivalent of doc values in Lucene)
 - `&[u8]` fast fields
- Text, i64, u64, f64, dates, and hierarchical facet fields
+- Text, i64, u64, f64, dates and hierarchical facet fields
 - LZ4 compressed document store
 - Range queries
 - Faceted search
@@ -61,42 +61,43 @@ performance for different type of queries / collection.

 # Non-features

- Distributed search is out of the scope of Tantivy. That being said, Tantivy is a
+- Distributed search is out of the scope of tantivy. That being said, tantivy is meant as a
 library upon which one could build a distributed search. Serializable/mergeable collector state for instance, 
-are within the scope of Tantivy.
+are within the scope of tantivy.

 # Supported OS and compiler

-Tantivy works on stable Rust (>= 1.27) and supports Linux, MacOS, and Windows.
+Tantivy works on stable rust (>= 1.27) and supports Linux, MacOS and Windows.

 # Getting started

- [Tantivy's simple search example](https://tantivy-search.github.io/examples/basic_search.html)
- [tantivy-cli and its tutorial](https://github.com/tantivy-search/tantivy-cli) - `tantivy-cli` is an actual command line interface that makes it easy for you to create a search engine,
-index documents, and search via the CLI or a small server with a REST API.
-It walks you through getting a wikipedia search engine up and running in a few minutes.
- [Reference doc for the last released version](https://docs.rs/tantivy/)
+- [tantivy's simple search example](https://tantivy-search.github.io/examples/basic_search.html)
+- [tantivy-cli and its tutorial](https://github.com/tantivy-search/tantivy-cli).
+`tantivy-cli` is an actual command line interface that makes it easy for you to create a search engine,
+index documents and search via the CLI or a small server with a REST API.
+It will walk you through getting a wikipedia search engine up and running in a few minutes.
+- [reference doc for the last released version](https://docs.rs/tantivy/)

 # How can I support this project?

 There are many ways to support this project. 

- Use Tantivy and tell us about your experience on [Gitter](https://gitter.im/tantivy-search/tantivy) or by email (paul.masurel@gmail.com)
+- Use tantivy and tell us about your experience on [gitter](https://gitter.im/tantivy-search/tantivy) or by email (paul.masurel@gmail.com)
 - Report bugs
 - Write a blog post
 - Help with documentation by asking questions or submitting PRs
- Contribute code (you can join [our Gitter](https://gitter.im/tantivy-search/tantivy))
- Talk about Tantivy around you
+- Contribute code (you can join [our gitter](https://gitter.im/tantivy-search/tantivy) )
+- Talk about tantivy around you
 - Drop a word on on [![Say Thanks!](https://img.shields.io/badge/Say%20Thanks-!-1EAEDB.svg)](https://saythanks.io/to/fulmicoton) or even [![Become a patron](https://c5.patreon.com/external/logo/become_a_patron_button.png)](https://www.patreon.com/fulmicoton)

 # Contributing code

-We use the GitHub Pull Request workflow: reference a GitHub ticket and/or include a comprehensive commit message when opening a PR.
+We use the GitHub Pull Request workflow - reference a GitHub ticket and/or include a comprehensive commit message when opening a PR.

 ## Clone and build locally

-Tantivy compiles on stable Rust but requires `Rust >= 1.27`.
-To check out and run tests, you can simply run:
+Tantivy compiles on stable rust but requires `Rust >= 1.27`.
+To check out and run tests, you can simply run :

 ```bash
    git clone https://github.com/tantivy-search/tantivy.git
@@ -107,7 +108,7 @@ To check out and run tests, you can simply run:
 ## Run tests

 Some tests will not run with just `cargo test` because of `fail-rs`.
-To run the tests exhaustively, run `./run-tests.sh`.
+To run the tests exhaustively, run `./run-tests.sh`

 ## Debug

@@ -115,13 +116,13 @@ You might find it useful to step through the programme with a debugger.

 ### A failing test

-Make sure you haven't run `cargo clean` after the most recent `cargo test` or `cargo build` to guarantee that the `target/` directory exists. Use this bash script to find the name of the most recent debug build of Tantivy and run it under `rust-gdb`:
+Make sure you haven't run `cargo clean` after the most recent `cargo test` or `cargo build` to guarantee that `target/` dir exists. Use this bash script to find the most name of the most recent debug build of tantivy and run it under rust-gdb.

 ```bash
 find target/debug/ -maxdepth 1 -executable -type f -name "tantivy*" -printf '%TY-%Tm-%Td %TT %p\n' | sort -r | cut -d " " -f 3 | xargs -I RECENT_DBG_TANTIVY rust-gdb RECENT_DBG_TANTIVY
 ```

-Now that you are in `rust-gdb`, you can set breakpoints on lines and methods that match your source code and run the debug executable with flags that you normally pass to `cargo test` like this:
+Now that you are in rust-gdb, you can set breakpoints on lines and methods that match your source-code and run the debug executable with flags that you normally pass to `cargo test` to like this

 ```bash
 $gdb run --test-threads 1 --test $NAME_OF_TEST
@@ -129,7 +130,7 @@ $gdb run --test-threads 1 --test $NAME_OF_TEST

 ### An example

-By default, `rustc` compiles everything in the `examples/` directory in debug mode. This makes it easy for you to make examples to reproduce bugs:
+By default, rustc compiles everything in the `examples/` dir in debug mode. This makes it easy for you to make examples to reproduce bugs.

 ```bash
 rust-gdb target/debug/examples/$EXAMPLE_NAME
--- a/query-grammar/src/occur.rs
+++ b/query-grammar/src/occur.rs
@@ -2,7 +2,7 @@ use std::fmt;
 use std::fmt::Write;

 /// Defines whether a term in a query must be present,
-/// should be present or must be not present.
+/// should be present or must not be present.
 #[derive(Debug, Clone, Hash, Copy, Eq, PartialEq)]
 pub enum Occur {
    /// For a given document to be considered for scoring,
--- a/src/collector/facet_collector.rs
+++ b/src/collector/facet_collector.rs
@@ -515,7 +515,7 @@ mod tests {
    #[should_panic(expected = "Tried to add a facet which is a descendant of \
                               an already added facet.")]
    fn test_misused_facet_collector() {
-        let mut facet_collector = FacetCollector::for_field(Field::from_field_id(0));
+        let mut facet_collector = FacetCollector::for_field(Field(0));
        facet_collector.add_facet(Facet::from("/country"));
        facet_collector.add_facet(Facet::from("/country/europe"));
    }
@@ -546,7 +546,7 @@ mod tests {

    #[test]
    fn test_non_used_facet_collector() {
-        let mut facet_collector = FacetCollector::for_field(Field::from_field_id(0));
+        let mut facet_collector = FacetCollector::for_field(Field(0));
        facet_collector.add_facet(Facet::from("/country"));
        facet_collector.add_facet(Facet::from("/countryeurope"));
    }
--- a/src/collector/top_score_collector.rs
+++ b/src/collector/top_score_collector.rs
@@ -551,7 +551,7 @@ mod tests {
            ));
        });
        let searcher = index.reader().unwrap().searcher();
-        let top_collector = TopDocs::with_limit(4).order_by_u64_field(Field::from_field_id(2));
+        let top_collector = TopDocs::with_limit(4).order_by_u64_field(Field(2));
        let segment_reader = searcher.segment_reader(0u32);
        top_collector
            .for_segment(0, segment_reader)
--- a/src/common/composite_file.rs
+++ b/src/common/composite_file.rs
@@ -199,13 +199,13 @@ mod test {
            let w = directory.open_write(path).unwrap();
            let mut composite_write = CompositeWrite::wrap(w);
            {
-                let mut write_0 = composite_write.for_field(Field::from_field_id(0u32));
+                let mut write_0 = composite_write.for_field(Field(0u32));
                VInt(32431123u64).serialize(&mut write_0).unwrap();
                write_0.flush().unwrap();
            }

            {
-                let mut write_4 = composite_write.for_field(Field::from_field_id(4u32));
+                let mut write_4 = composite_write.for_field(Field(4u32));
                VInt(2).serialize(&mut write_4).unwrap();
                write_4.flush().unwrap();
            }
@@ -215,18 +215,14 @@ mod test {
            let r = directory.open_read(path).unwrap();
            let composite_file = CompositeFile::open(&r).unwrap();
            {
-                let file0 = composite_file
-                    .open_read(Field::from_field_id(0u32))
-                    .unwrap();
+                let file0 = composite_file.open_read(Field(0u32)).unwrap();
                let mut file0_buf = file0.as_slice();
                let payload_0 = VInt::deserialize(&mut file0_buf).unwrap().0;
                assert_eq!(file0_buf.len(), 0);
                assert_eq!(payload_0, 32431123u64);
            }
            {
-                let file4 = composite_file
-                    .open_read(Field::from_field_id(4u32))
-                    .unwrap();
+                let file4 = composite_file.open_read(Field(4u32)).unwrap();
                let mut file4_buf = file4.as_slice();
                let payload_4 = VInt::deserialize(&mut file4_buf).unwrap().0;
                assert_eq!(file4_buf.len(), 0);
--- a/src/fastfield/readers.rs
+++ b/src/fastfield/readers.rs
@@ -59,7 +59,8 @@ impl FastFieldReaders {
            fast_bytes: Default::default(),
            fast_fields_composite: fast_fields_composite.clone(),
        };
-        for (field, field_entry) in schema.fields() {
+        for (field_id, field_entry) in schema.fields().iter().enumerate() {
+            let field = Field(field_id as u32);
            let field_type = field_entry.field_type();
            if field_type == &FieldType::Bytes {
                let idx_reader = fast_fields_composite
--- a/src/fastfield/writer.rs
+++ b/src/fastfield/writer.rs
@@ -24,7 +24,8 @@ impl FastFieldsWriter {
        let mut multi_values_writers = Vec::new();
        let mut bytes_value_writers = Vec::new();

-        for (field, field_entry) in schema.fields() {
+        for (field_id, field_entry) in schema.fields().iter().enumerate() {
+            let field = Field(field_id as u32);
            let default_value = match *field_entry.field_type() {
                FieldType::I64(_) => common::i64_to_u64(0i64),
                FieldType::F64(_) => common::f64_to_u64(0.0f64),
--- a/src/fieldnorm/writer.rs
+++ b/src/fieldnorm/writer.rs
@@ -22,14 +22,11 @@ impl FieldNormsWriter {
    pub(crate) fn fields_with_fieldnorm(schema: &Schema) -> Vec<Field> {
        schema
            .fields()
-            .filter_map(|(field, field_entry)| {
-                if field_entry.is_indexed() {
-                    Some(field)
-                } else {
-                    None
-                }
-            })
-            .collect::<Vec<_>>()
+            .iter()
+            .enumerate()
+            .filter(|&(_, field_entry)| field_entry.is_indexed())
+            .map(|(field, _)| Field(field as u32))
+            .collect::<Vec<Field>>()
    }

    /// Initialize with state for tracking the field norm fields
@@ -38,7 +35,7 @@ impl FieldNormsWriter {
        let fields = FieldNormsWriter::fields_with_fieldnorm(schema);
        let max_field = fields
            .iter()
-            .map(Field::field_id)
+            .map(|field| field.0)
            .max()
            .map(|max_field_id| max_field_id as usize + 1)
            .unwrap_or(0);
@@ -53,8 +50,8 @@ impl FieldNormsWriter {
    ///
    /// Will extend with 0-bytes for documents that have not been seen.
    pub fn fill_up_to_max_doc(&mut self, max_doc: DocId) {
-        for field in self.fields.iter() {
-            self.fieldnorms_buffer[field.field_id() as usize].resize(max_doc as usize, 0u8);
+        for &field in self.fields.iter() {
+            self.fieldnorms_buffer[field.0 as usize].resize(max_doc as usize, 0u8);
        }
    }

@@ -67,7 +64,7 @@ impl FieldNormsWriter {
    /// * field     - the field being set
    /// * fieldnorm - the number of terms present in document `doc` in field `field`
    pub fn record(&mut self, doc: DocId, field: Field, fieldnorm: u32) {
-        let fieldnorm_buffer: &mut Vec<u8> = &mut self.fieldnorms_buffer[field.field_id() as usize];
+        let fieldnorm_buffer: &mut Vec<u8> = &mut self.fieldnorms_buffer[field.0 as usize];
        assert!(
            fieldnorm_buffer.len() <= doc as usize,
            "Cannot register a given fieldnorm twice"
@@ -80,7 +77,7 @@ impl FieldNormsWriter {
    /// Serialize the seen fieldnorm values to the serializer for all fields.
    pub fn serialize(&self, fieldnorms_serializer: &mut FieldNormsSerializer) -> io::Result<()> {
        for &field in self.fields.iter() {
-            let fieldnorm_values: &[u8] = &self.fieldnorms_buffer[field.field_id() as usize][..];
+            let fieldnorm_values: &[u8] = &self.fieldnorms_buffer[field.0 as usize][..];
            fieldnorms_serializer.serialize_field(field, fieldnorm_values)?;
        }
        Ok(())
--- a/src/indexer/delete_queue.rs
+++ b/src/indexer/delete_queue.rs
@@ -258,7 +258,7 @@ mod tests {
        let delete_queue = DeleteQueue::new();

        let make_op = |i: usize| {
-            let field = Field::from_field_id(1u32);
+            let field = Field(1u32);
            DeleteOperation {
                opstamp: i as u64,
                term: Term::from_field_u64(field, i as u64),
--- a/src/indexer/merger.rs
+++ b/src/indexer/merger.rs
@@ -190,7 +190,8 @@ impl IndexMerger {
        fast_field_serializer: &mut FastFieldSerializer,
        mut term_ord_mappings: HashMap<Field, TermOrdinalMapping>,
    ) -> Result<()> {
-        for (field, field_entry) in self.schema.fields() {
+        for (field_id, field_entry) in self.schema.fields().iter().enumerate() {
+            let field = Field(field_id as u32);
            let field_type = field_entry.field_type();
            match *field_type {
                FieldType::HierarchicalFacet => {
@@ -648,12 +649,15 @@ impl IndexMerger {
        serializer: &mut InvertedIndexSerializer,
    ) -> Result<HashMap<Field, TermOrdinalMapping>> {
        let mut term_ordinal_mappings = HashMap::new();
-        for (field, field_entry) in self.schema.fields() {
+        for (field_ord, field_entry) in self.schema.fields().iter().enumerate() {
            if field_entry.is_indexed() {
-                if let Some(term_ordinal_mapping) =
-                    self.write_postings_for_field(field, field_entry.field_type(), serializer)?
-                {
-                    term_ordinal_mappings.insert(field, term_ordinal_mapping);
+                let indexed_field = Field(field_ord as u32);
+                if let Some(term_ordinal_mapping) = self.write_postings_for_field(
+                    indexed_field,
+                    field_entry.field_type(),
+                    serializer,
+                )? {
+                    term_ordinal_mappings.insert(indexed_field, term_ordinal_mapping);
                }
            }
        }
--- a/src/indexer/segment_writer.rs
+++ b/src/indexer/segment_writer.rs
@@ -6,11 +6,11 @@ use crate::fieldnorm::FieldNormsWriter;
 use crate::indexer::segment_serializer::SegmentSerializer;
 use crate::postings::compute_table_size;
 use crate::postings::MultiFieldPostingsWriter;
+use crate::schema::FieldEntry;
 use crate::schema::FieldType;
 use crate::schema::Schema;
 use crate::schema::Term;
 use crate::schema::Value;
-use crate::schema::{Field, FieldEntry};
 use crate::tokenizer::BoxedTokenizer;
 use crate::tokenizer::FacetTokenizer;
 use crate::tokenizer::{TokenStream, Tokenizer};
@@ -70,10 +70,12 @@ impl SegmentWriter {
        let table_num_bits = initial_table_size(memory_budget)?;
        let segment_serializer = SegmentSerializer::for_segment(&mut segment)?;
        let multifield_postings = MultiFieldPostingsWriter::new(schema, table_num_bits);
-        let tokenizers = schema
-            .fields()
-            .map(
-                |(_, field_entry): (Field, &FieldEntry)| match field_entry.field_type() {
+        let tokenizers =
+            schema
+                .fields()
+                .iter()
+                .map(FieldEntry::field_type)
+                .map(|field_type| match *field_type {
                    FieldType::Str(ref text_options) => text_options
                        .get_indexing_options()
                        .and_then(|text_index_option| {
@@ -81,9 +83,8 @@ impl SegmentWriter {
                            segment.index().tokenizers().get(tokenizer_name)
                        }),
                    _ => None,
-                },
-            )
-            .collect();
+                })
+                .collect();
        Ok(SegmentWriter {
            max_doc: 0,
            multifield_postings,
@@ -159,7 +160,7 @@ impl SegmentWriter {
                }
                FieldType::Str(_) => {
                    let num_tokens = if let Some(ref mut tokenizer) =
-                        self.tokenizers[field.field_id() as usize]
+                        self.tokenizers[field.0 as usize]
                    {
                        let texts: Vec<&str> = field_values
                            .iter()
--- a/src/lib.rs
+++ b/src/lib.rs
@@ -212,13 +212,15 @@ pub type Score = f32;
 pub type SegmentLocalId = u32;

 impl DocAddress {
-    /// Return the segment ordinal id that identifies the segment
-    /// hosting the document in the `Searcher` it is called from.
+    /// Return the segment ordinal.
+    /// The segment ordinal is an id identifying the segment
+    /// hosting the document. It is only meaningful, in the context
+    /// of a searcher.
    pub fn segment_ord(self) -> SegmentLocalId {
        self.0
    }

-    /// Return the segment-local `DocId`
+    /// Return the segment local `DocId`
    pub fn doc(self) -> DocId {
        self.1
    }
@@ -227,11 +229,11 @@ impl DocAddress {
 /// `DocAddress` contains all the necessary information
 /// to identify a document given a `Searcher` object.
 ///
-/// It consists of an id identifying its segment, and
-/// a segment-local `DocId`.
+/// It consists in an id identifying its segment, and
+/// its segment-local `DocId`.
 ///
 /// The id used for the segment is actually an ordinal
-/// in the list of `Segment`s held by a `Searcher`.
+/// in the list of segment hold by a `Searcher`.
 #[derive(Debug, Clone, Copy, PartialEq, Eq, PartialOrd, Ord)]
 pub struct DocAddress(pub SegmentLocalId, pub DocId);

--- a/src/postings/mod.rs
+++ b/src/postings/mod.rs
@@ -356,9 +356,9 @@ pub mod tests {

    #[test]
    fn test_skip_next() {
-        let term_0 = Term::from_field_u64(Field::from_field_id(0), 0);
-        let term_1 = Term::from_field_u64(Field::from_field_id(0), 1);
-        let term_2 = Term::from_field_u64(Field::from_field_id(0), 2);
+        let term_0 = Term::from_field_u64(Field(0), 0);
+        let term_1 = Term::from_field_u64(Field(0), 1);
+        let term_2 = Term::from_field_u64(Field(0), 2);

        let num_docs = 300u32;

@@ -511,19 +511,19 @@ pub mod tests {
    }

    pub static TERM_A: Lazy<Term> = Lazy::new(|| {
-        let field = Field::from_field_id(0);
+        let field = Field(0);
        Term::from_field_text(field, "a")
    });
    pub static TERM_B: Lazy<Term> = Lazy::new(|| {
-        let field = Field::from_field_id(0);
+        let field = Field(0);
        Term::from_field_text(field, "b")
    });
    pub static TERM_C: Lazy<Term> = Lazy::new(|| {
-        let field = Field::from_field_id(0);
+        let field = Field(0);
        Term::from_field_text(field, "c")
    });
    pub static TERM_D: Lazy<Term> = Lazy::new(|| {
-        let field = Field::from_field_id(0);
+        let field = Field(0);
        Term::from_field_text(field, "d")
    });

--- a/src/postings/postings_writer.rs
+++ b/src/postings/postings_writer.rs
@@ -61,12 +61,12 @@ fn make_field_partition(
        .iter()
        .map(|(key, _, _)| Term::wrap(key).field())
        .enumerate();
-    let mut prev_field_opt = None;
+    let mut prev_field = Field(u32::max_value());
    let mut fields = vec![];
    let mut offsets = vec![];
    for (offset, field) in term_offsets_it {
-        if Some(field) != prev_field_opt {
-            prev_field_opt = Some(field);
+        if field != prev_field {
+            prev_field = field;
            fields.push(field);
            offsets.push(offset);
        }
@@ -86,7 +86,8 @@ impl MultiFieldPostingsWriter {
        let term_index = TermHashMap::new(table_bits);
        let per_field_postings_writers: Vec<_> = schema
            .fields()
-            .map(|(_, field_entry)| posting_from_field_entry(field_entry))
+            .iter()
+            .map(|field_entry| posting_from_field_entry(field_entry))
            .collect();
        MultiFieldPostingsWriter {
            heap: MemoryArena::new(),
@@ -106,8 +107,7 @@ impl MultiFieldPostingsWriter {
        field: Field,
        token_stream: &mut dyn TokenStream,
    ) -> u32 {
-        let postings_writer =
-            self.per_field_postings_writers[field.field_id() as usize].deref_mut();
+        let postings_writer = self.per_field_postings_writers[field.0 as usize].deref_mut();
        postings_writer.index_text(
            &mut self.term_index,
            doc,
@@ -118,8 +118,7 @@ impl MultiFieldPostingsWriter {
    }

    pub fn subscribe(&mut self, doc: DocId, term: &Term) -> UnorderedTermId {
-        let postings_writer =
-            self.per_field_postings_writers[term.field().field_id() as usize].deref_mut();
+        let postings_writer = self.per_field_postings_writers[term.field().0 as usize].deref_mut();
        postings_writer.subscribe(&mut self.term_index, doc, 0u32, term, &mut self.heap)
    }

@@ -161,7 +160,7 @@ impl MultiFieldPostingsWriter {
                FieldType::Bytes => {}
            }

-            let postings_writer = &self.per_field_postings_writers[field.field_id() as usize];
+            let postings_writer = &self.per_field_postings_writers[field.0 as usize];
            let mut field_serializer =
                serializer.new_field(field, postings_writer.total_num_tokens())?;
            postings_writer.serialize(
--- a/src/query/boolean_query/boolean_query.rs
+++ b/src/query/boolean_query/boolean_query.rs
@@ -9,8 +9,7 @@ use crate::Result;
 use crate::Searcher;
 use std::collections::BTreeSet;

-/// The boolean query returns a set of documents
-/// that matches the Boolean combination of constituent subqueries.
+/// The boolean query combines a set of queries
 ///
 /// The documents matched by the boolean query are
 /// those which
@@ -20,113 +19,6 @@ use std::collections::BTreeSet;
 /// `MustNot` occurence.
 /// * match at least one of the subqueries that is not
 /// a `MustNot` occurence.
-///
-///
-/// You can combine other query types and their `Occur`ances into one `BooleanQuery`
-///
-/// ```rust
-///use tantivy::collector::Count;
-///use tantivy::doc;
-///use tantivy::query::{BooleanQuery, Occur, PhraseQuery, Query, TermQuery};
-///use tantivy::schema::{IndexRecordOption, Schema, TEXT};
-///use tantivy::Term;
-///use tantivy::{Index, Result};
-///
-///fn main() -> Result<()> {
-///    let mut schema_builder = Schema::builder();
-///    let title = schema_builder.add_text_field("title", TEXT);
-///    let body = schema_builder.add_text_field("body", TEXT);
-///    let schema = schema_builder.build();
-///    let index = Index::create_in_ram(schema);
-///    {
-///        let mut index_writer = index.writer(3_000_000)?;
-///        index_writer.add_document(doc!(
-///            title => "The Name of the Wind",
-///        ));
-///        index_writer.add_document(doc!(
-///            title => "The Diary of Muadib",
-///        ));
-///        index_writer.add_document(doc!(
-///            title => "A Dairy Cow",
-///            body => "hidden",
-///        ));
-///        index_writer.add_document(doc!(
-///            title => "A Dairy Cow",
-///            body => "found",
-///        ));
-///        index_writer.add_document(doc!(
-///            title => "The Diary of a Young Girl",
-///        ));
-///        index_writer.commit().unwrap();
-///    }
-///
-///    let reader = index.reader()?;
-///    let searcher = reader.searcher();
-///
-///    // Make TermQuery's for "girl" and "diary" in the title
-///    let girl_term_query: Box<dyn Query> = Box::new(TermQuery::new(
-///        Term::from_field_text(title, "girl"),
-///        IndexRecordOption::Basic,
-///    ));
-///    let diary_term_query: Box<dyn Query> = Box::new(TermQuery::new(
-///        Term::from_field_text(title, "diary"),
-///        IndexRecordOption::Basic,
-///    ));
-///    // A TermQuery with "found" in the body
-///    let body_term_query: Box<dyn Query> = Box::new(TermQuery::new(
-///        Term::from_field_text(body, "found"),
-///        IndexRecordOption::Basic,
-///    ));
-///    // TermQuery "diary" must and "girl" must not be present
-///    let queries_with_occurs1 = vec![
-///        (Occur::Must, diary_term_query.box_clone()),
-///        (Occur::MustNot, girl_term_query),
-///    ];
-///    // Make a BooleanQuery equivalent to
-///    // title:+diary title:-girl
-///    let diary_must_and_girl_mustnot = BooleanQuery::from(queries_with_occurs1);
-///    let count1 = searcher.search(&diary_must_and_girl_mustnot, &Count)?;
-///    assert_eq!(count1, 1);
-///
-///    // TermQuery for "cow" in the title
-///    let cow_term_query: Box<dyn Query> = Box::new(TermQuery::new(
-///        Term::from_field_text(title, "cow"),
-///        IndexRecordOption::Basic,
-///    ));
-///    // "title:diary OR title:cow"
-///    let title_diary_or_cow = BooleanQuery::from(vec![
-///        (Occur::Should, diary_term_query.box_clone()),
-///        (Occur::Should, cow_term_query),
-///    ]);
-///    let count2 = searcher.search(&title_diary_or_cow, &Count)?;
-///    assert_eq!(count2, 4);
-///
-///    // Make a `PhraseQuery` from a vector of `Term`s
-///    let phrase_query: Box<dyn Query> = Box::new(PhraseQuery::new(vec![
-///        Term::from_field_text(title, "dairy"),
-///        Term::from_field_text(title, "cow"),
-///    ]));
-///    // You can combine subqueries of different types into 1 BooleanQuery:
-///    // `TermQuery` and `PhraseQuery`
-///    // "title:diary OR "dairy cow"
-///    let term_of_phrase_query = BooleanQuery::from(vec![
-///        (Occur::Should, diary_term_query.box_clone()),
-///        (Occur::Should, phrase_query.box_clone()),
-///    ]);
-///    let count3 = searcher.search(&term_of_phrase_query, &Count)?;
-///    assert_eq!(count3, 4);
-///
-///    // You can nest one BooleanQuery inside another
-///    // body:found AND ("title:diary OR "dairy cow")
-///    let nested_query = BooleanQuery::from(vec![
-///        (Occur::Must, body_term_query),
-///        (Occur::Must, Box::new(term_of_phrase_query))
-///    ]);
-///    let count4 = searcher.search(&nested_query, &Count)?;
-///    assert_eq!(count4, 1);
-///    Ok(())
-///}
-/// ```
 #[derive(Debug)]
 pub struct BooleanQuery {
    subqueries: Vec<(Occur, Box<dyn Query>)>,
--- a/src/query/phrase_query/phrase_query.rs
+++ b/src/query/phrase_query/phrase_query.rs
@@ -40,7 +40,7 @@ impl PhraseQuery {
        PhraseQuery::new_with_offset(terms_with_offset)
    }

-    /// Creates a new `PhraseQuery` given a list of terms and their offsets.
+    /// Creates a new `PhraseQuery` given a list of terms and there offsets.
    ///
    /// Can be used to provide custom offset for each term.
    pub fn new_with_offset(mut terms: Vec<(usize, Term)>) -> PhraseQuery {
@@ -73,7 +73,7 @@ impl PhraseQuery {
            .collect::<Vec<Term>>()
    }

-    /// Returns the `PhraseWeight` for the given phrase query given a specific `searcher`.
+    /// Returns the `PhraseWeight` for the given phrase query given a specific `searcher`.  
    ///
    /// This function is the same as `.weight(...)` except it returns
    /// a specialized type `PhraseWeight` instead of a Boxed trait.
--- a/src/query/query_parser/query_parser.rs
+++ b/src/query/query_parser/query_parser.rs
@@ -674,19 +674,13 @@ mod test {

        test_parse_query_to_logical_ast_helper(
            "signed:-2324",
-            &format!(
-                "{:?}",
-                Term::from_field_i64(Field::from_field_id(2u32), -2324)
-            ),
+            &format!("{:?}", Term::from_field_i64(Field(2u32), -2324)),
            false,
        );

        test_parse_query_to_logical_ast_helper(
            "float:2.5",
-            &format!(
-                "{:?}",
-                Term::from_field_f64(Field::from_field_id(10u32), 2.5)
-            ),
+            &format!("{:?}", Term::from_field_f64(Field(10u32), 2.5)),
            false,
        );
    }
--- a/src/query/term_query/mod.rs
+++ b/src/query/term_query/mod.rs
@@ -118,7 +118,7 @@ mod tests {
    #[test]
    fn test_term_query_debug() {
        let term_query = TermQuery::new(
-            Term::from_field_text(Field::from_field_id(1), "hello"),
+            Term::from_field_text(Field(1), "hello"),
            IndexRecordOption::WithFreqs,
        );
        assert_eq!(
--- a/src/schema/field.rs
+++ b/src/schema/field.rs
@@ -3,22 +3,14 @@ use std::io;
 use std::io::Read;
 use std::io::Write;

-/// `Field` is represented by an unsigned 32-bit integer type
-/// The schema holds the mapping between field names and `Field` objects.
+/// `Field` is actually a `u8` identifying a `Field`
+/// The schema is in charge of holding mapping between field names
+/// to `Field` objects.
+///
+/// Because the field id is a `u8`, tantivy can only have at most `255` fields.
+/// Value 255 is reserved.
 #[derive(Copy, Clone, Debug, PartialEq, PartialOrd, Eq, Ord, Hash, Serialize, Deserialize)]
-pub struct Field(u32);
-
-impl Field {
-    /// Create a new field object for the given FieldId.
-    pub fn from_field_id(field_id: u32) -> Field {
-        Field(field_id)
-    }
-
-    /// Returns a u32 identifying uniquely a field within a schema.
-    pub fn field_id(&self) -> u32 {
-        self.0
-    }
-}
+pub struct Field(pub u32);

 impl BinarySerializable for Field {
    fn serialize<W: Write>(&self, writer: &mut W) -> io::Result<()> {
--- a/src/schema/schema.rs
+++ b/src/schema/schema.rs
@@ -167,7 +167,7 @@ impl SchemaBuilder {

    /// Adds a field entry to the schema in build.
    fn add_field(&mut self, field_entry: FieldEntry) -> Field {
-        let field = Field::from_field_id(self.fields.len() as u32);
+        let field = Field(self.fields.len() as u32);
        let field_name = field_entry.name().to_string();
        self.fields.push(field_entry);
        self.fields_map.insert(field_name, field);
@@ -223,7 +223,7 @@ pub struct Schema(Arc<InnerSchema>);
 impl Schema {
    /// Return the `FieldEntry` associated to a `Field`.
    pub fn get_field_entry(&self, field: Field) -> &FieldEntry {
-        &self.0.fields[field.field_id() as usize]
+        &self.0.fields[field.0 as usize]
    }

    /// Return the field name for a given `Field`.
@@ -232,12 +232,8 @@ impl Schema {
    }

    /// Return the list of all the `Field`s.
-    pub fn fields(&self) -> impl Iterator<Item = (Field, &FieldEntry)> {
-        self.0
-            .fields
-            .iter()
-            .enumerate()
-            .map(|(field_id, field_entry)| (Field::from_field_id(field_id as u32), field_entry))
+    pub fn fields(&self) -> &[FieldEntry] {
+        &self.0.fields
    }

    /// Creates a new builder.
@@ -489,32 +485,13 @@ mod tests {

        let schema: Schema = serde_json::from_str(expected).unwrap();

-        let mut fields = schema.fields();
-        {
-            let (field, field_entry) = fields.next().unwrap();
-            assert_eq!("title", field_entry.name());
-            assert_eq!(0, field.field_id());
-        }
-        {
-            let (field, field_entry) = fields.next().unwrap();
-            assert_eq!("author", field_entry.name());
-            assert_eq!(1, field.field_id());
-        }
-        {
-            let (field, field_entry) = fields.next().unwrap();
-            assert_eq!("count", field_entry.name());
-            assert_eq!(2, field.field_id());
-        }
-        {
-            let (field, field_entry) = fields.next().unwrap();
-            assert_eq!("popularity", field_entry.name());
-            assert_eq!(3, field.field_id());
-        }
-        {
-            let (field, field_entry) = fields.next().unwrap();
-            assert_eq!("score", field_entry.name());
-            assert_eq!(4, field.field_id());
-        }
+        let mut fields = schema.fields().iter();
+
+        assert_eq!("title", fields.next().unwrap().name());
+        assert_eq!("author", fields.next().unwrap().name());
+        assert_eq!("count", fields.next().unwrap().name());
+        assert_eq!("popularity", fields.next().unwrap().name());
+        assert_eq!("score", fields.next().unwrap().name());
        assert!(fields.next().is_none());
    }

--- a/src/schema/term.rs
+++ b/src/schema/term.rs
@@ -105,7 +105,7 @@ impl Term {
        if self.0.len() < 4 {
            self.0.resize(4, 0u8);
        }
-        BigEndian::write_u32(&mut self.0[0..4], field.field_id());
+        BigEndian::write_u32(&mut self.0[0..4], field.0);
    }

    /// Sets a u64 value in the term.
@@ -157,7 +157,7 @@ where

    /// Returns the field.
    pub fn field(&self) -> Field {
-        Field::from_field_id(BigEndian::read_u32(&self.0.as_ref()[..4]))
+        Field(BigEndian::read_u32(&self.0.as_ref()[..4]))
    }

    /// Returns the `u64` value stored in a term.
@@ -227,7 +227,7 @@ impl fmt::Debug for Term {
        write!(
            f,
            "Term(field={},bytes={:?})",
-            self.field().field_id(),
+            self.field().0,
            self.value_bytes()
        )
    }
--- a/tests/failpoints/mod.rs
+++ b/tests/failpoints/mod.rs
@@ -1,4 +1,5 @@
 use fail;
+use std::io::Write;
 use std::path::Path;
 use tantivy::directory::{Directory, ManagedDirectory, RAMDirectory, TerminatingWrite};
 use tantivy::doc;
Author	SHA1	Message	Date
Paul Masurel	9fd23f3abf	Fixing bench compilation	2019-10-04 16:36:17 +09:00
Paul Masurel	c030990d00	fmt	2019-10-02 09:50:20 +09:00