First stab at #396

Removed use
Added fundamentalss
2026-01-03 07:42:54 +00:00 · 2018-08-28 10:01:19 +09:00 · 2018-08-28 09:54:04 +09:00 · 2018-08-28 08:09:27 +09:00 · 2018-08-27 09:49:49 +09:00 · 2018-08-23 08:59:11 +09:00
153 changed files with 5869 additions and 4678 deletions
--- a/.github/ISSUE_TEMPLATE/bug_report.md
+++ b/.github/ISSUE_TEMPLATE/bug_report.md
@@ -0,0 +1,19 @@
+---
+name: Bug report
+about: Create a report to help us improve
+
+---
+
+**Describe the bug**
+- What did you do?
+- What happened?
+- What was expected?
+
+**Which version of tantivy are you using?**
+If "master",  ideally give the specific sha1 revision.
+
+**To Reproduce**
+
+If your bug is deterministic, can you give a minimal reproducing code?
+Some bugs are not deterministic. Can you describe with precision in which context it happened?
+If this is possible, can you share your code?
--- a/.github/ISSUE_TEMPLATE/feature_request.md
+++ b/.github/ISSUE_TEMPLATE/feature_request.md
@@ -0,0 +1,14 @@
+---
+name: Feature request
+about: Suggest an idea for this project
+
+---
+
+**Is your feature request related to a problem? Please describe.**
+A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]
+
+**Describe the solution you'd like**
+A clear and concise description of what you want to happen.
+
+**[Optional] describe alternatives you've considered**
+A clear and concise description of any alternative solutions or features you've considered.
--- a/.github/ISSUE_TEMPLATE/question.md
+++ b/.github/ISSUE_TEMPLATE/question.md
@@ -0,0 +1,7 @@
+---
+name: Question
+about: Ask any question about tantivy's usage...
+
+---
+
+Try to be specific about your use case...
--- a/.travis.yml
+++ b/.travis.yml
@@ -1,14 +1,17 @@
+# Based on the "trust" template v0.1.2
+# https://github.com/japaric/trust/tree/v0.1.2
+
+dist: trusty
 language: rust
+services: docker
 sudo: required
-cache: cargo
-rust:
-  - nightly
+
 env:
  global:
-    - CC=gcc-4.8
-    - CXX=g++-4.8
+    - CRATE_NAME=tantivy
    - TRAVIS_CARGO_NIGHTLY_FEATURE=""
    - secure: eC8HjTi1wgRVCsMAeXEXt8Ckr0YBSGOEnQkkW4/Nde/OZ9jJjz2nmP1ELQlDE7+czHub2QvYtDMG0parcHZDx/Kus0yvyn08y3g2rhGIiE7y8OCvQm1Mybu2D/p7enm6shXquQ6Z5KRfRq+18mHy80wy9ABMA/ukEZdvnfQ76/Een8/Lb0eHaDoXDXn3PqLVtByvSfQQ7OhS60dEScu8PWZ6/l1057P5NpdWbMExBE7Ro4zYXNhkJeGZx0nP/Bd4Jjdt1XfPzMEybV6NZ5xsTILUBFTmOOt603IsqKGov089NExqxYu5bD3K+S4MzF1Nd6VhomNPJqLDCfhlymJCUj5n5Ku4yidlhQbM4Ej9nGrBalJnhcjBjPua5tmMF2WCxP9muKn/2tIOu1/+wc0vMf9Yd3wKIkf5+FtUxCgs2O+NslWvmOMAMI/yD25m7hb4t1IwE/4Bk+GVcWJRWXbo0/m6ZUHzRzdjUY2a1qvw7C9udzdhg7gcnXwsKrSWi2NjMiIVw86l+Zim0nLpKIN41sxZHLaFRG63Ki8zQ/481LGn32awJ6i3sizKS0WD+N1DfR2qYMrwYHaMN0uR0OFXYTJkFvTFttAeUY3EKmRKAuMhmO2YRdSr4/j/G5E9HMc1gSGJj6PxgpQU7EpvxRsmoVAEJr0mszmOj9icGHep/FM=
+
 addons:
  apt:
    sources:
@@ -22,16 +25,56 @@ addons:
      - libdw-dev
      - binutils-dev
      - cmake
+
+matrix:
+  include:
+    # Android
+    - env: TARGET=aarch64-linux-android DISABLE_TESTS=1
+    #- env: TARGET=arm-linux-androideabi DISABLE_TESTS=1
+    #- env: TARGET=armv7-linux-androideabi DISABLE_TESTS=1
+    #- env: TARGET=i686-linux-android DISABLE_TESTS=1
+    #- env: TARGET=x86_64-linux-android DISABLE_TESTS=1
+
+    # Linux
+    #- env: TARGET=aarch64-unknown-linux-gnu
+    #- env: TARGET=i686-unknown-linux-gnu
+    - env: TARGET=x86_64-unknown-linux-gnu CODECOV=1
+    # - env: TARGET=x86_64-unknown-linux-musl CODECOV=1
+
+    # OSX
+    - env: TARGET=x86_64-apple-darwin
+      os: osx
+
+before_install:
+  - set -e
+  - rustup self update
+
+install:
+  - sh ci/install.sh
+  - source ~/.cargo/env || true
+
 before_script:
  - export PATH=$HOME/.cargo/bin:$PATH
  - cargo install cargo-update || echo "cargo-update already installed"
  - cargo install cargo-travis || echo "cargo-travis already installed"
+
 script:
-  - cargo build
-  - cargo test
-  - cargo test -- --ignored
-  - cargo run --example simple_search
-  - cargo doc
-after_success:
-  - cargo coveralls --exclude-pattern src/functional_test.rs
-  - cargo doc-upload
+  - bash ci/script.sh
+
+before_deploy:
+  - sh ci/before_deploy.sh
+
+cache: cargo
+before_cache:
+  # Travis can't cache files that are not readable by "others"
+  - chmod -R a+r $HOME/.cargo
+
+#branches:
+#  only:
+#    # release tags
+#    - /^v\d+\.\d+\.\d+.*$/
+#    - master
+
+notifications:
+  email:
+    on_success: never
--- a/11
+++ b/11
@@ -0,0 +1,11 @@
+# This is the list of authors of tantivy for copyright purposes.
+Paul Masurel
+Laurentiu Nicola
+Dru Sellers
+Ashley Mannix
+Michael J. Curry
+Jason Wolfe
+# As an employee of Google I am required to add Google LLC
+# in the list of authors, but this project is not affiliated to Google
+# in any other way.
+Google LLC 
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -1,14 +1,42 @@
+
+Tantivy 0.7
+=====================
+- Skip data for doc ids and positions (@fulmicoton),
+  greatly improving performance
+- Tantivy error now rely on the failure crate (@drusellers)
+
+
+Tantivy 0.6.1
+=========================
+- Bugfix #324. GC removing was removing file that were still in useful
+- Added support for parsing AllQuery and RangeQuery via QueryParser
+    - AllQuery: `*`
+    - RangeQuery:
+        - Inclusive `field:[startIncl to endIncl]`
+        - Exclusive `field:{startExcl to endExcl}`
+        - Mixed `field:[startIncl to endExcl}` and vice versa
+        - Unbounded `field:[start to *]`, `field:[* to end]`
+ 
+
 Tantivy 0.6
 ==========================
- Removed C code. Tantivy is now pure Rust.
- BM25
- Approximate field norms encoded over 1 byte.
- Compiles on stable rust
+
+Special thanks to @drusellers and @jason-wolfe for their contributions
+to this release!
+
+- Removed C code. Tantivy is now pure Rust. (@pmasurel)
+- BM25 (@pmasurel)
+- Approximate field norms encoded over 1 byte. (@pmasurel)
+- Compiles on stable rust (@pmasurel)
 - Add &[u8] fastfield for associating arbitrary bytes to each document (@jason-wolfe) (#270)
    - Completely uncompressed
    - Internally: One u64 fast field for indexes, one fast field for the bytes themselves.
 - Add NGram token support (@drusellers)
 - Add Stopword Filter support (@drusellers)
+- Add a FuzzyTermQuery (@drusellers)
+- Add a RegexQuery (@drusellers)
+- Various performance improvements (@pmasurel)_
+

 Tantivy 0.5.2
 ===========================
--- a/Cargo.toml
+++ b/Cargo.toml
@@ -1,10 +1,10 @@
 [package]
 name = "tantivy"
-version = "0.6.0-dev"
+version = "0.7.0-dev"
 authors = ["Paul Masurel <paul.masurel@gmail.com>"]
 license = "MIT"
 categories = ["database-implementations", "data-structures"]
-description = """Tantivy is a search engine library."""
+description = """Search engine library"""
 documentation = "https://tantivy-search.github.io/tantivy/tantivy/index.html"
 homepage = "https://github.com/tantivy-search/tantivy"
 repository = "https://github.com/tantivy-search/tantivy"
@@ -14,43 +14,46 @@ keywords = ["search", "information", "retrieval"]
 [dependencies]
 base64 = "0.9.1"
 byteorder = "1.0"
-lazy_static = "0.2.1"
+lazy_static = "1"
 tinysegmenter = "0.1.0"
-regex = "0.2"
+regex = "1.0"
 fst = {version="0.3", default-features=false}
-atomicwrites = {version="0.1", optional=true}
-tempfile = "2.1"
-log = "0.3.6"
-combine = "2.2"
+fst-regex = { version="0.2" }
+lz4 = {version="1.20", optional=true}
+snap = {version="0.2"}
+atomicwrites = {version="0.2.2", optional=true}
+tempfile = "3.0"
+log = "0.4"
+combine = "3"
 tempdir = "0.3"
 serde = "1.0"
 serde_derive = "1.0"
 serde_json = "1.0"
 num_cpus = "1.2"
-itertools = "0.5.9"
+itertools = "0.7"
 levenshtein_automata = {version="0.1", features=["fst_automaton"]}
-lz4 = "1.20"
-bit-set = "0.4.0"
+bit-set = "0.5"
 uuid = { version = "0.6", features = ["v4", "serde"] }
-chan = "0.1"
-crossbeam = "0.3"
+crossbeam = "0.4"
+crossbeam-channel = "0.2"
 futures = "0.1"
 futures-cpupool = "0.1"
-error-chain = "0.8"
-owning_ref = "0.3"
+owning_ref = "0.4"
 stable_deref_trait = "1.0.0"
-rust-stemmers = "0.1.0"
+rust-stemmers = "1"
 downcast = { version="0.9" }
 matches = "0.1"
-bitpacking = "0.4"
+bitpacking = "0.5"
+census = "0.1"
 fnv = "1.0.6"
+owned-read = "0.4"
+failure = "0.1"

 [target.'cfg(windows)'.dependencies]
 winapi = "0.2"

 [dev-dependencies]
-rand = "0.3"
-env_logger = "0.4"
+rand = "0.5"

 [profile.release]
 opt-level = 3
@@ -60,16 +63,9 @@ debug-assertions = false

 [features]
 default = ["mmap"]
-simd = ["bitpacking/simd"]
 mmap = ["fst/mmap", "atomicwrites"]
-unstable = ["simd"]
+lz4-compression = ["lz4"]

 [badges]
 travis-ci = { repository = "tantivy-search/tantivy" }

-[[example]]
-name = "simple_search"
-required-features = ["mmap"]
-
-[[example]]
-name = "custom_tokenizer"
--- a/2
+++ b/2
@@ -1,4 +1,4 @@
-Copyright (c) 2018 by Paul Masurel, Google LLC
+Copyright (c) 2018 by the project authors, as listed in the AUTHORS file. 

 Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

--- a/README.md
+++ b/README.md
@@ -1,39 +1,66 @@
-![Tantivy](https://tantivy-search.github.io/logo/tantivy-logo.png)

 [![Build Status](https://travis-ci.org/tantivy-search/tantivy.svg?branch=master)](https://travis-ci.org/tantivy-search/tantivy)
-[![Coverage Status](https://coveralls.io/repos/github/tantivy-search/tantivy/badge.svg?branch=master&refresh1)](https://coveralls.io/github/tantivy-search/tantivy?branch=master)
+[![codecov](https://codecov.io/gh/tantivy-search/tantivy/branch/master/graph/badge.svg)](https://codecov.io/gh/tantivy-search/tantivy)
 [![Join the chat at https://gitter.im/tantivy-search/tantivy](https://badges.gitter.im/tantivy-search/tantivy.svg)](https://gitter.im/tantivy-search/tantivy?utm_source=badge&utm_medium=badge&utm_campaign=pr-badge&utm_content=badge)
 [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
-[![Build status](https://ci.appveyor.com/api/projects/status/r7nb13kj23u8m9pj?svg=true)](https://ci.appveyor.com/project/fulmicoton/tantivy)
+[![Build status](https://ci.appveyor.com/api/projects/status/r7nb13kj23u8m9pj/branch/master?svg=true)](https://ci.appveyor.com/project/fulmicoton/tantivy/branch/master)
+
+![Tantivy](https://tantivy-search.github.io/logo/tantivy-logo.png)
+
+[![](https://sourcerer.io/fame/fulmicoton/tantivy-search/tantivy/images/0)](https://sourcerer.io/fame/fulmicoton/tantivy-search/tantivy/links/0)
+[![](https://sourcerer.io/fame/fulmicoton/tantivy-search/tantivy/images/1)](https://sourcerer.io/fame/fulmicoton/tantivy-search/tantivy/links/1)
+[![](https://sourcerer.io/fame/fulmicoton/tantivy-search/tantivy/images/2)](https://sourcerer.io/fame/fulmicoton/tantivy-search/tantivy/links/2)
+[![](https://sourcerer.io/fame/fulmicoton/tantivy-search/tantivy/images/3)](https://sourcerer.io/fame/fulmicoton/tantivy-search/tantivy/links/3)
+[![](https://sourcerer.io/fame/fulmicoton/tantivy-search/tantivy/images/4)](https://sourcerer.io/fame/fulmicoton/tantivy-search/tantivy/links/4)
+[![](https://sourcerer.io/fame/fulmicoton/tantivy-search/tantivy/images/5)](https://sourcerer.io/fame/fulmicoton/tantivy-search/tantivy/links/5)
+[![](https://sourcerer.io/fame/fulmicoton/tantivy-search/tantivy/images/6)](https://sourcerer.io/fame/fulmicoton/tantivy-search/tantivy/links/6)
+[![](https://sourcerer.io/fame/fulmicoton/tantivy-search/tantivy/images/7)](https://sourcerer.io/fame/fulmicoton/tantivy-search/tantivy/links/7)
+
+

 **Tantivy** is a **full text search engine library** written in rust.

-It is strongly inspired by Lucene's design.
+It is closer to Lucene than to Elastic Search and Solr in the sense it is not
+an off-the-shelf search engine server, but rather a crate that can be used
+to build such a search engine.
+
+Tantivy is, in fact, strongly inspired by Lucene's design.

 # Features

+- Full-text search
+- Fast (check out the :racehorse: :sparkles: [benchmark](https://tantivy-search.github.io/bench/) :sparkles: :racehorse:)
 - Tiny startup time (<10ms), perfect for command line tools
- tf-idf scoring
- Basic query language
- Phrase queries
+- BM25 scoring (the same as lucene)
+- Basic query language (`+michael +jackson`)
+- Phrase queries search (\"michael jackson\"`)
 - Incremental indexing
 - Multithreaded indexing (indexing English Wikipedia takes < 3 minutes on my desktop)
 - Mmap directory
- optional SIMD integer compression
+- SIMD integer compression when the platform/CPU includes the SSE2 instruction set.
 - Single valued and multivalued u64 and i64 fast fields (equivalent of doc values in Lucene)
+- `&[u8]` fast fields
 - LZ4 compressed document store
 - Range queries
- Faceting
- configurable indexing (optional term frequency and position indexing
+- Faceted search
+- Configurable indexing (optional term frequency and position indexing
 - Cheesy logo with a horse

-Tantivy supports Linux, MacOS and Windows.
+# Non-features

+- Distributed search and will not be in the scope of tantivy.
+
+
+# Supported OS and compiler
+
+Tantivy works on stable rust (>= 1.27) and supports Linux, MacOS and Windows.

 # Getting started

- [tantivy's usage example](http://fulmicoton.com/tantivy-examples/simple_search.html)
+- [tantivy's simple search example](http://fulmicoton.com/tantivy-examples/simple_search.html)
 - [tantivy-cli and its tutorial](https://github.com/tantivy-search/tantivy-cli).
+`tantivy-cli` is an actual command line interface that makes it easy for you to create a search engine,
+index documents and search via the CLI or a small server with a REST API.
 It will walk you through getting a wikipedia search engine up and running in a few minutes.
 - [reference doc]
    - [For the last released version](https://docs.rs/tantivy/)
@@ -43,40 +70,14 @@ It will walk you through getting a wikipedia search engine up and running in a f

 ## Development

-Tantivy now compiles on stable rust.
-To check out and run test, you can simply run :
+Tantivy compiles on stable rust but requires `Rust >= 1.27`.
+To check out and run tests, you can simply run :

    git clone git@github.com:tantivy-search/tantivy.git
    cd tantivy
    cargo build


-## Note on release build and performance
-
-If your project depends on `tantivy`, for better performance, make sure to enable
-`sse3` instructions using a RUSTFLAGS. (This instruction set is likely to
-be available on most `x86_64` CPUs you will encounter).
-
-For instance,
-
-    RUSTFLAGS='-C target-feature=+sse3'
-
-Or, if you are targetting a specific cpu
-
-    RUSTFLAGS='-C target-cpu=native' build --release
-
-Regardless of the flags you pass, by default `tantivy` will contain `SSE3` instructions.
-If you want to disable those, you can run the following command :
-
-    cargo build --no-default-features
-
-Alternatively, if you are trying to compile `tantivy` without simd compression,
-you can disable this functionality. In this case, this submodule is not required
-and you can compile tantivy by using the `--no-default-features` flag.
-
-    cargo build --no-default-features
-
-
 # Contribute

-Send me an email (paul.masurel at gmail.com) if you want to contribute to tantivy.
+Send me an email (paul.masurel at gmail.com) if you want to contribute to tantivy.
--- a/appveyor.yml
+++ b/appveyor.yml
@@ -4,11 +4,8 @@
 os: Visual Studio 2015
 environment:
  matrix:
-    - channel: nightly
+    - channel: stable
      target: x86_64-pc-windows-msvc
-    - channel: nightly
-      target: x86_64-pc-windows-gnu
-      msys_bits: 64

 install:
  - appveyor DownloadFile https://win.rustup.rs/ -FileName rustup-init.exe
@@ -22,4 +19,4 @@ build: false

 test_script:
  - REM SET RUST_LOG=tantivy,test & cargo test --verbose
-  - REM SET RUST_BACKTRACE=1 & cargo run --example simple_search
+  - REM SET RUST_BACKTRACE=1 & cargo build --examples
--- a/ci/before_deploy.ps1
+++ b/ci/before_deploy.ps1
@@ -0,0 +1,23 @@
+# This script takes care of packaging the build artifacts that will go in the
+# release zipfile
+
+$SRC_DIR = $PWD.Path
+$STAGE = [System.Guid]::NewGuid().ToString()
+
+Set-Location $ENV:Temp
+New-Item -Type Directory -Name $STAGE
+Set-Location $STAGE
+
+$ZIP = "$SRC_DIR\$($Env:CRATE_NAME)-$($Env:APPVEYOR_REPO_TAG_NAME)-$($Env:TARGET).zip"
+
+# TODO Update this to package the right artifacts
+Copy-Item "$SRC_DIR\target\$($Env:TARGET)\release\hello.exe" '.\'
+
+7z a "$ZIP" *
+
+Push-AppveyorArtifact "$ZIP"
+
+Remove-Item *.* -Force
+Set-Location ..
+Remove-Item $STAGE
+Set-Location $SRC_DIR
--- a/ci/before_deploy.sh
+++ b/ci/before_deploy.sh
@@ -0,0 +1,33 @@
+# This script takes care of building your crate and packaging it for release
+
+set -ex
+
+main() {
+    local src=$(pwd) \
+          stage=
+
+    case $TRAVIS_OS_NAME in
+        linux)
+            stage=$(mktemp -d)
+            ;;
+        osx)
+            stage=$(mktemp -d -t tmp)
+            ;;
+    esac
+
+    test -f Cargo.lock || cargo generate-lockfile
+
+    # TODO Update this to build the artifacts that matter to you
+    cross rustc --bin hello --target $TARGET --release -- -C lto
+
+    # TODO Update this to package the right artifacts
+    cp target/$TARGET/release/hello $stage/
+
+    cd $stage
+    tar czf $src/$CRATE_NAME-$TRAVIS_TAG-$TARGET.tar.gz *
+    cd $src
+
+    rm -rf $stage
+}
+
+main
--- a/ci/install.sh
+++ b/ci/install.sh
@@ -0,0 +1,47 @@
+set -ex
+
+main() {
+    local target=
+    if [ $TRAVIS_OS_NAME = linux ]; then
+        target=x86_64-unknown-linux-musl
+        sort=sort
+    else
+        target=x86_64-apple-darwin
+        sort=gsort  # for `sort --sort-version`, from brew's coreutils.
+    fi
+
+    # Builds for iOS are done on OSX, but require the specific target to be
+    # installed.
+    case $TARGET in
+        aarch64-apple-ios)
+            rustup target install aarch64-apple-ios
+            ;;
+        armv7-apple-ios)
+            rustup target install armv7-apple-ios
+            ;;
+        armv7s-apple-ios)
+            rustup target install armv7s-apple-ios
+            ;;
+        i386-apple-ios)
+            rustup target install i386-apple-ios
+            ;;
+        x86_64-apple-ios)
+            rustup target install x86_64-apple-ios
+            ;;
+    esac
+
+    # This fetches latest stable release
+    local tag=$(git ls-remote --tags --refs --exit-code https://github.com/japaric/cross \
+                       | cut -d/ -f3 \
+                       | grep -E '^v[0.1.0-9.]+$' \
+                       | $sort --version-sort \
+                       | tail -n1)
+    curl -LSfs https://japaric.github.io/trust/install.sh | \
+        sh -s -- \
+           --force \
+           --git japaric/cross \
+           --tag $tag \
+           --target $target
+}
+
+main
--- a/ci/script.sh
+++ b/ci/script.sh
@@ -0,0 +1,30 @@
+#!/usr/bin/env bash
+
+# This script takes care of testing your crate
+
+set -ex
+
+main() {
+    if [ ! -z $CODECOV ]; then
+        echo "Codecov"
+        cargo build --verbose && cargo coverage --verbose && bash <(curl -s https://codecov.io/bash) -s target/kcov
+    else
+        echo "Build"
+        cross build --target $TARGET
+        cross build --target $TARGET --release
+        if [ ! -z $DISABLE_TESTS ]; then
+            return
+        fi
+        echo "Test"
+        cross test --target $TARGET
+    fi
+    for example in $(ls examples/*.rs)
+    do
+        cargo run --example  $(basename $example .rs)
+    done
+}
+
+# we don't run the "test phase" when doing deploys
+if [ -z $TRAVIS_TAG ]; then
+    main
+fi
--- a/doc/.gitignore
+++ b/doc/.gitignore
@@ -0,0 +1 @@
+book
--- a/doc/book.toml
+++ b/doc/book.toml
@@ -0,0 +1,5 @@
+[book]
+authors = ["Paul Masurel"]
+multilingual = false
+src = "src"
+title = "Tantivy, the user guide"
--- a/doc/src/SUMMARY.md
+++ b/doc/src/SUMMARY.md
@@ -0,0 +1,14 @@
+# Summary
+
+
+
+[Avant Propos](./avant-propos.md)
+
+- [Segments](./basis.md)
+- [Defining your schema](./schema.md)
+- [Facetting](./facetting.md)
+- [Innerworkings](./innerworkings.md)
+  - [Inverted index](./inverted_index.md)
+
+[Frequently Asked Questions](./faq.md)
+[Examples](./examples.md)
--- a/doc/src/avant-propos.md
+++ b/doc/src/avant-propos.md
@@ -0,0 +1,31 @@
+# Foreword, what is the scope of tantivy?
+
+> Tantivy is a **search** engine **library** for Rust.
+
+If you are familiar with Lucene, tantivy is heavily inspired by Lucene's design and
+they both have the same scope and targetted users.
+
+If you are not familiar with Lucene, let's break down our little tagline.
+
+- **Search** here means full-text search : fundamentally, tantivy is here to help you
+identify efficiently what are the documents matching a given query in your corpus.
+But modern search UI are so much more : text processing, facetting, autocomplete, fuzzy search, good
+relevancy, collapsing, highlighting, spatial search.
+
+  While some of these features are not available in tantivy yet, all of these are relevant
+  feature requests. Tantivy's objective is to offer a solid toolbox to create the best search
+  experience. But keep in mind this is just a toolbox.
+  Which bring us to the second keyword...
+
+- **Library** means that you will have to write code. tantivy is not an *all-in-one* server solution.
+  
+  Sometimes a functionality will not be available in tantivy because it is too specific to your use case. By design, tantivy should make it possible to extend
+  the available set of features using the existing rock-solid datastructures. 
+
+   Most frequently this will mean writing your own `Collector`, your own `Scorer` or your own 
+   `Tokenizer/TokenFilter`... But some of your requirement may also be related to 
+   architecture or operations. For instance, you may want to build a large corpus on Hadoop,
+   fine-tune the merge policy to keep your index sharded in a time-wise fashion, or you may want
+   to convert and existing index from a different format.
+   
+   Tantivy exposes its API to do all of these things.
--- a/doc/src/basis.md
+++ b/doc/src/basis.md
@@ -0,0 +1,48 @@
+# Anatomy of an index
+
+## Straight from disk
+
+By default, tantivy accesses its data using its `MMapDirectory`. 
+While this design has some downsides, this greatly simplifies the source code of tantivy, 
+and entirely delegates the caching to the OS.
+
+`tantivy` works entirely (or almost) by directly reading the datastructures as they are layed on disk.
+As a result, the act of opening an indexing does not involve loading different datastructures 
+from the disk into random access memory : starting a process, opening an index, and performing a query 
+can typically be done in a matter of milliseconds. 
+
+This is an interesting property for a command line search engine, or for some multi-tenant log search engine.
+Spawning a new process for each new query can be a perfectly sensible solution in some use case.
+
+In later chapters, we will discuss tantivy's inverted index data layout.
+One key take away is that to achieve great performance, search indexes are extremely compact. 
+Of course this is crucial to reduce IO, and ensure that as much of our index can sit in RAM.
+
+Also, whenever possible the data is accessed sequentially. Of course, this is an amazing property when tantivy needs to access
+the data from your spinning hard disk, but this is also a great property when working with `SSD` or `RAM`, 
+as it makes our read patterns very predictable for the CPU.
+
+
+## Segments, and the log method
+
+That kind compact layout comes at one cost: it prevents our datastructures from being dynamic.
+In fact, a trait called `Directory` is in charge of abstracting all of tantivy's data access
+and its API does not even allow editing these file once they are written.
+
+To allow the addition / deletion of documents, and create the illusion that
+your index is dynamic (i.e.: adding and deleting documents), tantivy uses a common database trick sometimes
+referred to as the *log method*.
+
+Let's forget about deletes for a moment. As you add documents, these documents are processed and stored in 
+a dedicated datastructure, in a `RAM` buffer. This datastructure is designed to be dynamic but 
+cannot be accessed for search. As you add documents, this buffer will reach its capacity and tantivy will
+transparently stop adding document to it and start converting this datastructure to its final
+read-only format on disk. Once written, an brand empty buffer is available to resume adding documents. 
+
+The resulting chunk of index obtained after this serialization is called a `Segment`.
+
+> A segment is a self-contained atomic piece of index. It is identified with a UUID, and all of its files
+are identified using the naming scheme : `<UUID>.*`.
+
+
+> A tantivy `Index` is a collection of `Segments`.
--- a/doc/src/examples.md
+++ b/doc/src/examples.md
@@ -0,0 +1 @@
+# Examples
--- a/doc/src/facetting.md
+++ b/doc/src/facetting.md
@@ -0,0 +1,5 @@
+# Facetting
+
+wewew
+
+## weeewe
--- a/doc/src/faq.md
+++ b/doc/src/faq.md
--- a/doc/src/innerworkings.md
+++ b/doc/src/innerworkings.md
@@ -0,0 +1 @@
+# Innerworkings
--- a/doc/src/inverted_index.md
+++ b/doc/src/inverted_index.md
@@ -0,0 +1 @@
+# Inverted index
--- a/doc/src/schema.md
+++ b/doc/src/schema.md
@@ -0,0 +1 @@
+# Defining your schema
--- a/examples/simple_search.rs
+++ b/examples/simple_search.rs
@@ -1,26 +1,32 @@
-extern crate tantivy;
+// # Basic Example
+//
+// This example covers the basic functionalities of
+// tantivy.
+//
+// We will :
+// - define our schema
+// = create an index in a directory
+// - index few documents in our index
+// - search for the best document matchings "sea whale"
+// - retrieve the best document original content.
+
+
 extern crate tempdir;

+// ---
+// Importing tantivy...
 #[macro_use]
-extern crate serde_json;
-
-use std::path::Path;
+extern crate tantivy;
 use tantivy::collector::TopCollector;
 use tantivy::query::QueryParser;
 use tantivy::schema::*;
 use tantivy::Index;
-use tempdir::TempDir;

-fn main() {
+fn main() -> tantivy::Result<()> {
    // Let's create a temporary directory for the
    // sake of this example
-    if let Ok(dir) = TempDir::new("tantivy_example_dir") {
-        run_example(dir.path()).unwrap();
-        dir.close().unwrap();
-    }
-}
+    let index_path = TempDir::new("tantivy_example_dir")?;

-fn run_example(index_path: &Path) -> tantivy::Result<()> {
    // # Defining the schema
    //
    // The Tantivy index requires a very strict schema.
@@ -35,7 +41,7 @@ fn run_example(index_path: &Path) -> tantivy::Result<()> {
    // We want full-text search for it, and we also want
    // to be able to retrieve the document after the search.
    //
-    // TEXT | STORED is some syntactic sugar to describe
+    // `TEXT | STORED` is some syntactic sugar to describe
    // that.
    //
    // `TEXT` means the field should be tokenized and indexed,
@@ -64,21 +70,22 @@ fn run_example(index_path: &Path) -> tantivy::Result<()> {
    //
    // This will actually just save a meta.json
    // with our schema in the directory.
-    let index = Index::create(index_path, schema.clone())?;
+    let index = Index::create_in_dir(&index_path, schema.clone())?;

    // To insert document we need an index writer.
    // There must be only one writer at a time.
    // This single `IndexWriter` is already
    // multithreaded.
    //
-    // Here we use a buffer of 50MB per thread. Using a bigger
-    // heap for the indexer can increase its throughput.
+    // Here we give tantivy a budget of `50MB`.
+    // Using a bigger heap for the indexer may increase
+    // throughput, but 50 MB is already plenty.
    let mut index_writer = index.writer(50_000_000)?;

    // Let's index our documents!
    // We first need a handle on the title and the body field.

-    // ### Create a document "manually".
+    // ### Adding documents
    //
    // We can create a document manually, by setting the fields
    // one by one in a Document object.
@@ -96,15 +103,11 @@ fn run_example(index_path: &Path) -> tantivy::Result<()> {
    // ... and add it to the `IndexWriter`.
    index_writer.add_document(old_man_doc);

-    // ### Create a document directly from json.
-    //
-    // Alternatively, we can use our schema to parse a
-    // document object directly from json.
-    // The document is a string, but we use the `json` macro
-    // from `serde_json` for the convenience of multi-line support.
-    let json = json!({
-       "title": "Of Mice and Men",
-       "body": "A few miles south of Soledad, the Salinas River drops in close to the hillside \
+    // For convenience, tantivy also comes with a macro to
+    // reduce the boilerplate above.
+    index_writer.add_document(doc!(
+        title => "Of Mice and Men",
+        body => "A few miles south of Soledad, the Salinas River drops in close to the hillside \
                bank and runs deep and green. The water is warm too, for it has slipped twinkling \
                over the yellow sands in the sunlight before reaching the narrow pool. On one \
                side of the river the golden foothill slopes curve up to the strong and rocky \
@@ -112,30 +115,35 @@ fn run_example(index_path: &Path) -> tantivy::Result<()> {
                fresh and green with every spring, carrying in their lower leaf junctures the \
                debris of the winter’s flooding; and sycamores with mottled, white, recumbent \
                limbs and branches that arch over the pool"
-    });
-    let mice_and_men_doc = schema.parse_document(&json.to_string())?;
+    ));

-    index_writer.add_document(mice_and_men_doc);
+    index_writer.add_document(doc!(
+        title => "Of Mice and Men",
+        body => "A few miles south of Soledad, the Salinas River drops in close to the hillside \
+                bank and runs deep and green. The water is warm too, for it has slipped twinkling \
+                over the yellow sands in the sunlight before reaching the narrow pool. On one \
+                side of the river the golden foothill slopes curve up to the strong and rocky \
+                Gabilan Mountains, but on the valley side the water is lined with trees—willows \
+                fresh and green with every spring, carrying in their lower leaf junctures the \
+                debris of the winter’s flooding; and sycamores with mottled, white, recumbent \
+                limbs and branches that arch over the pool"
+    ));

-    // Multi-valued field are allowed, they are
-    // expressed in JSON by an array.
-    // The following document has two titles.
-    let json = json!({
-       "title": ["Frankenstein", "The Modern Prometheus"],
-       "body": "You will rejoice to hear that no disaster has accompanied the commencement of an \
+    // Multivalued field just need to be repeated.
+    index_writer.add_document(doc!(
+       title => "Frankenstein",
+       title => "The Modern Prometheus",
+       body => "You will rejoice to hear that no disaster has accompanied the commencement of an \
                enterprise which you have regarded with such evil forebodings.  I arrived here \
                yesterday, and my first task is to assure my dear sister of my welfare and \
                increasing confidence in the success of my undertaking."
-    });
-    let frankenstein_doc = schema.parse_document(&json.to_string())?;
-
-    index_writer.add_document(frankenstein_doc);
+    ));

    // This is an example, so we will only index 3 documents
    // here. You can check out tantivy's tutorial to index
    // the English wikipedia. Tantivy's indexing is rather fast.
    // Indexing 5 million articles of the English wikipedia takes
-    // around 4 minutes on my computer!
+    // around 3 minutes on my computer!

    // ### Committing
    //
@@ -160,17 +168,29 @@ fn run_example(index_path: &Path) -> tantivy::Result<()> {

    // # Searching
    //
+    // ### Searcher
+    //
    // Let's search our index. Start by reloading
    // searchers in the index. This should be done
-    // after every commit().
+    // after every `commit()`.
    index.load_searchers()?;

-    // Afterwards create one (or more) searchers.
+    // We now need to acquire a searcher.
+    // Some search experience might require more than
+    // one query.
    //
-    // You should create a searcher
-    // every time you start a "search query".
+    // The searcher ensure that we get to work
+    // with a consistent version of the index.
+    //
+    // Acquiring a `searcher` is very cheap.
+    //
+    // You should acquire a searcher every time you
+    // start processing a request and
+    // and release it right after your query is finished.
    let searcher = index.searcher();

+    // ### Query
+
    // The query parser can interpret human queries.
    // Here, if the user does not specify which
    // field they want to search, tantivy will search
@@ -215,11 +235,9 @@ fn run_example(index_path: &Path) -> tantivy::Result<()> {
        println!("{}", schema.to_json(&retrieved_doc));
    }

-    // Wait for indexing and merging threads to shut down.
-    // Usually this isn't needed, but in `main` we try to
-    // delete the temporary directory and that fails on
-    // Windows if the files are still open.
-    index_writer.wait_merging_threads()?;

    Ok(())
 }
+
+
+use tempdir::TempDir;
--- a/examples/custom_tokenizer.rs
+++ b/examples/custom_tokenizer.rs
@@ -1,27 +1,19 @@
-extern crate tantivy;
-extern crate tempdir;
+// # Defining a tokenizer pipeline
+//
+// In this example, we'll see how to define a tokenizer pipeline
+// by aligning a bunch of `TokenFilter`.
+

 #[macro_use]
-extern crate serde_json;
-
-use std::path::Path;
+extern crate tantivy;
 use tantivy::collector::TopCollector;
 use tantivy::query::QueryParser;
 use tantivy::schema::*;
 use tantivy::tokenizer::NgramTokenizer;
 use tantivy::Index;
-use tempdir::TempDir;

-fn main() {
-    // Let's create a temporary directory for the
-    // sake of this example
-    if let Ok(dir) = TempDir::new("tantivy_token_example_dir") {
-        run_example(dir.path()).unwrap();
-        dir.close().unwrap();
-    }
-}

-fn run_example(index_path: &Path) -> tantivy::Result<()> {
+fn main() -> tantivy::Result<()> {
    // # Defining the schema
    //
    // The Tantivy index requires a very strict schema.
@@ -42,7 +34,7 @@ fn run_example(index_path: &Path) -> tantivy::Result<()> {
    let text_options = TextOptions::default()
        .set_indexing_options(text_field_indexing)
        .set_stored();
-    schema_builder.add_text_field("title", text_options);
+    let title = schema_builder.add_text_field("title", text_options);

    // Our second field is body.
    // We want full-text search for it, but we do not
@@ -51,17 +43,17 @@ fn run_example(index_path: &Path) -> tantivy::Result<()> {
    //
    // We can make our index lighter and
    // by omitting `STORED` flag.
-    schema_builder.add_text_field("body", TEXT);
+    let body = schema_builder.add_text_field("body", TEXT);

    let schema = schema_builder.build();

    // # Indexing documents
    //
    // Let's create a brand new index.
-    //
-    // This will actually just save a meta.json
-    // with our schema in the directory.
-    let index = Index::create(index_path, schema.clone())?;
+    // To simplify we will work entirely in RAM.
+    // This is not what you want in reality, but it is very useful
+    // for your unit tests... Or this example.
+    let index = Index::create_in_ram(schema.clone());

    // here we are registering our custome tokenizer
    // this will store tokens of 3 characters each
@@ -77,101 +69,32 @@ fn run_example(index_path: &Path) -> tantivy::Result<()> {
    // Here we use a buffer of 50MB per thread. Using a bigger
    // heap for the indexer can increase its throughput.
    let mut index_writer = index.writer(50_000_000)?;
-
-    // Let's index our documents!
-    // We first need a handle on the title and the body field.
-
-    // ### Create a document "manually".
-    //
-    // We can create a document manually, by setting the fields
-    // one by one in a Document object.
-    let title = schema.get_field("title").unwrap();
-    let body = schema.get_field("body").unwrap();
-
-    let mut old_man_doc = Document::default();
-    old_man_doc.add_text(title, "The Old Man and the Sea");
-    old_man_doc.add_text(
-        body,
-        "He was an old man who fished alone in a skiff in the Gulf Stream and \
-         he had gone eighty-four days now without taking a fish.",
-    );
-
-    // ... and add it to the `IndexWriter`.
-    index_writer.add_document(old_man_doc);
-
-    // ### Create a document directly from json.
-    //
-    // Alternatively, we can use our schema to parse a
-    // document object directly from json.
-    // The document is a string, but we use the `json` macro
-    // from `serde_json` for the convenience of multi-line support.
-    let json = json!({
-       "title": "Of Mice and Men",
-       "body": "A few miles south of Soledad, the Salinas River drops in close to the hillside \
-                bank and runs deep and green. The water is warm too, for it has slipped twinkling \
-                over the yellow sands in the sunlight before reaching the narrow pool. On one \
-                side of the river the golden foothill slopes curve up to the strong and rocky \
-                Gabilan Mountains, but on the valley side the water is lined with trees—willows \
-                fresh and green with every spring, carrying in their lower leaf junctures the \
-                debris of the winter’s flooding; and sycamores with mottled, white, recumbent \
-                limbs and branches that arch over the pool"
-    });
-    let mice_and_men_doc = schema.parse_document(&json.to_string())?;
-
-    index_writer.add_document(mice_and_men_doc);
-
-    // Multi-valued field are allowed, they are
-    // expressed in JSON by an array.
-    // The following document has two titles.
-    let json = json!({
-       "title": ["Frankenstein", "The Modern Prometheus"],
-       "body": "You will rejoice to hear that no disaster has accompanied the commencement of an \
-                enterprise which you have regarded with such evil forebodings.  I arrived here \
-                yesterday, and my first task is to assure my dear sister of my welfare and \
-                increasing confidence in the success of my undertaking."
-    });
-    let frankenstein_doc = schema.parse_document(&json.to_string())?;
-
-    index_writer.add_document(frankenstein_doc);
-
-    // This is an example, so we will only index 3 documents
-    // here. You can check out tantivy's tutorial to index
-    // the English wikipedia. Tantivy's indexing is rather fast.
-    // Indexing 5 million articles of the English wikipedia takes
-    // around 4 minutes on my computer!
-
-    // ### Committing
-    //
-    // At this point our documents are not searchable.
-    //
-    //
-    // We need to call .commit() explicitly to force the
-    // index_writer to finish processing the documents in the queue,
-    // flush the current index to the disk, and advertise
-    // the existence of new documents.
-    //
-    // This call is blocking.
+    index_writer.add_document(doc!(
+        title => "The Old Man and the Sea",
+        body => "He was an old man who fished alone in a skiff in the Gulf Stream and \
+         he had gone eighty-four days now without taking a fish."
+    ));
+    index_writer.add_document(doc!(
+       title => "Of Mice and Men",
+       body => r#"A few miles south of Soledad, the Salinas River drops in close to the hillside
+                bank and runs deep and green. The water is warm too, for it has slipped twinkling
+                over the yellow sands in the sunlight before reaching the narrow pool. On one
+                side of the river the golden foothill slopes curve up to the strong and rocky
+                Gabilan Mountains, but on the valley side the water is lined with trees—willows
+                fresh and green with every spring, carrying in their lower leaf junctures the
+                debris of the winter’s flooding; and sycamores with mottled, white, recumbent
+                limbs and branches that arch over the pool"#
+    ));
+    index_writer.add_document(doc!(
+        title => "Frankenstein",
+        body => r#"You will rejoice to hear that no disaster has accompanied the commencement of an
+                enterprise which you have regarded with such evil forebodings.  I arrived here
+                yesterday, and my first task is to assure my dear sister of my welfare and
+                increasing confidence in the success of my undertaking."#
+    ));
    index_writer.commit()?;
-
-    // If `.commit()` returns correctly, then all of the
-    // documents that have been added are guaranteed to be
-    // persistently indexed.
-    //
-    // In the scenario of a crash or a power failure,
-    // tantivy behaves as if has rolled back to its last
-    // commit.
-
-    // # Searching
-    //
-    // Let's search our index. Start by reloading
-    // searchers in the index. This should be done
-    // after every commit().
    index.load_searchers()?;

-    // Afterwards create one (or more) searchers.
-    //
-    // You should create a searcher
-    // every time you start a "search query".
    let searcher = index.searcher();

    // The query parser can interpret human queries.
@@ -183,44 +106,14 @@ fn run_example(index_path: &Path) -> tantivy::Result<()> {
    // here we want to get a hit on the 'ken' in Frankenstein
    let query = query_parser.parse_query("ken")?;

-    // A query defines a set of documents, as
-    // well as the way they should be scored.
-    //
-    // A query created by the query parser is scored according
-    // to a metric called Tf-Idf, and will consider
-    // any document matching at least one of our terms.
-
-    // ### Collectors
-    //
-    // We are not interested in all of the documents but
-    // only in the top 10. Keeping track of our top 10 best documents
-    // is the role of the TopCollector.
    let mut top_collector = TopCollector::with_limit(10);
-
-    // We can now perform our query.
    searcher.search(&*query, &mut top_collector)?;

-    // Our top collector now contains the 10
-    // most relevant doc ids...
    let doc_addresses = top_collector.docs();
-
-    // The actual documents still need to be
-    // retrieved from Tantivy's store.
-    //
-    // Since the body field was not configured as stored,
-    // the document returned will only contain
-    // a title.
-
    for doc_address in doc_addresses {
        let retrieved_doc = searcher.doc(&doc_address)?;
        println!("{}", schema.to_json(&retrieved_doc));
    }

-    // Wait for indexing and merging threads to shut down.
-    // Usually this isn't needed, but in `main` we try to
-    // delete the temporary directory and that fails on
-    // Windows if the files are still open.
-    index_writer.wait_merging_threads()?;
-
    Ok(())
 }
--- a/examples/deleting_updating_documents.rs
+++ b/examples/deleting_updating_documents.rs
@@ -0,0 +1,146 @@
+// # Deleting and Updating (?) documents
+//
+// This example explains how to delete and update documents.
+// In fact there is actually no such thing as an update in tantivy.
+//
+// To update a document, you need to delete a document and then reinsert
+// its new version.
+//
+// ---
+// Importing tantivy...
+#[macro_use]
+extern crate tantivy;
+use tantivy::collector::TopCollector;
+use tantivy::schema::*;
+use tantivy::Index;
+use tantivy::query::TermQuery;
+
+
+// A simple helper function to fetch a single document
+// given its id from our index.
+// It will be helpful to check our work.
+fn extract_doc_given_isbn(index: &Index, isbn_term: &Term) -> tantivy::Result<Option<Document>> {
+    let searcher = index.searcher();
+
+    // This is the simplest query you can think of.
+    // It matches all of the documents containing a specific term.
+    //
+    // The second argument is here to tell we don't care about decoding positions,
+    // or term frequencies.
+    let term_query = TermQuery::new(isbn_term.clone(), IndexRecordOption::Basic);
+    let mut top_collector = TopCollector::with_limit(1);
+    searcher.search(&term_query, &mut top_collector)?;
+
+    if let Some(doc_address) =  top_collector.docs().first() {
+        let doc = searcher.doc(doc_address)?;
+        Ok(Some(doc))
+    } else {
+        // no doc matching this ID.
+        Ok(None)
+    }
+}
+
+fn main() -> tantivy::Result<()> {
+
+    // # Defining the schema
+    //
+    // Check out the *basic_search* example if this makes
+    // small sense to you.
+    let mut schema_builder = SchemaBuilder::default();
+
+    // Tantivy does not really have a notion of primary id.
+    // This may change in the future.
+    //
+    // Still, we can create a `isbn` field and use it as an id. This
+    // field can be `u64` or a `text`, depending on your use case.
+    // It just needs to be indexed.
+    //
+    // If it is `text`, let's make sure to keep it `raw` and let's avoid
+    // running any text processing on it.
+    // This is done by associating this field to the tokenizer named `raw`.
+    // Rather than building our [`TextOptions`](//docs.rs/tantivy/~0/tantivy/schema/struct.TextOptions.html) manually,
+    // We use the `STRING` shortcut. `STRING` stands for indexed (without term frequency or positions)
+    // and untokenized.
+    //
+    // Because we also want to be able to see this `id` in our returned documents,
+    // we also mark the field as stored.
+    let isbn = schema_builder.add_text_field("isbn", STRING | STORED);
+    let title = schema_builder.add_text_field("title", TEXT | STORED);
+    let schema = schema_builder.build();
+
+    let index = Index::create_in_ram(schema.clone());
+
+    let mut index_writer = index.writer(50_000_000)?;
+
+    // Let's add a couple of documents, for the sake of the example.
+    let mut old_man_doc = Document::default();
+    old_man_doc.add_text(title, "The Old Man and the Sea");
+    index_writer.add_document(doc!(
+        isbn => "978-0099908401",
+        title => "The old Man and the see"
+    ));
+    index_writer.add_document(doc!(
+        isbn => "978-0140177398",
+        title => "Of Mice and Men",
+    ));
+    index_writer.add_document(doc!(
+       title => "Frankentein", //< Oops there is a typo here.
+       isbn => "978-9176370711",
+    ));
+    index_writer.commit()?;
+    index.load_searchers()?;
+
+    let frankenstein_isbn = Term::from_field_text(isbn, "978-9176370711");
+
+    // Oops our frankenstein doc seems mispelled
+    let frankenstein_doc_misspelled = extract_doc_given_isbn(&index, &frankenstein_isbn)?.unwrap();
+    assert_eq!(
+        schema.to_json(&frankenstein_doc_misspelled),
+        r#"{"isbn":["978-9176370711"],"title":["Frankentein"]}"#,
+    );
+
+    // # Update = Delete + Insert
+    //
+    // Here we will want to update the typo in the `Frankenstein` book.
+    //
+    // Tantivy does not handle updates directly, we need to delete
+    // and reinsert the document.
+    //
+    // This can be complicated as it means you need to have access
+    // to the entire document. It is good practise to integrate tantivy
+    // with a key value store for this reason.
+    //
+    // To remove one of the document, we just call `delete_term`
+    // on its id.
+    //
+    // Note that `tantivy` does nothing to enforce the idea that
+    // there is only one document associated to this id.
+    //
+    // Also you might have noticed that we apply the delete before
+    // having committed. This does not matter really...
+    index_writer.delete_term(frankenstein_isbn.clone());
+
+    // We now need to reinsert our document without the typo.
+    index_writer.add_document(doc!(
+       title => "Frankenstein",
+       isbn => "978-9176370711",
+    ));
+
+
+    // You are guaranteed that your clients will only observe your index in
+    // the state it was in after a commit.
+    // In this example, your search engine will at no point be missing the *Frankenstein* document.
+    // Everything happened as if the document was updated.
+    index_writer.commit()?;
+    // We reload our searcher to make our change available to clients.
+    index.load_searchers()?;
+
+    // No more typo!
+    let frankenstein_new_doc = extract_doc_given_isbn(&index, &frankenstein_isbn)?.unwrap();
+    assert_eq!(
+        schema.to_json(&frankenstein_new_doc),
+        r#"{"isbn":["978-9176370711"],"title":["Frankenstein"]}"#,
+    );
+
+    Ok(())
+}
--- a/examples/faceted_search.rs
+++ b/examples/faceted_search.rs
@@ -0,0 +1,81 @@
+// # Basic Example
+//
+// This example covers the basic functionalities of
+// tantivy.
+//
+// We will :
+// - define our schema
+// = create an index in a directory
+// - index few documents in our index
+// - search for the best document matchings "sea whale"
+// - retrieve the best document original content.
+
+extern crate tempdir;
+
+// ---
+// Importing tantivy...
+#[macro_use]
+extern crate tantivy;
+use tantivy::collector::FacetCollector;
+use tantivy::query::AllQuery;
+use tantivy::schema::*;
+use tantivy::Index;
+
+fn main() -> tantivy::Result<()> {
+  // Let's create a temporary directory for the
+  // sake of this example
+  let index_path = TempDir::new("tantivy_facet_example_dir")?;
+  let mut schema_builder = SchemaBuilder::default();
+
+  schema_builder.add_text_field("name", TEXT | STORED);
+
+  // this is our faceted field
+  schema_builder.add_facet_field("tags");
+
+  let schema = schema_builder.build();
+
+  let index = Index::create_in_dir(&index_path, schema.clone())?;
+
+  let mut index_writer = index.writer(50_000_000)?;
+
+  let name = schema.get_field("name").unwrap();
+  let tags = schema.get_field("tags").unwrap();
+
+  // For convenience, tantivy also comes with a macro to
+  // reduce the boilerplate above.
+  index_writer.add_document(doc!(
+        name => "the ditch",
+        tags => Facet::from("/pools/north")
+    ));
+
+  index_writer.add_document(doc!(
+        name => "little stacey",
+        tags => Facet::from("/pools/south")
+    ));
+
+  index_writer.commit()?;
+
+  index.load_searchers()?;
+
+  let searcher = index.searcher();
+
+  let mut facet_collector = FacetCollector::for_field(tags);
+  facet_collector.add_facet("/pools");
+
+  searcher.search(&AllQuery, &mut facet_collector).unwrap();
+
+  let counts = facet_collector.harvest();
+  // This lists all of the facet counts
+  let facets: Vec<(&Facet, u64)> = counts.get("/pools").collect();
+  assert_eq!(
+    facets,
+    vec![
+      (&Facet::from("/pools/north"), 1),
+      (&Facet::from("/pools/south"), 1)
+    ]
+  );
+
+  Ok(())
+}
+
+use tempdir::TempDir;
--- a/examples/generate_html.sh
+++ b/examples/generate_html.sh
@@ -1,2 +0,0 @@
-#!/bin/bash
-docco simple_search.rs -o html
--- a/examples/html/docco.css
+++ b/examples/html/docco.css
@@ -1,518 +0,0 @@
-/*--------------------- Typography ----------------------------*/
-
-@font-face {
-    font-family: 'aller-light';
-    src: url('public/fonts/aller-light.eot');
-    src: url('public/fonts/aller-light.eot?#iefix') format('embedded-opentype'),
-         url('public/fonts/aller-light.woff') format('woff'),
-         url('public/fonts/aller-light.ttf') format('truetype');
-    font-weight: normal;
-    font-style: normal;
-}
-
-@font-face {
-    font-family: 'aller-bold';
-    src: url('public/fonts/aller-bold.eot');
-    src: url('public/fonts/aller-bold.eot?#iefix') format('embedded-opentype'),
-         url('public/fonts/aller-bold.woff') format('woff'),
-         url('public/fonts/aller-bold.ttf') format('truetype');
-    font-weight: normal;
-    font-style: normal;
-}
-
-@font-face {
-    font-family: 'roboto-black';
-    src: url('public/fonts/roboto-black.eot');
-    src: url('public/fonts/roboto-black.eot?#iefix') format('embedded-opentype'),
-         url('public/fonts/roboto-black.woff') format('woff'),
-         url('public/fonts/roboto-black.ttf') format('truetype');
-    font-weight: normal;
-    font-style: normal;
-}
-
-/*--------------------- Layout ----------------------------*/
-html { height: 100%; }
-body {
-  font-family: "aller-light";
-  font-size: 14px;
-  line-height: 18px;
-  color: #30404f;
-  margin: 0; padding: 0;
-  height:100%;
-}
-#container { min-height: 100%; }
-
-a {
-  color: #000;
-}
-
-b, strong {
-  font-weight: normal;
-  font-family: "aller-bold";
-}
-
-p {
-  margin: 15px 0 0px;
-}
-  .annotation ul, .annotation ol {
-    margin: 25px 0;
-  }
-    .annotation ul li, .annotation ol li {
-      font-size: 14px;
-      line-height: 18px;
-      margin: 10px 0;
-    }
-
-h1, h2, h3, h4, h5, h6 {
-  color: #112233;
-  line-height: 1em;
-  font-weight: normal;
-  font-family: "roboto-black";
-  text-transform: uppercase;
-  margin: 30px 0 15px 0;
-}
-
-h1 {
-  margin-top: 40px;
-}
-h2 {
-  font-size: 1.26em;
-}
-
-hr {
-  border: 0;
-  background: 1px #ddd;
-  height: 1px;
-  margin: 20px 0;
-}
-
-pre, tt, code {
-  font-size: 12px; line-height: 16px;
-  font-family: Menlo, Monaco, Consolas, "Lucida Console", monospace;
-  margin: 0; padding: 0;
-}
-  .annotation pre {
-    display: block;
-    margin: 0;
-    padding: 7px 10px;
-    background: #fcfcfc;
-    -moz-box-shadow:    inset 0 0 10px rgba(0,0,0,0.1);
-    -webkit-box-shadow: inset 0 0 10px rgba(0,0,0,0.1);
-    box-shadow:         inset 0 0 10px rgba(0,0,0,0.1);
-    overflow-x: auto;
-  }
-    .annotation pre code {
-      border: 0;
-      padding: 0;
-      background: transparent;
-    }
-
-
-blockquote {
-  border-left: 5px solid #ccc;
-  margin: 0;
-  padding: 1px 0 1px 1em;
-}
-  .sections blockquote p {
-    font-family: Menlo, Consolas, Monaco, monospace;
-    font-size: 12px; line-height: 16px;
-    color: #999;
-    margin: 10px 0 0;
-    white-space: pre-wrap;
-  }
-
-ul.sections {
-  list-style: none;
-  padding:0 0 5px 0;;
-  margin:0;
-}
-
-/*
-  Force border-box so that % widths fit the parent
-  container without overlap because of margin/padding.
-
-  More Info : http://www.quirksmode.org/css/box.html
-*/
-ul.sections > li > div {
-  -moz-box-sizing: border-box;    /* firefox */
-  -ms-box-sizing: border-box;     /* ie */
-  -webkit-box-sizing: border-box; /* webkit */
-  -khtml-box-sizing: border-box;  /* konqueror */
-  box-sizing: border-box;         /* css3 */
-}
-
-
-/*---------------------- Jump Page -----------------------------*/
-#jump_to, #jump_page {
-  margin: 0;
-  background: white;
-  -webkit-box-shadow: 0 0 25px #777; -moz-box-shadow: 0 0 25px #777;
-  -webkit-border-bottom-left-radius: 5px; -moz-border-radius-bottomleft: 5px;
-  font: 16px Arial;
-  cursor: pointer;
-  text-align: right;
-  list-style: none;
-}
-
-#jump_to a {
-  text-decoration: none;
-}
-
-#jump_to a.large {
-  display: none;
-}
-#jump_to a.small {
-  font-size: 22px;
-  font-weight: bold;
-  color: #676767;
-}
-
-#jump_to, #jump_wrapper {
-  position: fixed;
-  right: 0; top: 0;
-  padding: 10px 15px;
-  margin:0;
-}
-
-#jump_wrapper {
-  display: none;
-  padding:0;
-}
-
-#jump_to:hover #jump_wrapper {
-  display: block;
-}
-
-#jump_page_wrapper{
-  position: fixed;
-  right: 0;
-  top: 0;
-  bottom: 0;
-}
-
-#jump_page {
-  padding: 5px 0 3px;
-  margin: 0 0 25px 25px;
-  max-height: 100%;
-  overflow: auto;
-}
-
-#jump_page .source {
-  display: block;
-  padding: 15px;
-  text-decoration: none;
-  border-top: 1px solid #eee;
-}
-
-#jump_page .source:hover {
-  background: #f5f5ff;
-}
-
-#jump_page .source:first-child {
-}
-
-/*---------------------- Low resolutions (> 320px) ---------------------*/
-@media only screen and (min-width: 320px) {
-  .pilwrap { display: none; }
-
-  ul.sections > li > div {
-    display: block;
-    padding:5px 10px 0 10px;
-  }
-
-  ul.sections > li > div.annotation ul, ul.sections > li > div.annotation ol {
-    padding-left: 30px;
-  }
-
-  ul.sections > li > div.content {
-    overflow-x:auto;
-    -webkit-box-shadow: inset 0 0 5px #e5e5ee;
-    box-shadow: inset 0 0 5px #e5e5ee;
-    border: 1px solid #dedede;
-    margin:5px 10px 5px 10px;
-    padding-bottom: 5px;
-  }
-
-  ul.sections > li > div.annotation pre {
-    margin: 7px 0 7px;
-    padding-left: 15px;
-  }
-
-  ul.sections > li > div.annotation p tt, .annotation code {
-    background: #f8f8ff;
-    border: 1px solid #dedede;
-    font-size: 12px;
-    padding: 0 0.2em;
-  }
-}
-
-/*----------------------  (> 481px) ---------------------*/
-@media only screen and (min-width: 481px) {
-  #container {
-    position: relative;
-  }
-  body {
-    background-color: #F5F5FF;
-    font-size: 15px;
-    line-height: 21px;
-  }
-  pre, tt, code {
-    line-height: 18px;
-  }
-  p, ul, ol {
-    margin: 0 0 15px;
-  }
-
-
-  #jump_to {
-    padding: 5px 10px;
-  }
-  #jump_wrapper {
-    padding: 0;
-  }
-  #jump_to, #jump_page {
-    font: 10px Arial;
-    text-transform: uppercase;
-  }
-  #jump_page .source {
-    padding: 5px 10px;
-  }
-  #jump_to a.large {
-    display: inline-block;
-  }
-  #jump_to a.small {
-    display: none;
-  }
-
-
-
-  #background {
-    position: absolute;
-    top: 0; bottom: 0;
-    width: 350px;
-    background: #fff;
-    border-right: 1px solid #e5e5ee;
-    z-index: -1;
-  }
-
-  ul.sections > li > div.annotation ul, ul.sections > li > div.annotation ol {
-    padding-left: 40px;
-  }
-
-  ul.sections > li {
-    white-space: nowrap;
-  }
-
-  ul.sections > li > div {
-    display: inline-block;
-  }
-
-  ul.sections > li > div.annotation {
-    max-width: 350px;
-    min-width: 350px;
-    min-height: 5px;
-    padding: 13px;
-    overflow-x: hidden;
-    white-space: normal;
-    vertical-align: top;
-    text-align: left;
-  }
-  ul.sections > li > div.annotation pre {
-    margin: 15px 0 15px;
-    padding-left: 15px;
-  }
-
-  ul.sections > li > div.content {
-    padding: 13px;
-    vertical-align: top;
-    border: none;
-    -webkit-box-shadow: none;
-    box-shadow: none;
-  }
-
-  .pilwrap {
-    position: relative;
-    display: inline;
-  }
-
-  .pilcrow {
-    font: 12px Arial;
-    text-decoration: none;
-    color: #454545;
-    position: absolute;
-    top: 3px; left: -20px;
-    padding: 1px 2px;
-    opacity: 0;
-    -webkit-transition: opacity 0.2s linear;
-  }
-    .for-h1 .pilcrow {
-      top: 47px;
-    }
-    .for-h2 .pilcrow, .for-h3 .pilcrow, .for-h4 .pilcrow {
-      top: 35px;
-    }
-
-  ul.sections > li > div.annotation:hover .pilcrow {
-    opacity: 1;
-  }
-}
-
-/*---------------------- (> 1025px) ---------------------*/
-@media only screen and (min-width: 1025px) {
-
-  body {
-    font-size: 16px;
-    line-height: 24px;
-  }
-
-  #background {
-    width: 525px;
-  }
-  ul.sections > li > div.annotation {
-    max-width: 525px;
-    min-width: 525px;
-    padding: 10px 25px 1px 50px;
-  }
-  ul.sections > li > div.content {
-    padding: 9px 15px 16px 25px;
-  }
-}
-
-/*---------------------- Syntax Highlighting -----------------------------*/
-
-td.linenos { background-color: #f0f0f0; padding-right: 10px; }
-span.lineno { background-color: #f0f0f0; padding: 0 5px 0 5px; }
-/*
-
-github.com style (c) Vasily Polovnyov <vast@whiteants.net>
-
-*/
-
-pre code {
-  display: block; padding: 0.5em;
-  color: #000;
-  background: #f8f8ff
-}
-
-pre .hljs-comment,
-pre .hljs-template_comment,
-pre .hljs-diff .hljs-header,
-pre .hljs-javadoc {
-  color: #408080;
-  font-style: italic
-}
-
-pre .hljs-keyword,
-pre .hljs-assignment,
-pre .hljs-literal,
-pre .hljs-css .hljs-rule .hljs-keyword,
-pre .hljs-winutils,
-pre .hljs-javascript .hljs-title,
-pre .hljs-lisp .hljs-title,
-pre .hljs-subst {
-  color: #954121;
-  /*font-weight: bold*/
-}
-
-pre .hljs-number,
-pre .hljs-hexcolor {
-  color: #40a070
-}
-
-pre .hljs-string,
-pre .hljs-tag .hljs-value,
-pre .hljs-phpdoc,
-pre .hljs-tex .hljs-formula {
-  color: #219161;
-}
-
-pre .hljs-title,
-pre .hljs-id {
-  color: #19469D;
-}
-pre .hljs-params {
-  color: #00F;
-}
-
-pre .hljs-javascript .hljs-title,
-pre .hljs-lisp .hljs-title,
-pre .hljs-subst {
-  font-weight: normal
-}
-
-pre .hljs-class .hljs-title,
-pre .hljs-haskell .hljs-label,
-pre .hljs-tex .hljs-command {
-  color: #458;
-  font-weight: bold
-}
-
-pre .hljs-tag,
-pre .hljs-tag .hljs-title,
-pre .hljs-rules .hljs-property,
-pre .hljs-django .hljs-tag .hljs-keyword {
-  color: #000080;
-  font-weight: normal
-}
-
-pre .hljs-attribute,
-pre .hljs-variable,
-pre .hljs-instancevar,
-pre .hljs-lisp .hljs-body {
-  color: #008080
-}
-
-pre .hljs-regexp {
-  color: #B68
-}
-
-pre .hljs-class {
-  color: #458;
-  font-weight: bold
-}
-
-pre .hljs-symbol,
-pre .hljs-ruby .hljs-symbol .hljs-string,
-pre .hljs-ruby .hljs-symbol .hljs-keyword,
-pre .hljs-ruby .hljs-symbol .hljs-keymethods,
-pre .hljs-lisp .hljs-keyword,
-pre .hljs-tex .hljs-special,
-pre .hljs-input_number {
-  color: #990073
-}
-
-pre .hljs-builtin,
-pre .hljs-constructor,
-pre .hljs-built_in,
-pre .hljs-lisp .hljs-title {
-  color: #0086b3
-}
-
-pre .hljs-preprocessor,
-pre .hljs-pi,
-pre .hljs-doctype,
-pre .hljs-shebang,
-pre .hljs-cdata {
-  color: #999;
-  font-weight: bold
-}
-
-pre .hljs-deletion {
-  background: #fdd
-}
-
-pre .hljs-addition {
-  background: #dfd
-}
-
-pre .hljs-diff .hljs-change {
-  background: #0086b3
-}
-
-pre .hljs-chunk {
-  color: #aaa
-}
-
-pre .hljs-tex .hljs-formula {
-  opacity: 0.5;
-}
--- a/examples/html/public/fonts/aller-bold.eot
+++ b/examples/html/public/fonts/aller-bold.eot
--- a/examples/html/public/fonts/aller-bold.ttf
+++ b/examples/html/public/fonts/aller-bold.ttf
--- a/examples/html/public/fonts/aller-bold.woff
+++ b/examples/html/public/fonts/aller-bold.woff
--- a/examples/html/public/fonts/aller-light.eot
+++ b/examples/html/public/fonts/aller-light.eot
--- a/examples/html/public/fonts/aller-light.ttf
+++ b/examples/html/public/fonts/aller-light.ttf
--- a/examples/html/public/fonts/aller-light.woff
+++ b/examples/html/public/fonts/aller-light.woff
--- a/examples/html/public/fonts/fleurons.eot
+++ b/examples/html/public/fonts/fleurons.eot
--- a/examples/html/public/fonts/fleurons.ttf
+++ b/examples/html/public/fonts/fleurons.ttf
--- a/examples/html/public/fonts/fleurons.woff
+++ b/examples/html/public/fonts/fleurons.woff
--- a/examples/html/public/fonts/roboto-black.eot
+++ b/examples/html/public/fonts/roboto-black.eot
--- a/examples/html/public/fonts/roboto-black.ttf
+++ b/examples/html/public/fonts/roboto-black.ttf
--- a/examples/html/public/fonts/roboto-black.woff
+++ b/examples/html/public/fonts/roboto-black.woff
--- a/examples/html/public/images/gray.png
+++ b/examples/html/public/images/gray.png
--- a/examples/html/public/stylesheets/normalize.css
+++ b/examples/html/public/stylesheets/normalize.css
@@ -1,375 +0,0 @@
-/*! normalize.css v2.0.1 | MIT License | git.io/normalize */
-
-/* ==========================================================================
-   HTML5 display definitions
-   ========================================================================== */
-
-/*
- * Corrects `block` display not defined in IE 8/9.
- */
-
-article,
-aside,
-details,
-figcaption,
-figure,
-footer,
-header,
-hgroup,
-nav,
-section,
-summary {
-    display: block;
-}
-
-/*
- * Corrects `inline-block` display not defined in IE 8/9.
- */
-
-audio,
-canvas,
-video {
-    display: inline-block;
-}
-
-/*
- * Prevents modern browsers from displaying `audio` without controls.
- * Remove excess height in iOS 5 devices.
- */
-
-audio:not([controls]) {
-    display: none;
-    height: 0;
-}
-
-/*
- * Addresses styling for `hidden` attribute not present in IE 8/9.
- */
-
-[hidden] {
-    display: none;
-}
-
-/* ==========================================================================
-   Base
-   ========================================================================== */
-
-/*
- * 1. Sets default font family to sans-serif.
- * 2. Prevents iOS text size adjust after orientation change, without disabling
- *    user zoom.
- */
-
-html {
-    font-family: sans-serif; /* 1 */
-    -webkit-text-size-adjust: 100%; /* 2 */
-    -ms-text-size-adjust: 100%; /* 2 */
-}
-
-/*
- * Removes default margin.
- */
-
-body {
-    margin: 0;
-}
-
-/* ==========================================================================
-   Links
-   ========================================================================== */
-
-/*
- * Addresses `outline` inconsistency between Chrome and other browsers.
- */
-
-a:focus {
-    outline: thin dotted;
-}
-
-/*
- * Improves readability when focused and also mouse hovered in all browsers.
- */
-
-a:active,
-a:hover {
-    outline: 0;
-}
-
-/* ==========================================================================
-   Typography
-   ========================================================================== */
-
-/*
- * Addresses `h1` font sizes within `section` and `article` in Firefox 4+,
- * Safari 5, and Chrome.
- */
-
-h1 {
-    font-size: 2em;
-}
-
-/*
- * Addresses styling not present in IE 8/9, Safari 5, and Chrome.
- */
-
-abbr[title] {
-    border-bottom: 1px dotted;
-}
-
-/*
- * Addresses style set to `bolder` in Firefox 4+, Safari 5, and Chrome.
- */
-
-b,
-strong {
-    font-weight: bold;
-}
-
-/*
- * Addresses styling not present in Safari 5 and Chrome.
- */
-
-dfn {
-    font-style: italic;
-}
-
-/*
- * Addresses styling not present in IE 8/9.
- */
-
-mark {
-    background: #ff0;
-    color: #000;
-}
-
-
-/*
- * Corrects font family set oddly in Safari 5 and Chrome.
- */
-
-code,
-kbd,
-pre,
-samp {
-    font-family: monospace, serif;
-    font-size: 1em;
-}
-
-/*
- * Improves readability of pre-formatted text in all browsers.
- */
-
-pre {
-    white-space: pre;
-    white-space: pre-wrap;
-    word-wrap: break-word;
-}
-
-/*
- * Sets consistent quote types.
- */
-
-q {
-    quotes: "\201C" "\201D" "\2018" "\2019";
-}
-
-/*
- * Addresses inconsistent and variable font size in all browsers.
- */
-
-small {
-    font-size: 80%;
-}
-
-/*
- * Prevents `sub` and `sup` affecting `line-height` in all browsers.
- */
-
-sub,
-sup {
-    font-size: 75%;
-    line-height: 0;
-    position: relative;
-    vertical-align: baseline;
-}
-
-sup {
-    top: -0.5em;
-}
-
-sub {
-    bottom: -0.25em;
-}
-
-/* ==========================================================================
-   Embedded content
-   ========================================================================== */
-
-/*
- * Removes border when inside `a` element in IE 8/9.
- */
-
-img {
-    border: 0;
-}
-
-/*
- * Corrects overflow displayed oddly in IE 9.
- */
-
-svg:not(:root) {
-    overflow: hidden;
-}
-
-/* ==========================================================================
-   Figures
-   ========================================================================== */
-
-/*
- * Addresses margin not present in IE 8/9 and Safari 5.
- */
-
-figure {
-    margin: 0;
-}
-
-/* ==========================================================================
-   Forms
-   ========================================================================== */
-
-/*
- * Define consistent border, margin, and padding.
- */
-
-fieldset {
-    border: 1px solid #c0c0c0;
-    margin: 0 2px;
-    padding: 0.35em 0.625em 0.75em;
-}
-
-/*
- * 1. Corrects color not being inherited in IE 8/9.
- * 2. Remove padding so people aren't caught out if they zero out fieldsets.
- */
-
-legend {
-    border: 0; /* 1 */
-    padding: 0; /* 2 */
-}
-
-/*
- * 1. Corrects font family not being inherited in all browsers.
- * 2. Corrects font size not being inherited in all browsers.
- * 3. Addresses margins set differently in Firefox 4+, Safari 5, and Chrome
- */
-
-button,
-input,
-select,
-textarea {
-    font-family: inherit; /* 1 */
-    font-size: 100%; /* 2 */
-    margin: 0; /* 3 */
-}
-
-/*
- * Addresses Firefox 4+ setting `line-height` on `input` using `!important` in
- * the UA stylesheet.
- */
-
-button,
-input {
-    line-height: normal;
-}
-
-/*
- * 1. Avoid the WebKit bug in Android 4.0.* where (2) destroys native `audio`
- *    and `video` controls.
- * 2. Corrects inability to style clickable `input` types in iOS.
- * 3. Improves usability and consistency of cursor style between image-type
- *    `input` and others.
- */
-
-button,
-html input[type="button"], /* 1 */
-input[type="reset"],
-input[type="submit"] {
-    -webkit-appearance: button; /* 2 */
-    cursor: pointer; /* 3 */
-}
-
-/*
- * Re-set default cursor for disabled elements.
- */
-
-button[disabled],
-input[disabled] {
-    cursor: default;
-}
-
-/*
- * 1. Addresses box sizing set to `content-box` in IE 8/9.
- * 2. Removes excess padding in IE 8/9.
- */
-
-input[type="checkbox"],
-input[type="radio"] {
-    box-sizing: border-box; /* 1 */
-    padding: 0; /* 2 */
-}
-
-/*
- * 1. Addresses `appearance` set to `searchfield` in Safari 5 and Chrome.
- * 2. Addresses `box-sizing` set to `border-box` in Safari 5 and Chrome
- *    (include `-moz` to future-proof).
- */
-
-input[type="search"] {
-    -webkit-appearance: textfield; /* 1 */
-    -moz-box-sizing: content-box;
-    -webkit-box-sizing: content-box; /* 2 */
-    box-sizing: content-box;
-}
-
-/*
- * Removes inner padding and search cancel button in Safari 5 and Chrome
- * on OS X.
- */
-
-input[type="search"]::-webkit-search-cancel-button,
-input[type="search"]::-webkit-search-decoration {
-    -webkit-appearance: none;
-}
-
-/*
- * Removes inner padding and border in Firefox 4+.
- */
-
-button::-moz-focus-inner,
-input::-moz-focus-inner {
-    border: 0;
-    padding: 0;
-}
-
-/*
- * 1. Removes default vertical scrollbar in IE 8/9.
- * 2. Improves readability and alignment in all browsers.
- */
-
-textarea {
-    overflow: auto; /* 1 */
-    vertical-align: top; /* 2 */
-}
-
-/* ==========================================================================
-   Tables
-   ========================================================================== */
-
-/*
- * Remove most spacing between table cells.
- */
-
-table {
-    border-collapse: collapse;
-    border-spacing: 0;
-}
--- a/examples/html/simple_search.html
+++ b/examples/html/simple_search.html
@@ -1,542 +0,0 @@
-<!DOCTYPE html>
-
-<html>
-<head>
-  <title>simple_search.rs</title>
-  <meta http-equiv="content-type" content="text/html; charset=UTF-8">
-  <meta name="viewport" content="width=device-width, target-densitydpi=160dpi, initial-scale=1.0; maximum-scale=1.0; user-scalable=0;">
-  <link rel="stylesheet" media="all" href="docco.css" />
-</head>
-<body>
-  <div id="container">
-    <div id="background"></div>
-    
-    <ul class="sections">
-        
-          <li id="title">
-              <div class="annotation">
-                  <h1>simple_search.rs</h1>
-              </div>
-          </li>
-        
-        
-        
-        <li id="section-1">
-            <div class="annotation">
-              
-              <div class="pilwrap ">
-                <a class="pilcrow" href="#section-1">&#182;</a>
-              </div>
-              
-            </div>
-            
-            <div class="content"><div class='highlight'><pre><span class="hljs-keyword">extern</span> <span class="hljs-keyword">crate</span> tantivy;
-<span class="hljs-keyword">extern</span> <span class="hljs-keyword">crate</span> tempdir;
-
-<span class="hljs-meta">#[macro_use]</span>
-<span class="hljs-keyword">extern</span> <span class="hljs-keyword">crate</span> serde_json;
-
-<span class="hljs-keyword">use</span> std::path::Path;
-<span class="hljs-keyword">use</span> tempdir::TempDir;
-<span class="hljs-keyword">use</span> tantivy::Index;
-<span class="hljs-keyword">use</span> tantivy::schema::*;
-<span class="hljs-keyword">use</span> tantivy::collector::TopCollector;
-<span class="hljs-keyword">use</span> tantivy::query::QueryParser;
-
-<span class="hljs-function"><span class="hljs-keyword">fn</span> <span class="hljs-title">main</span></span>() {</pre></div></div>
-            
-        </li>
-        
-        
-        <li id="section-2">
-            <div class="annotation">
-              
-              <div class="pilwrap ">
-                <a class="pilcrow" href="#section-2">&#182;</a>
-              </div>
-              <p>Let’s create a temporary directory for the
-sake of this example</p>
-
-            </div>
-            
-            <div class="content"><div class='highlight'><pre>    <span class="hljs-keyword">if</span> <span class="hljs-keyword">let</span> <span class="hljs-literal">Ok</span>(dir) = TempDir::new(<span class="hljs-string">"tantivy_example_dir"</span>) {
-        run_example(dir.path()).unwrap();
-        dir.close().unwrap();
-    }
-}
-
-
-<span class="hljs-function"><span class="hljs-keyword">fn</span> <span class="hljs-title">run_example</span></span>(index_path: &amp;Path) -&gt; tantivy::<span class="hljs-built_in">Result</span>&lt;()&gt; {</pre></div></div>
-            
-        </li>
-        
-        
-        <li id="section-3">
-            <div class="annotation">
-              
-              <div class="pilwrap ">
-                <a class="pilcrow" href="#section-3">&#182;</a>
-              </div>
-              <h1 id="defining-the-schema">Defining the schema</h1>
-<p>The Tantivy index requires a very strict schema.
-The schema declares which fields are in the index,
-and for each field, its type and “the way it should
-be indexed”.</p>
-
-            </div>
-            
-        </li>
-        
-        
-        <li id="section-4">
-            <div class="annotation">
-              
-              <div class="pilwrap ">
-                <a class="pilcrow" href="#section-4">&#182;</a>
-              </div>
-              <p>first we need to define a schema …</p>
-
-            </div>
-            
-            <div class="content"><div class='highlight'><pre>    <span class="hljs-keyword">let</span> <span class="hljs-keyword">mut</span> schema_builder = SchemaBuilder::<span class="hljs-keyword">default</span>();</pre></div></div>
-            
-        </li>
-        
-        
-        <li id="section-5">
-            <div class="annotation">
-              
-              <div class="pilwrap ">
-                <a class="pilcrow" href="#section-5">&#182;</a>
-              </div>
-              <p>Our first field is title.
-We want full-text search for it, and we also want 
-to be able to retrieve the document after the search.</p>
-<p>TEXT | STORED is some syntactic sugar to describe
-that.</p>
-<p><code>TEXT</code> means the field should be tokenized and indexed,
-along with its term frequency and term positions.</p>
-<p><code>STORED</code> means that the field will also be saved
-in a compressed, row-oriented key-value store.
-This store is useful to reconstruct the
-documents that were selected during the search phase.</p>
-
-            </div>
-            
-            <div class="content"><div class='highlight'><pre>    schema_builder.add_text_field(<span class="hljs-string">"title"</span>, TEXT | STORED);</pre></div></div>
-            
-        </li>
-        
-        
-        <li id="section-6">
-            <div class="annotation">
-              
-              <div class="pilwrap ">
-                <a class="pilcrow" href="#section-6">&#182;</a>
-              </div>
-              <p>Our second field is body.
-We want full-text search for it, but we do not 
-need to be able to be able to retrieve it
-for our application. </p>
-<p>We can make our index lighter and 
-by omitting <code>STORED</code> flag.</p>
-
-            </div>
-            
-            <div class="content"><div class='highlight'><pre>    schema_builder.add_text_field(<span class="hljs-string">"body"</span>, TEXT);
-
-    <span class="hljs-keyword">let</span> schema = schema_builder.build();</pre></div></div>
-            
-        </li>
-        
-        
-        <li id="section-7">
-            <div class="annotation">
-              
-              <div class="pilwrap ">
-                <a class="pilcrow" href="#section-7">&#182;</a>
-              </div>
-              <h1 id="indexing-documents">Indexing documents</h1>
-<p>Let’s create a brand new index.</p>
-<p>This will actually just save a meta.json
-with our schema in the directory.</p>
-
-            </div>
-            
-            <div class="content"><div class='highlight'><pre>    <span class="hljs-keyword">let</span> index = Index::create(index_path, schema.clone())?;</pre></div></div>
-            
-        </li>
-        
-        
-        <li id="section-8">
-            <div class="annotation">
-              
-              <div class="pilwrap ">
-                <a class="pilcrow" href="#section-8">&#182;</a>
-              </div>
-              <p>To insert document we need an index writer.
-There must be only one writer at a time.
-This single <code>IndexWriter</code> is already
-multithreaded.</p>
-<p>Here we use a buffer of 50MB per thread. Using a bigger
-heap for the indexer can increase its throughput.</p>
-
-            </div>
-            
-            <div class="content"><div class='highlight'><pre>    <span class="hljs-keyword">let</span> <span class="hljs-keyword">mut</span> index_writer = index.writer(<span class="hljs-number">50_000_000</span>)?;</pre></div></div>
-            
-        </li>
-        
-        
-        <li id="section-9">
-            <div class="annotation">
-              
-              <div class="pilwrap ">
-                <a class="pilcrow" href="#section-9">&#182;</a>
-              </div>
-              <p>Let’s index our documents!
-We first need a handle on the title and the body field.</p>
-
-            </div>
-            
-        </li>
-        
-        
-        <li id="section-10">
-            <div class="annotation">
-              
-              <div class="pilwrap ">
-                <a class="pilcrow" href="#section-10">&#182;</a>
-              </div>
-              <h3 id="create-a-document-manually-">Create a document “manually”.</h3>
-<p>We can create a document manually, by setting the fields
-one by one in a Document object.</p>
-
-            </div>
-            
-            <div class="content"><div class='highlight'><pre>    <span class="hljs-keyword">let</span> title = schema.get_field(<span class="hljs-string">"title"</span>).unwrap();
-    <span class="hljs-keyword">let</span> body = schema.get_field(<span class="hljs-string">"body"</span>).unwrap();
-
-    <span class="hljs-keyword">let</span> <span class="hljs-keyword">mut</span> old_man_doc = Document::<span class="hljs-keyword">default</span>();
-    old_man_doc.add_text(title, <span class="hljs-string">"The Old Man and the Sea"</span>);
-    old_man_doc.add_text(
-        body,
-        <span class="hljs-string">"He was an old man who fished alone in a skiff in the Gulf Stream and \
-                          he had gone eighty-four days now without taking a fish."</span>,
-    );</pre></div></div>
-            
-        </li>
-        
-        
-        <li id="section-11">
-            <div class="annotation">
-              
-              <div class="pilwrap ">
-                <a class="pilcrow" href="#section-11">&#182;</a>
-              </div>
-              <p>… and add it to the <code>IndexWriter</code>.</p>
-
-            </div>
-            
-            <div class="content"><div class='highlight'><pre>    index_writer.add_document(old_man_doc);</pre></div></div>
-            
-        </li>
-        
-        
-        <li id="section-12">
-            <div class="annotation">
-              
-              <div class="pilwrap ">
-                <a class="pilcrow" href="#section-12">&#182;</a>
-              </div>
-              <h3 id="create-a-document-directly-from-json-">Create a document directly from json.</h3>
-<p>Alternatively, we can use our schema to parse a
-document object directly from json.
-The document is a string, but we use the <code>json</code> macro
-from <code>serde_json</code> for the convenience of multi-line support.</p>
-
-            </div>
-            
-            <div class="content"><div class='highlight'><pre>    <span class="hljs-keyword">let</span> json = json!({
-       <span class="hljs-string">"title"</span>: <span class="hljs-string">"Of Mice and Men"</span>,
-       <span class="hljs-string">"body"</span>: <span class="hljs-string">"A few miles south of Soledad, the Salinas River drops in close to the hillside \
-                bank and runs deep and green. The water is warm too, for it has slipped twinkling \
-                over the yellow sands in the sunlight before reaching the narrow pool. On one \
-                side of the river the golden foothill slopes curve up to the strong and rocky \
-                Gabilan Mountains, but on the valley side the water is lined with trees—willows \
-                fresh and green with every spring, carrying in their lower leaf junctures the \
-                debris of the winter’s flooding; and sycamores with mottled, white, recumbent \
-                limbs and branches that arch over the pool"</span>
-    });
-    <span class="hljs-keyword">let</span> mice_and_men_doc = schema.parse_document(&amp;json.to_string())?;
-
-    index_writer.add_document(mice_and_men_doc);</pre></div></div>
-            
-        </li>
-        
-        
-        <li id="section-13">
-            <div class="annotation">
-              
-              <div class="pilwrap ">
-                <a class="pilcrow" href="#section-13">&#182;</a>
-              </div>
-              <p>Multi-valued field are allowed, they are
-expressed in JSON by an array.
-The following document has two titles.</p>
-
-            </div>
-            
-            <div class="content"><div class='highlight'><pre>    <span class="hljs-keyword">let</span> json = json!({
-       <span class="hljs-string">"title"</span>: [<span class="hljs-string">"Frankenstein"</span>, <span class="hljs-string">"The Modern Prometheus"</span>],
-       <span class="hljs-string">"body"</span>: <span class="hljs-string">"You will rejoice to hear that no disaster has accompanied the commencement of an \
-                enterprise which you have regarded with such evil forebodings.  I arrived here \
-                yesterday, and my first task is to assure my dear sister of my welfare and \
-                increasing confidence in the success of my undertaking."</span>
-    });
-    <span class="hljs-keyword">let</span> frankenstein_doc = schema.parse_document(&amp;json.to_string())?;
-
-    index_writer.add_document(frankenstein_doc);</pre></div></div>
-            
-        </li>
-        
-        
-        <li id="section-14">
-            <div class="annotation">
-              
-              <div class="pilwrap ">
-                <a class="pilcrow" href="#section-14">&#182;</a>
-              </div>
-              <p>This is an example, so we will only index 3 documents
-here. You can check out tantivy’s tutorial to index
-the English wikipedia. Tantivy’s indexing is rather fast.
-Indexing 5 million articles of the English wikipedia takes
-around 4 minutes on my computer!</p>
-
-            </div>
-            
-        </li>
-        
-        
-        <li id="section-15">
-            <div class="annotation">
-              
-              <div class="pilwrap ">
-                <a class="pilcrow" href="#section-15">&#182;</a>
-              </div>
-              <h3 id="committing">Committing</h3>
-<p>At this point our documents are not searchable.</p>
-<p>We need to call .commit() explicitly to force the
-index_writer to finish processing the documents in the queue,
-flush the current index to the disk, and advertise
-the existence of new documents.</p>
-<p>This call is blocking.</p>
-
-            </div>
-            
-            <div class="content"><div class='highlight'><pre>    index_writer.commit()?;</pre></div></div>
-            
-        </li>
-        
-        
-        <li id="section-16">
-            <div class="annotation">
-              
-              <div class="pilwrap ">
-                <a class="pilcrow" href="#section-16">&#182;</a>
-              </div>
-              <p>If <code>.commit()</code> returns correctly, then all of the
-documents that have been added are guaranteed to be
-persistently indexed.</p>
-<p>In the scenario of a crash or a power failure,
-tantivy behaves as if has rolled back to its last
-commit.</p>
-
-            </div>
-            
-        </li>
-        
-        
-        <li id="section-17">
-            <div class="annotation">
-              
-              <div class="pilwrap ">
-                <a class="pilcrow" href="#section-17">&#182;</a>
-              </div>
-              <h1 id="searching">Searching</h1>
-<p>Let’s search our index. Start by reloading
-searchers in the index. This should be done
-after every commit().</p>
-
-            </div>
-            
-            <div class="content"><div class='highlight'><pre>    index.load_searchers()?;</pre></div></div>
-            
-        </li>
-        
-        
-        <li id="section-18">
-            <div class="annotation">
-              
-              <div class="pilwrap ">
-                <a class="pilcrow" href="#section-18">&#182;</a>
-              </div>
-              <p>Afterwards create one (or more) searchers.</p>
-<p>You should create a searcher
-every time you start a “search query”.</p>
-
-            </div>
-            
-            <div class="content"><div class='highlight'><pre>    <span class="hljs-keyword">let</span> searcher = index.searcher();</pre></div></div>
-            
-        </li>
-        
-        
-        <li id="section-19">
-            <div class="annotation">
-              
-              <div class="pilwrap ">
-                <a class="pilcrow" href="#section-19">&#182;</a>
-              </div>
-              <p>The query parser can interpret human queries.
-Here, if the user does not specify which
-field they want to search, tantivy will search
-in both title and body.</p>
-
-            </div>
-            
-            <div class="content"><div class='highlight'><pre>    <span class="hljs-keyword">let</span> <span class="hljs-keyword">mut</span> query_parser = QueryParser::for_index(index, <span class="hljs-built_in">vec!</span>[title, body]);</pre></div></div>
-            
-        </li>
-        
-        
-        <li id="section-20">
-            <div class="annotation">
-              
-              <div class="pilwrap ">
-                <a class="pilcrow" href="#section-20">&#182;</a>
-              </div>
-              <p>QueryParser may fail if the query is not in the right
-format. For user facing applications, this can be a problem.
-A ticket has been opened regarding this problem.</p>
-
-            </div>
-            
-            <div class="content"><div class='highlight'><pre>    <span class="hljs-keyword">let</span> query = query_parser.parse_query(<span class="hljs-string">"sea whale"</span>)?;</pre></div></div>
-            
-        </li>
-        
-        
-        <li id="section-21">
-            <div class="annotation">
-              
-              <div class="pilwrap ">
-                <a class="pilcrow" href="#section-21">&#182;</a>
-              </div>
-              <p>A query defines a set of documents, as
-well as the way they should be scored.</p>
-<p>A query created by the query parser is scored according
-to a metric called Tf-Idf, and will consider
-any document matching at least one of our terms.</p>
-
-            </div>
-            
-        </li>
-        
-        
-        <li id="section-22">
-            <div class="annotation">
-              
-              <div class="pilwrap ">
-                <a class="pilcrow" href="#section-22">&#182;</a>
-              </div>
-              <h3 id="collectors">Collectors</h3>
-<p>We are not interested in all of the documents but
-only in the top 10. Keeping track of our top 10 best documents
-is the role of the TopCollector.</p>
-
-            </div>
-            
-            <div class="content"><div class='highlight'><pre>    <span class="hljs-keyword">let</span> <span class="hljs-keyword">mut</span> top_collector = TopCollector::with_limit(<span class="hljs-number">10</span>);</pre></div></div>
-            
-        </li>
-        
-        
-        <li id="section-23">
-            <div class="annotation">
-              
-              <div class="pilwrap ">
-                <a class="pilcrow" href="#section-23">&#182;</a>
-              </div>
-              <p>We can now perform our query.</p>
-
-            </div>
-            
-            <div class="content"><div class='highlight'><pre>    searcher.search(&amp;*query, &amp;<span class="hljs-keyword">mut</span> top_collector)?;</pre></div></div>
-            
-        </li>
-        
-        
-        <li id="section-24">
-            <div class="annotation">
-              
-              <div class="pilwrap ">
-                <a class="pilcrow" href="#section-24">&#182;</a>
-              </div>
-              <p>Our top collector now contains the 10
-most relevant doc ids…</p>
-
-            </div>
-            
-            <div class="content"><div class='highlight'><pre>    <span class="hljs-keyword">let</span> doc_addresses = top_collector.docs();</pre></div></div>
-            
-        </li>
-        
-        
-        <li id="section-25">
-            <div class="annotation">
-              
-              <div class="pilwrap ">
-                <a class="pilcrow" href="#section-25">&#182;</a>
-              </div>
-              <p>The actual documents still need to be
-retrieved from Tantivy’s store.</p>
-<p>Since the body field was not configured as stored,
-the document returned will only contain
-a title.</p>
-
-            </div>
-            
-            <div class="content"><div class='highlight'><pre>
-    <span class="hljs-keyword">for</span> doc_address <span class="hljs-keyword">in</span> doc_addresses {
-        <span class="hljs-keyword">let</span> retrieved_doc = searcher.doc(&amp;doc_address)?;
-        <span class="hljs-built_in">println!</span>(<span class="hljs-string">"{}"</span>, schema.to_json(&amp;retrieved_doc));
-    }</pre></div></div>
-            
-        </li>
-        
-        
-        <li id="section-26">
-            <div class="annotation">
-              
-              <div class="pilwrap ">
-                <a class="pilcrow" href="#section-26">&#182;</a>
-              </div>
-              <p>Wait for indexing and merging threads to shut down.
-Usually this isn’t needed, but in <code>main</code> we try to
-delete the temporary directory and that fails on
-Windows if the files are still open.</p>
-
-            </div>
-            
-            <div class="content"><div class='highlight'><pre>    index_writer.wait_merging_threads()?;
-
-    <span class="hljs-literal">Ok</span>(())
-}</pre></div></div>
-            
-        </li>
-        
-    </ul>
-  </div>
-</body>
-</html>
--- a/examples/iterating_docs_and_positions.rs
+++ b/examples/iterating_docs_and_positions.rs
@@ -0,0 +1,139 @@
+// # Iterating docs and positioms.
+//
+// At its core of tantivy, relies on a data structure
+// called an inverted index.
+//
+// This example shows how to manually iterate through
+// the list of documents containing a term, getting
+// its term frequency, and accessing its positions.
+
+
+// ---
+// Importing tantivy...
+#[macro_use]
+extern crate tantivy;
+use tantivy::schema::*;
+use tantivy::Index;
+use tantivy::{DocSet, DocId, Postings};
+
+fn main() -> tantivy::Result<()> {
+
+
+    // We first create a schema for the sake of the
+    // example. Check the `basic_search` example for more information.
+    let mut schema_builder = SchemaBuilder::default();
+
+    // For this example, we need to make sure to index positions for our title
+    // field. `TEXT` precisely does this.
+    let title = schema_builder.add_text_field("title", TEXT | STORED);
+    let schema = schema_builder.build();
+
+    let index = Index::create_in_ram(schema.clone());
+
+    let mut index_writer = index.writer_with_num_threads(1, 50_000_000)?;
+    index_writer.add_document(doc!(title => "The Old Man and the Sea"));
+    index_writer.add_document(doc!(title => "Of Mice and Men"));
+    index_writer.add_document(doc!(title => "The modern Promotheus"));
+    index_writer.commit()?;
+
+    index.load_searchers()?;
+
+    let searcher = index.searcher();
+
+    // A tantivy index is actually a collection of segments.
+    // Similarly, a searcher just wraps a list `segment_reader`.
+    //
+    // (Because we indexed a very small number of documents over one thread
+    // there is actually only one segment here, but let's iterate through the list
+    // anyway)
+    for segment_reader in searcher.segment_readers() {
+
+        // A segment contains different data structure.
+        // Inverted index stands for the combination of
+        // - the term dictionary
+        // - the inverted lists associated to each terms and their positions
+        let inverted_index = segment_reader.inverted_index(title);
+
+        // A `Term` is a text token associated with a field.
+        // Let's go through all docs containing the term `title:the` and access their position
+        let term_the = Term::from_field_text(title, "the");
+
+
+        // This segment posting object is like a cursor over the documents matching the term.
+        // The `IndexRecordOption` arguments tells tantivy we will be interested in both term frequencies
+        // and positions.
+        //
+        // If you don't need all this information, you may get better performance by decompressing less
+        // information.
+        if let Some(mut segment_postings) = inverted_index.read_postings(&term_the, IndexRecordOption::WithFreqsAndPositions) {
+
+            // this buffer will be used to request for positions
+            let mut positions: Vec<u32> = Vec::with_capacity(100);
+            while segment_postings.advance() {
+
+                // the number of time the term appears in the document.
+                let doc_id: DocId = segment_postings.doc(); //< do not try to access this before calling advance once.
+
+                // This MAY contains deleted documents as well.
+                if segment_reader.is_deleted(doc_id) {
+                    continue;
+                }
+
+                // the number of time the term appears in the document.
+                let term_freq: u32 = segment_postings.term_freq();
+                // accessing positions is slightly expensive and lazy, do not request
+                // for them if you don't need them for some documents.
+                segment_postings.positions(&mut positions);
+
+                // By definition we should have `term_freq` positions.
+                assert_eq!(positions.len(), term_freq as usize);
+
+                // This prints:
+                // ```
+                // Doc 0: TermFreq 2: [0, 4]
+                // Doc 2: TermFreq 1: [0]
+                // ```
+                println!("Doc {}: TermFreq {}: {:?}", doc_id, term_freq, positions);
+            }
+        }
+    }
+
+
+    // A `Term` is a text token associated with a field.
+    // Let's go through all docs containing the term `title:the` and access their position
+    let term_the = Term::from_field_text(title, "the");
+
+    // Some other powerful operations (especially `.skip_to`) may be useful to consume these
+    // posting lists rapidly.
+    // You can check for them in the [`DocSet`](https://docs.rs/tantivy/~0/tantivy/trait.DocSet.html) trait
+    // and the [`Postings`](https://docs.rs/tantivy/~0/tantivy/trait.Postings.html) trait
+
+    // Also, for some VERY specific high performance use case like an OLAP analysis of logs,
+    // you can get better performance by accessing directly the blocks of doc ids.
+    for segment_reader in searcher.segment_readers() {
+
+        // A segment contains different data structure.
+        // Inverted index stands for the combination of
+        // - the term dictionary
+        // - the inverted lists associated to each terms and their positions
+        let inverted_index = segment_reader.inverted_index(title);
+
+        // This segment posting object is like a cursor over the documents matching the term.
+        // The `IndexRecordOption` arguments tells tantivy we will be interested in both term frequencies
+        // and positions.
+        //
+        // If you don't need all this information, you may get better performance by decompressing less
+        // information.
+        if let Some(mut block_segment_postings) = inverted_index.read_block_postings(&term_the, IndexRecordOption::Basic) {
+            while block_segment_postings.advance() {
+                // Once again these docs MAY contains deleted documents as well.
+                let docs = block_segment_postings.docs();
+                // Prints `Docs [0, 2].`
+                println!("Docs {:?}", docs);
+            }
+        }
+    }
+
+    Ok(())
+}
+
--- a/examples/stop_words.rs
+++ b/examples/stop_words.rs
@@ -0,0 +1,129 @@
+// # Stop Words Example
+//
+// This example covers the basic usage of stop words
+// with tantivy
+//
+// We will :
+// - define our schema
+// - create an index in a directory
+// - add a few stop words
+// - index few documents in our index
+
+extern crate tempdir;
+
+// ---
+// Importing tantivy...
+#[macro_use]
+extern crate tantivy;
+use tantivy::collector::TopCollector;
+use tantivy::query::QueryParser;
+use tantivy::schema::*;
+use tantivy::tokenizer::*;
+use tantivy::Index;
+
+fn main() -> tantivy::Result<()> {
+  // this example assumes you understand the content in `basic_search`
+  let index_path = TempDir::new("tantivy_stopwords_example_dir")?;
+  let mut schema_builder = SchemaBuilder::default();
+
+  // This configures your custom options for how tantivy will
+  // store and process your content in the index; The key
+  // to note is that we are setting the tokenizer to `stoppy`
+  // which will be defined and registered below.
+  let text_field_indexing = TextFieldIndexing::default()
+    .set_tokenizer("stoppy")
+    .set_index_option(IndexRecordOption::WithFreqsAndPositions);
+  let text_options = TextOptions::default()
+    .set_indexing_options(text_field_indexing)
+    .set_stored();
+
+  // Our first field is title.
+  schema_builder.add_text_field("title", text_options);
+
+  // Our second field is body.
+  let text_field_indexing = TextFieldIndexing::default()
+    .set_tokenizer("stoppy")
+    .set_index_option(IndexRecordOption::WithFreqsAndPositions);
+  let text_options = TextOptions::default()
+    .set_indexing_options(text_field_indexing)
+    .set_stored();
+  schema_builder.add_text_field("body", text_options);
+
+  let schema = schema_builder.build();
+
+  let index = Index::create_in_dir(&index_path, schema.clone())?;
+
+  // This tokenizer lowers all of the text (to help with stop word matching)
+  // then removes all instances of `the` and `and` from the corpus
+  let tokenizer = SimpleTokenizer
+    .filter(LowerCaser)
+    .filter(StopWordFilter::remove(vec![
+      "the".to_string(),
+      "and".to_string(),
+    ]));
+
+  index.tokenizers().register("stoppy", tokenizer);
+
+  let mut index_writer = index.writer(50_000_000)?;
+
+  let title = schema.get_field("title").unwrap();
+  let body = schema.get_field("body").unwrap();
+
+  index_writer.add_document(doc!(
+    title => "The Old Man and the Sea",
+    body => "He was an old man who fished alone in a skiff in the Gulf Stream and \
+     he had gone eighty-four days now without taking a fish."
+  ));
+
+  index_writer.add_document(doc!(
+        title => "Of Mice and Men",
+        body => "A few miles south of Soledad, the Salinas River drops in close to the hillside \
+                bank and runs deep and green. The water is warm too, for it has slipped twinkling \
+                over the yellow sands in the sunlight before reaching the narrow pool. On one \
+                side of the river the golden foothill slopes curve up to the strong and rocky \
+                Gabilan Mountains, but on the valley side the water is lined with trees—willows \
+                fresh and green with every spring, carrying in their lower leaf junctures the \
+                debris of the winter’s flooding; and sycamores with mottled, white, recumbent \
+                limbs and branches that arch over the pool"
+    ));
+
+  index_writer.add_document(doc!(
+       title => "Frankenstein",
+       body => "You will rejoice to hear that no disaster has accompanied the commencement of an \
+                enterprise which you have regarded with such evil forebodings.  I arrived here \
+                yesterday, and my first task is to assure my dear sister of my welfare and \
+                increasing confidence in the success of my undertaking."
+    ));
+
+  index_writer.commit()?;
+
+  index.load_searchers()?;
+
+  let searcher = index.searcher();
+
+  let query_parser = QueryParser::for_index(&index, vec![title, body]);
+
+  // this will have NO hits because it was filtered out
+  // because the query is run through the analyzer you
+  // actually will get an error here because the query becomes
+  // empty
+  assert!(query_parser.parse_query("the").is_err());
+
+  // this will have hits
+  let query = query_parser.parse_query("is")?;
+
+  let mut top_collector = TopCollector::with_limit(10);
+
+  searcher.search(&*query, &mut top_collector)?;
+
+  let doc_addresses = top_collector.docs();
+
+  for doc_address in doc_addresses {
+    let retrieved_doc = searcher.doc(&doc_address)?;
+    println!("{}", schema.to_json(&retrieved_doc));
+  }
+
+  Ok(())
+}
+
+use tempdir::TempDir;
--- a/examples/working_with_json.rs
+++ b/examples/working_with_json.rs
@@ -0,0 +1,41 @@
+extern crate tantivy;
+use tantivy::schema::*;
+
+// # Document from json
+//
+// For convenience, `Document` can be parsed directly from json.
+fn main() -> tantivy::Result<()> {
+    // Let's first define a schema and an index.
+    // Check out the basic example if this is confusing to you.
+    //
+    // first we need to define a schema ...
+    let mut schema_builder = SchemaBuilder::default();
+    schema_builder.add_text_field("title", TEXT | STORED);
+    schema_builder.add_text_field("body", TEXT);
+    schema_builder.add_u64_field("year", INT_INDEXED);
+    let schema = schema_builder.build();
+
+    // Let's assume we have a json-serialized document.
+    let mice_and_men_doc_json = r#"{
+       "title": "Of Mice and Men",
+       "year": 1937
+    }"#;
+
+    // We can parse our document
+    let _mice_and_men_doc = schema.parse_document(&mice_and_men_doc_json)?;
+
+    // Multi-valued field are allowed, they are
+    // expressed in JSON by an array.
+    // The following document has two titles.
+    let frankenstein_json = r#"{
+       "title": ["Frankenstein", "The Modern Prometheus"],
+       "year": 1818
+    }"#;
+    let _frankenstein_doc = schema.parse_document(&frankenstein_json)?;
+
+    // Note that the schema is saved in your index directory.
+    //
+    // As a result, Indexes are aware of their schema, and you can use this feature
+    // just by opening an existing `Index`, and calling `index.schema()..parse_document(json)`.
+    Ok(())
+}
--- a/src/collector/chained_collector.rs
+++ b/src/collector/chained_collector.rs
@@ -4,87 +4,111 @@ use Result;
 use Score;
 use SegmentLocalId;
 use SegmentReader;
-use collector::SegmentCollector;
-use collector::CollectorWrapper;

 /// Collector that does nothing.
 /// This is used in the chain Collector and will hopefully
 /// be optimized away by the compiler.
 pub struct DoNothingCollector;
 impl Collector for DoNothingCollector {
-    type Child = DoNothingCollector;
    #[inline]
-    fn for_segment(&mut self, _: SegmentLocalId, _: &SegmentReader) -> Result<DoNothingCollector> {
-        Ok(DoNothingCollector)
+    fn set_segment(&mut self, _: SegmentLocalId, _: &SegmentReader) -> Result<()> {
+        Ok(())
    }
    #[inline]
+    fn collect(&mut self, _doc: DocId, _score: Score) {}
+    #[inline]
    fn requires_scoring(&self) -> bool {
        false
    }
 }

-impl SegmentCollector for DoNothingCollector {
-    type CollectionResult = ();
-
-    #[inline]
-    fn collect(&mut self, _doc: DocId, _score: Score) {}
-
-    fn finalize(self) -> () {
-        ()
-    }
-}
-
 /// Zero-cost abstraction used to collect on multiple collectors.
 /// This contraption is only usable if the type of your collectors
 /// are known at compile time.
+///
+/// ```rust
+/// #[macro_use]
+/// extern crate tantivy;
+/// use tantivy::schema::{SchemaBuilder, TEXT};
+/// use tantivy::{Index, Result};
+/// use tantivy::collector::{CountCollector, TopCollector, chain};
+/// use tantivy::query::QueryParser;
+///
+/// # fn main() { example().unwrap(); }
+/// fn example() -> Result<()> {
+///     let mut schema_builder = SchemaBuilder::new();
+///     let title = schema_builder.add_text_field("title", TEXT);
+///     let schema = schema_builder.build();
+///     let index = Index::create_in_ram(schema);
+///     {
+///         let mut index_writer = index.writer(3_000_000)?;
+///         index_writer.add_document(doc!(
+///             title => "The Name of the Wind",
+///         ));
+///         index_writer.add_document(doc!(
+///             title => "The Diary of Muadib",
+///         ));
+///         index_writer.add_document(doc!(
+///             title => "A Dairy Cow",
+///         ));
+///         index_writer.add_document(doc!(
+///             title => "The Diary of a Young Girl",
+///         ));
+///         index_writer.commit().unwrap();
+///     }
+///
+///     index.load_searchers()?;
+///     let searcher = index.searcher();
+///
+///     {
+///         let mut top_collector = TopCollector::with_limit(2);
+///         let mut count_collector = CountCollector::default();
+///         {
+///             let mut collectors = chain().push(&mut top_collector).push(&mut count_collector);
+///             let query_parser = QueryParser::for_index(&index, vec![title]);
+///             let query = query_parser.parse_query("diary")?;
+///             searcher.search(&*query, &mut collectors).unwrap();
+///         }
+///         assert_eq!(count_collector.count(), 2);
+///         assert!(top_collector.at_capacity());
+///     }
+///
+///     Ok(())
+/// }
+/// ```
 pub struct ChainedCollector<Left: Collector, Right: Collector> {
    left: Left,
    right: Right,
 }

-pub struct ChainedSegmentCollector<Left: SegmentCollector, Right: SegmentCollector> {
-    left: Left,
-    right: Right,
-}
-
 impl<Left: Collector, Right: Collector> ChainedCollector<Left, Right> {
    /// Adds a collector
-    pub fn push<C: Collector>(self, new_collector: &mut C) -> ChainedCollector<Self, CollectorWrapper<C>> {
+    pub fn push<C: Collector>(self, new_collector: &mut C) -> ChainedCollector<Self, &mut C> {
        ChainedCollector {
            left: self,
-            right: CollectorWrapper::new(new_collector),
+            right: new_collector,
        }
    }
 }

 impl<Left: Collector, Right: Collector> Collector for ChainedCollector<Left, Right> {
-    type Child = ChainedSegmentCollector<Left::Child, Right::Child>;
-    fn for_segment(
+    fn set_segment(
        &mut self,
        segment_local_id: SegmentLocalId,
        segment: &SegmentReader,
-    ) -> Result<Self::Child> {
-        Ok(ChainedSegmentCollector {
-            left: self.left.for_segment(segment_local_id, segment)?,
-            right: self.right.for_segment(segment_local_id, segment)?,
-        })
+    ) -> Result<()> {
+        self.left.set_segment(segment_local_id, segment)?;
+        self.right.set_segment(segment_local_id, segment)?;
+        Ok(())
    }

-    fn requires_scoring(&self) -> bool {
-        self.left.requires_scoring() || self.right.requires_scoring()
-    }
-}
-
-impl<Left: SegmentCollector, Right: SegmentCollector> SegmentCollector for ChainedSegmentCollector<Left, Right> {
-    type CollectionResult = (Left::CollectionResult, Right::CollectionResult);
-
    fn collect(&mut self, doc: DocId, score: Score) {
        self.left.collect(doc, score);
        self.right.collect(doc, score);
    }

-    fn finalize(self) -> Self::CollectionResult {
-        (self.left.finalize(), self.right.finalize())
+    fn requires_scoring(&self) -> bool {
+        self.left.requires_scoring() || self.right.requires_scoring()
    }
 }

@@ -98,35 +122,19 @@ pub fn chain() -> ChainedCollector<DoNothingCollector, DoNothingCollector> {

 #[cfg(test)]
 mod tests {
+
    use super::*;
-    use collector::{CountCollector, SegmentCollector, TopCollector};
-    use schema::SchemaBuilder;
-    use Index;
-    use Document;
+    use collector::{Collector, CountCollector, TopCollector};

    #[test]
    fn test_chained_collector() {
-        let schema_builder = SchemaBuilder::new();
-        let schema = schema_builder.build();
-        let index = Index::create_in_ram(schema);
-
-        let mut index_writer = index.writer(3_000_000).unwrap();
-        let doc = Document::new();
-        index_writer.add_document(doc);
-        index_writer.commit().unwrap();
-        index.load_searchers().unwrap();
-        let searcher = index.searcher();
-        let segment_readers = searcher.segment_readers();
-
        let mut top_collector = TopCollector::with_limit(2);
        let mut count_collector = CountCollector::default();
        {
            let mut collectors = chain().push(&mut top_collector).push(&mut count_collector);
-            let mut segment_collector = collectors.for_segment(0, &segment_readers[0]).unwrap();
-            segment_collector.collect(1, 0.2);
-            segment_collector.collect(2, 0.1);
-            segment_collector.collect(3, 0.5);
-            collectors.merge_children(vec![segment_collector]);
+            collectors.collect(1, 0.2);
+            collectors.collect(2, 0.1);
+            collectors.collect(3, 0.5);
        }
        assert_eq!(count_collector.count(), 3);
        assert!(top_collector.at_capacity());
--- a/src/collector/count_collector.rs
+++ b/src/collector/count_collector.rs
@@ -4,11 +4,56 @@ use Result;
 use Score;
 use SegmentLocalId;
 use SegmentReader;
-use collector::SegmentCollector;
-use collector::Combinable;

 /// `CountCollector` collector only counts how many
 /// documents match the query.
+///
+/// ```rust
+/// #[macro_use]
+/// extern crate tantivy;
+/// use tantivy::schema::{SchemaBuilder, TEXT};
+/// use tantivy::{Index, Result};
+/// use tantivy::collector::CountCollector;
+/// use tantivy::query::QueryParser;
+///
+/// # fn main() { example().unwrap(); }
+/// fn example() -> Result<()> {
+///     let mut schema_builder = SchemaBuilder::new();
+///     let title = schema_builder.add_text_field("title", TEXT);
+///     let schema = schema_builder.build();
+///     let index = Index::create_in_ram(schema);
+///     {
+///         let mut index_writer = index.writer(3_000_000)?;
+///         index_writer.add_document(doc!(
+///             title => "The Name of the Wind",
+///         ));
+///         index_writer.add_document(doc!(
+///             title => "The Diary of Muadib",
+///         ));
+///         index_writer.add_document(doc!(
+///             title => "A Dairy Cow",
+///         ));
+///         index_writer.add_document(doc!(
+///             title => "The Diary of a Young Girl",
+///         ));
+///         index_writer.commit().unwrap();
+///     }
+///
+///     index.load_searchers()?;
+///     let searcher = index.searcher();
+///
+///     {
+///	        let mut count_collector = CountCollector::default();
+///         let query_parser = QueryParser::for_index(&index, vec![title]);
+///         let query = query_parser.parse_query("diary")?;
+///         searcher.search(&*query, &mut count_collector).unwrap();
+///
+///         assert_eq!(count_collector.count(), 2);
+///     }
+///
+///     Ok(())
+/// }
+/// ```
 #[derive(Default)]
 pub struct CountCollector {
    count: usize,
@@ -23,10 +68,12 @@ impl CountCollector {
 }

 impl Collector for CountCollector {
-    type Child = CountCollector;
+    fn set_segment(&mut self, _: SegmentLocalId, _: &SegmentReader) -> Result<()> {
+        Ok(())
+    }

-    fn for_segment(&mut self, _: SegmentLocalId, _: &SegmentReader) -> Result<CountCollector> {
-        Ok(CountCollector::default())
+    fn collect(&mut self, _: DocId, _: Score) {
+        self.count += 1;
    }

    fn requires_scoring(&self) -> bool {
@@ -34,28 +81,10 @@ impl Collector for CountCollector {
    }
 }

-impl Combinable for CountCollector {
-    fn combine_into(&mut self, other: Self) {
-        self.count += other.count;
-    }
-}
-
-impl SegmentCollector for CountCollector {
-    type CollectionResult = CountCollector;
-
-    fn collect(&mut self, _: DocId, _: Score) {
-        self.count += 1;
-    }
-
-    fn finalize(self) -> CountCollector {
-        self
-    }
-}
-
 #[cfg(test)]
 mod tests {

-    use collector::{Collector, CountCollector, SegmentCollector};
+    use collector::{Collector, CountCollector};

    #[test]
    fn test_count_collector() {
--- a/src/collector/facet_collector.rs
+++ b/src/collector/facet_collector.rs
@@ -3,12 +3,14 @@ use docset::SkipResult;
 use fastfield::FacetReader;
 use schema::Facet;
 use schema::Field;
+use std::cell::UnsafeCell;
 use std::collections::btree_map;
 use std::collections::BTreeMap;
 use std::collections::BTreeSet;
 use std::collections::BinaryHeap;
 use std::collections::Bound;
 use std::iter::Peekable;
+use std::mem;
 use std::{u64, usize};
 use termdict::TermMerger;

@@ -18,7 +20,6 @@ use Result;
 use Score;
 use SegmentLocalId;
 use SegmentReader;
-use collector::SegmentCollector;

 struct Hit<'a> {
    count: u64,
@@ -193,22 +194,19 @@ fn facet_depth(facet_bytes: &[u8]) -> usize {
 /// }
 /// ```
 pub struct FacetCollector {
+    facet_ords: Vec<u64>,
    field: Field,
+    ff_reader: Option<UnsafeCell<FacetReader>>,
    segment_counters: Vec<SegmentFacetCounter>,
-    facets: BTreeSet<Facet>,
-}
-
-pub struct FacetSegmentCollector {
-    reader: FacetReader,
-
-    facet_ords_buf: Vec<u64>,

    // facet_ord -> collapse facet_id
-    collapse_mapping: Vec<usize>,
+    current_segment_collapse_mapping: Vec<usize>,
    // collapse facet_id -> count
-    counts: Vec<u64>,
+    current_segment_counts: Vec<u64>,
    // collapse facet_id -> facet_ord
-    collapse_facet_ords: Vec<u64>,
+    current_collapse_facet_ords: Vec<u64>,
+
+    facets: BTreeSet<Facet>,
 }

 fn skip<'a, I: Iterator<Item = &'a Facet>>(
@@ -242,9 +240,15 @@ impl FacetCollector {
    /// is of the proper type.
    pub fn for_field(field: Field) -> FacetCollector {
        FacetCollector {
+            facet_ords: Vec::with_capacity(255),
            segment_counters: Vec::new(),
            field,
+            ff_reader: None,
            facets: BTreeSet::new(),
+
+            current_segment_collapse_mapping: Vec::new(),
+            current_collapse_facet_ords: Vec::new(),
+            current_segment_counts: Vec::new(),
        }
    }

@@ -275,11 +279,69 @@ impl FacetCollector {
        self.facets.insert(facet);
    }

+    fn set_collapse_mapping(&mut self, facet_reader: &FacetReader) {
+        self.current_segment_collapse_mapping.clear();
+        self.current_collapse_facet_ords.clear();
+        self.current_segment_counts.clear();
+        let mut collapse_facet_it = self.facets.iter().peekable();
+        self.current_collapse_facet_ords.push(0);
+        let mut facet_streamer = facet_reader.facet_dict().range().into_stream();
+        if !facet_streamer.advance() {
+            return;
+        }
+        'outer: loop {
+            // at the begining of this loop, facet_streamer
+            // is positionned on a term that has not been processed yet.
+            let skip_result = skip(facet_streamer.key(), &mut collapse_facet_it);
+            match skip_result {
+                SkipResult::Reached => {
+                    // we reach a facet we decided to collapse.
+                    let collapse_depth = facet_depth(facet_streamer.key());
+                    let mut collapsed_id = 0;
+                    self.current_segment_collapse_mapping.push(0);
+                    while facet_streamer.advance() {
+                        let depth = facet_depth(facet_streamer.key());
+                        if depth <= collapse_depth {
+                            continue 'outer;
+                        }
+                        if depth == collapse_depth + 1 {
+                            collapsed_id = self.current_collapse_facet_ords.len();
+                            self.current_collapse_facet_ords
+                                .push(facet_streamer.term_ord());
+                            self.current_segment_collapse_mapping.push(collapsed_id);
+                        } else {
+                            self.current_segment_collapse_mapping.push(collapsed_id);
+                        }
+                    }
+                    break;
+                }
+                SkipResult::End | SkipResult::OverStep => {
+                    self.current_segment_collapse_mapping.push(0);
+                    if !facet_streamer.advance() {
+                        break;
+                    }
+                }
+            }
+        }
+    }
+
+    fn finalize_segment(&mut self) {
+        if self.ff_reader.is_some() {
+            self.segment_counters.push(SegmentFacetCounter {
+                facet_reader: self.ff_reader.take().unwrap().into_inner(),
+                facet_ords: mem::replace(&mut self.current_collapse_facet_ords, Vec::new()),
+                facet_counts: mem::replace(&mut self.current_segment_counts, Vec::new()),
+            });
+        }
+    }
+
    /// Returns the results of the collection.
    ///
    /// This method does not just return the counters,
    /// it also translates the facet ordinals of the last segment.
-    pub fn harvest(self) -> FacetCounts {
+    pub fn harvest(mut self) -> FacetCounts {
+        self.finalize_segment();
+
        let collapsed_facet_ords: Vec<&[u64]> = self.segment_counters
            .iter()
            .map(|segment_counter| &segment_counter.facet_ords[..])
@@ -327,92 +389,30 @@ impl FacetCollector {
    }
 }

-impl FacetSegmentCollector {
-    fn into_segment_facet_counter(self) -> SegmentFacetCounter {
-        SegmentFacetCounter {
-            facet_reader: self.reader,
-            facet_ords: self.collapse_facet_ords,
-            facet_counts: self.counts,
-        }
-    }
-}
-
 impl Collector for FacetCollector {
-    type Child = FacetSegmentCollector;
-
-    fn for_segment(&mut self, _: SegmentLocalId, reader: &SegmentReader) -> Result<FacetSegmentCollector> {
+    fn set_segment(&mut self, _: SegmentLocalId, reader: &SegmentReader) -> Result<()> {
+        self.finalize_segment();
        let facet_reader = reader.facet_reader(self.field)?;
-
-        let mut collapse_mapping = Vec::new();
-        let mut counts = Vec::new();
-        let mut collapse_facet_ords = Vec::new();
-
-        let mut collapse_facet_it = self.facets.iter().peekable();
-        collapse_facet_ords.push(0);
-        {
-            let mut facet_streamer = facet_reader.facet_dict().range().into_stream();
-            if facet_streamer.advance() {
-                'outer: loop {
-                    // at the begining of this loop, facet_streamer
-                    // is positionned on a term that has not been processed yet.
-                    let skip_result = skip(facet_streamer.key(), &mut collapse_facet_it);
-                    match skip_result {
-                        SkipResult::Reached => {
-                            // we reach a facet we decided to collapse.
-                            let collapse_depth = facet_depth(facet_streamer.key());
-                            let mut collapsed_id = 0;
-                            collapse_mapping.push(0);
-                            while facet_streamer.advance() {
-                                let depth = facet_depth(facet_streamer.key());
-                                if depth <= collapse_depth {
-                                    continue 'outer;
-                                }
-                                if depth == collapse_depth + 1 {
-                                    collapsed_id = collapse_facet_ords.len();
-                                    collapse_facet_ords.push(facet_streamer.term_ord());
-                                    collapse_mapping.push(collapsed_id);
-                                } else {
-                                    collapse_mapping.push(collapsed_id);
-                                }
-                            }
-                            break;
-                        }
-                        SkipResult::End | SkipResult::OverStep => {
-                            collapse_mapping.push(0);
-                            if !facet_streamer.advance() {
-                                break;
-                            }
-                        }
-                    }
-                }
-            }
-        }
-
-        counts.resize(collapse_facet_ords.len(), 0);
-
-        Ok(FacetSegmentCollector {
-            reader: facet_reader,
-            facet_ords_buf: Vec::with_capacity(255),
-            collapse_mapping,
-            counts,
-            collapse_facet_ords,
-        })
+        self.set_collapse_mapping(&facet_reader);
+        self.current_segment_counts
+            .resize(self.current_collapse_facet_ords.len(), 0);
+        self.ff_reader = Some(UnsafeCell::new(facet_reader));
+        Ok(())
    }

-    fn requires_scoring(&self) -> bool {
-        false
-    }
-}
-
-impl SegmentCollector for FacetSegmentCollector {
-    type CollectionResult = Vec<SegmentFacetCounter>;
-
    fn collect(&mut self, doc: DocId, _: Score) {
-        self.reader.facet_ords(doc, &mut self.facet_ords_buf);
+        let facet_reader: &mut FacetReader = unsafe {
+            &mut *self.ff_reader
+                .as_ref()
+                .expect("collect() was called before set_segment. This should never happen.")
+                .get()
+        };
+        facet_reader.facet_ords(doc, &mut self.facet_ords);
        let mut previous_collapsed_ord: usize = usize::MAX;
-        for &facet_ord in &self.facet_ords_buf {
-            let collapsed_ord = self.collapse_mapping[facet_ord as usize];
-            self.counts[collapsed_ord] += if collapsed_ord == previous_collapsed_ord {
+        for &facet_ord in &self.facet_ords {
+            let collapsed_ord = self.current_segment_collapse_mapping[facet_ord as usize];
+            self.current_segment_counts[collapsed_ord] += if collapsed_ord == previous_collapsed_ord
+            {
                0
            } else {
                1
@@ -421,8 +421,8 @@ impl SegmentCollector for FacetSegmentCollector {
        }
    }

-    fn finalize(self) -> Vec<SegmentFacetCounter> {
-        vec![self.into_segment_facet_counter()]
+    fn requires_scoring(&self) -> bool {
+        false
    }
 }

@@ -470,17 +470,25 @@ impl FacetCounts {
        let mut heap = BinaryHeap::with_capacity(k);
        let mut it = self.get(facet);

+        // push the first k elements to first bring the heap
+        // to capacity
        for (facet, count) in (&mut it).take(k) {
            heap.push(Hit { count, facet });
        }

-        let mut lowest_count: u64 = heap.peek().map(|hit| hit.count).unwrap_or(u64::MIN);
+        let mut lowest_count: u64 = heap.peek().map(|hit| hit.count)
+            .unwrap_or(u64::MIN); //< the `unwrap_or` case may be triggered but the value
+                                  // is never used in that case.
+
        for (facet, count) in it {
            if count > lowest_count {
-                lowest_count = count;
                if let Some(mut head) = heap.peek_mut() {
                    *head = Hit { count, facet };
                }
+                // the heap gets reconstructed at this point
+                if let Some(head) = heap.peek() {
+                    lowest_count = head.count;
+                }
            }
        }
        heap.into_sorted_vec()
@@ -495,6 +503,7 @@ mod tests {
    use super::{FacetCollector, FacetCounts};
    use core::Index;
    use query::AllQuery;
+    use rand::distributions::Uniform;
    use rand::{thread_rng, Rng};
    use schema::Field;
    use schema::{Document, Facet, SchemaBuilder};
@@ -507,7 +516,7 @@ mod tests {
        let schema = schema_builder.build();
        let index = Index::create_in_ram(schema);

-        let mut index_writer = index.writer(3_000_000).unwrap();
+        let mut index_writer = index.writer_with_num_threads(1, 3_000_000).unwrap();
        let num_facets: usize = 3 * 4 * 5;
        let facets: Vec<Facet> = (0..num_facets)
            .map(|mut n| {
@@ -563,6 +572,31 @@ mod tests {
        facet_collector.add_facet(Facet::from("/country/europe"));
    }

+    #[test]
+    fn test_doc_unsorted_multifacet() {
+        let mut schema_builder = SchemaBuilder::new();
+        let facet_field = schema_builder.add_facet_field("facets");
+        let schema = schema_builder.build();
+        let index = Index::create_in_ram(schema);
+        let mut index_writer = index.writer_with_num_threads(1, 3_000_000).unwrap();
+        index_writer.add_document(doc!(
+            facet_field => Facet::from_text(&"/subjects/A/a"),
+            facet_field => Facet::from_text(&"/subjects/B/a"),
+            facet_field => Facet::from_text(&"/subjects/A/b"),
+            facet_field => Facet::from_text(&"/subjects/B/b"),
+        ));
+        index_writer.commit().unwrap();
+        index.load_searchers().unwrap();
+        let searcher = index.searcher();
+        assert_eq!(searcher.num_docs(), 1);
+        let mut facet_collector = FacetCollector::for_field(facet_field);
+        facet_collector.add_facet("/subjects");
+        searcher.search(&AllQuery, &mut facet_collector).unwrap();
+        let counts = facet_collector.harvest();
+        let facets: Vec<(&Facet, u64)> = counts.get("/subjects").collect();
+        assert_eq!(facets[0].1, 1);
+    }
+
    #[test]
    fn test_non_used_facet_collector() {
        let mut facet_collector = FacetCollector::for_field(Field(0));
@@ -577,17 +611,19 @@ mod tests {
        let schema = schema_builder.build();
        let index = Index::create_in_ram(schema);

+        let uniform = Uniform::new_inclusive(1, 100_000);
        let mut docs: Vec<Document> = vec![("a", 10), ("b", 100), ("c", 7), ("d", 12), ("e", 21)]
            .into_iter()
            .flat_map(|(c, count)| {
-                let facet = Facet::from(&format!("/facet_{}", c));
+                let facet = Facet::from(&format!("/facet/{}", c));
                let doc = doc!(facet_field => facet);
                iter::repeat(doc).take(count)
            })
+            .map(|mut doc| { doc.add_facet(facet_field, &format!("/facet/{}", thread_rng().sample(&uniform) )); doc})
            .collect();
        thread_rng().shuffle(&mut docs[..]);

-        let mut index_writer = index.writer(3_000_000).unwrap();
+        let mut index_writer = index.writer_with_num_threads(1, 3_000_000).unwrap();
        for doc in docs {
            index_writer.add_document(doc);
        }
@@ -597,18 +633,18 @@ mod tests {
        let searcher = index.searcher();

        let mut facet_collector = FacetCollector::for_field(facet_field);
-        facet_collector.add_facet("/");
+        facet_collector.add_facet("/facet");
        searcher.search(&AllQuery, &mut facet_collector).unwrap();

        let counts: FacetCounts = facet_collector.harvest();
        {
-            let facets: Vec<(&Facet, u64)> = counts.top_k("/", 3);
+            let facets: Vec<(&Facet, u64)> = counts.top_k("/facet", 3);
            assert_eq!(
                facets,
                vec![
-                    (&Facet::from("/facet_b"), 100),
-                    (&Facet::from("/facet_e"), 21),
-                    (&Facet::from("/facet_d"), 12),
+                    (&Facet::from("/facet/b"), 100),
+                    (&Facet::from("/facet/e"), 21),
+                    (&Facet::from("/facet/d"), 12),
                ]
            );
        }
@@ -644,7 +680,7 @@ mod bench {
        // 40425 docs
        thread_rng().shuffle(&mut docs[..]);

-        let mut index_writer = index.writer(3_000_000).unwrap();
+        let mut index_writer = index.writer_with_num_threads(1, 3_000_000).unwrap();
        for doc in docs {
            index_writer.add_document(doc);
        }
--- a/src/collector/mod.rs
+++ b/src/collector/mod.rs
@@ -7,15 +7,12 @@ use Result;
 use Score;
 use SegmentLocalId;
 use SegmentReader;
-use query::Query;
-use Searcher;
-use downcast;

 mod count_collector;
 pub use self::count_collector::CountCollector;

-//mod multi_collector;
-//pub use self::multi_collector::MultiCollector;
+mod multi_collector;
+pub use self::multi_collector::MultiCollector;

 mod top_collector;
 pub use self::top_collector::TopCollector;
@@ -24,7 +21,7 @@ mod facet_collector;
 pub use self::facet_collector::FacetCollector;

 mod chained_collector;
-pub use self::chained_collector::chain;
+pub use self::chained_collector::{chain, ChainedCollector};

 /// Collectors are in charge of collecting and retaining relevant
 /// information from the document found and scored by the query.
@@ -56,90 +53,31 @@ pub use self::chained_collector::chain;
 ///
 /// Segments are not guaranteed to be visited in any specific order.
 pub trait Collector {
-    type Child : SegmentCollector + 'static;
    /// `set_segment` is called before beginning to enumerate
    /// on this segment.
-    fn for_segment(
+    fn set_segment(
        &mut self,
        segment_local_id: SegmentLocalId,
        segment: &SegmentReader,
-    ) -> Result<Self::Child>;
-
-    /// Returns true iff the collector requires to compute scores for documents.
-    fn requires_scoring(&self) -> bool;
-
-    /// Search works as follows :
-    ///
-    /// First the weight object associated to the query is created.
-    ///
-    /// Then, the query loops over the segments and for each segment :
-    /// - setup the collector and informs it that the segment being processed has changed.
-    /// - creates a SegmentCollector for collecting documents associated to the segment
-    /// - creates a `Scorer` object associated for this segment
-    /// - iterate through the matched documents and push them to the segment collector.
-    /// - turn the segment collector into a Combinable segment result
-    ///
-    /// Combining all of the segment results gives a single Child::CollectionResult, which is returned.
-    ///
-    /// The result will be Ok(None) in case of having no segments.
-    fn search(&mut self, searcher: &Searcher, query: &Query) -> Result<Option<<Self::Child as SegmentCollector>::CollectionResult>> {
-        let scoring_enabled = self.requires_scoring();
-        let weight = query.weight(searcher, scoring_enabled)?;
-        let mut results = Vec::new();
-        for (segment_ord, segment_reader) in searcher.segment_readers().iter().enumerate() {
-            let mut child: Self::Child = self.for_segment(segment_ord as SegmentLocalId, segment_reader)?;
-            let mut scorer = weight.scorer(segment_reader)?;
-            scorer.collect(&mut child, segment_reader.delete_bitset());
-            results.push(child.finalize());
-        }
-        Ok(results.into_iter().fold1(|x,y| {
-            x.combine_into(y);
-            x
-        }))
-    }
-}
-
-pub trait Combinable {
-    fn combine_into(&mut self, other: Self);
-}
-
-impl Combinable for () {
-    fn combine_into(&mut self, other: Self) {
-        ()
-    }
-}
-
-impl<T> Combinable for Vec<T> {
-    fn combine_into(&mut self, other: Self) {
-        self.extend(other.into_iter());
-    }
-}
-
-impl<L: Combinable, R: Combinable> Combinable for (L, R) {
-    fn combine_into(&mut self, other: Self) {
-        self.0.combine_into(other.0);
-        self.1.combine_into(other.1);
-    }
-}
-
-pub trait SegmentCollector: downcast::Any + 'static {
-    type CollectionResult: Combinable + downcast::Any + 'static;
+    ) -> Result<()>;
    /// The query pushes the scored document to the collector via this method.
    fn collect(&mut self, doc: DocId, score: Score);

-    /// Turn into the final result
-    fn finalize(self) -> Self::CollectionResult;
+    /// Returns true iff the collector requires to compute scores for documents.
+    fn requires_scoring(&self) -> bool;
 }

 impl<'a, C: Collector> Collector for &'a mut C {
-    type Child = C::Child;
-
-    fn for_segment(
-        &mut self, // TODO Ask Jason : why &mut self here!?
+    fn set_segment(
+        &mut self,
        segment_local_id: SegmentLocalId,
        segment: &SegmentReader,
-    ) -> Result<C::Child> {
-        (*self).for_segment(segment_local_id, segment)
+    ) -> Result<()> {
+        (*self).set_segment(segment_local_id, segment)
+    }
+    /// The query pushes the scored document to the collector via this method.
+    fn collect(&mut self, doc: DocId, score: Score) {
+        C::collect(self, doc, score)
    }

    fn requires_scoring(&self) -> bool {
@@ -147,61 +85,6 @@ impl<'a, C: Collector> Collector for &'a mut C {
    }
 }

-pub struct CollectorWrapper<'a, TCollector: 'a + Collector>(&'a mut TCollector);
-
-impl<'a, T: 'a + Collector> CollectorWrapper<'a, T> {
-    pub fn new(collector: &'a mut T) -> CollectorWrapper<'a, T> {
-        CollectorWrapper(collector)
-    }
-}
-
-impl<'a, T: 'a + Collector> Collector for CollectorWrapper<'a, T> {
-    type Child = T::Child;
-
-    fn for_segment(&mut self, segment_local_id: u32, segment: &SegmentReader) -> Result<T::Child> {
-        self.0.for_segment(segment_local_id, segment)
-    }
-
-    fn requires_scoring(&self) -> bool {
-        self.0.requires_scoring()
-    }
-}
-
-trait UntypedCollector {
-    fn for_segment(&mut self, segment_local_id: u32, segment: &SegmentReader) -> Result<Box<UntypedSegmentCollector>>;
-}
-
-
-impl<'a, TCollector:'a + Collector> UntypedCollector for CollectorWrapper<'a, TCollector> {
-    fn for_segment(&mut self, segment_local_id: u32, segment: &SegmentReader) -> Result<Box<UntypedSegmentCollector>> {
-        let segment_collector = self.0.for_segment(segment_local_id, segment)?;
-        Ok(Box::new(segment_collector))
-    }
-}
-
-trait UntypedSegmentCollector {
-    fn finalize(self) -> Box<UntypedCombinable>;
-}
-
-trait UntypedCombinable {
-    fn combine_into(&mut self, other: Box<UntypedCombinable>);
-}
-
-pub struct CombinableWrapper<'a, T: 'a + Combinable>(&'a mut T);
-
-impl<'a, T: 'a + Combinable> CombinableWrapper<'a, T> {
-    pub fn new(combinable: &'a mut T) -> CombinableWrapper<'a, T> {
-        CombinableWrapper(combinable)
-    }
-}
-
-impl<'a, T: 'a + Combinable> Combinable for CombinableWrapper<'a, T> {
-    fn combine_into(&mut self, other: Self) {
-        self.0.combine_into(*::downcast::Downcast::<T>::downcast(other).unwrap())
-    }
-}
-
-
 #[cfg(test)]
 pub mod tests {

@@ -219,13 +102,8 @@ pub mod tests {
    /// It is unusable in practise, as it does not store
    /// the segment ordinals
    pub struct TestCollector {
-        next_offset: DocId,
-        docs: Vec<DocId>,
-        scores: Vec<Score>,
-    }
-
-    pub struct TestSegmentCollector {
        offset: DocId,
+        segment_max_doc: DocId,
        docs: Vec<DocId>,
        scores: Vec<Score>,
    }
@@ -244,7 +122,8 @@ pub mod tests {
    impl Default for TestCollector {
        fn default() -> TestCollector {
            TestCollector {
-                next_offset: 0,
+                offset: 0,
+                segment_max_doc: 0,
                docs: Vec::new(),
                scores: Vec::new(),
            }
@@ -252,33 +131,19 @@ pub mod tests {
    }

    impl Collector for TestCollector {
-        type Child = TestSegmentCollector;
-
-        fn for_segment(&mut self, _: SegmentLocalId, reader: &SegmentReader) -> Result<TestSegmentCollector> {
-            let offset = self.next_offset;
-            self.next_offset += reader.max_doc();
-            Ok(TestSegmentCollector {
-                offset,
-                docs: Vec::new(),
-                scores: Vec::new(),
-            })
+        fn set_segment(&mut self, _: SegmentLocalId, reader: &SegmentReader) -> Result<()> {
+            self.offset += self.segment_max_doc;
+            self.segment_max_doc = reader.max_doc();
+            Ok(())
        }

-        fn requires_scoring(&self) -> bool {
-            true
-        }
-    }
-
-    impl SegmentCollector for TestSegmentCollector {
-        type CollectionResult = Vec<TestSegmentCollector>;
-
        fn collect(&mut self, doc: DocId, score: Score) {
            self.docs.push(doc + self.offset);
            self.scores.push(score);
        }

-        fn finalize(self) -> Vec<TestSegmentCollector> {
-            vec![self]
+        fn requires_scoring(&self) -> bool {
+            true
        }
    }

@@ -287,26 +152,17 @@ pub mod tests {
    ///
    /// This collector is mainly useful for tests.
    pub struct FastFieldTestCollector {
-        next_counter: usize,
-        field: Field,
-    }
-
-    #[derive(Default)]
-    pub struct FastFieldSegmentCollectorState {
-        counter: usize,
        vals: Vec<u64>,
-    }
-
-    pub struct FastFieldSegmentCollector {
-        state: FastFieldSegmentCollectorState,
-        reader: FastFieldReader<u64>,
+        field: Field,
+        ff_reader: Option<FastFieldReader<u64>>,
    }

    impl FastFieldTestCollector {
        pub fn for_field(field: Field) -> FastFieldTestCollector {
            FastFieldTestCollector {
-                next_counter: 0,
+                vals: Vec::new(),
                field,
+                ff_reader: None,
            }
        }

@@ -316,32 +172,17 @@ pub mod tests {
    }

    impl Collector for FastFieldTestCollector {
-        type Child = FastFieldSegmentCollector;
-
-        fn for_segment(&mut self, _: SegmentLocalId, reader: &SegmentReader) -> Result<FastFieldSegmentCollector> {
-            let counter = self.next_counter;
-            self.next_counter += 1;
-            Ok(FastFieldSegmentCollector {
-                state: FastFieldSegmentCollectorState::default(),
-                reader: reader.fast_field_reader(self.field)?,
-            })
+        fn set_segment(&mut self, _: SegmentLocalId, reader: &SegmentReader) -> Result<()> {
+            self.ff_reader = Some(reader.fast_field_reader(self.field)?);
+            Ok(())
        }

-        fn requires_scoring(&self) -> bool {
-            false
-        }
-    }
-
-    impl SegmentCollector for FastFieldSegmentCollector {
-        type CollectionResult = Vec<FastFieldSegmentCollectorState>;
-
        fn collect(&mut self, doc: DocId, _score: Score) {
-            let val = self.reader.get(doc);
+            let val = self.ff_reader.as_ref().unwrap().get(doc);
            self.vals.push(val);
        }
-
-        fn finalize(self) -> Vec<FastFieldSegmentCollectorState> {
-            vec![self.state]
+        fn requires_scoring(&self) -> bool {
+            false
        }
    }

@@ -352,11 +193,7 @@ pub mod tests {
    pub struct BytesFastFieldTestCollector {
        vals: Vec<u8>,
        field: Field,
-    }
-
-    pub struct BytesFastFieldSegmentCollector {
-        vals: Vec<u8>,
-        reader: BytesFastFieldReader,
+        ff_reader: Option<BytesFastFieldReader>,
    }

    impl BytesFastFieldTestCollector {
@@ -364,6 +201,7 @@ pub mod tests {
            BytesFastFieldTestCollector {
                vals: Vec::new(),
                field,
+                ff_reader: None,
            }
        }

@@ -373,32 +211,20 @@ pub mod tests {
    }

    impl Collector for BytesFastFieldTestCollector {
-        type Child = BytesFastFieldSegmentCollector;
+        fn set_segment(&mut self, _segment_local_id: u32, segment: &SegmentReader) -> Result<()> {
+            self.ff_reader = Some(segment.bytes_fast_field_reader(self.field)?);
+            Ok(())
+        }

-        fn for_segment(&mut self, _segment_local_id: u32, segment: &SegmentReader) -> Result<BytesFastFieldSegmentCollector> {
-            Ok(BytesFastFieldSegmentCollector {
-                vals: Vec::new(),
-                reader: segment.bytes_fast_field_reader(self.field)?,
-            })
+        fn collect(&mut self, doc: u32, _score: f32) {
+            let val = self.ff_reader.as_ref().unwrap().get_val(doc);
+            self.vals.extend(val);
        }

        fn requires_scoring(&self) -> bool {
            false
        }
    }
-
-    impl SegmentCollector for BytesFastFieldSegmentCollector {
-        type CollectionResult = Vec<Vec<u8>>;
-
-        fn collect(&mut self, doc: u32, _score: f32) {
-            let val = self.reader.get_val(doc);
-            self.vals.extend(val);
-        }
-
-        fn finalize(self) -> Vec<Vec<u8>> {
-            vec![self.vals]
-        }
-    }
 }

 #[cfg(all(test, feature = "unstable"))]
--- a/src/collector/multi_collector.rs
+++ b/src/collector/multi_collector.rs
@@ -1,122 +1,119 @@
 use super::Collector;
-use super::SegmentCollector;
 use DocId;
-use Score;
 use Result;
+use Score;
 use SegmentLocalId;
 use SegmentReader;
-use downcast::Downcast;

+/// Multicollector makes it possible to collect on more than one collector.
+/// It should only be used for use cases where the Collector types is unknown
+/// at compile time.
+/// If the type of the collectors is known, you should prefer to use `ChainedCollector`.
+///
+/// ```rust
+/// #[macro_use]
+/// extern crate tantivy;
+/// use tantivy::schema::{SchemaBuilder, TEXT};
+/// use tantivy::{Index, Result};
+/// use tantivy::collector::{CountCollector, TopCollector, MultiCollector};
+/// use tantivy::query::QueryParser;
+///
+/// # fn main() { example().unwrap(); }
+/// fn example() -> Result<()> {
+///     let mut schema_builder = SchemaBuilder::new();
+///     let title = schema_builder.add_text_field("title", TEXT);
+///     let schema = schema_builder.build();
+///     let index = Index::create_in_ram(schema);
+///     {
+///         let mut index_writer = index.writer(3_000_000)?;
+///         index_writer.add_document(doc!(
+///             title => "The Name of the Wind",
+///         ));
+///         index_writer.add_document(doc!(
+///             title => "The Diary of Muadib",
+///         ));
+///         index_writer.add_document(doc!(
+///             title => "A Dairy Cow",
+///         ));
+///         index_writer.add_document(doc!(
+///             title => "The Diary of a Young Girl",
+///         ));
+///         index_writer.commit().unwrap();
+///     }
+///
+///     index.load_searchers()?;
+///     let searcher = index.searcher();
+///
+///     {
+///         let mut top_collector = TopCollector::with_limit(2);
+///         let mut count_collector = CountCollector::default();
+///         {
+///             let mut collectors =
+///                 MultiCollector::from(vec![&mut top_collector, &mut count_collector]);
+///             let query_parser = QueryParser::for_index(&index, vec![title]);
+///             let query = query_parser.parse_query("diary")?;
+///             searcher.search(&*query, &mut collectors).unwrap();
+///         }
+///         assert_eq!(count_collector.count(), 2);
+///         assert!(top_collector.at_capacity());
+///     }
+///
+///     Ok(())
+/// }
+/// ```
 pub struct MultiCollector<'a> {
-    collector_wrappers: Vec<Box<UntypedCollector + 'a>>
+    collectors: Vec<&'a mut Collector>,
 }

 impl<'a> MultiCollector<'a> {
-    pub fn new() -> MultiCollector<'a> {
-        MultiCollector {
-            collector_wrappers: Vec::new()
-        }
-    }
-
-    pub fn add_collector<TCollector: 'a + Collector>(&mut self, collector: &'a mut TCollector) {
-        let collector_wrapper = CollectorWrapper(collector);
-        self.collector_wrappers.push(Box::new(collector_wrapper));
+    /// Constructor
+    pub fn from(collectors: Vec<&'a mut Collector>) -> MultiCollector {
+        MultiCollector { collectors }
    }
 }

 impl<'a> Collector for MultiCollector<'a> {
-
-    type Child = MultiCollectorChild;
-
-    fn for_segment(&mut self, segment_local_id: SegmentLocalId, segment: &SegmentReader) -> Result<MultiCollectorChild> {
-        let children = self.collector_wrappers
-            .iter_mut()
-            .map(|collector_wrapper| {
-                collector_wrapper.for_segment(segment_local_id, segment)
-            })
-            .collect::<Result<Vec<_>>>()?;
-        Ok(MultiCollectorChild {
-            children
-        })
-    }
-
-    fn requires_scoring(&self) -> bool {
-        self.collector_wrappers
-            .iter()
-            .any(|c| c.requires_scoring())
-    }
-
-    fn merge_children(&mut self, children: Vec<MultiCollectorChild>) {
-        let mut per_collector_children: Vec<Vec<Box<SegmentCollector>>> =
-            (0..self.collector_wrappers.len())
-                .map(|_| Vec::with_capacity(children.len()))
-                .collect::<Vec<_>>();
-        for child in children {
-            for (idx, segment_collector) in child.children.into_iter().enumerate() {
-                per_collector_children[idx].push(segment_collector);
-            }
-        }
-        for (collector, children) in self.collector_wrappers.iter_mut().zip(per_collector_children) {
-            collector.merge_children_anys(children);
+    fn set_segment(
+        &mut self,
+        segment_local_id: SegmentLocalId,
+        segment: &SegmentReader,
+    ) -> Result<()> {
+        for collector in &mut self.collectors {
+            collector.set_segment(segment_local_id, segment)?;
        }
+        Ok(())
    }

-}
-
-pub struct MultiCollectorChild {
-    children: Vec<Box<SegmentCollector>>
-}
-
-impl SegmentCollector for MultiCollectorChild {
    fn collect(&mut self, doc: DocId, score: Score) {
-        for child in &mut self.children {
-            child.collect(doc, score);
+        for collector in &mut self.collectors {
+            collector.collect(doc, score);
        }
    }
+    fn requires_scoring(&self) -> bool {
+        self.collectors
+            .iter()
+            .any(|collector| collector.requires_scoring())
+    }
 }

-
 #[cfg(test)]
 mod tests {

    use super::*;
    use collector::{Collector, CountCollector, TopCollector};
-    use schema::{TEXT, SchemaBuilder};
-    use query::TermQuery;
-    use Index;
-    use Term;
-    use schema::IndexRecordOption;

    #[test]
    fn test_multi_collector() {
-        let mut schema_builder = SchemaBuilder::new();
-        let text = schema_builder.add_text_field("text", TEXT);
-        let schema = schema_builder.build();
-
-        let index = Index::create_in_ram(schema);
-        {
-            let mut index_writer = index.writer_with_num_threads(1, 3_000_000).unwrap();
-            index_writer.add_document(doc!(text=>"abc"));
-            index_writer.add_document(doc!(text=>"abc abc abc"));
-            index_writer.add_document(doc!(text=>"abc abc"));
-            index_writer.commit().unwrap();
-            index_writer.add_document(doc!(text=>""));
-            index_writer.add_document(doc!(text=>"abc abc abc abc"));
-            index_writer.add_document(doc!(text=>"abc"));
-            index_writer.commit().unwrap();
-        }
-        index.load_searchers().unwrap();
-        let searcher = index.searcher();
-        let term = Term::from_field_text(text, "abc");
-        let query = TermQuery::new(term, IndexRecordOption::Basic);
        let mut top_collector = TopCollector::with_limit(2);
        let mut count_collector = CountCollector::default();
        {
-            let mut collectors = MultiCollector::new();
-            collectors.add_collector(&mut top_collector);
-            collectors.add_collector(&mut count_collector);
-            collectors.search(&*searcher, &query).unwrap();
+            let mut collectors =
+                MultiCollector::from(vec![&mut top_collector, &mut count_collector]);
+            collectors.collect(1, 0.2);
+            collectors.collect(2, 0.1);
+            collectors.collect(3, 0.5);
        }
-        assert_eq!(count_collector.count(), 5);
+        assert_eq!(count_collector.count(), 3);
+        assert!(top_collector.at_capacity());
    }
 }
--- a/src/collector/top_collector.rs
+++ b/src/collector/top_collector.rs
@@ -7,8 +7,6 @@ use Result;
 use Score;
 use SegmentLocalId;
 use SegmentReader;
-use collector::SegmentCollector;
-use collector::Combinable;

 // Rust heap is a max-heap and we need a min heap.
 #[derive(Clone, Copy)]
@@ -45,7 +43,61 @@ impl Eq for GlobalScoredDoc {}
 /// with the best scores.
 ///
 /// The implementation is based on a `BinaryHeap`.
-/// The theorical complexity is `O(n log K)`.
+/// The theorical complexity for collecting the top `K` out of `n` documents
+/// is `O(n log K)`.
+///
+/// ```rust
+/// #[macro_use]
+/// extern crate tantivy;
+/// use tantivy::schema::{SchemaBuilder, TEXT};
+/// use tantivy::{Index, Result, DocId, Score};
+/// use tantivy::collector::TopCollector;
+/// use tantivy::query::QueryParser;
+///
+/// # fn main() { example().unwrap(); }
+/// fn example() -> Result<()> {
+///     let mut schema_builder = SchemaBuilder::new();
+///     let title = schema_builder.add_text_field("title", TEXT);
+///     let schema = schema_builder.build();
+///     let index = Index::create_in_ram(schema);
+///     {
+///         let mut index_writer = index.writer_with_num_threads(1, 3_000_000)?;
+///         index_writer.add_document(doc!(
+///             title => "The Name of the Wind",
+///         ));
+///         index_writer.add_document(doc!(
+///             title => "The Diary of Muadib",
+///         ));
+///         index_writer.add_document(doc!(
+///             title => "A Dairy Cow",
+///         ));
+///         index_writer.add_document(doc!(
+///             title => "The Diary of a Young Girl",
+///         ));
+///         index_writer.commit().unwrap();
+///     }
+///
+///     index.load_searchers()?;
+///     let searcher = index.searcher();
+///
+///     {
+///	        let mut top_collector = TopCollector::with_limit(2);
+///         let query_parser = QueryParser::for_index(&index, vec![title]);
+///         let query = query_parser.parse_query("diary")?;
+///         searcher.search(&*query, &mut top_collector).unwrap();
+///
+///         let score_docs: Vec<(Score, DocId)> = top_collector
+///           .score_docs()
+///           .into_iter()
+///           .map(|(score, doc_address)| (score, doc_address.doc()))
+///           .collect();
+///
+///         assert_eq!(score_docs, vec![(0.7261542, 1), (0.6099695, 3)]);
+///     }
+///
+///     Ok(())
+/// }
+/// ```
 pub struct TopCollector {
    limit: usize,
    heap: BinaryHeap<GlobalScoredDoc>,
@@ -101,34 +153,11 @@ impl TopCollector {
 }

 impl Collector for TopCollector {
-    type Child = TopCollector;
-
-    fn for_segment(&mut self, segment_id: SegmentLocalId, _: &SegmentReader) -> Result<TopCollector> {
-        Ok(TopCollector {
-            limit: self.limit,
-            heap: BinaryHeap::new(),
-            segment_id,
-        })
+    fn set_segment(&mut self, segment_id: SegmentLocalId, _: &SegmentReader) -> Result<()> {
+        self.segment_id = segment_id;
+        Ok(())
    }

-    fn requires_scoring(&self) -> bool {
-        true
-    }
-}
-
-impl Combinable for TopCollector {
-    // TODO: I think this could be a bit better
-    fn combine_into(&mut self, other: Self) {
-        self.segment_id = other.segment_id;
-        while let Some(doc) = other.heap.pop() {
-            self.collect(doc.doc_address.doc(), doc.score);
-        }
-    }
-}
-
-impl SegmentCollector for TopCollector {
-    type CollectionResult = TopCollector;
-
    fn collect(&mut self, doc: DocId, score: Score) {
        if self.at_capacity() {
            // It's ok to unwrap as long as a limit of 0 is forbidden.
@@ -151,8 +180,8 @@ impl SegmentCollector for TopCollector {
        }
    }

-    fn finalize(self) -> TopCollector {
-        self
+    fn requires_scoring(&self) -> bool {
+        true
    }
 }

@@ -160,6 +189,7 @@ impl SegmentCollector for TopCollector {
 mod tests {

    use super::*;
+    use collector::Collector;
    use DocId;
    use Score;

@@ -210,4 +240,5 @@ mod tests {
    fn test_top_0() {
        TopCollector::with_limit(0);
    }
+
 }
--- a/src/common/bitpacker.rs
+++ b/src/common/bitpacker.rs
@@ -46,7 +46,7 @@ impl BitPacker {
    pub fn flush<TWrite: Write>(&mut self, output: &mut TWrite) -> io::Result<()> {
        if self.mini_buffer_written > 0 {
            let num_bytes = (self.mini_buffer_written + 7) / 8;
-            let arr: [u8; 8] = unsafe { mem::transmute::<u64, [u8; 8]>(self.mini_buffer) };
+            let arr: [u8; 8] = unsafe { mem::transmute::<u64, [u8; 8]>(self.mini_buffer.to_le()) };
            output.write_all(&arr[..num_bytes])?;
            self.mini_buffer_written = 0;
        }
@@ -98,31 +98,14 @@ where
        let addr_in_bits = idx * num_bits;
        let addr = addr_in_bits >> 3;
        let bit_shift = addr_in_bits & 7;
-        if cfg!(feature = "simdcompression") {
-            // for simdcompression,
-            // the bitpacker is only used for fastfields,
-            // and we expect them to be always padded.
-            debug_assert!(
-                addr + 8 <= data.len(),
-                "The fast field field should have been padded with 7 bytes."
-            );
-            let val_unshifted_unmasked: u64 =
-                unsafe { ptr::read_unaligned(data[addr..].as_ptr() as *const u64) };
-            let val_shifted = (val_unshifted_unmasked >> bit_shift) as u64;
-            val_shifted & mask
-        } else {
-            let val_unshifted_unmasked: u64 = if addr + 8 <= data.len() {
-                unsafe { ptr::read_unaligned(data[addr..].as_ptr() as *const u64) }
-            } else {
-                let mut buffer = [0u8; 8];
-                for i in addr..data.len() {
-                    buffer[i - addr] += data[i];
-                }
-                unsafe { ptr::read_unaligned(buffer[..].as_ptr() as *const u64) }
-            };
-            let val_shifted = val_unshifted_unmasked >> (bit_shift as u64);
-            val_shifted & mask
-        }
+        debug_assert!(
+            addr + 8 <= data.len(),
+            "The fast field field should have been padded with 7 bytes."
+        );
+        let val_unshifted_unmasked: u64 =
+            u64::from_le(unsafe { ptr::read_unaligned(data[addr..].as_ptr() as *const u64) });
+        let val_shifted = (val_unshifted_unmasked >> bit_shift) as u64;
+        val_shifted & mask
    }

    /// Reads a range of values from the fast field.
--- a/src/common/bitset.rs
+++ b/src/common/bitset.rs
@@ -342,7 +342,7 @@ mod tests {
    #[test]
    fn test_bitset_clear() {
        let mut bitset = BitSet::with_max_value(1_000);
-        let els = tests::sample(1_000, 0.01f32);
+        let els = tests::sample(1_000, 0.01f64);
        for &el in &els {
            bitset.insert(el);
        }
--- a/src/common/composite_file.rs
+++ b/src/common/composite_file.rs
@@ -64,7 +64,7 @@ impl<W: Write> CompositeWrite<W> {
        &mut self.write
    }

-    /// Close the composite file.
+    /// Close the composite file
    ///
    /// An index of the different field offsets
    /// will be written as a footer.
@@ -112,7 +112,6 @@ impl CompositeFile {
        let end = data.len();
        let footer_len_data = data.slice_from(end - 4);
        let footer_len = u32::deserialize(&mut footer_len_data.as_slice())? as usize;
-
        let footer_start = end - 4 - footer_len;
        let footer_data = data.slice(footer_start, footer_start + footer_len);
        let mut footer_buffer = footer_data.as_slice();
--- a/src/common/vint.rs
+++ b/src/common/vint.rs
@@ -7,7 +7,11 @@ use std::io::Write;
 #[derive(Debug, Eq, PartialEq)]
 pub struct VInt(pub u64);

+const STOP_BIT: u8 = 128;
+
 impl VInt {
+
+
    pub fn val(&self) -> u64 {
        self.0
    }
@@ -15,24 +19,35 @@ impl VInt {
    pub fn deserialize_u64<R: Read>(reader: &mut R) -> io::Result<u64> {
        VInt::deserialize(reader).map(|vint| vint.0)
    }
+
+    pub fn serialize_into_vec(&self, output: &mut Vec<u8>){
+        let mut buffer = [0u8; 10];
+        let num_bytes = self.serialize_into(&mut buffer);
+        output.extend(&buffer[0..num_bytes]);
+    }
+
+    fn serialize_into(&self, buffer: &mut [u8; 10]) -> usize {
+
+        let mut remaining = self.0;
+        for (i, b) in buffer.iter_mut().enumerate() {
+            let next_byte: u8 = (remaining % 128u64) as u8;
+            remaining /= 128u64;
+            if remaining == 0u64 {
+                *b = next_byte | STOP_BIT;
+                return i + 1;
+            } else {
+                *b = next_byte;
+            }
+        }
+        unreachable!();
+    }
 }

 impl BinarySerializable for VInt {
    fn serialize<W: Write>(&self, writer: &mut W) -> io::Result<()> {
-        let mut remaining = self.0;
        let mut buffer = [0u8; 10];
-        let mut i = 0;
-        loop {
-            let next_byte: u8 = (remaining % 128u64) as u8;
-            remaining /= 128u64;
-            if remaining == 0u64 {
-                buffer[i] = next_byte | 128u8;
-                return writer.write_all(&buffer[0..i + 1]);
-            } else {
-                buffer[i] = next_byte;
-            }
-            i += 1;
-        }
+        let num_bytes = self.serialize_into(&mut buffer);
+        writer.write_all(&buffer[0..num_bytes])
    }

    fn deserialize<R: Read>(reader: &mut R) -> io::Result<Self> {
@@ -42,20 +57,59 @@ impl BinarySerializable for VInt {
        loop {
            match bytes.next() {
                Some(Ok(b)) => {
-                    result += u64::from(b % 128u8) << shift;
-                    if b & 128u8 != 0u8 {
-                        break;
+                    result |= u64::from(b % 128u8) << shift;
+                    if b >= STOP_BIT {
+                        return Ok(VInt(result));
                    }
                    shift += 7;
                }
                _ => {
                    return Err(io::Error::new(
                        io::ErrorKind::InvalidData,
-                        "Reach end of buffer",
+                        "Reach end of buffer while reading VInt",
                    ))
                }
            }
        }
-        Ok(VInt(result))
    }
 }
+
+
+#[cfg(test)]
+mod tests {
+
+    use super::VInt;
+    use common::BinarySerializable;
+
+    fn aux_test_vint(val: u64) {
+        let mut v = [14u8; 10];
+        let num_bytes = VInt(val).serialize_into(&mut v);
+        for i in num_bytes..10 {
+            assert_eq!(v[i], 14u8);
+        }
+        assert!(num_bytes > 0);
+        if num_bytes < 10 {
+            assert!(1u64 << (7*num_bytes) > val);
+        }
+        if num_bytes > 1 {
+            assert!(1u64 << (7*(num_bytes-1)) <= val);
+        }
+        let serdeser_val = VInt::deserialize(&mut &v[..]).unwrap();
+        assert_eq!(val, serdeser_val.0);
+    }
+
+    #[test]
+    fn test_vint() {
+        aux_test_vint(0);
+        aux_test_vint(1);
+        aux_test_vint(5);
+        aux_test_vint(u64::max_value());
+        for i in 1..9 {
+            let power_of_128 = 1u64 << (7*i);
+            aux_test_vint(power_of_128 - 1u64);
+            aux_test_vint(power_of_128 );
+            aux_test_vint(power_of_128 + 1u64);
+        }
+        aux_test_vint(10);
+    }
+}
--- a/src/compression/stream.rs
+++ b/src/compression/stream.rs
@@ -1,159 +0,0 @@
-use compression::compressed_block_size;
-use compression::BlockDecoder;
-use compression::COMPRESSION_BLOCK_SIZE;
-use directory::{ReadOnlySource, SourceRead};
-
-/// Reads a stream of compressed ints.
-///
-/// Tantivy uses `CompressedIntStream` to read
-/// the position file.
-/// The `.skip(...)` makes it possible to avoid
-/// decompressing blocks that are not required.
-pub struct CompressedIntStream {
-    buffer: SourceRead,
-
-    block_decoder: BlockDecoder,
-    cached_addr: usize,      // address of the currently decoded block
-    cached_next_addr: usize, // address following the currently decoded block
-
-    addr: usize, // address of the block associated to the current position
-    inner_offset: usize,
-}
-
-impl CompressedIntStream {
-    /// Opens a compressed int stream.
-    pub(crate) fn wrap(source: ReadOnlySource) -> CompressedIntStream {
-        CompressedIntStream {
-            buffer: SourceRead::from(source),
-            block_decoder: BlockDecoder::new(),
-            cached_addr: usize::max_value(),
-            cached_next_addr: usize::max_value(),
-
-            addr: 0,
-            inner_offset: 0,
-        }
-    }
-
-    /// Loads the block at the given address and return the address of the
-    /// following block
-    pub fn read_block(&mut self, addr: usize) -> usize {
-        if self.cached_addr == addr {
-            // we are already on this block.
-            // no need to read.
-            self.cached_next_addr
-        } else {
-            let next_addr = addr + self.block_decoder
-                .uncompress_block_unsorted(self.buffer.slice_from(addr));
-            self.cached_addr = addr;
-            self.cached_next_addr = next_addr;
-            next_addr
-        }
-    }
-
-    /// Fills a buffer with the next `output.len()` integers.
-    /// This does not consume / advance the stream.
-    pub fn read(&mut self, output: &mut [u32]) {
-        let mut cursor = self.addr;
-        let mut inner_offset = self.inner_offset;
-        let mut num_els: usize = output.len();
-        let mut start = 0;
-        loop {
-            cursor = self.read_block(cursor);
-            let block = &self.block_decoder.output_array()[inner_offset..];
-            let block_len = block.len();
-            if num_els >= block_len {
-                output[start..start + block_len].clone_from_slice(&block);
-                start += block_len;
-                num_els -= block_len;
-                inner_offset = 0;
-            } else {
-                output[start..].clone_from_slice(&block[..num_els]);
-                break;
-            }
-        }
-    }
-
-    /// Skip the next `skip_len` integer.
-    ///
-    /// If a full block is skipped, calling
-    /// `.skip(...)` will avoid decompressing it.
-    ///
-    /// May panic if the end of the stream is reached.
-    pub fn skip(&mut self, mut skip_len: usize) {
-        loop {
-            let available = COMPRESSION_BLOCK_SIZE - self.inner_offset;
-            if available >= skip_len {
-                self.inner_offset += skip_len;
-                break;
-            } else {
-                skip_len -= available;
-                // entirely skip decompressing some blocks.
-                let num_bits: u8 = self.buffer.get(self.addr);
-                let block_len = compressed_block_size(num_bits);
-                self.addr += block_len;
-                self.inner_offset = 0;
-            }
-        }
-    }
-}
-
-#[cfg(test)]
-pub mod tests {
-
-    use super::CompressedIntStream;
-    use compression::compressed_block_size;
-    use compression::BlockEncoder;
-    use compression::COMPRESSION_BLOCK_SIZE;
-    use directory::ReadOnlySource;
-
-    fn create_stream_buffer() -> ReadOnlySource {
-        let mut buffer: Vec<u8> = vec![];
-        let mut encoder = BlockEncoder::new();
-        let vals: Vec<u32> = (0u32..1152u32).collect();
-        for chunk in vals.chunks(COMPRESSION_BLOCK_SIZE) {
-            let compressed_block = encoder.compress_block_unsorted(chunk);
-            let num_bits = compressed_block[0];
-            assert_eq!(compressed_block_size(num_bits), compressed_block.len());
-            buffer.extend_from_slice(compressed_block);
-        }
-        if cfg!(simd) {
-            buffer.extend_from_slice(&[0u8; 7]);
-        }
-        ReadOnlySource::from(buffer)
-    }
-
-    #[test]
-    fn test_compressed_int_stream() {
-        let buffer = create_stream_buffer();
-        let mut stream = CompressedIntStream::wrap(buffer);
-        let mut block: [u32; COMPRESSION_BLOCK_SIZE] = [0u32; COMPRESSION_BLOCK_SIZE];
-
-        stream.read(&mut block[0..2]);
-        assert_eq!(block[0], 0);
-        assert_eq!(block[1], 1);
-
-        // reading does not consume the stream
-        stream.read(&mut block[0..2]);
-        assert_eq!(block[0], 0);
-        assert_eq!(block[1], 1);
-        stream.skip(2);
-
-        stream.skip(5);
-        stream.read(&mut block[0..3]);
-        stream.skip(3);
-
-        assert_eq!(block[0], 7);
-        assert_eq!(block[1], 8);
-        assert_eq!(block[2], 9);
-        stream.skip(500);
-        stream.read(&mut block[0..3]);
-        stream.skip(3);
-
-        assert_eq!(block[0], 510);
-        assert_eq!(block[1], 511);
-        assert_eq!(block[2], 512);
-        stream.skip(511);
-        stream.read(&mut block[..1]);
-        assert_eq!(block[0], 1024);
-    }
-}
--- a/src/core/index.rs
+++ b/src/core/index.rs
@@ -1,9 +1,10 @@
 use core::SegmentId;
-use error::{ErrorKind, ResultExt};
+use error::TantivyError;
 use schema::Schema;
 use serde_json;
 use std::borrow::BorrowMut;
 use std::fmt;
+use std::sync::atomic::{AtomicUsize, Ordering};
 use std::sync::Arc;
 use Result;

@@ -16,11 +17,12 @@ use core::IndexMeta;
 use core::SegmentMeta;
 use core::SegmentReader;
 use core::META_FILEPATH;
-use directory::ManagedDirectory;
 #[cfg(feature = "mmap")]
 use directory::MmapDirectory;
 use directory::{Directory, RAMDirectory};
+use directory::{DirectoryClone, ManagedDirectory};
 use indexer::index_writer::open_index_writer;
+use indexer::index_writer::HEAP_SIZE_MIN;
 use indexer::segment_updater::save_new_metas;
 use indexer::DirectoryLock;
 use num_cpus;
@@ -28,18 +30,18 @@ use std::path::Path;
 use tokenizer::TokenizerManager;
 use IndexWriter;

-const NUM_SEARCHERS: usize = 12;
-
 fn load_metas(directory: &Directory) -> Result<IndexMeta> {
    let meta_data = directory.atomic_read(&META_FILEPATH)?;
    let meta_string = String::from_utf8_lossy(&meta_data);
-    serde_json::from_str(&meta_string).chain_err(|| ErrorKind::CorruptedFile(META_FILEPATH.clone()))
+    serde_json::from_str(&meta_string)
+        .map_err(|_| TantivyError::CorruptedFile(META_FILEPATH.clone()))
 }

 /// Search Index
 pub struct Index {
    directory: ManagedDirectory,
    schema: Schema,
+    num_searchers: Arc<AtomicUsize>,
    searcher_pool: Arc<Pool<Searcher>>,
    tokenizers: TokenizerManager,
 }
@@ -51,12 +53,7 @@ impl Index {
    /// This should only be used for unit tests.
    pub fn create_in_ram(schema: Schema) -> Index {
        let ram_directory = RAMDirectory::create();
-        // unwrap is ok here
-        let directory = ManagedDirectory::new(ram_directory).expect(
-            "Creating a managed directory from a brand new RAM directory \
-             should never fail.",
-        );
-        Index::from_directory(directory, schema).expect("Creating a RAMDirectory should never fail")
+        Index::create(ram_directory, schema).expect("Creating a RAMDirectory should never fail")
    }

    /// Creates a new index in a given filepath.
@@ -64,15 +61,9 @@ impl Index {
    ///
    /// If a previous index was in this directory, then its meta file will be destroyed.
    #[cfg(feature = "mmap")]
-    pub fn create<P: AsRef<Path>>(directory_path: P, schema: Schema) -> Result<Index> {
+    pub fn create_in_dir<P: AsRef<Path>>(directory_path: P, schema: Schema) -> Result<Index> {
        let mmap_directory = MmapDirectory::open(directory_path)?;
-        let directory = ManagedDirectory::new(mmap_directory)?;
-        Index::from_directory(directory, schema)
-    }
-
-    /// Accessor for the tokenizer manager.
-    pub fn tokenizers(&self) -> &TokenizerManager {
-        &self.tokenizers
+        Index::create(mmap_directory, schema)
    }

    /// Creates a new index in a temp directory.
@@ -86,16 +77,30 @@ impl Index {
    #[cfg(feature = "mmap")]
    pub fn create_from_tempdir(schema: Schema) -> Result<Index> {
        let mmap_directory = MmapDirectory::create_from_tempdir()?;
-        let directory = ManagedDirectory::new(mmap_directory)?;
+        Index::create(mmap_directory, schema)
+    }
+
+    /// Creates a new index given an implementation of the trait `Directory`
+    pub fn create<Dir: Directory>(dir: Dir, schema: Schema) -> Result<Index> {
+        let directory = ManagedDirectory::new(dir)?;
        Index::from_directory(directory, schema)
    }

+    /// Create a new index from a directory.
+    fn from_directory(mut directory: ManagedDirectory, schema: Schema) -> Result<Index> {
+        save_new_metas(schema.clone(), 0, directory.borrow_mut())?;
+        let metas = IndexMeta::with_schema(schema);
+        Index::create_from_metas(directory, &metas)
+    }
+
    /// Creates a new index given a directory and an `IndexMeta`.
    fn create_from_metas(directory: ManagedDirectory, metas: &IndexMeta) -> Result<Index> {
        let schema = metas.schema.clone();
+        let n_cpus = num_cpus::get();
        let index = Index {
            directory,
            schema,
+            num_searchers: Arc::new(AtomicUsize::new(n_cpus)),
            searcher_pool: Arc::new(Pool::new()),
            tokenizers: TokenizerManager::default(),
        };
@@ -103,24 +108,22 @@ impl Index {
        Ok(index)
    }

-    /// Open the index using the provided directory
-    pub fn open_directory<D: Directory>(directory: D) -> Result<Index> {
-        let directory = ManagedDirectory::new(directory)?;
-        let metas = load_metas(&directory)?;
-        Index::create_from_metas(directory, &metas)
+    /// Accessor for the tokenizer manager.
+    pub fn tokenizers(&self) -> &TokenizerManager {
+        &self.tokenizers
    }

    /// Opens a new directory from an index path.
    #[cfg(feature = "mmap")]
-    pub fn open<P: AsRef<Path>>(directory_path: P) -> Result<Index> {
+    pub fn open_in_dir<P: AsRef<Path>>(directory_path: P) -> Result<Index> {
        let mmap_directory = MmapDirectory::open(directory_path)?;
-        Index::open_directory(mmap_directory)
+        Index::open(mmap_directory)
    }

-    /// Create a new index from a directory.
-    pub fn from_directory(mut directory: ManagedDirectory, schema: Schema) -> Result<Index> {
-        save_new_metas(schema.clone(), 0, directory.borrow_mut())?;
-        let metas = IndexMeta::with_schema(schema);
+    /// Open the index using the provided directory
+    pub fn open<D: Directory>(directory: D) -> Result<Index> {
+        let directory = ManagedDirectory::new(directory)?;
+        let metas = load_metas(&directory)?;
        Index::create_from_metas(directory, &metas)
    }

@@ -137,9 +140,13 @@ impl Index {
    /// `IndexWriter` on the system is accessing the index directory,
    /// it is safe to manually delete the lockfile.
    ///
-    /// num_threads specifies the number of indexing workers that
+    /// - `num_threads` defines the number of indexing workers that
    /// should work at the same time.
    ///
+    /// - `overall_heap_size_in_bytes` sets the amount of memory
+    /// allocated for all indexing thread.
+    /// Each thread will receive a budget of  `overall_heap_size_in_bytes / num_threads`.
+    ///
    /// # Errors
    /// If the lockfile already exists, returns `Error::FileAlreadyExists`.
    /// # Panics
@@ -147,21 +154,35 @@ impl Index {
    pub fn writer_with_num_threads(
        &self,
        num_threads: usize,
-        heap_size_in_bytes: usize,
+        overall_heap_size_in_bytes: usize,
    ) -> Result<IndexWriter> {
        let directory_lock = DirectoryLock::lock(self.directory().box_clone())?;
-        open_index_writer(self, num_threads, heap_size_in_bytes, directory_lock)
+        let heap_size_in_bytes_per_thread = overall_heap_size_in_bytes / num_threads;
+        open_index_writer(
+            self,
+            num_threads,
+            heap_size_in_bytes_per_thread,
+            directory_lock,
+        )
    }

    /// Creates a multithreaded writer
-    /// It just calls `writer_with_num_threads` with the number of cores as `num_threads`
+    ///
+    /// Tantivy will automatically define the number of threads to use.
+    /// `overall_heap_size_in_bytes` is the total target memory usage that will be split
+    /// between a given number of threads.
    ///
    /// # Errors
    /// If the lockfile already exists, returns `Error::FileAlreadyExists`.
    /// # Panics
    /// If the heap size per thread is too small, panics.
-    pub fn writer(&self, heap_size_in_bytes: usize) -> Result<IndexWriter> {
-        self.writer_with_num_threads(num_cpus::get(), heap_size_in_bytes)
+    pub fn writer(&self, overall_heap_size_in_bytes: usize) -> Result<IndexWriter> {
+        let mut num_threads = num_cpus::get();
+        let heap_size_in_bytes_per_thread = overall_heap_size_in_bytes / num_threads;
+        if heap_size_in_bytes_per_thread < HEAP_SIZE_MIN {
+            num_threads = (overall_heap_size_in_bytes / HEAP_SIZE_MIN).max(1);
+        }
+        self.writer_with_num_threads(num_threads, overall_heap_size_in_bytes)
    }

    /// Accessor to the index schema
@@ -186,8 +207,8 @@ impl Index {

    /// Creates a new segment.
    pub fn new_segment(&self) -> Segment {
-        let segment_meta = SegmentMeta::new(SegmentId::generate_random());
-        create_segment(self.clone(), segment_meta)
+        let segment_meta = SegmentMeta::new(SegmentId::generate_random(), 0);
+        self.segment(segment_meta)
    }

    /// Return a reference to the index directory.
@@ -214,6 +235,13 @@ impl Index {
            .collect())
    }

+    /// Sets the number of searchers to use
+    ///
+    /// Only works after the next call to `load_searchers`
+    pub fn set_num_searchers(&mut self, num_searchers: usize) {
+        self.num_searchers.store(num_searchers, Ordering::Release);
+    }
+
    /// Creates a new generation of searchers after

    /// a change of the set of searchable indexes.
@@ -227,7 +255,8 @@ impl Index {
            .map(SegmentReader::open)
            .collect::<Result<_>>()?;
        let schema = self.schema();
-        let searchers = (0..NUM_SEARCHERS)
+        let num_searchers: usize = self.num_searchers.load(Ordering::Acquire);
+        let searchers = (0..num_searchers)
            .map(|_| Searcher::new(schema.clone(), segment_readers.clone()))
            .collect();
        self.searcher_pool.publish_new_generation(searchers);
@@ -238,7 +267,7 @@ impl Index {
    ///
    /// This method should be called every single time a search
    /// query is performed.
-    /// The searchers are taken from a pool of `NUM_SEARCHERS` searchers.
+    /// The searchers are taken from a pool of `num_searchers` searchers.
    /// If no searcher is available
    /// this may block.
    ///
@@ -260,6 +289,7 @@ impl Clone for Index {
        Index {
            directory: self.directory.clone(),
            schema: self.schema.clone(),
+            num_searchers: Arc::clone(&self.num_searchers),
            searcher_pool: Arc::clone(&self.searcher_pool),
            tokenizers: self.tokenizers.clone(),
        }
--- a/src/core/inverted_index_reader.rs
+++ b/src/core/inverted_index_reader.rs
@@ -1,13 +1,13 @@
 use common::BinarySerializable;
-use compression::CompressedIntStream;
-use directory::{ReadOnlySource, SourceRead};
-use postings::FreqReadingOption;
+use directory::ReadOnlySource;
 use postings::TermInfo;
 use postings::{BlockSegmentPostings, SegmentPostings};
 use schema::FieldType;
 use schema::IndexRecordOption;
 use schema::Term;
 use termdict::TermDictionary;
+use owned_read::OwnedRead;
+use positions::PositionReader;

 /// The inverted index reader is in charge of accessing
 /// the inverted index associated to a specific field.
@@ -26,6 +26,7 @@ pub struct InvertedIndexReader {
    termdict: TermDictionary,
    postings_source: ReadOnlySource,
    positions_source: ReadOnlySource,
+    positions_idx_source: ReadOnlySource,
    record_option: IndexRecordOption,
    total_num_tokens: u64,
 }
@@ -35,6 +36,7 @@ impl InvertedIndexReader {
        termdict: TermDictionary,
        postings_source: ReadOnlySource,
        positions_source: ReadOnlySource,
+        positions_idx_source: ReadOnlySource,
        record_option: IndexRecordOption,
    ) -> InvertedIndexReader {
        let total_num_tokens_data = postings_source.slice(0, 8);
@@ -44,6 +46,7 @@ impl InvertedIndexReader {
            termdict,
            postings_source: postings_source.slice_from(8),
            positions_source,
+            positions_idx_source,
            record_option,
            total_num_tokens,
        }
@@ -59,6 +62,7 @@ impl InvertedIndexReader {
            termdict: TermDictionary::empty(field_type),
            postings_source: ReadOnlySource::empty(),
            positions_source: ReadOnlySource::empty(),
+            positions_idx_source: ReadOnlySource::empty(),
            record_option,
            total_num_tokens: 0u64,
        }
@@ -92,8 +96,22 @@ impl InvertedIndexReader {
        let offset = term_info.postings_offset as usize;
        let end_source = self.postings_source.len();
        let postings_slice = self.postings_source.slice(offset, end_source);
-        let postings_reader = SourceRead::from(postings_slice);
-        block_postings.reset(term_info.doc_freq as usize, postings_reader);
+        let postings_reader = OwnedRead::new(postings_slice);
+        block_postings.reset(term_info.doc_freq, postings_reader);
+    }
+
+
+    /// Returns a block postings given a `Term`.
+    /// This method is for an advanced usage only.
+    ///
+    /// Most user should prefer using `read_postings` instead.
+    pub fn read_block_postings(
+        &self,
+        term: &Term,
+        option: IndexRecordOption,
+    ) -> Option<BlockSegmentPostings> {
+        self.get_term_info(term)
+            .map(move|term_info| self.read_block_postings_from_terminfo(&term_info, option))
    }

    /// Returns a block postings given a `term_info`.
@@ -107,15 +125,11 @@ impl InvertedIndexReader {
    ) -> BlockSegmentPostings {
        let offset = term_info.postings_offset as usize;
        let postings_data = self.postings_source.slice_from(offset);
-        let freq_reading_option = match (self.record_option, requested_option) {
-            (IndexRecordOption::Basic, _) => FreqReadingOption::NoFreq,
-            (_, IndexRecordOption::Basic) => FreqReadingOption::SkipFreq,
-            (_, _) => FreqReadingOption::ReadFreq,
-        };
        BlockSegmentPostings::from_data(
-            term_info.doc_freq as usize,
-            SourceRead::from(postings_data),
-            freq_reading_option,
+            term_info.doc_freq,
+            OwnedRead::new(postings_data),
+            self.record_option,
+            requested_option,
        )
    }

@@ -131,11 +145,10 @@ impl InvertedIndexReader {
        let block_postings = self.read_block_postings_from_terminfo(term_info, option);
        let position_stream = {
            if option.has_positions() {
-                let position_offset = term_info.positions_offset;
-                let positions_source = self.positions_source.slice_from(position_offset as usize);
-                let mut stream = CompressedIntStream::wrap(positions_source);
-                stream.skip(term_info.positions_inner_offset as usize);
-                Some(stream)
+                let position_reader = self.positions_source.clone();
+                let skip_reader = self.positions_idx_source.clone();
+                let position_reader = PositionReader::new(position_reader, skip_reader, term_info.positions_idx);
+                Some(position_reader)
            } else {
                None
            }
@@ -160,8 +173,8 @@ impl InvertedIndexReader {
    /// `TextIndexingOptions` that does not index position will return a `SegmentPostings`
    /// with `DocId`s and frequencies.
    pub fn read_postings(&self, term: &Term, option: IndexRecordOption) -> Option<SegmentPostings> {
-        let term_info = get!(self.get_term_info(term));
-        Some(self.read_postings_from_terminfo(&term_info, option))
+        self.get_term_info(term)
+            .map(move |term_info| self.read_postings_from_terminfo(&term_info, option))
    }

    pub(crate) fn read_postings_no_deletes(
@@ -169,8 +182,8 @@ impl InvertedIndexReader {
        term: &Term,
        option: IndexRecordOption,
    ) -> Option<SegmentPostings> {
-        let term_info = get!(self.get_term_info(term));
-        Some(self.read_postings_from_terminfo(&term_info, option))
+        self.get_term_info(term)
+            .map(|term_info| self.read_postings_from_terminfo(&term_info, option))
    }

    /// Returns the number of documents containing the term.
--- a/src/core/pool.rs
+++ b/src/core/pool.rs
@@ -1,4 +1,4 @@
-use crossbeam::sync::MsQueue;
+use crossbeam::queue::MsQueue;
 use std::mem;
 use std::ops::{Deref, DerefMut};
 use std::sync::atomic::AtomicUsize;
--- a/src/core/searcher.rs
+++ b/src/core/searcher.rs
@@ -73,7 +73,7 @@ impl Searcher {

    /// Runs a query on the segment readers wrapped by the searcher
    pub fn search<C: Collector>(&self, query: &Query, collector: &mut C) -> Result<()> {
-        collector.search(self, query)
+        query.search(self, collector)
    }

    /// Return the field searcher associated to a `Field`.
--- a/src/core/segment.rs
+++ b/src/core/segment.rs
@@ -4,7 +4,7 @@ use core::SegmentId;
 use core::SegmentMeta;
 use directory::error::{OpenReadError, OpenWriteError};
 use directory::Directory;
-use directory::{FileProtection, ReadOnlySource, WritePtr};
+use directory::{ReadOnlySource, WritePtr};
 use indexer::segment_serializer::SegmentSerializer;
 use schema::Schema;
 use std::fmt;
@@ -28,6 +28,7 @@ impl fmt::Debug for Segment {
 /// Creates a new segment given an `Index` and a `SegmentId`
 ///
 /// The function is here to make it private outside `tantivy`.
+/// #[doc(hidden)]
 pub fn create_segment(index: Index, meta: SegmentMeta) -> Segment {
    Segment { index, meta }
 }
@@ -49,8 +50,11 @@ impl Segment {
    }

    #[doc(hidden)]
-    pub fn set_delete_meta(&mut self, num_deleted_docs: u32, opstamp: u64) {
-        self.meta.set_delete_meta(num_deleted_docs, opstamp);
+    pub fn with_delete_meta(self, num_deleted_docs: u32, opstamp: u64) -> Segment {
+        Segment {
+            index: self.index,
+            meta: self.meta.with_delete_meta(num_deleted_docs, opstamp),
+        }
    }

    /// Returns the segment's id.
@@ -66,16 +70,6 @@ impl Segment {
        self.meta.relative_path(component)
    }

-    /// Protects a specific component file from being deleted.
-    ///
-    /// Returns a FileProtection object. The file is guaranteed
-    /// to not be garbage collected as long as this `FileProtection`  object
-    /// lives.
-    pub fn protect_from_delete(&self, component: SegmentComponent) -> FileProtection {
-        let path = self.relative_path(component);
-        self.index.directory().protect_file_from_delete(&path)
-    }
-
    /// Open one of the component file for a *regular* read.
    pub fn open_read(
        &self,
@@ -105,35 +99,3 @@ pub trait SerializableSegment {
    /// The number of documents in the segment.
    fn write(&self, serializer: SegmentSerializer) -> Result<u32>;
 }
-
-#[cfg(test)]
-mod tests {
-
-    use core::SegmentComponent;
-    use directory::Directory;
-    use schema::SchemaBuilder;
-    use std::collections::HashSet;
-    use Index;
-
-    #[test]
-    fn test_segment_protect_component() {
-        let mut index = Index::create_in_ram(SchemaBuilder::new().build());
-        let segment = index.new_segment();
-        let path = segment.relative_path(SegmentComponent::POSTINGS);
-
-        let directory = index.directory_mut();
-        directory.atomic_write(&*path, &vec![0u8]).unwrap();
-
-        let living_files = HashSet::new();
-        {
-            let _file_protection = segment.protect_from_delete(SegmentComponent::POSTINGS);
-            assert!(directory.exists(&*path));
-            directory.garbage_collect(|| living_files.clone());
-            assert!(directory.exists(&*path));
-        }
-
-        directory.garbage_collect(|| living_files);
-        assert!(!directory.exists(&*path));
-    }
-
-}
--- a/src/core/segment_component.rs
+++ b/src/core/segment_component.rs
@@ -10,6 +10,8 @@ pub enum SegmentComponent {
    POSTINGS,
    /// Positions of terms in each document.
    POSITIONS,
+    /// Index to seek within the position file
+    POSITIONSSKIP,
    /// Column-oriented random-access storage of fields.
    FASTFIELDS,
    /// Stores the sum  of the length (in terms) of each field for each document.
@@ -29,9 +31,10 @@ pub enum SegmentComponent {
 impl SegmentComponent {
    /// Iterates through the components.
    pub fn iterator() -> slice::Iter<'static, SegmentComponent> {
-        static SEGMENT_COMPONENTS: [SegmentComponent; 7] = [
+        static SEGMENT_COMPONENTS: [SegmentComponent; 8] = [
            SegmentComponent::POSTINGS,
            SegmentComponent::POSITIONS,
+            SegmentComponent::POSITIONSSKIP,
            SegmentComponent::FASTFIELDS,
            SegmentComponent::FIELDNORMS,
            SegmentComponent::TERMS,
--- a/src/core/segment_meta.rs
+++ b/src/core/segment_meta.rs
@@ -1,8 +1,15 @@
 use super::SegmentComponent;
+use census::{Inventory, TrackedObject};
 use core::SegmentId;
+use serde;
 use std::collections::HashSet;
+use std::fmt;
 use std::path::PathBuf;

+lazy_static! {
+    static ref INVENTORY: Inventory<InnerSegmentMeta> = { Inventory::new() };
+}
+
 #[derive(Clone, Debug, Serialize, Deserialize)]
 struct DeleteMeta {
    num_deleted_docs: u32,
@@ -13,32 +20,72 @@ struct DeleteMeta {
 ///
 /// For instance the number of docs it contains,
 /// how many are deleted, etc.
-#[derive(Clone, Debug, Serialize, Deserialize)]
+#[derive(Clone)]
 pub struct SegmentMeta {
-    segment_id: SegmentId,
-    max_doc: u32,
-    deletes: Option<DeleteMeta>,
+    tracked: TrackedObject<InnerSegmentMeta>,
+}
+
+impl fmt::Debug for SegmentMeta {
+    fn fmt(&self, f: &mut fmt::Formatter) -> Result<(), fmt::Error> {
+        self.tracked.fmt(f)
+    }
+}
+
+impl serde::Serialize for SegmentMeta {
+    fn serialize<S>(
+        &self,
+        serializer: S,
+    ) -> Result<<S as serde::Serializer>::Ok, <S as serde::Serializer>::Error>
+    where
+        S: serde::Serializer,
+    {
+        self.tracked.serialize(serializer)
+    }
+}
+
+impl<'a> serde::Deserialize<'a> for SegmentMeta {
+    fn deserialize<D>(deserializer: D) -> Result<Self, <D as serde::Deserializer<'a>>::Error>
+    where
+        D: serde::Deserializer<'a>,
+    {
+        let inner = InnerSegmentMeta::deserialize(deserializer)?;
+        let tracked = INVENTORY.track(inner);
+        Ok(SegmentMeta { tracked: tracked })
+    }
 }

 impl SegmentMeta {
-    /// Creates a new segment meta for
-    /// a segment with no deletes and no documents.
-    pub fn new(segment_id: SegmentId) -> SegmentMeta {
-        SegmentMeta {
+    /// Lists all living `SegmentMeta` object at the time of the call.
+    pub fn all() -> Vec<SegmentMeta> {
+        INVENTORY
+            .list()
+            .into_iter()
+            .map(|inner| SegmentMeta { tracked: inner })
+            .collect::<Vec<_>>()
+    }
+
+    /// Creates a new `SegmentMeta` object.
+    #[doc(hidden)]
+    pub fn new(segment_id: SegmentId, max_doc: u32) -> SegmentMeta {
+        let inner = InnerSegmentMeta {
            segment_id,
-            max_doc: 0,
+            max_doc,
            deletes: None,
+        };
+        SegmentMeta {
+            tracked: INVENTORY.track(inner),
        }
    }

    /// Returns the segment id.
    pub fn id(&self) -> SegmentId {
-        self.segment_id
+        self.tracked.segment_id
    }

    /// Returns the number of deleted documents.
    pub fn num_deleted_docs(&self) -> u32 {
-        self.deletes
+        self.tracked
+            .deletes
            .as_ref()
            .map(|delete_meta| delete_meta.num_deleted_docs)
            .unwrap_or(0u32)
@@ -63,8 +110,9 @@ impl SegmentMeta {
    pub fn relative_path(&self, component: SegmentComponent) -> PathBuf {
        let mut path = self.id().uuid_string();
        path.push_str(&*match component {
-            SegmentComponent::POSITIONS => ".pos".to_string(),
            SegmentComponent::POSTINGS => ".idx".to_string(),
+            SegmentComponent::POSITIONS => ".pos".to_string(),
+            SegmentComponent::POSITIONSSKIP => ".posidx".to_string(),
            SegmentComponent::TERMS => ".term".to_string(),
            SegmentComponent::STORE => ".store".to_string(),
            SegmentComponent::FASTFIELDS => ".fast".to_string(),
@@ -80,7 +128,7 @@ impl SegmentMeta {
    /// and all the doc ids contains in this segment
    /// are exactly (0..max_doc).
    pub fn max_doc(&self) -> u32 {
-        self.max_doc
+        self.tracked.max_doc
    }

    /// Return the number of documents in the segment.
@@ -91,25 +139,36 @@ impl SegmentMeta {
    /// Returns the opstamp of the last delete operation
    /// taken in account in this segment.
    pub fn delete_opstamp(&self) -> Option<u64> {
-        self.deletes.as_ref().map(|delete_meta| delete_meta.opstamp)
+        self.tracked
+            .deletes
+            .as_ref()
+            .map(|delete_meta| delete_meta.opstamp)
    }

    /// Returns true iff the segment meta contains
    /// delete information.
    pub fn has_deletes(&self) -> bool {
-        self.deletes.is_some()
+        self.num_deleted_docs() > 0
    }

    #[doc(hidden)]
-    pub fn set_max_doc(&mut self, max_doc: u32) {
-        self.max_doc = max_doc;
-    }
-
-    #[doc(hidden)]
-    pub fn set_delete_meta(&mut self, num_deleted_docs: u32, opstamp: u64) {
-        self.deletes = Some(DeleteMeta {
+    pub fn with_delete_meta(self, num_deleted_docs: u32, opstamp: u64) -> SegmentMeta {
+        let delete_meta = DeleteMeta {
            num_deleted_docs,
            opstamp,
+        };
+        let tracked = self.tracked.map(move |inner_meta| InnerSegmentMeta {
+            segment_id: inner_meta.segment_id,
+            max_doc: inner_meta.max_doc,
+            deletes: Some(delete_meta),
        });
+        SegmentMeta { tracked }
    }
 }
+
+#[derive(Clone, Debug, Serialize, Deserialize)]
+struct InnerSegmentMeta {
+    segment_id: SegmentId,
+    max_doc: u32,
+    deletes: Option<DeleteMeta>,
+}
--- a/src/core/segment_reader.rs
+++ b/src/core/segment_reader.rs
@@ -5,7 +5,7 @@ use core::Segment;
 use core::SegmentComponent;
 use core::SegmentId;
 use core::SegmentMeta;
-use error::ErrorKind;
+use error::TantivyError;
 use fastfield::DeleteBitSet;
 use fastfield::FacetReader;
 use fastfield::FastFieldReader;
@@ -49,6 +49,7 @@ pub struct SegmentReader {
    termdict_composite: CompositeFile,
    postings_composite: CompositeFile,
    positions_composite: CompositeFile,
+    positions_idx_composite: CompositeFile,
    fast_fields_composite: CompositeFile,
    fieldnorms_composite: CompositeFile,

@@ -170,7 +171,7 @@ impl SegmentReader {
    pub fn facet_reader(&self, field: Field) -> Result<FacetReader> {
        let field_entry = self.schema.get_field_entry(field);
        if field_entry.field_type() != &FieldType::HierarchicalFacet {
-            return Err(ErrorKind::InvalidArgument(format!(
+            return Err(TantivyError::InvalidArgument(format!(
                "The field {:?} is not a \
                 hierarchical facet.",
                field_entry
@@ -178,7 +179,7 @@ impl SegmentReader {
        }
        let term_ords_reader = self.multi_fast_field_reader(field)?;
        let termdict_source = self.termdict_composite.open_read(field).ok_or_else(|| {
-            ErrorKind::InvalidArgument(format!(
+            TantivyError::InvalidArgument(format!(
                "The field \"{}\" is a hierarchical \
                 but this segment does not seem to have the field term \
                 dictionary.",
@@ -235,6 +236,14 @@ impl SegmentReader {
            }
        };

+        let positions_idx_composite = {
+            if let Ok(source) = segment.open_read(SegmentComponent::POSITIONSSKIP) {
+                CompositeFile::open(&source)?
+            } else {
+                CompositeFile::empty()
+            }
+        };
+
        let fast_fields_data = segment.open_read(SegmentComponent::FASTFIELDS)?;
        let fast_fields_composite = CompositeFile::open(&fast_fields_data)?;

@@ -260,6 +269,7 @@ impl SegmentReader {
            store_reader,
            delete_bitset_opt,
            positions_composite,
+            positions_idx_composite,
            schema,
        })
    }
@@ -309,10 +319,15 @@ impl SegmentReader {
            .open_read(field)
            .expect("Index corrupted. Failed to open field positions in composite file.");

+        let positions_idx_source = self.positions_idx_composite
+            .open_read(field)
+            .expect("Index corrupted. Failed to open field positions in composite file.");
+
        let inv_idx_reader = Arc::new(InvertedIndexReader::new(
            TermDictionary::from_source(termdict_source),
            postings_source,
            positions_source,
+            positions_idx_source,
            record_option,
        ));

--- a/src/datastruct/mod.rs
+++ b/src/datastruct/mod.rs
@@ -1,4 +0,0 @@
-mod skip;
-pub mod stacker;
-
-pub use self::skip::{SkipList, SkipListBuilder};
--- a/src/datastruct/stacker/expull.rs
+++ b/src/datastruct/stacker/expull.rs
@@ -1,168 +0,0 @@
-use super::heap::{Heap, HeapAllocable};
-use std::mem;
-
-#[inline]
-pub fn is_power_of_2(val: u32) -> bool {
-    val & (val - 1) == 0
-}
-
-#[inline]
-pub fn jump_needed(val: u32) -> bool {
-    val > 3 && is_power_of_2(val)
-}
-
-#[derive(Debug, Clone)]
-pub struct ExpUnrolledLinkedList {
-    len: u32,
-    end: u32,
-    val0: u32,
-    val1: u32,
-    val2: u32,
-    next: u32, // inline  of the first block
-}
-
-impl ExpUnrolledLinkedList {
-    pub fn iter<'a>(&self, addr: u32, heap: &'a Heap) -> ExpUnrolledLinkedListIterator<'a> {
-        ExpUnrolledLinkedListIterator {
-            heap,
-            addr: addr + 2u32 * (mem::size_of::<u32>() as u32),
-            len: self.len,
-            consumed: 0,
-        }
-    }
-
-    pub fn push(&mut self, val: u32, heap: &Heap) {
-        self.len += 1;
-        if jump_needed(self.len) {
-            // we need to allocate another block.
-            // ... As we want to grow block exponentially
-            // the next block as a size of (length so far),
-            // and we need to add 1u32 to store the pointer
-            // to the next element.
-            let new_block_size: usize = (self.len as usize + 1) * mem::size_of::<u32>();
-            let new_block_addr: u32 = heap.allocate_space(new_block_size);
-            heap.set(self.end, &new_block_addr);
-            self.end = new_block_addr;
-        }
-        heap.set(self.end, &val);
-        self.end += mem::size_of::<u32>() as u32;
-    }
-}
-
-impl HeapAllocable for u32 {
-    fn with_addr(_addr: u32) -> u32 {
-        0u32
-    }
-}
-
-impl HeapAllocable for ExpUnrolledLinkedList {
-    fn with_addr(addr: u32) -> ExpUnrolledLinkedList {
-        let last_addr = addr + mem::size_of::<u32>() as u32 * 2u32;
-        ExpUnrolledLinkedList {
-            len: 0u32,
-            end: last_addr,
-            val0: 0u32,
-            val1: 0u32,
-            val2: 0u32,
-            next: 0u32,
-        }
-    }
-}
-
-pub struct ExpUnrolledLinkedListIterator<'a> {
-    heap: &'a Heap,
-    addr: u32,
-    len: u32,
-    consumed: u32,
-}
-
-impl<'a> Iterator for ExpUnrolledLinkedListIterator<'a> {
-    type Item = u32;
-
-    fn next(&mut self) -> Option<u32> {
-        if self.consumed == self.len {
-            None
-        } else {
-            let addr: u32;
-            self.consumed += 1;
-            if jump_needed(self.consumed) {
-                addr = *self.heap.get_mut_ref(self.addr);
-            } else {
-                addr = self.addr;
-            }
-            self.addr = addr + mem::size_of::<u32>() as u32;
-            Some(*self.heap.get_mut_ref(addr))
-        }
-    }
-}
-
-#[cfg(test)]
-mod tests {
-
-    use super::super::heap::Heap;
-    use super::*;
-
-    #[test]
-    fn test_stack() {
-        let heap = Heap::with_capacity(1_000_000);
-        let (addr, stack) = heap.allocate_object::<ExpUnrolledLinkedList>();
-        stack.push(1u32, &heap);
-        stack.push(2u32, &heap);
-        stack.push(4u32, &heap);
-        stack.push(8u32, &heap);
-        {
-            let mut it = stack.iter(addr, &heap);
-            assert_eq!(it.next().unwrap(), 1u32);
-            assert_eq!(it.next().unwrap(), 2u32);
-            assert_eq!(it.next().unwrap(), 4u32);
-            assert_eq!(it.next().unwrap(), 8u32);
-            assert!(it.next().is_none());
-        }
-    }
-
-}
-
-#[cfg(all(test, feature = "unstable"))]
-mod bench {
-    use super::ExpUnrolledLinkedList;
-    use super::Heap;
-    use test::Bencher;
-
-    const NUM_STACK: usize = 10_000;
-    const STACK_SIZE: u32 = 1000;
-
-    #[bench]
-    fn bench_push_vec(bench: &mut Bencher) {
-        bench.iter(|| {
-            let mut vecs = Vec::with_capacity(100);
-            for _ in 0..NUM_STACK {
-                vecs.push(Vec::new());
-            }
-            for s in 0..NUM_STACK {
-                for i in 0u32..STACK_SIZE {
-                    let t = s * 392017 % NUM_STACK;
-                    vecs[t].push(i);
-                }
-            }
-        });
-    }
-
-    #[bench]
-    fn bench_push_stack(bench: &mut Bencher) {
-        let heap = Heap::with_capacity(64_000_000);
-        bench.iter(|| {
-            let mut stacks = Vec::with_capacity(100);
-            for _ in 0..NUM_STACK {
-                let (_, stack) = heap.allocate_object::<ExpUnrolledLinkedList>();
-                stacks.push(stack);
-            }
-            for s in 0..NUM_STACK {
-                for i in 0u32..STACK_SIZE {
-                    let t = s * 392017 % NUM_STACK;
-                    stacks[t].push(i, &heap);
-                }
-            }
-            heap.clear();
-        });
-    }
-}
--- a/src/datastruct/stacker/hashmap.rs
+++ b/src/datastruct/stacker/hashmap.rs
@@ -1,335 +0,0 @@
-use super::heap::{BytesRef, Heap, HeapAllocable};
-use postings::UnorderedTermId;
-use std::iter;
-use std::mem;
-use std::slice;
-
-mod murmurhash2 {
-
-    const SEED: u32 = 3_242_157_231u32;
-    const M: u32 = 0x5bd1_e995;
-
-    #[inline(always)]
-    pub fn murmurhash2(key: &[u8]) -> u32 {
-        let mut key_ptr: *const u32 = key.as_ptr() as *const u32;
-        let len = key.len() as u32;
-        let mut h: u32 = SEED ^ len;
-
-        let num_blocks = len >> 2;
-        for _ in 0..num_blocks {
-            let mut k: u32 = unsafe { *key_ptr }; // ok because of num_blocks definition
-            k = k.wrapping_mul(M);
-            k ^= k >> 24;
-            k = k.wrapping_mul(M);
-            h = h.wrapping_mul(M);
-            h ^= k;
-            key_ptr = key_ptr.wrapping_offset(1);
-        }
-
-        // Handle the last few bytes of the input array
-        let remaining: &[u8] = &key[key.len() & !3..];
-        match remaining.len() {
-            3 => {
-                h ^= u32::from(remaining[2]) << 16;
-                h ^= u32::from(remaining[1]) << 8;
-                h ^= u32::from(remaining[0]);
-                h = h.wrapping_mul(M);
-            }
-            2 => {
-                h ^= u32::from(remaining[1]) << 8;
-                h ^= u32::from(remaining[0]);
-                h = h.wrapping_mul(M);
-            }
-            1 => {
-                h ^= u32::from(remaining[0]);
-                h = h.wrapping_mul(M);
-            }
-            _ => {}
-        }
-        h ^= h >> 13;
-        h = h.wrapping_mul(M);
-        h ^ (h >> 15)
-    }
-}
-
-/// Split the thread memory budget into
-/// - the heap size
-/// - the hash table "table" itself.
-///
-/// Returns (the heap size in bytes, the hash table size in number of bits)
-pub(crate) fn split_memory(per_thread_memory_budget: usize) -> (usize, usize) {
-    let table_size_limit: usize = per_thread_memory_budget / 3;
-    let compute_table_size = |num_bits: usize| (1 << num_bits) * mem::size_of::<KeyValue>();
-    let table_num_bits: usize = (1..)
-        .into_iter()
-        .take_while(|num_bits: &usize| compute_table_size(*num_bits) < table_size_limit)
-        .last()
-        .expect(&format!(
-            "Per thread memory is too small: {}",
-            per_thread_memory_budget
-        ));
-    let table_size = compute_table_size(table_num_bits);
-    let heap_size = per_thread_memory_budget - table_size;
-    (heap_size, table_num_bits)
-}
-
-/// `KeyValue` is the item stored in the hash table.
-/// The key is actually a `BytesRef` object stored in an external heap.
-/// The `value_addr` also points to an address in the heap.
-///
-/// The key and the value are actually stored contiguously.
-/// For this reason, the (start, stop) information is actually redundant
-/// and can be simplified in the future
-#[derive(Copy, Clone, Default)]
-struct KeyValue {
-    key_value_addr: BytesRef,
-    hash: u32,
-}
-
-impl KeyValue {
-    fn is_empty(&self) -> bool {
-        self.key_value_addr.is_null()
-    }
-}
-
-/// Customized `HashMap` with string keys
-///
-/// This `HashMap` takes String as keys. Keys are
-/// stored in a user defined heap.
-///
-/// The quirky API has the benefit of avoiding
-/// the computation of the hash of the key twice,
-/// or copying the key as long as there is no insert.
-///
-pub struct TermHashMap<'a> {
-    table: Box<[KeyValue]>,
-    heap: &'a Heap,
-    mask: usize,
-    occupied: Vec<usize>,
-}
-
-struct QuadraticProbing {
-    hash: usize,
-    i: usize,
-    mask: usize,
-}
-
-impl QuadraticProbing {
-    fn compute(hash: usize, mask: usize) -> QuadraticProbing {
-        QuadraticProbing { hash, i: 0, mask }
-    }
-
-    #[inline]
-    fn next_probe(&mut self) -> usize {
-        self.i += 1;
-        (self.hash + self.i * self.i) & self.mask
-    }
-}
-
-pub struct Iter<'a: 'b, 'b> {
-    hashmap: &'b TermHashMap<'a>,
-    inner: slice::Iter<'a, usize>,
-}
-
-impl<'a, 'b> Iterator for Iter<'a, 'b> {
-    type Item = (&'b [u8], u32, UnorderedTermId);
-
-    fn next(&mut self) -> Option<Self::Item> {
-        self.inner.next().cloned().map(move |bucket: usize| {
-            let kv = self.hashmap.table[bucket];
-            let (key, offset): (&'b [u8], u32) = self.hashmap.get_key_value(kv.key_value_addr);
-            (key, offset, bucket as UnorderedTermId)
-        })
-    }
-}
-
-impl<'a> TermHashMap<'a> {
-    pub fn new(num_bucket_power_of_2: usize, heap: &'a Heap) -> TermHashMap<'a> {
-        let table_size = 1 << num_bucket_power_of_2;
-        let table: Vec<KeyValue> = iter::repeat(KeyValue::default()).take(table_size).collect();
-        TermHashMap {
-            table: table.into_boxed_slice(),
-            heap,
-            mask: table_size - 1,
-            occupied: Vec::with_capacity(table_size / 2),
-        }
-    }
-
-    fn probe(&self, hash: u32) -> QuadraticProbing {
-        QuadraticProbing::compute(hash as usize, self.mask)
-    }
-
-    pub fn is_saturated(&self) -> bool {
-        self.table.len() < self.occupied.len() * 3
-    }
-
-    #[inline(never)]
-    fn get_key_value(&self, bytes_ref: BytesRef) -> (&[u8], u32) {
-        let key_bytes: &[u8] = self.heap.get_slice(bytes_ref);
-        let expull_addr: u32 = bytes_ref.addr() + 2 + key_bytes.len() as u32;
-        (key_bytes, expull_addr)
-    }
-
-    pub fn set_bucket(&mut self, hash: u32, key_value_addr: BytesRef, bucket: usize) {
-        self.occupied.push(bucket);
-        self.table[bucket] = KeyValue {
-            key_value_addr,
-            hash,
-        };
-    }
-
-    pub fn iter<'b: 'a>(&'b self) -> Iter<'a, 'b> {
-        Iter {
-            inner: self.occupied.iter(),
-            hashmap: &self,
-        }
-    }
-
-    pub fn get_or_create<S: AsRef<[u8]>, V: HeapAllocable>(
-        &mut self,
-        key: S,
-    ) -> (UnorderedTermId, &mut V) {
-        let key_bytes: &[u8] = key.as_ref();
-        let hash = murmurhash2::murmurhash2(key.as_ref());
-        let mut probe = self.probe(hash);
-        loop {
-            let bucket = probe.next_probe();
-            let kv: KeyValue = self.table[bucket];
-            if kv.is_empty() {
-                let key_bytes_ref = self.heap.allocate_and_set(key_bytes);
-                let (addr, val): (u32, &mut V) = self.heap.allocate_object();
-                assert_eq!(addr, key_bytes_ref.addr() + 2 + key_bytes.len() as u32);
-                self.set_bucket(hash, key_bytes_ref, bucket);
-                return (bucket as UnorderedTermId, val);
-            } else if kv.hash == hash {
-                let (stored_key, expull_addr): (&[u8], u32) = self.get_key_value(kv.key_value_addr);
-                if stored_key == key_bytes {
-                    return (
-                        bucket as UnorderedTermId,
-                        self.heap.get_mut_ref(expull_addr),
-                    );
-                }
-            }
-        }
-    }
-}
-
-#[cfg(all(test, feature = "unstable"))]
-mod bench {
-    use super::murmurhash2::murmurhash2;
-    use test::Bencher;
-
-    #[bench]
-    fn bench_murmurhash2(b: &mut Bencher) {
-        let keys: [&'static str; 3] = ["wer qwe qwe qwe ", "werbq weqweqwe2 ", "weraq weqweqwe3 "];
-        b.iter(|| {
-            let mut s = 0;
-            for &key in &keys {
-                s ^= murmurhash2(key.as_bytes());
-            }
-            s
-        });
-    }
-}
-
-#[cfg(test)]
-mod tests {
-
-    use super::super::heap::{Heap, HeapAllocable};
-    use super::murmurhash2::murmurhash2;
-    use super::split_memory;
-    use super::*;
-    use std::collections::HashSet;
-
-    struct TestValue {
-        val: u32,
-        _addr: u32,
-    }
-
-    impl HeapAllocable for TestValue {
-        fn with_addr(addr: u32) -> TestValue {
-            TestValue {
-                val: 0u32,
-                _addr: addr,
-            }
-        }
-    }
-
-    #[test]
-    fn test_hashmap_size() {
-        assert_eq!(split_memory(100_000), (67232, 12));
-        assert_eq!(split_memory(1_000_000), (737856, 15));
-        assert_eq!(split_memory(10_000_000), (7902848, 18));
-    }
-
-    #[test]
-    fn test_hash_map() {
-        let heap = Heap::with_capacity(2_000_000);
-        let mut hash_map: TermHashMap = TermHashMap::new(18, &heap);
-        {
-            let v: &mut TestValue = hash_map.get_or_create("abc").1;
-            assert_eq!(v.val, 0u32);
-            v.val = 3u32;
-        }
-        {
-            let v: &mut TestValue = hash_map.get_or_create("abcd").1;
-            assert_eq!(v.val, 0u32);
-            v.val = 4u32;
-        }
-        {
-            let v: &mut TestValue = hash_map.get_or_create("abc").1;
-            assert_eq!(v.val, 3u32);
-        }
-        {
-            let v: &mut TestValue = hash_map.get_or_create("abcd").1;
-            assert_eq!(v.val, 4u32);
-        }
-        let mut iter_values = hash_map.iter();
-        {
-            let (_, addr, _) = iter_values.next().unwrap();
-            let val: &TestValue = heap.get_ref(addr);
-            assert_eq!(val.val, 3u32);
-        }
-        {
-            let (_, addr, _) = iter_values.next().unwrap();
-            let val: &TestValue = heap.get_ref(addr);
-            assert_eq!(val.val, 4u32);
-        }
-        assert!(iter_values.next().is_none());
-    }
-
-    #[test]
-    fn test_murmur() {
-        let s1 = "abcdef";
-        let s2 = "abcdeg";
-        for i in 0..5 {
-            assert_eq!(
-                murmurhash2(&s1[i..5].as_bytes()),
-                murmurhash2(&s2[i..5].as_bytes())
-            );
-        }
-    }
-
-    #[test]
-    fn test_murmur_against_reference_impl() {
-        assert_eq!(murmurhash2("".as_bytes()), 3632506080);
-        assert_eq!(murmurhash2("a".as_bytes()), 455683869);
-        assert_eq!(murmurhash2("ab".as_bytes()), 2448092234);
-        assert_eq!(murmurhash2("abc".as_bytes()), 2066295634);
-        assert_eq!(murmurhash2("abcd".as_bytes()), 2588571162);
-        assert_eq!(murmurhash2("abcde".as_bytes()), 2988696942);
-        assert_eq!(murmurhash2("abcdefghijklmnop".as_bytes()), 2350868870);
-    }
-
-    #[test]
-    fn test_murmur_collisions() {
-        let mut set: HashSet<u32> = HashSet::default();
-        for i in 0..10_000 {
-            let s = format!("hash{}", i);
-            let hash = murmurhash2(s.as_bytes());
-            set.insert(hash);
-        }
-        assert_eq!(set.len(), 10_000);
-    }
-
-}
--- a/src/datastruct/stacker/heap.rs
+++ b/src/datastruct/stacker/heap.rs
@@ -1,233 +0,0 @@
-use byteorder::{ByteOrder, NativeEndian};
-use std::cell::UnsafeCell;
-use std::mem;
-use std::ptr;
-
-/// `BytesRef` refers to a slice in tantivy's custom `Heap`.
-///
-/// The slice will encode the length of the `&[u8]` slice
-/// on 16-bits, and then the data is encoded.
-#[derive(Copy, Clone)]
-pub struct BytesRef(u32);
-
-impl BytesRef {
-    pub fn is_null(&self) -> bool {
-        self.0 == u32::max_value()
-    }
-
-    pub fn addr(&self) -> u32 {
-        self.0
-    }
-}
-
-impl Default for BytesRef {
-    fn default() -> BytesRef {
-        BytesRef(u32::max_value())
-    }
-}
-
-/// Object that can be allocated in tantivy's custom `Heap`.
-pub trait HeapAllocable {
-    fn with_addr(addr: u32) -> Self;
-}
-
-/// Tantivy's custom `Heap`.
-pub struct Heap {
-    inner: UnsafeCell<InnerHeap>,
-}
-
-#[cfg_attr(feature = "cargo-clippy", allow(mut_from_ref))]
-impl Heap {
-    /// Creates a new heap with a given capacity
-    pub fn with_capacity(num_bytes: usize) -> Heap {
-        Heap {
-            inner: UnsafeCell::new(InnerHeap::with_capacity(num_bytes)),
-        }
-    }
-
-    fn inner(&self) -> &mut InnerHeap {
-        unsafe { &mut *self.inner.get() }
-    }
-
-    /// Clears the heap. All the underlying data is lost.
-    ///
-    /// This heap does not support deallocation.
-    /// This method is the only way to free memory.
-    pub fn clear(&self) {
-        self.inner().clear();
-    }
-
-    /// Return amount of free space, in bytes.
-    pub fn num_free_bytes(&self) -> u32 {
-        self.inner().num_free_bytes()
-    }
-
-    /// Allocate a given amount of space and returns an address
-    /// in the Heap.
-    pub fn allocate_space(&self, num_bytes: usize) -> u32 {
-        self.inner().allocate_space(num_bytes)
-    }
-
-    /// Allocate an object in the heap
-    pub fn allocate_object<V: HeapAllocable>(&self) -> (u32, &mut V) {
-        let addr = self.inner().allocate_space(mem::size_of::<V>());
-        let v: V = V::with_addr(addr);
-        self.inner().set(addr, &v);
-        (addr, self.inner().get_mut_ref(addr))
-    }
-
-    /// Stores a `&[u8]` in the heap and returns the destination BytesRef.
-    pub fn allocate_and_set(&self, data: &[u8]) -> BytesRef {
-        self.inner().allocate_and_set(data)
-    }
-
-    /// Fetches the `&[u8]` stored on the slice defined by the `BytesRef`
-    /// given as argumetn
-    pub fn get_slice(&self, bytes_ref: BytesRef) -> &[u8] {
-        self.inner().get_slice(bytes_ref)
-    }
-
-    /// Stores an item's data in the heap, at the given `address`.
-    pub fn set<Item>(&self, addr: u32, val: &Item) {
-        self.inner().set(addr, val);
-    }
-
-    /// Returns a mutable reference for an object at a given Item.
-    pub fn get_mut_ref<Item>(&self, addr: u32) -> &mut Item {
-        self.inner().get_mut_ref(addr)
-    }
-
-    /// Returns a mutable reference to an `Item` at a given `addr`.
-    #[cfg(test)]
-    pub fn get_ref<Item>(&self, addr: u32) -> &mut Item {
-        self.get_mut_ref(addr)
-    }
-}
-
-struct InnerHeap {
-    buffer: Vec<u8>,
-    buffer_len: u32,
-    used: u32,
-    next_heap: Option<Box<InnerHeap>>,
-}
-
-impl InnerHeap {
-    pub fn with_capacity(num_bytes: usize) -> InnerHeap {
-        let buffer: Vec<u8> = vec![0u8; num_bytes];
-        InnerHeap {
-            buffer,
-            buffer_len: num_bytes as u32,
-            next_heap: None,
-            used: 0u32,
-        }
-    }
-
-    pub fn clear(&mut self) {
-        self.used = 0u32;
-        self.next_heap = None;
-    }
-
-    // Returns the number of free bytes. If the buffer
-    // has reached it's capacity and overflowed to another buffer, return 0.
-    pub fn num_free_bytes(&self) -> u32 {
-        if self.next_heap.is_some() {
-            0u32
-        } else {
-            self.buffer_len - self.used
-        }
-    }
-
-    pub fn allocate_space(&mut self, num_bytes: usize) -> u32 {
-        let addr = self.used;
-        self.used += num_bytes as u32;
-        if self.used <= self.buffer_len {
-            addr
-        } else {
-            if self.next_heap.is_none() {
-                info!(
-                    r#"Exceeded heap size. The segment will be committed right
-                         after indexing this document."#,
-                );
-                self.next_heap = Some(Box::new(InnerHeap::with_capacity(self.buffer_len as usize)));
-            }
-            self.next_heap.as_mut().unwrap().allocate_space(num_bytes) + self.buffer_len
-        }
-    }
-
-    fn get_slice(&self, bytes_ref: BytesRef) -> &[u8] {
-        let start = bytes_ref.0;
-        if start >= self.buffer_len {
-            self.next_heap
-                .as_ref()
-                .unwrap()
-                .get_slice(BytesRef(start - self.buffer_len))
-        } else {
-            let start = start as usize;
-            let len = NativeEndian::read_u16(&self.buffer[start..start + 2]) as usize;
-            &self.buffer[start + 2..start + 2 + len]
-        }
-    }
-
-    fn get_mut_slice(&mut self, start: u32, stop: u32) -> &mut [u8] {
-        if start >= self.buffer_len {
-            self.next_heap
-                .as_mut()
-                .unwrap()
-                .get_mut_slice(start - self.buffer_len, stop - self.buffer_len)
-        } else {
-            &mut self.buffer[start as usize..stop as usize]
-        }
-    }
-
-    fn allocate_and_set(&mut self, data: &[u8]) -> BytesRef {
-        assert!(data.len() < u16::max_value() as usize);
-        let total_len = 2 + data.len();
-        let start = self.allocate_space(total_len);
-        let total_buff = self.get_mut_slice(start, start + total_len as u32);
-        NativeEndian::write_u16(&mut total_buff[0..2], data.len() as u16);
-        total_buff[2..].clone_from_slice(data);
-        BytesRef(start)
-    }
-
-    fn get_mut(&mut self, addr: u32) -> *mut u8 {
-        if addr >= self.buffer_len {
-            self.next_heap
-                .as_mut()
-                .unwrap()
-                .get_mut(addr - self.buffer_len)
-        } else {
-            let addr_isize = addr as isize;
-            unsafe { self.buffer.as_mut_ptr().offset(addr_isize) }
-        }
-    }
-
-    fn get_mut_ref<Item>(&mut self, addr: u32) -> &mut Item {
-        if addr >= self.buffer_len {
-            self.next_heap
-                .as_mut()
-                .unwrap()
-                .get_mut_ref(addr - self.buffer_len)
-        } else {
-            let v_ptr_u8 = self.get_mut(addr) as *mut u8;
-            let v_ptr = v_ptr_u8 as *mut Item;
-            unsafe { &mut *v_ptr }
-        }
-    }
-
-    pub fn set<Item>(&mut self, addr: u32, val: &Item) {
-        if addr >= self.buffer_len {
-            self.next_heap
-                .as_mut()
-                .unwrap()
-                .set(addr - self.buffer_len, val);
-        } else {
-            let v_ptr: *const Item = val as *const Item;
-            let v_ptr_u8: *const u8 = v_ptr as *const u8;
-            debug_assert!(addr + mem::size_of::<Item>() as u32 <= self.used);
-            unsafe {
-                let dest_ptr: *mut u8 = self.get_mut(addr);
-                ptr::copy(v_ptr_u8, dest_ptr, mem::size_of::<Item>());
-            }
-        }
-    }
-}
--- a/src/datastruct/stacker/mod.rs
+++ b/src/datastruct/stacker/mod.rs
@@ -1,43 +0,0 @@
-mod expull;
-pub(crate) mod hashmap;
-mod heap;
-
-pub use self::expull::ExpUnrolledLinkedList;
-pub use self::hashmap::TermHashMap;
-pub use self::heap::{Heap, HeapAllocable};
-
-#[test]
-fn test_unrolled_linked_list() {
-    use std::collections;
-    let heap = Heap::with_capacity(30_000_000);
-    {
-        heap.clear();
-        let mut ks: Vec<usize> = (1..5).map(|k| k * 100).collect();
-        ks.push(2);
-        ks.push(3);
-        for k in (1..5).map(|k| k * 100) {
-            let mut hashmap: TermHashMap = TermHashMap::new(10, &heap);
-            for j in 0..k {
-                for i in 0..500 {
-                    let v: &mut ExpUnrolledLinkedList = hashmap.get_or_create(i.to_string()).1;
-                    v.push(i * j, &heap);
-                }
-            }
-            let mut map_addr: collections::HashMap<Vec<u8>, u32> = collections::HashMap::new();
-            for (key, addr, _) in hashmap.iter() {
-                map_addr.insert(Vec::from(key), addr);
-            }
-
-            for i in 0..500 {
-                let key: String = i.to_string();
-                let addr: u32 = *map_addr.get(key.as_bytes()).unwrap();
-                let exp_pull: &ExpUnrolledLinkedList = heap.get_ref(addr);
-                let mut it = exp_pull.iter(addr, &heap);
-                for j in 0..k {
-                    assert_eq!(it.next().unwrap(), i * j);
-                }
-                assert!(!it.next().is_some());
-            }
-        }
-    }
-}
--- a/src/directory/directory.rs
+++ b/src/directory/directory.rs
@@ -17,7 +17,7 @@ use std::result;
 /// - The [`RAMDirectory`](struct.RAMDirectory.html), which
 /// should be used mostly for tests.
 ///
-pub trait Directory: fmt::Debug + Send + Sync + 'static {
+pub trait Directory: DirectoryClone + fmt::Debug + Send + Sync + 'static {
    /// Opens a virtual file for read.
    ///
    /// Once a virtual file is open, its data may not
@@ -73,7 +73,19 @@ pub trait Directory: fmt::Debug + Send + Sync + 'static {
    ///
    /// The file may or may not previously exist.
    fn atomic_write(&mut self, path: &Path, data: &[u8]) -> io::Result<()>;
-
-    /// Clones the directory and boxes the clone
-    fn box_clone(&self) -> Box<Directory>;
+}
+
+/// DirectoryClone
+pub trait DirectoryClone {
+  /// Clones the directory and boxes the clone
+  fn box_clone(&self) -> Box<Directory>;
+}
+
+impl<T> DirectoryClone for T
+where
+  T: 'static + Directory + Clone,
+{
+  fn box_clone(&self) -> Box<Directory> {
+    Box::new(self.clone())
+  }
 }
--- a/src/directory/error.rs
+++ b/src/directory/error.rs
@@ -173,9 +173,6 @@ pub enum DeleteError {
    /// Any kind of IO error that happens when
    /// interacting with the underlying IO device.
    IOError(IOError),
-    /// The file may not be deleted because it is
-    /// protected.
-    FileProtected(PathBuf),
 }

 impl From<IOError> for DeleteError {
@@ -190,9 +187,6 @@ impl fmt::Display for DeleteError {
            DeleteError::FileDoesNotExist(ref path) => {
                write!(f, "the file '{:?}' does not exist", path)
            }
-            DeleteError::FileProtected(ref path) => {
-                write!(f, "the file '{:?}' is protected and can't be deleted", path)
-            }
            DeleteError::IOError(ref err) => {
                write!(f, "an io error occurred while deleting a file: '{}'", err)
            }
@@ -207,7 +201,7 @@ impl StdError for DeleteError {

    fn cause(&self) -> Option<&StdError> {
        match *self {
-            DeleteError::FileDoesNotExist(_) | DeleteError::FileProtected(_) => None,
+            DeleteError::FileDoesNotExist(_) => None,
            DeleteError::IOError(ref err) => Some(err),
        }
    }
--- a/src/directory/managed_directory.rs
+++ b/src/directory/managed_directory.rs
@@ -1,11 +1,9 @@
 use core::MANAGED_FILEPATH;
 use directory::error::{DeleteError, IOError, OpenReadError, OpenWriteError};
 use directory::{ReadOnlySource, WritePtr};
-use error::{ErrorKind, Result, ResultExt};
+use error::TantivyError;
 use serde_json;
-use std::collections::HashMap;
 use std::collections::HashSet;
-use std::fmt;
 use std::io;
 use std::io::Write;
 use std::path::{Path, PathBuf};
@@ -13,6 +11,7 @@ use std::result;
 use std::sync::RwLockWriteGuard;
 use std::sync::{Arc, RwLock};
 use Directory;
+use Result;

 /// Wrapper of directories that keeps track of files created by Tantivy.
 ///
@@ -32,37 +31,6 @@ pub struct ManagedDirectory {
 #[derive(Debug, Default)]
 struct MetaInformation {
    managed_paths: HashSet<PathBuf>,
-    protected_files: HashMap<PathBuf, usize>,
-}
-
-/// A `FileProtection` prevents the garbage collection of a file.
-///
-/// See `ManagedDirectory.protect_file_from_delete`.
-pub struct FileProtection {
-    directory: ManagedDirectory,
-    path: PathBuf,
-}
-
-fn unprotect_file_from_delete(directory: &ManagedDirectory, path: &Path) {
-    let mut meta_informations_wlock = directory
-        .meta_informations
-        .write()
-        .expect("Managed file lock poisoned");
-    if let Some(counter_ref_mut) = meta_informations_wlock.protected_files.get_mut(path) {
-        (*counter_ref_mut) -= 1;
-    }
-}
-
-impl fmt::Debug for FileProtection {
-    fn fmt(&self, formatter: &mut fmt::Formatter) -> result::Result<(), fmt::Error> {
-        write!(formatter, "FileProtectionFor({:?})", self.path)
-    }
-}
-
-impl Drop for FileProtection {
-    fn drop(&mut self) {
-        unprotect_file_from_delete(&self.directory, &*self.path);
-    }
 }

 /// Saves the file containing the list of existing files
@@ -84,12 +52,11 @@ impl ManagedDirectory {
            Ok(data) => {
                let managed_files_json = String::from_utf8_lossy(&data);
                let managed_files: HashSet<PathBuf> = serde_json::from_str(&managed_files_json)
-                    .chain_err(|| ErrorKind::CorruptedFile(MANAGED_FILEPATH.clone()))?;
+                    .map_err(|_| TantivyError::CorruptedFile(MANAGED_FILEPATH.clone()))?;
                Ok(ManagedDirectory {
                    directory: Box::new(directory),
                    meta_informations: Arc::new(RwLock::new(MetaInformation {
                        managed_paths: managed_files,
-                        protected_files: HashMap::default(),
                    })),
                })
            }
@@ -158,9 +125,6 @@ impl ManagedDirectory {
                                    error!("Failed to delete {:?}", file_to_delete);
                                }
                            }
-                            DeleteError::FileProtected(_) => {
-                                // this is expected.
-                            }
                        }
                    }
                }
@@ -185,28 +149,6 @@ impl ManagedDirectory {
        }
    }

-    /// Protects a file from being garbage collected.
-    ///
-    /// The method returns a `FileProtection` object.
-    /// The file will not be garbage collected as long as the
-    /// `FileProtection` object is kept alive.
-    pub fn protect_file_from_delete(&self, path: &Path) -> FileProtection {
-        let pathbuf = path.to_owned();
-        {
-            let mut meta_informations_wlock = self.meta_informations
-                .write()
-                .expect("Managed file lock poisoned on protect");
-            *meta_informations_wlock
-                .protected_files
-                .entry(pathbuf.clone())
-                .or_insert(0) += 1;
-        }
-        FileProtection {
-            directory: self.clone(),
-            path: pathbuf.clone(),
-        }
-    }
-
    /// Registers a file as managed
    ///
    /// This method must be called before the file is
@@ -247,26 +189,12 @@ impl Directory for ManagedDirectory {
    }

    fn delete(&self, path: &Path) -> result::Result<(), DeleteError> {
-        {
-            let metas_rlock = self.meta_informations
-                .read()
-                .expect("poisoned lock in managed directory meta");
-            if let Some(counter) = metas_rlock.protected_files.get(path) {
-                if *counter > 0 {
-                    return Err(DeleteError::FileProtected(path.to_owned()));
-                }
-            }
-        }
        self.directory.delete(path)
    }

    fn exists(&self, path: &Path) -> bool {
        self.directory.exists(path)
    }
-
-    fn box_clone(&self) -> Box<Directory> {
-        Box::new(self.clone())
-    }
 }

 impl Clone for ManagedDirectory {
@@ -372,28 +300,4 @@ mod tests {
        }
    }

-    #[test]
-    #[cfg(feature = "mmap")]
-    fn test_managed_directory_protect() {
-        let tempdir = TempDir::new("index").unwrap();
-        let tempdir_path = PathBuf::from(tempdir.path());
-        let living_files = HashSet::new();
-
-        let mmap_directory = MmapDirectory::open(&tempdir_path).unwrap();
-        let mut managed_directory = ManagedDirectory::new(mmap_directory).unwrap();
-        managed_directory
-            .atomic_write(*TEST_PATH1, &vec![0u8, 1u8])
-            .unwrap();
-        assert!(managed_directory.exists(*TEST_PATH1));
-
-        {
-            let _file_protection = managed_directory.protect_file_from_delete(*TEST_PATH1);
-            managed_directory.garbage_collect(|| living_files.clone());
-            assert!(managed_directory.exists(*TEST_PATH1));
-        }
-
-        managed_directory.garbage_collect(|| living_files.clone());
-        assert!(!managed_directory.exists(*TEST_PATH1));
-    }
-
 }
--- a/src/directory/mmap_directory.rs
+++ b/src/directory/mmap_directory.rs
@@ -352,10 +352,6 @@ impl Directory for MmapDirectory {
        meta_file.write(|f| f.write_all(data))?;
        Ok(())
    }
-
-    fn box_clone(&self) -> Box<Directory> {
-        Box::new(self.clone())
-    }
 }

 #[cfg(test)]
--- a/src/directory/mod.rs
+++ b/src/directory/mod.rs
@@ -18,15 +18,14 @@ pub mod error;

 use std::io::{BufWriter, Seek, Write};

-pub use self::directory::Directory;
+pub use self::directory::{Directory, DirectoryClone};
 pub use self::ram_directory::RAMDirectory;
 pub use self::read_only_source::ReadOnlySource;

 #[cfg(feature = "mmap")]
 pub use self::mmap_directory::MmapDirectory;

-pub(crate) use self::managed_directory::{FileProtection, ManagedDirectory};
-pub(crate) use self::read_only_source::SourceRead;
+pub(crate) use self::managed_directory::ManagedDirectory;

 /// Synonym of Seek + Write
 pub trait SeekableWrite: Seek + Write {}
--- a/src/directory/ram_directory.rs
+++ b/src/directory/ram_directory.rs
@@ -203,8 +203,4 @@ impl Directory for RAMDirectory {
        vec_writer.flush()?;
        Ok(())
    }
-
-    fn box_clone(&self) -> Box<Directory> {
-        Box::new(self.clone())
-    }
 }
--- a/src/directory/read_only_source.rs
+++ b/src/directory/read_only_source.rs
@@ -3,9 +3,8 @@ use common::HasLen;
 #[cfg(feature = "mmap")]
 use fst::raw::MmapReadOnly;
 use stable_deref_trait::{CloneStableDeref, StableDeref};
-use std::io::{self, Read};
 use std::ops::Deref;
-use std::slice;
+

 /// Read object that represents files in tantivy.
 ///
@@ -120,49 +119,3 @@ impl From<Vec<u8>> for ReadOnlySource {
        ReadOnlySource::Anonymous(shared_data)
    }
 }
-
-/// Acts as a owning cursor over the data backed up by a `ReadOnlySource`
-pub(crate) struct SourceRead {
-    _data_owner: ReadOnlySource,
-    cursor: &'static [u8],
-}
-
-impl SourceRead {
-    // Advance the cursor by a given number of bytes.
-    pub fn advance(&mut self, len: usize) {
-        self.cursor = &self.cursor[len..];
-    }
-
-    pub fn slice_from(&self, start: usize) -> &[u8] {
-        &self.cursor[start..]
-    }
-
-    pub fn get(&self, idx: usize) -> u8 {
-        self.cursor[idx]
-    }
-}
-
-impl AsRef<[u8]> for SourceRead {
-    fn as_ref(&self) -> &[u8] {
-        self.cursor
-    }
-}
-
-impl From<ReadOnlySource> for SourceRead {
-    // Creates a new `SourceRead` from a given `ReadOnlySource`
-    fn from(source: ReadOnlySource) -> SourceRead {
-        let len = source.len();
-        let slice_ptr = source.as_slice().as_ptr();
-        let static_slice = unsafe { slice::from_raw_parts(slice_ptr, len) };
-        SourceRead {
-            _data_owner: source,
-            cursor: static_slice,
-        }
-    }
-}
-
-impl Read for SourceRead {
-    fn read(&mut self, buf: &mut [u8]) -> io::Result<usize> {
-        self.cursor.read(buf)
-    }
-}
--- a/src/error.rs
+++ b/src/error.rs
@@ -10,129 +10,114 @@ use serde_json;
 use std::path::PathBuf;
 use std::sync::PoisonError;

-error_chain!(
-    errors {
-        /// Path does not exist.
-        PathDoesNotExist(buf: PathBuf) {
-            description("path does not exist")
-            display("path does not exist: '{:?}'", buf)
-        }
-        /// File already exists, this is a problem when we try to write into a new file.
-        FileAlreadyExists(buf: PathBuf) {
-            description("file already exists")
-            display("file already exists: '{:?}'", buf)
-        }
-        /// IO Error.
-        IOError(err: IOError) {
-            description("an IO error occurred")
-            display("an IO error occurred: '{}'", err)
-        }
-        /// The data within is corrupted.
-        ///
-        /// For instance, it contains invalid JSON.
-        CorruptedFile(buf: PathBuf) {
-            description("file contains corrupted data")
-            display("file contains corrupted data: '{:?}'", buf)
-        }
-        /// A thread holding the locked panicked and poisoned the lock.
-        Poisoned {
-            description("a thread holding the locked panicked and poisoned the lock")
-        }
-        /// Invalid argument was passed by the user.
-        InvalidArgument(arg: String) {
-            description("an invalid argument was passed")
-            display("an invalid argument was passed: '{}'", arg)
-        }
-        /// An Error happened in one of the thread.
-        ErrorInThread(err: String) {
-            description("an error occurred in a thread")
-            display("an error occurred in a thread: '{}'", err)
-        }
-        /// An Error appeared related to the schema.
-        SchemaError(message: String) {
-            description("the schema is not matching expectations.")
-            display("Schema error: '{}'", message)
-        }
-        /// Tried to access a fastfield reader for a field not configured accordingly.
-        FastFieldError(err: FastFieldNotAvailableError) {
-            description("fast field not available")
-            display("fast field not available: '{:?}'", err)
-        }
-    }
-);
+/// The library's failure based error enum
+#[derive(Debug, Fail)]
+pub enum TantivyError {
+    /// Path does not exist.
+    #[fail(display = "path does not exist: '{:?}'", _0)]
+    PathDoesNotExist(PathBuf),
+    /// File already exists, this is a problem when we try to write into a new file.
+    #[fail(display = "file already exists: '{:?}'", _0)]
+    FileAlreadyExists(PathBuf),
+    /// IO Error.
+    #[fail(display = "an IO error occurred: '{}'", _0)]
+    IOError(#[cause] IOError),
+    /// The data within is corrupted.
+    ///
+    /// For instance, it contains invalid JSON.
+    #[fail(display = "file contains corrupted data: '{:?}'", _0)]
+    CorruptedFile(PathBuf),
+    /// A thread holding the locked panicked and poisoned the lock.
+    #[fail(display = "a thread holding the locked panicked and poisoned the lock")]
+    Poisoned,
+    /// Invalid argument was passed by the user.
+    #[fail(display = "an invalid argument was passed: '{}'", _0)]
+    InvalidArgument(String),
+    /// An Error happened in one of the thread.
+    #[fail(display = "an error occurred in a thread: '{}'", _0)]
+    ErrorInThread(String),
+    /// An Error appeared related to the schema.
+    #[fail(display = "Schema error: '{}'", _0)]
+    SchemaError(String),
+    /// Tried to access a fastfield reader for a field not configured accordingly.
+    #[fail(display = "fast field not available: '{:?}'", _0)]
+    FastFieldError(#[cause] FastFieldNotAvailableError),
+}

-impl From<FastFieldNotAvailableError> for Error {
-    fn from(fastfield_error: FastFieldNotAvailableError) -> Error {
-        ErrorKind::FastFieldError(fastfield_error).into()
+impl From<FastFieldNotAvailableError> for TantivyError {
+    fn from(fastfield_error: FastFieldNotAvailableError) -> TantivyError {
+        TantivyError::FastFieldError(fastfield_error).into()
    }
 }

-impl From<IOError> for Error {
-    fn from(io_error: IOError) -> Error {
-        ErrorKind::IOError(io_error).into()
+impl From<IOError> for TantivyError {
+    fn from(io_error: IOError) -> TantivyError {
+        TantivyError::IOError(io_error).into()
    }
 }

-impl From<io::Error> for Error {
-    fn from(io_error: io::Error) -> Error {
-        ErrorKind::IOError(io_error.into()).into()
+impl From<io::Error> for TantivyError {
+    fn from(io_error: io::Error) -> TantivyError {
+        TantivyError::IOError(io_error.into()).into()
    }
 }

-impl From<query::QueryParserError> for Error {
-    fn from(parsing_error: query::QueryParserError) -> Error {
-        ErrorKind::InvalidArgument(format!("Query is invalid. {:?}", parsing_error)).into()
+impl From<query::QueryParserError> for TantivyError {
+    fn from(parsing_error: query::QueryParserError) -> TantivyError {
+        TantivyError::InvalidArgument(format!("Query is invalid. {:?}", parsing_error)).into()
    }
 }

-impl<Guard> From<PoisonError<Guard>> for Error {
-    fn from(_: PoisonError<Guard>) -> Error {
-        ErrorKind::Poisoned.into()
+impl<Guard> From<PoisonError<Guard>> for TantivyError {
+    fn from(_: PoisonError<Guard>) -> TantivyError {
+        TantivyError::Poisoned.into()
    }
 }

-impl From<OpenReadError> for Error {
-    fn from(error: OpenReadError) -> Error {
+impl From<OpenReadError> for TantivyError {
+    fn from(error: OpenReadError) -> TantivyError {
        match error {
            OpenReadError::FileDoesNotExist(filepath) => {
-                ErrorKind::PathDoesNotExist(filepath).into()
+                TantivyError::PathDoesNotExist(filepath).into()
            }
-            OpenReadError::IOError(io_error) => ErrorKind::IOError(io_error).into(),
+            OpenReadError::IOError(io_error) => TantivyError::IOError(io_error).into(),
        }
    }
 }

-impl From<schema::DocParsingError> for Error {
-    fn from(error: schema::DocParsingError) -> Error {
-        ErrorKind::InvalidArgument(format!("Failed to parse document {:?}", error)).into()
+impl From<schema::DocParsingError> for TantivyError {
+    fn from(error: schema::DocParsingError) -> TantivyError {
+        TantivyError::InvalidArgument(format!("Failed to parse document {:?}", error)).into()
    }
 }

-impl From<OpenWriteError> for Error {
-    fn from(error: OpenWriteError) -> Error {
+impl From<OpenWriteError> for TantivyError {
+    fn from(error: OpenWriteError) -> TantivyError {
        match error {
-            OpenWriteError::FileAlreadyExists(filepath) => ErrorKind::FileAlreadyExists(filepath),
-            OpenWriteError::IOError(io_error) => ErrorKind::IOError(io_error),
+            OpenWriteError::FileAlreadyExists(filepath) => {
+                TantivyError::FileAlreadyExists(filepath)
+            }
+            OpenWriteError::IOError(io_error) => TantivyError::IOError(io_error),
        }.into()
    }
 }

-impl From<OpenDirectoryError> for Error {
-    fn from(error: OpenDirectoryError) -> Error {
+impl From<OpenDirectoryError> for TantivyError {
+    fn from(error: OpenDirectoryError) -> TantivyError {
        match error {
            OpenDirectoryError::DoesNotExist(directory_path) => {
-                ErrorKind::PathDoesNotExist(directory_path).into()
+                TantivyError::PathDoesNotExist(directory_path).into()
            }
-            OpenDirectoryError::NotADirectory(directory_path) => ErrorKind::InvalidArgument(
+            OpenDirectoryError::NotADirectory(directory_path) => TantivyError::InvalidArgument(
                format!("{:?} is not a directory", directory_path),
            ).into(),
        }
    }
 }

-impl From<serde_json::Error> for Error {
-    fn from(error: serde_json::Error) -> Error {
+impl From<serde_json::Error> for TantivyError {
+    fn from(error: serde_json::Error) -> TantivyError {
        let io_err = io::Error::from(error);
-        ErrorKind::IOError(io_err.into()).into()
+        TantivyError::IOError(io_err.into()).into()
    }
 }
--- a/src/fastfield/error.rs
+++ b/src/fastfield/error.rs
@@ -4,7 +4,8 @@ use std::result;
 /// `FastFieldNotAvailableError` is returned when the
 /// user requested for a fast field reader, and the field was not
 /// defined in the schema as a fast field.
-#[derive(Debug)]
+#[derive(Debug, Fail)]
+#[fail(display = "field not available: '{:?}'", field_name)]
 pub struct FastFieldNotAvailableError {
    field_name: String,
 }
--- a/src/fastfield/mod.rs
+++ b/src/fastfield/mod.rs
@@ -368,8 +368,8 @@ mod tests {
    }

    pub fn generate_permutation() -> Vec<u64> {
-        let seed: &[u32; 4] = &[1, 2, 3, 4];
-        let mut rng = XorShiftRng::from_seed(*seed);
+        let seed: [u8; 16] = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16];
+        let mut rng = XorShiftRng::from_seed(seed);
        let mut permutation: Vec<u64> = (0u64..1_000_000u64).collect();
        rng.shuffle(&mut permutation);
        permutation
--- a/src/fastfield/multivalued/reader.rs
+++ b/src/fastfield/multivalued/reader.rs
@@ -102,7 +102,7 @@ mod tests {
        let mut vals = Vec::new();
        {
            facet_reader.facet_ords(0, &mut vals);
-            assert_eq!(&vals[..], &[3, 2]);
+            assert_eq!(&vals[..], &[2, 3]);
        }
        {
            facet_reader.facet_ords(1, &mut vals);
--- a/src/fastfield/multivalued/writer.rs
+++ b/src/fastfield/multivalued/writer.rs
@@ -90,10 +90,10 @@ impl MultiValueIntFastFieldWriter {

    /// Serializes fast field values by pushing them to the `FastFieldSerializer`.
    ///
-    /// HashMap makes it possible to remap them before serializing.
-    /// Specifically, string terms are first stored in the writer as their
-    /// position in the `IndexWriter`'s `HashMap`. This value is called
-    /// an `UnorderedTermId`.
+    /// If a mapping is given, the values are remapped *and sorted* before serialization.
+    /// This is used when serializing `facets`. Specifically their terms are
+    /// first stored in the writer as their position in the `IndexWriter`'s `HashMap`.
+    /// This value is called an `UnorderedTermId`.
    ///
    /// During the serialization of the segment, terms gets sorted and
    /// `tantivy` builds a mapping to convert this `UnorderedTermId` into
@@ -125,10 +125,30 @@ impl MultiValueIntFastFieldWriter {
                        mapping.len() as u64,
                        1,
                    )?;
-                    for val in &self.vals {
-                        let remapped_val = *mapping.get(val).expect("Missing term ordinal");
-                        value_serializer.add_val(remapped_val)?;
+
+                    let last_interval = (
+                        self.doc_index.last().cloned().unwrap(),
+                        self.vals.len() as u64,
+                    );
+
+                    let mut doc_vals: Vec<u64> = Vec::with_capacity(100);
+                    for (start, stop) in self.doc_index
+                        .windows(2)
+                        .map(|interval| (interval[0], interval[1]))
+                        .chain(Some(last_interval).into_iter())
+                        .map(|(start, stop)| (start as usize, stop as usize))
+                    {
+                        doc_vals.clear();
+                        let remapped_vals = self.vals[start..stop]
+                            .iter()
+                            .map(|val| *mapping.get(val).expect("Missing term ordinal"));
+                        doc_vals.extend(remapped_vals);
+                        doc_vals.sort();
+                        for &val in &doc_vals {
+                            value_serializer.add_val(val)?;
+                        }
                    }
+
                }
                None => {
                    let val_min_max = self.vals.iter().cloned().minmax();
--- a/src/functional_test.rs
+++ b/src/functional_test.rs
@@ -1,7 +1,8 @@
 use rand::thread_rng;
 use std::collections::HashSet;

-use rand::distributions::{IndependentSample, Range};
+use rand::Rng;
+use rand::distributions::Range;
 use schema::*;
 use Index;
 use Searcher;
@@ -32,7 +33,7 @@ fn test_indexing() {
    let mut uncommitted_docs: HashSet<u64> = HashSet::new();

    for _ in 0..200 {
-        let random_val = universe.ind_sample(&mut rng);
+        let random_val = rng.sample(&universe);
        if random_val == 0 {
            index_writer.commit().expect("Commit failed");
            committed_docs.extend(&uncommitted_docs);
--- a/src/indexer/index_writer.rs
+++ b/src/indexer/index_writer.rs
@@ -2,18 +2,15 @@ use super::operation::AddOperation;
 use super::segment_updater::SegmentUpdater;
 use super::PreparedCommit;
 use bit_set::BitSet;
-use chan;
 use core::Index;
 use core::Segment;
 use core::SegmentComponent;
 use core::SegmentId;
 use core::SegmentMeta;
 use core::SegmentReader;
-use datastruct::stacker::hashmap::split_memory;
-use datastruct::stacker::Heap;
-use directory::FileProtection;
+use crossbeam_channel as channel;
 use docset::DocSet;
-use error::{Error, ErrorKind, Result, ResultExt};
+use error::TantivyError;
 use fastfield::write_delete_bitset;
 use futures::sync::oneshot::Receiver;
 use indexer::delete_queue::{DeleteCursor, DeleteQueue};
@@ -24,6 +21,7 @@ use indexer::DirectoryLock;
 use indexer::MergePolicy;
 use indexer::SegmentEntry;
 use indexer::SegmentWriter;
+use postings::compute_table_size;
 use schema::Document;
 use schema::IndexRecordOption;
 use schema::Term;
@@ -31,20 +29,40 @@ use std::mem;
 use std::mem::swap;
 use std::thread;
 use std::thread::JoinHandle;
+use Result;

 // Size of the margin for the heap. A segment is closed when the remaining memory
 // in the heap goes below MARGIN_IN_BYTES.
-pub const MARGIN_IN_BYTES: u32 = 1_000_000u32;
+pub const MARGIN_IN_BYTES: usize = 1_000_000;

 // We impose the memory per thread to be at least 3 MB.
-pub const HEAP_SIZE_LIMIT: u32 = MARGIN_IN_BYTES * 3u32;
+pub const HEAP_SIZE_MIN: usize = ((MARGIN_IN_BYTES as u32) * 3u32) as usize;
+pub const HEAP_SIZE_MAX: usize = u32::max_value() as usize - MARGIN_IN_BYTES;

 // Add document will block if the number of docs waiting in the queue to be indexed
 // reaches `PIPELINE_MAX_SIZE_IN_DOCS`
 const PIPELINE_MAX_SIZE_IN_DOCS: usize = 10_000;

-type DocumentSender = chan::Sender<AddOperation>;
-type DocumentReceiver = chan::Receiver<AddOperation>;
+type DocumentSender = channel::Sender<AddOperation>;
+type DocumentReceiver = channel::Receiver<AddOperation>;
+
+/// Split the thread memory budget into
+/// - the heap size
+/// - the hash table "table" itself.
+///
+/// Returns (the heap size in bytes, the hash table size in number of bits)
+fn initial_table_size(per_thread_memory_budget: usize) -> usize {
+    let table_size_limit: usize = per_thread_memory_budget / 3;
+    (1..)
+        .into_iter()
+        .take_while(|num_bits: &usize| compute_table_size(*num_bits) < table_size_limit)
+        .last()
+        .expect(&format!(
+            "Per thread memory is too small: {}",
+            per_thread_memory_budget
+        ))
+        .min(19) // we cap it at 512K
+}

 /// `IndexWriter` is the user entry-point to add document to an index.
 ///
@@ -100,14 +118,19 @@ pub fn open_index_writer(
    heap_size_in_bytes_per_thread: usize,
    directory_lock: DirectoryLock,
 ) -> Result<IndexWriter> {
-    if heap_size_in_bytes_per_thread < HEAP_SIZE_LIMIT as usize {
-        panic!(format!(
+    if heap_size_in_bytes_per_thread < HEAP_SIZE_MIN {
+        let err_msg = format!(
            "The heap size per thread needs to be at least {}.",
-            HEAP_SIZE_LIMIT
-        ));
+            HEAP_SIZE_MIN
+        );
+        return Err(TantivyError::InvalidArgument(err_msg));
+    }
+    if heap_size_in_bytes_per_thread >= HEAP_SIZE_MAX {
+        let err_msg = format!("The heap size per thread cannot exceed {}", HEAP_SIZE_MAX);
+        return Err(TantivyError::InvalidArgument(err_msg));
    }
    let (document_sender, document_receiver): (DocumentSender, DocumentReceiver) =
-        chan::sync(PIPELINE_MAX_SIZE_IN_DOCS);
+        channel::bounded(PIPELINE_MAX_SIZE_IN_DOCS);

    let delete_queue = DeleteQueue::new();

@@ -193,15 +216,13 @@ pub fn advance_deletes(
    mut segment: Segment,
    segment_entry: &mut SegmentEntry,
    target_opstamp: u64,
-) -> Result<Option<FileProtection>> {
-    let mut file_protect: Option<FileProtection> = None;
+) -> Result<()> {
    {
-        if let Some(previous_opstamp) = segment_entry.meta().delete_opstamp() {
+        if segment_entry.meta().delete_opstamp() == Some(target_opstamp) {
            // We are already up-to-date here.
-            if target_opstamp == previous_opstamp {
-                return Ok(file_protect);
-            }
+            return Ok(());
        }
+
        let segment_reader = SegmentReader::open(&segment)?;
        let max_doc = segment_reader.max_doc();

@@ -220,6 +241,7 @@ pub fn advance_deletes(
            target_opstamp,
        )?;

+        // TODO optimize
        for doc in 0u32..max_doc {
            if segment_reader.is_deleted(doc) {
                delete_bitset.insert(doc as usize);
@@ -228,54 +250,39 @@ pub fn advance_deletes(

        let num_deleted_docs = delete_bitset.len();
        if num_deleted_docs > 0 {
-            segment.set_delete_meta(num_deleted_docs as u32, target_opstamp);
-            file_protect = Some(segment.protect_from_delete(SegmentComponent::DELETE));
+            segment = segment.with_delete_meta(num_deleted_docs as u32, target_opstamp);
            let mut delete_file = segment.open_write(SegmentComponent::DELETE)?;
            write_delete_bitset(&delete_bitset, &mut delete_file)?;
        }
    }
-    segment_entry.set_meta(segment.meta().clone());
-    Ok(file_protect)
+    segment_entry.set_meta((*segment.meta()).clone());
+    Ok(())
 }

 fn index_documents(
-    heap: &mut Heap,
-    table_size: usize,
+    memory_budget: usize,
    segment: &Segment,
    generation: usize,
    document_iterator: &mut Iterator<Item = AddOperation>,
    segment_updater: &mut SegmentUpdater,
    mut delete_cursor: DeleteCursor,
 ) -> Result<bool> {
-    heap.clear();
    let schema = segment.schema();
    let segment_id = segment.id();
-    let mut segment_writer =
-        SegmentWriter::for_segment(heap, table_size, segment.clone(), &schema)?;
+    let table_size = initial_table_size(memory_budget);
+    let mut segment_writer = SegmentWriter::for_segment(table_size, segment.clone(), &schema)?;
    for doc in document_iterator {
        segment_writer.add_document(doc, &schema)?;
-        // There is two possible conditions to close the segment.
-        // One is the memory arena dedicated to the segment is
-        // getting full.
-        if segment_writer.is_buffer_full() {
+
+        let mem_usage = segment_writer.mem_usage();
+
+        if mem_usage >= memory_budget - MARGIN_IN_BYTES {
            info!(
                "Buffer limit reached, flushing segment with maxdoc={}.",
                segment_writer.max_doc()
            );
            break;
        }
-        // The second is the term dictionary hash table
-        // is reaching saturation.
-        //
-        // Tantivy does not resize its hashtable. When it reaches
-        // capacity, we just stop indexing new document.
-        if segment_writer.is_term_saturated() {
-            info!(
-                "Term dic saturated, flushing segment with maxdoc={}.",
-                segment_writer.max_doc()
-            );
-            break;
-        }
    }

    if !segment_updater.is_alive() {
@@ -290,8 +297,7 @@ fn index_documents(

    let doc_opstamps: Vec<u64> = segment_writer.finalize()?;

-    let mut segment_meta = SegmentMeta::new(segment_id);
-    segment_meta.set_max_doc(num_docs);
+    let segment_meta = SegmentMeta::new(segment_id, num_docs);

    let last_docstamp: u64 = *(doc_opstamps.last().unwrap());

@@ -329,13 +335,15 @@ impl IndexWriter {
            join_handle
                .join()
                .expect("Indexing Worker thread panicked")
-                .chain_err(|| ErrorKind::ErrorInThread("Error in indexing worker thread.".into()))?;
+                .map_err(|_| {
+                    TantivyError::ErrorInThread("Error in indexing worker thread.".into())
+                })?;
        }
        drop(self.workers_join_handle);

        let result = self.segment_updater
            .wait_merging_thread()
-            .chain_err(|| ErrorKind::ErrorInThread("Failed to join merging thread.".into()));
+            .map_err(|_| TantivyError::ErrorInThread("Failed to join merging thread.".into()));

        if let Err(ref e) = result {
            error!("Some merging thread failed {:?}", e);
@@ -367,14 +375,12 @@ impl IndexWriter {
    fn add_indexing_worker(&mut self) -> Result<()> {
        let document_receiver_clone = self.document_receiver.clone();
        let mut segment_updater = self.segment_updater.clone();
-        let (heap_size, table_size) = split_memory(self.heap_size_in_bytes_per_thread);
-        info!("heap size {}, table_size {}", heap_size, table_size);
-        let mut heap = Heap::with_capacity(heap_size);

        let generation = self.generation;

        let mut delete_cursor = self.delete_queue.cursor();

+        let mem_budget = self.heap_size_in_bytes_per_thread;
        let join_handle: JoinHandle<Result<()>> = thread::Builder::new()
            .name(format!(
                "indexing thread {} for gen {}",
@@ -402,8 +408,7 @@ impl IndexWriter {
                    }
                    let segment = segment_updater.new_segment();
                    index_documents(
-                        &mut heap,
-                        table_size,
+                        mem_budget,
                        &segment,
                        generation,
                        &mut document_iterator,
@@ -441,7 +446,9 @@ impl IndexWriter {
    }

    /// Merges a given list of segments
-    pub fn merge(&mut self, segment_ids: &[SegmentId]) -> Receiver<SegmentMeta> {
+    ///
+    /// `segment_ids` is required to be non-empty.
+    pub fn merge(&mut self, segment_ids: &[SegmentId]) -> Result<Receiver<SegmentMeta>> {
        self.segment_updater.start_merge(segment_ids)
    }

@@ -457,7 +464,7 @@ impl IndexWriter {
        let (mut document_sender, mut document_receiver): (
            DocumentSender,
            DocumentReceiver,
-        ) = chan::sync(PIPELINE_MAX_SIZE_IN_DOCS);
+        ) = channel::bounded(PIPELINE_MAX_SIZE_IN_DOCS);
        swap(&mut self.document_sender, &mut document_sender);
        swap(&mut self.document_receiver, &mut document_receiver);
        document_receiver
@@ -555,7 +562,7 @@ impl IndexWriter {
        for worker_handle in former_workers_join_handle {
            let indexing_worker_result = worker_handle
                .join()
-                .map_err(|e| Error::from_kind(ErrorKind::ErrorInThread(format!("{:?}", e))))?;
+                .map_err(|e| TantivyError::ErrorInThread(format!("{:?}", e)))?;

            indexing_worker_result?;
            // add a new worker for the next generation.
@@ -637,7 +644,7 @@ impl IndexWriter {
 #[cfg(test)]
 mod tests {

-    use env_logger;
+    use super::initial_table_size;
    use error::*;
    use indexer::NoMergePolicy;
    use schema::{self, Document};
@@ -650,7 +657,7 @@ mod tests {
        let index = Index::create_in_ram(schema_builder.build());
        let _index_writer = index.writer(40_000_000).unwrap();
        match index.writer(40_000_000) {
-            Err(Error(ErrorKind::FileAlreadyExists(_), _)) => {}
+            Err(TantivyError::FileAlreadyExists(_)) => {}
            _ => panic!("Expected FileAlreadyExists error"),
        }
    }
@@ -699,7 +706,7 @@ mod tests {

        {
            // writing the segment
-            let mut index_writer = index.writer_with_num_threads(3, 40_000_000).unwrap();
+            let mut index_writer = index.writer(3_000_000).unwrap();
            index_writer.add_document(doc!(text_field=>"a"));
            index_writer.rollback().unwrap();

@@ -721,7 +728,6 @@ mod tests {

    #[test]
    fn test_with_merges() {
-        let _ = env_logger::init();
        let mut schema_builder = schema::SchemaBuilder::default();
        let text_field = schema_builder.add_text_field("text", schema::TEXT);
        let index = Index::create_in_ram(schema_builder.build());
@@ -732,7 +738,7 @@ mod tests {
        };
        {
            // writing the segment
-            let mut index_writer = index.writer_with_num_threads(4, 4 * 30_000_000).unwrap();
+            let mut index_writer = index.writer(12_000_000).unwrap();
            // create 8 segments with 100 tiny docs
            for _doc in 0..100 {
                let mut doc = Document::default();
@@ -759,14 +765,13 @@ mod tests {

    #[test]
    fn test_prepare_with_commit_message() {
-        let _ = env_logger::init();
        let mut schema_builder = schema::SchemaBuilder::default();
        let text_field = schema_builder.add_text_field("text", schema::TEXT);
        let index = Index::create_in_ram(schema_builder.build());

        {
            // writing the segment
-            let mut index_writer = index.writer_with_num_threads(4, 4 * 30_000_000).unwrap();
+            let mut index_writer = index.writer(12_000_000).unwrap();
            // create 8 segments with 100 tiny docs
            for _doc in 0..100 {
                index_writer.add_document(doc!(text_field => "a"));
@@ -794,14 +799,13 @@ mod tests {

    #[test]
    fn test_prepare_but_rollback() {
-        let _ = env_logger::init();
        let mut schema_builder = schema::SchemaBuilder::default();
        let text_field = schema_builder.add_text_field("text", schema::TEXT);
        let index = Index::create_in_ram(schema_builder.build());

        {
            // writing the segment
-            let mut index_writer = index.writer_with_num_threads(4, 4 * 30_000_000).unwrap();
+            let mut index_writer = index.writer_with_num_threads(4, 12_000_000).unwrap();
            // create 8 segments with 100 tiny docs
            for _doc in 0..100 {
                index_writer.add_document(doc!(text_field => "a"));
@@ -831,4 +835,12 @@ mod tests {
        assert_eq!(num_docs_containing("b"), 100);
    }

+    #[test]
+    fn test_hashmap_size() {
+        assert_eq!(initial_table_size(100_000), 12);
+        assert_eq!(initial_table_size(1_000_000), 15);
+        assert_eq!(initial_table_size(10_000_000), 18);
+        assert_eq!(initial_table_size(1_000_000_000), 19);
+    }
+
 }
--- a/src/indexer/log_merge_policy.rs
+++ b/src/indexer/log_merge_policy.rs
@@ -80,10 +80,6 @@ impl MergePolicy for LogMergePolicy {
            .map(|ind_vec| MergeCandidate(ind_vec.iter().map(|&ind| segments[ind].id()).collect()))
            .collect()
    }
-
-    fn box_clone(&self) -> Box<MergePolicy> {
-        Box::new(self.clone())
-    }
 }

 impl Default for LogMergePolicy {
@@ -116,15 +112,17 @@ mod tests {
        assert!(result_list.is_empty());
    }

-    fn seg_meta(num_docs: u32) -> SegmentMeta {
-        let mut segment_metas = SegmentMeta::new(SegmentId::generate_random());
-        segment_metas.set_max_doc(num_docs);
-        segment_metas
+    fn create_random_segment_meta(num_docs: u32) -> SegmentMeta {
+        SegmentMeta::new(SegmentId::generate_random(), num_docs)
    }

    #[test]
    fn test_log_merge_policy_pair() {
-        let test_input = vec![seg_meta(10), seg_meta(10), seg_meta(10)];
+        let test_input = vec![
+            create_random_segment_meta(10),
+            create_random_segment_meta(10),
+            create_random_segment_meta(10),
+        ];
        let result_list = test_merge_policy().compute_merge_candidates(&test_input);
        assert_eq!(result_list.len(), 1);
    }
@@ -137,17 +135,17 @@ mod tests {
        // * one with the 3 * 1000-docs segments
        // no MergeCandidate expected for the 2 * 10_000-docs segments as min_merge_size=3
        let test_input = vec![
-            seg_meta(10),
-            seg_meta(10),
-            seg_meta(10),
-            seg_meta(1000),
-            seg_meta(1000),
-            seg_meta(1000),
-            seg_meta(10000),
-            seg_meta(10000),
-            seg_meta(10),
-            seg_meta(10),
-            seg_meta(10),
+            create_random_segment_meta(10),
+            create_random_segment_meta(10),
+            create_random_segment_meta(10),
+            create_random_segment_meta(1000),
+            create_random_segment_meta(1000),
+            create_random_segment_meta(1000),
+            create_random_segment_meta(10000),
+            create_random_segment_meta(10000),
+            create_random_segment_meta(10),
+            create_random_segment_meta(10),
+            create_random_segment_meta(10),
        ];
        let result_list = test_merge_policy().compute_merge_candidates(&test_input);
        assert_eq!(result_list.len(), 2);
@@ -157,12 +155,12 @@ mod tests {
    fn test_log_merge_policy_within_levels() {
        // multiple levels all get merged correctly
        let test_input = vec![
-            seg_meta(10),   // log2(10) = ~3.32 (> 3.58 - 0.75)
-            seg_meta(11),   // log2(11) = ~3.46
-            seg_meta(12),   // log2(12) = ~3.58
-            seg_meta(800),  // log2(800) = ~9.64 (> 9.97 - 0.75)
-            seg_meta(1000), // log2(1000) = ~9.97
-            seg_meta(1000),
+            create_random_segment_meta(10),   // log2(10) = ~3.32 (> 3.58 - 0.75)
+            create_random_segment_meta(11),   // log2(11) = ~3.46
+            create_random_segment_meta(12),   // log2(12) = ~3.58
+            create_random_segment_meta(800),  // log2(800) = ~9.64 (> 9.97 - 0.75)
+            create_random_segment_meta(1000), // log2(1000) = ~9.97
+            create_random_segment_meta(1000),
        ]; // log2(1000) = ~9.97
        let result_list = test_merge_policy().compute_merge_candidates(&test_input);
        assert_eq!(result_list.len(), 2);
@@ -171,12 +169,12 @@ mod tests {
    fn test_log_merge_policy_small_segments() {
        // segments under min_layer_size are merged together
        let test_input = vec![
-            seg_meta(1),
-            seg_meta(1),
-            seg_meta(1),
-            seg_meta(2),
-            seg_meta(2),
-            seg_meta(2),
+            create_random_segment_meta(1),
+            create_random_segment_meta(1),
+            create_random_segment_meta(1),
+            create_random_segment_meta(2),
+            create_random_segment_meta(2),
+            create_random_segment_meta(2),
        ];
        let result_list = test_merge_policy().compute_merge_candidates(&test_input);
        assert_eq!(result_list.len(), 1);
--- a/src/indexer/merge_policy.rs
+++ b/src/indexer/merge_policy.rs
@@ -11,18 +11,31 @@ pub struct MergeCandidate(pub Vec<SegmentId>);
 ///
 /// Every time a the list of segments changes, the segment updater
 /// asks the merge policy if some segments should be merged.
-pub trait MergePolicy: marker::Send + marker::Sync + Debug {
+pub trait MergePolicy: MergePolicyClone + marker::Send + marker::Sync + Debug {
    /// Given the list of segment metas, returns the list of merge candidates.
    ///
    /// This call happens on the segment updater thread, and will block
    /// other segment updates, so all implementations should happen rapidly.
    fn compute_merge_candidates(&self, segments: &[SegmentMeta]) -> Vec<MergeCandidate>;
-    /// Returns a boxed clone of the MergePolicy.
-    fn box_clone(&self) -> Box<MergePolicy>;
+}
+
+/// MergePolicyClone
+pub trait MergePolicyClone {
+  /// Returns a boxed clone of the MergePolicy.
+  fn box_clone(&self) -> Box<MergePolicy>;
+}
+
+impl<T> MergePolicyClone for T
+where
+  T: 'static + MergePolicy + Clone,
+{
+  fn box_clone(&self) -> Box<MergePolicy> {
+    Box::new(self.clone())
+  }
 }

 /// Never merge segments.
-#[derive(Debug)]
+#[derive(Debug, Clone)]
 pub struct NoMergePolicy;

 impl Default for NoMergePolicy {
@@ -35,10 +48,6 @@ impl MergePolicy for NoMergePolicy {
    fn compute_merge_candidates(&self, _segments: &[SegmentMeta]) -> Vec<MergeCandidate> {
        Vec::new()
    }
-
-    fn box_clone(&self) -> Box<MergePolicy> {
-        Box::new(NoMergePolicy)
-    }
 }

 #[cfg(test)]
@@ -52,7 +61,7 @@ pub mod tests {
    ///
    /// Everytime there is more than one segment,
    /// it will suggest to merge them.
-    #[derive(Debug)]
+    #[derive(Debug, Clone)]
    pub struct MergeWheneverPossible;

    impl MergePolicy for MergeWheneverPossible {
@@ -67,9 +76,5 @@ pub mod tests {
                vec![]
            }
        }
-
-        fn box_clone(&self) -> Box<MergePolicy> {
-            Box::new(MergeWheneverPossible)
-        }
    }
 }
--- a/src/indexer/merger.rs
+++ b/src/indexer/merger.rs
@@ -2,7 +2,6 @@ use core::Segment;
 use core::SegmentReader;
 use core::SerializableSegment;
 use docset::DocSet;
-use error::Result;
 use fastfield::DeleteBitSet;
 use fastfield::FastFieldReader;
 use fastfield::FastFieldSerializer;
@@ -23,6 +22,7 @@ use store::StoreWriter;
 use termdict::TermMerger;
 use termdict::TermOrdinal;
 use DocId;
+use Result;

 fn compute_total_num_tokens(readers: &[SegmentReader], field: Field) -> u64 {
    let mut total_tokens = 0u64;
@@ -683,7 +683,7 @@ mod tests {
        };

        {
-            let mut index_writer = index.writer_with_num_threads(1, 40_000_000).unwrap();
+            let mut index_writer = index.writer_with_num_threads(1, 3_000_000).unwrap();
            {
                // writing the segment
                {
@@ -733,9 +733,10 @@ mod tests {
            let segment_ids = index
                .searchable_segment_ids()
                .expect("Searchable segments failed.");
-            let mut index_writer = index.writer_with_num_threads(1, 40_000_000).unwrap();
+            let mut index_writer = index.writer_with_num_threads(1, 3_000_000).unwrap();
            index_writer
                .merge(&segment_ids)
+                .expect("Failed to initiate merge")
                .wait()
                .expect("Merging failed");
            index_writer.wait_merging_threads().unwrap();
@@ -979,6 +980,7 @@ mod tests {
                .expect("Searchable segments failed.");
            index_writer
                .merge(&segment_ids)
+                .expect("Failed to initiate merge")
                .wait()
                .expect("Merging failed");
            index.load_searchers().unwrap();
@@ -1075,6 +1077,7 @@ mod tests {
                .expect("Searchable segments failed.");
            index_writer
                .merge(&segment_ids)
+                .expect("Failed to initiate merge")
                .wait()
                .expect("Merging failed");
            index.load_searchers().unwrap();
@@ -1128,6 +1131,7 @@ mod tests {
                .expect("Searchable segments failed.");
            index_writer
                .merge(&segment_ids)
+                .expect("Failed to initiate merge")
                .wait()
                .expect("Merging failed");
            index.load_searchers().unwrap();
@@ -1138,126 +1142,126 @@ mod tests {
        }
    }

-//    #[test]
-//    fn test_merge_facets() {
-//        let mut schema_builder = schema::SchemaBuilder::default();
-//        let facet_field = schema_builder.add_facet_field("facet");
-//        let index = Index::create_in_ram(schema_builder.build());
-//        use schema::Facet;
-//        {
-//            let mut index_writer = index.writer_with_num_threads(1, 40_000_000).unwrap();
-//            let index_doc = |index_writer: &mut IndexWriter, doc_facets: &[&str]| {
-//                let mut doc = Document::default();
-//                for facet in doc_facets {
-//                    doc.add_facet(facet_field, Facet::from(facet));
-//                }
-//                index_writer.add_document(doc);
-//            };
-//
-//            index_doc(&mut index_writer, &["/top/a/firstdoc", "/top/b"]);
-//            index_doc(&mut index_writer, &["/top/a/firstdoc", "/top/b", "/top/c"]);
-//            index_doc(&mut index_writer, &["/top/a", "/top/b"]);
-//            index_doc(&mut index_writer, &["/top/a"]);
-//
-//            index_doc(&mut index_writer, &["/top/b", "/top/d"]);
-//            index_doc(&mut index_writer, &["/top/d"]);
-//            index_doc(&mut index_writer, &["/top/e"]);
-//            index_writer.commit().expect("committed");
-//
-//            index_doc(&mut index_writer, &["/top/a"]);
-//            index_doc(&mut index_writer, &["/top/b"]);
-//            index_doc(&mut index_writer, &["/top/c"]);
-//            index_writer.commit().expect("committed");
-//
-//            index_doc(&mut index_writer, &["/top/e", "/top/f"]);
-//            index_writer.commit().expect("committed");
-//        }
-//        index.load_searchers().unwrap();
-//        let test_searcher = |expected_num_docs: usize, expected: &[(&str, u64)]| {
-//            let searcher = index.searcher();
-//            let mut facet_collector = FacetCollector::for_field(facet_field);
-//            facet_collector.add_facet(Facet::from("/top"));
-//            use collector::{CountCollector, MultiCollector};
-//            let mut count_collector = CountCollector::default();
-//            {
-//                let mut multi_collectors = MultiCollector::new();
-//                multi_collectors.add_collector(&mut count_collector);
-//                multi_collectors.add_collector(&mut facet_collector);
-//                searcher.search(&AllQuery, &mut multi_collectors).unwrap();
-//            }
-//            assert_eq!(count_collector.count(), expected_num_docs);
-//            let facet_counts = facet_collector.harvest();
-//            let facets: Vec<(String, u64)> = facet_counts
-//                .get("/top")
-//                .map(|(facet, count)| (facet.to_string(), count))
-//                .collect();
-//            assert_eq!(
-//                facets,
-//                expected
-//                    .iter()
-//                    .map(|&(facet_str, count)| (String::from(facet_str), count))
-//                    .collect::<Vec<_>>()
-//            );
-//        };
-//        test_searcher(
-//            11,
-//            &[
-//                ("/top/a", 5),
-//                ("/top/b", 5),
-//                ("/top/c", 2),
-//                ("/top/d", 2),
-//                ("/top/e", 2),
-//                ("/top/f", 1),
-//            ],
-//        );
-//
-//        // Merging the segments
-//        {
-//            let segment_ids = index
-//                .searchable_segment_ids()
-//                .expect("Searchable segments failed.");
-//            let mut index_writer = index.writer_with_num_threads(1, 40_000_000).unwrap();
-//            index_writer
-//                .merge(&segment_ids)
-//                .wait()
-//                .expect("Merging failed");
-//            index_writer.wait_merging_threads().unwrap();
-//
-//            index.load_searchers().unwrap();
-//            test_searcher(
-//                11,
-//                &[
-//                    ("/top/a", 5),
-//                    ("/top/b", 5),
-//                    ("/top/c", 2),
-//                    ("/top/d", 2),
-//                    ("/top/e", 2),
-//                    ("/top/f", 1),
-//                ],
-//            );
-//        }
-//
-//        // Deleting one term
-//        {
-//            let mut index_writer = index.writer_with_num_threads(1, 40_000_000).unwrap();
-//            let facet = Facet::from_path(vec!["top", "a", "firstdoc"]);
-//            let facet_term = Term::from_facet(facet_field, &facet);
-//            index_writer.delete_term(facet_term);
-//            index_writer.commit().unwrap();
-//            index.load_searchers().unwrap();
-//            test_searcher(
-//                9,
-//                &[
-//                    ("/top/a", 3),
-//                    ("/top/b", 3),
-//                    ("/top/c", 1),
-//                    ("/top/d", 2),
-//                    ("/top/e", 2),
-//                    ("/top/f", 1),
-//                ],
-//            );
-//        }
-//    }
+    #[test]
+    fn test_merge_facets() {
+        let mut schema_builder = schema::SchemaBuilder::default();
+        let facet_field = schema_builder.add_facet_field("facet");
+        let index = Index::create_in_ram(schema_builder.build());
+        use schema::Facet;
+        {
+            let mut index_writer = index.writer_with_num_threads(1, 40_000_000).unwrap();
+            let index_doc = |index_writer: &mut IndexWriter, doc_facets: &[&str]| {
+                let mut doc = Document::default();
+                for facet in doc_facets {
+                    doc.add_facet(facet_field, Facet::from(facet));
+                }
+                index_writer.add_document(doc);
+            };
+
+            index_doc(&mut index_writer, &["/top/a/firstdoc", "/top/b"]);
+            index_doc(&mut index_writer, &["/top/a/firstdoc", "/top/b", "/top/c"]);
+            index_doc(&mut index_writer, &["/top/a", "/top/b"]);
+            index_doc(&mut index_writer, &["/top/a"]);
+
+            index_doc(&mut index_writer, &["/top/b", "/top/d"]);
+            index_doc(&mut index_writer, &["/top/d"]);
+            index_doc(&mut index_writer, &["/top/e"]);
+            index_writer.commit().expect("committed");
+
+            index_doc(&mut index_writer, &["/top/a"]);
+            index_doc(&mut index_writer, &["/top/b"]);
+            index_doc(&mut index_writer, &["/top/c"]);
+            index_writer.commit().expect("committed");
+
+            index_doc(&mut index_writer, &["/top/e", "/top/f"]);
+            index_writer.commit().expect("committed");
+        }
+        index.load_searchers().unwrap();
+        let test_searcher = |expected_num_docs: usize, expected: &[(&str, u64)]| {
+            let searcher = index.searcher();
+            let mut facet_collector = FacetCollector::for_field(facet_field);
+            facet_collector.add_facet(Facet::from("/top"));
+            use collector::{CountCollector, MultiCollector};
+            let mut count_collector = CountCollector::default();
+            {
+                let mut multi_collectors =
+                    MultiCollector::from(vec![&mut count_collector, &mut facet_collector]);
+                searcher.search(&AllQuery, &mut multi_collectors).unwrap();
+            }
+            assert_eq!(count_collector.count(), expected_num_docs);
+            let facet_counts = facet_collector.harvest();
+            let facets: Vec<(String, u64)> = facet_counts
+                .get("/top")
+                .map(|(facet, count)| (facet.to_string(), count))
+                .collect();
+            assert_eq!(
+                facets,
+                expected
+                    .iter()
+                    .map(|&(facet_str, count)| (String::from(facet_str), count))
+                    .collect::<Vec<_>>()
+            );
+        };
+        test_searcher(
+            11,
+            &[
+                ("/top/a", 5),
+                ("/top/b", 5),
+                ("/top/c", 2),
+                ("/top/d", 2),
+                ("/top/e", 2),
+                ("/top/f", 1),
+            ],
+        );
+
+        // Merging the segments
+        {
+            let segment_ids = index
+                .searchable_segment_ids()
+                .expect("Searchable segments failed.");
+            let mut index_writer = index.writer_with_num_threads(1, 40_000_000).unwrap();
+            index_writer
+                .merge(&segment_ids)
+                .expect("Failed to initiate merge")
+                .wait()
+                .expect("Merging failed");
+            index_writer.wait_merging_threads().unwrap();
+
+            index.load_searchers().unwrap();
+            test_searcher(
+                11,
+                &[
+                    ("/top/a", 5),
+                    ("/top/b", 5),
+                    ("/top/c", 2),
+                    ("/top/d", 2),
+                    ("/top/e", 2),
+                    ("/top/f", 1),
+                ],
+            );
+        }
+
+        // Deleting one term
+        {
+            let mut index_writer = index.writer_with_num_threads(1, 40_000_000).unwrap();
+            let facet = Facet::from_path(vec!["top", "a", "firstdoc"]);
+            let facet_term = Term::from_facet(facet_field, &facet);
+            index_writer.delete_term(facet_term);
+            index_writer.commit().unwrap();
+            index.load_searchers().unwrap();
+            test_searcher(
+                9,
+                &[
+                    ("/top/a", 3),
+                    ("/top/b", 3),
+                    ("/top/c", 1),
+                    ("/top/d", 2),
+                    ("/top/e", 2),
+                    ("/top/f", 1),
+                ],
+            );
+        }
+    }

    #[test]
    fn test_merge_multivalued_int_fields_all_deleted() {
@@ -1290,6 +1294,7 @@ mod tests {
            let mut index_writer = index.writer_with_num_threads(1, 40_000_000).unwrap();
            index_writer
                .merge(&segment_ids)
+                .expect("Failed to initiate merge")
                .wait()
                .expect("Merging failed");
            index_writer.wait_merging_threads().unwrap();
@@ -1392,6 +1397,7 @@ mod tests {
            let mut index_writer = index.writer_with_num_threads(1, 40_000_000).unwrap();
            index_writer
                .merge(&segment_ids)
+                .expect("Failed to initiate merge")
                .wait()
                .expect("Merging failed");
            index_writer.wait_merging_threads().unwrap();
--- a/src/indexer/segment_manager.rs
+++ b/src/indexer/segment_manager.rs
@@ -2,6 +2,7 @@ use super::segment_register::SegmentRegister;
 use core::SegmentId;
 use core::SegmentMeta;
 use core::{LOCKFILE_FILEPATH, META_FILEPATH};
+use error::TantivyError;
 use indexer::delete_queue::DeleteCursor;
 use indexer::SegmentEntry;
 use std::collections::hash_set::HashSet;
@@ -9,6 +10,7 @@ use std::fmt::{self, Debug, Formatter};
 use std::path::PathBuf;
 use std::sync::RwLock;
 use std::sync::{RwLockReadGuard, RwLockWriteGuard};
+use Result as TantivyResult;

 #[derive(Default)]
 struct SegmentRegisters {
@@ -64,8 +66,9 @@ impl SegmentManager {

    /// Returns all of the segment entries (committed or uncommitted)
    pub fn segment_entries(&self) -> Vec<SegmentEntry> {
-        let mut segment_entries = self.read().uncommitted.segment_entries();
-        segment_entries.extend(self.read().committed.segment_entries());
+        let registers_lock = self.read();
+        let mut segment_entries = registers_lock.uncommitted.segment_entries();
+        segment_entries.extend(registers_lock.committed.segment_entries());
        segment_entries
    }

@@ -76,32 +79,15 @@ impl SegmentManager {
    }

    pub fn list_files(&self) -> HashSet<PathBuf> {
-        let registers_lock = self.read();
        let mut files = HashSet::new();
        files.insert(META_FILEPATH.clone());
        files.insert(LOCKFILE_FILEPATH.clone());
-
-        let segment_metas: Vec<SegmentMeta> = registers_lock
-            .committed
-            .get_all_segments()
-            .into_iter()
-            .chain(registers_lock.uncommitted.get_all_segments().into_iter())
-            .chain(registers_lock.writing.iter().cloned().map(SegmentMeta::new))
-            .collect();
-        for segment_meta in segment_metas {
+        for segment_meta in SegmentMeta::all() {
            files.extend(segment_meta.list_files());
        }
        files
    }

-    pub fn segment_entry(&self, segment_id: &SegmentId) -> Option<SegmentEntry> {
-        let registers = self.read();
-        registers
-            .committed
-            .segment_entry(segment_id)
-            .or_else(|| registers.uncommitted.segment_entry(segment_id))
-    }
-
    // Lock poisoning should never happen :
    // The lock is acquired and released within this class,
    // and the operations cannot panic.
@@ -126,19 +112,38 @@ impl SegmentManager {
        }
    }

-    pub fn start_merge(&self, segment_ids: &[SegmentId]) {
+    /// Marks a list of segments as in merge.
+    ///
+    /// Returns an error if some segments are missing, or if
+    /// the `segment_ids` are not either all committed or all
+    /// uncommitted.
+    pub fn start_merge(&self, segment_ids: &[SegmentId]) -> TantivyResult<Vec<SegmentEntry>> {
        let mut registers_lock = self.write();
+        let mut segment_entries = vec![];
        if registers_lock.uncommitted.contains_all(segment_ids) {
            for segment_id in segment_ids {
-                registers_lock.uncommitted.start_merge(segment_id);
+                let segment_entry = registers_lock.uncommitted
+                    .start_merge(segment_id)
+                    .expect("Segment id not found {}. Should never happen because of the contains all if-block.");
+                segment_entries.push(segment_entry);
            }
        } else if registers_lock.committed.contains_all(segment_ids) {
+            for segment_id in segment_ids {
+                let segment_entry = registers_lock.committed
+                    .start_merge(segment_id)
+                    .expect("Segment id not found {}. Should never happen because of the contains all if-block.");
+                segment_entries.push(segment_entry);
+            }
            for segment_id in segment_ids {
                registers_lock.committed.start_merge(segment_id);
            }
        } else {
-            error!("Merge operation sent for segments that are not all uncommited or commited.");
+            let error_msg = "Merge operation sent for segments that are not \
+                             all uncommited or commited."
+                .to_string();
+            return Err(TantivyError::InvalidArgument(error_msg));
        }
+        Ok(segment_entries)
    }

    pub fn cancel_merge(
--- a/src/indexer/segment_register.rs
+++ b/src/indexer/segment_register.rs
@@ -3,8 +3,7 @@ use core::SegmentMeta;
 use indexer::delete_queue::DeleteCursor;
 use indexer::segment_entry::SegmentEntry;
 use std::collections::HashMap;
-use std::fmt;
-use std::fmt::{Debug, Formatter};
+use std::fmt::{self, Debug, Formatter};

 /// The segment register keeps track
 /// of the list of segment, their size as well
@@ -39,13 +38,6 @@ impl SegmentRegister {
        self.segment_states.len()
    }

-    pub fn get_all_segments(&self) -> Vec<SegmentMeta> {
-        self.segment_states
-            .values()
-            .map(|segment_entry| segment_entry.meta().clone())
-            .collect()
-    }
-
    pub fn get_mergeable_segments(&self) -> Vec<SegmentMeta> {
        self.segment_states
            .values()
@@ -67,10 +59,6 @@ impl SegmentRegister {
        segment_ids
    }

-    pub fn segment_entry(&self, segment_id: &SegmentId) -> Option<SegmentEntry> {
-        self.segment_states.get(segment_id).cloned()
-    }
-
    pub fn contains_all(&mut self, segment_ids: &[SegmentId]) -> bool {
        segment_ids
            .iter()
@@ -93,11 +81,13 @@ impl SegmentRegister {
            .cancel_merge();
    }

-    pub fn start_merge(&mut self, segment_id: &SegmentId) {
-        self.segment_states
-            .get_mut(segment_id)
-            .expect("Received a merge notification for a segment that is not registered")
-            .start_merge();
+    pub fn start_merge(&mut self, segment_id: &SegmentId) -> Option<SegmentEntry> {
+        if let Some(segment_entry) = self.segment_states.get_mut(segment_id) {
+            segment_entry.start_merge();
+            Some(segment_entry.clone())
+        } else {
+            None
+        }
    }

    pub fn new(segment_metas: Vec<SegmentMeta>, delete_cursor: &DeleteCursor) -> SegmentRegister {
@@ -109,6 +99,11 @@ impl SegmentRegister {
        }
        SegmentRegister { segment_states }
    }
+
+    #[cfg(test)]
+    pub fn segment_entry(&self, segment_id: &SegmentId) -> Option<SegmentEntry> {
+        self.segment_states.get(segment_id).cloned()
+    }
 }

 #[cfg(test)]
@@ -137,7 +132,7 @@ mod tests {
        let segment_id_merged = SegmentId::generate_random();

        {
-            let segment_meta = SegmentMeta::new(segment_id_a);
+            let segment_meta = SegmentMeta::new(segment_id_a, 0u32);
            let segment_entry = SegmentEntry::new(segment_meta, delete_queue.cursor(), None);
            segment_register.add_segment_entry(segment_entry);
        }
@@ -150,7 +145,7 @@ mod tests {
        );
        assert_eq!(segment_ids(&segment_register), vec![segment_id_a]);
        {
-            let segment_meta = SegmentMeta::new(segment_id_b);
+            let segment_meta = SegmentMeta::new(segment_id_b, 0u32);
            let segment_entry = SegmentEntry::new(segment_meta, delete_queue.cursor(), None);
            segment_register.add_segment_entry(segment_entry);
        }
@@ -180,7 +175,7 @@ mod tests {
        segment_register.remove_segment(&segment_id_a);
        segment_register.remove_segment(&segment_id_b);
        {
-            let segment_meta_merged = SegmentMeta::new(segment_id_merged);
+            let segment_meta_merged = SegmentMeta::new(segment_id_merged, 0u32);
            let segment_entry = SegmentEntry::new(segment_meta_merged, delete_queue.cursor(), None);
            segment_register.add_segment_entry(segment_entry);
        }
--- a/src/indexer/segment_updater.rs
+++ b/src/indexer/segment_updater.rs
@@ -6,12 +6,12 @@ use core::SegmentId;
 use core::SegmentMeta;
 use core::SerializableSegment;
 use core::META_FILEPATH;
-use directory::Directory;
-use directory::FileProtection;
-use error::{Error, ErrorKind, Result};
+use directory::{Directory, DirectoryClone};
+use error::TantivyError;
 use futures::oneshot;
 use futures::sync::oneshot::Receiver;
 use futures::Future;
+use futures_cpupool::Builder as CpuPoolBuilder;
 use futures_cpupool::CpuFuture;
 use futures_cpupool::CpuPool;
 use indexer::delete_queue::DeleteCursor;
@@ -29,12 +29,12 @@ use std::collections::HashMap;
 use std::io::Write;
 use std::mem;
 use std::ops::DerefMut;
-use std::sync::atomic::Ordering;
-use std::sync::atomic::{AtomicBool, AtomicUsize};
+use std::sync::atomic::{AtomicBool, AtomicUsize, Ordering};
 use std::sync::Arc;
 use std::sync::RwLock;
 use std::thread;
 use std::thread::JoinHandle;
+use Result;

 /// Save the index meta file.
 /// This operation is atomic :
@@ -87,38 +87,19 @@ pub fn save_metas(
 pub struct SegmentUpdater(Arc<InnerSegmentUpdater>);

 fn perform_merge(
-    segment_ids: &[SegmentId],
-    segment_updater: &SegmentUpdater,
+    index: &Index,
+    mut segment_entries: Vec<SegmentEntry>,
    mut merged_segment: Segment,
    target_opstamp: u64,
 ) -> Result<SegmentEntry> {
    // first we need to apply deletes to our segment.
-    info!("Start merge: {:?}", segment_ids);

-    let index = &segment_updater.0.index;
+    // TODO add logging
    let schema = index.schema();
-    let mut segment_entries = vec![];

-    let mut file_protections: Vec<FileProtection> = vec![];
-
-    for segment_id in segment_ids {
-        if let Some(mut segment_entry) = segment_updater.0.segment_manager.segment_entry(segment_id)
-        {
-            let segment = index.segment(segment_entry.meta().clone());
-            if let Some(file_protection) =
-                advance_deletes(segment, &mut segment_entry, target_opstamp)?
-            {
-                file_protections.push(file_protection);
-            }
-            segment_entries.push(segment_entry);
-        } else {
-            error!("Error, had to abort merge as some of the segment is not managed anymore.");
-            let msg = format!(
-                "Segment {:?} requested for merge is not managed.",
-                segment_id
-            );
-            bail!(ErrorKind::InvalidArgument(msg));
-        }
+    for segment_entry in &mut segment_entries {
+        let segment = index.segment(segment_entry.meta().clone());
+        advance_deletes(segment, segment_entry, target_opstamp)?;
    }

    let delete_cursor = segment_entries[0].delete_cursor().clone();
@@ -134,14 +115,11 @@ fn perform_merge(
    // ... we just serialize this index merger in our new segment
    // to merge the two segments.

-    let segment_serializer = SegmentSerializer::for_segment(&mut merged_segment)
-        .expect("Creating index serializer failed");
+    let segment_serializer = SegmentSerializer::for_segment(&mut merged_segment)?;

-    let num_docs = merger
-        .write(segment_serializer)
-        .expect("Serializing merged index failed");
-    let mut segment_meta = SegmentMeta::new(merged_segment.id());
-    segment_meta.set_max_doc(num_docs);
+    let num_docs = merger.write(segment_serializer)?;
+
+    let segment_meta = SegmentMeta::new(merged_segment.id(), num_docs);

    let after_merge_segment_entry = SegmentEntry::new(segment_meta.clone(), delete_cursor, None);
    Ok(after_merge_segment_entry)
@@ -167,8 +145,12 @@ impl SegmentUpdater {
    ) -> Result<SegmentUpdater> {
        let segments = index.searchable_segment_metas()?;
        let segment_manager = SegmentManager::from_segments(segments, delete_cursor);
+        let pool = CpuPoolBuilder::new()
+            .name_prefix("segment_updater")
+            .pool_size(1)
+            .create();
        Ok(SegmentUpdater(Arc::new(InnerSegmentUpdater {
-            pool: CpuPool::new(1),
+            pool,
            index,
            segment_manager,
            merge_policy: RwLock::new(Box::new(DefaultMergePolicy::default())),
@@ -202,7 +184,7 @@ impl SegmentUpdater {
    fn run_async<T: 'static + Send, F: 'static + Send + FnOnce(SegmentUpdater) -> T>(
        &self,
        f: F,
-    ) -> CpuFuture<T, Error> {
+    ) -> CpuFuture<T, TantivyError> {
        let me_clone = self.clone();
        self.0.pool.spawn_fn(move || Ok(f(me_clone)))
    }
@@ -283,69 +265,85 @@ impl SegmentUpdater {
        }).wait()
    }

-    pub fn start_merge(&self, segment_ids: &[SegmentId]) -> Receiver<SegmentMeta> {
-        self.0.segment_manager.start_merge(segment_ids);
+    pub fn start_merge(&self, segment_ids: &[SegmentId]) -> Result<Receiver<SegmentMeta>> {
+        //let future_merged_segment = */
+        let segment_ids_vec = segment_ids.to_vec();
+        self.run_async(move |segment_updater| {
+            segment_updater.start_merge_impl(&segment_ids_vec[..])
+        }).wait()?
+    }
+
+    // `segment_ids` is required to be non-empty.
+    fn start_merge_impl(&self, segment_ids: &[SegmentId]) -> Result<Receiver<SegmentMeta>> {
+        assert!(!segment_ids.is_empty(), "Segment_ids cannot be empty.");
+
        let segment_updater_clone = self.clone();
+        let segment_entries: Vec<SegmentEntry> = self.0.segment_manager.start_merge(segment_ids)?;

        let segment_ids_vec = segment_ids.to_vec();

        let merging_thread_id = self.get_merging_thread_id();
+        info!(
+            "Starting merge thread #{} - {:?}",
+            merging_thread_id, segment_ids
+        );
        let (merging_future_send, merging_future_recv) = oneshot();

-        if segment_ids.is_empty() {
-            return merging_future_recv;
-        }
-
        let target_opstamp = self.0.stamper.stamp();
-        let merging_join_handle = thread::spawn(move || {
-            // first we need to apply deletes to our segment.
-            let merged_segment = segment_updater_clone.new_segment();
-            let merged_segment_id = merged_segment.id();
-            let merge_result = perform_merge(
-                &segment_ids_vec,
-                &segment_updater_clone,
-                merged_segment,
-                target_opstamp,
-            );

-            match merge_result {
-                Ok(after_merge_segment_entry) => {
-                    let merged_segment_meta = after_merge_segment_entry.meta().clone();
-                    segment_updater_clone
-                        .end_merge(segment_ids_vec, after_merge_segment_entry)
-                        .expect("Segment updater thread is corrupted.");
+        // first we need to apply deletes to our segment.
+        let merging_join_handle = thread::Builder::new()
+            .name(format!("mergingthread-{}", merging_thread_id))
+            .spawn(move || {
+                // first we need to apply deletes to our segment.
+                let merged_segment = segment_updater_clone.new_segment();
+                let merged_segment_id = merged_segment.id();
+                let merge_result = perform_merge(
+                    &segment_updater_clone.0.index,
+                    segment_entries,
+                    merged_segment,
+                    target_opstamp,
+                );

-                    // the future may fail if the listener of the oneshot future
-                    // has been destroyed.
-                    //
-                    // This is not a problem here, so we just ignore any
-                    // possible error.
-                    let _merging_future_res = merging_future_send.send(merged_segment_meta);
-                }
-                Err(e) => {
-                    error!("Merge of {:?} was cancelled: {:?}", segment_ids_vec, e);
-                    // ... cancel merge
-                    if cfg!(test) {
-                        panic!("Merge failed.");
+                match merge_result {
+                    Ok(after_merge_segment_entry) => {
+                        let merged_segment_meta = after_merge_segment_entry.meta().clone();
+                        segment_updater_clone
+                            .end_merge(segment_ids_vec, after_merge_segment_entry)
+                            .expect("Segment updater thread is corrupted.");
+
+                        // the future may fail if the listener of the oneshot future
+                        // has been destroyed.
+                        //
+                        // This is not a problem here, so we just ignore any
+                        // possible error.
+                        let _merging_future_res = merging_future_send.send(merged_segment_meta);
+                    }
+                    Err(e) => {
+                        warn!("Merge of {:?} was cancelled: {:?}", segment_ids_vec, e);
+                        // ... cancel merge
+                        if cfg!(test) {
+                            panic!("Merge failed.");
+                        }
+                        segment_updater_clone.cancel_merge(&segment_ids_vec, merged_segment_id);
+                        // merging_future_send will be dropped, sending an error to the future.
                    }
-                    segment_updater_clone.cancel_merge(&segment_ids_vec, merged_segment_id);
-                    // merging_future_send will be dropped, sending an error to the future.
                }
-            }
-            segment_updater_clone
-                .0
-                .merging_threads
-                .write()
-                .unwrap()
-                .remove(&merging_thread_id);
-            Ok(())
-        });
+                segment_updater_clone
+                    .0
+                    .merging_threads
+                    .write()
+                    .unwrap()
+                    .remove(&merging_thread_id);
+                Ok(())
+            })
+            .expect("Failed to spawn a thread.");
        self.0
            .merging_threads
            .write()
            .unwrap()
            .insert(merging_thread_id, merging_join_handle);
-        merging_future_recv
+        Ok(merging_future_recv)
    }

    fn consider_merge_options(&self) {
@@ -358,8 +356,18 @@ impl SegmentUpdater {
        let committed_merge_candidates = merge_policy.compute_merge_candidates(&committed_segments);
        merge_candidates.extend_from_slice(&committed_merge_candidates[..]);
        for MergeCandidate(segment_metas) in merge_candidates {
-            if let Err(e) = self.start_merge(&segment_metas).fuse().poll() {
-                error!("The merge task failed quickly after starting: {:?}", e);
+            match self.start_merge_impl(&segment_metas) {
+                Ok(merge_future) => {
+                    if let Err(e) = merge_future.fuse().poll() {
+                        error!("The merge task failed quickly after starting: {:?}", e);
+                    }
+                }
+                Err(err) => {
+                    warn!(
+                        "Starting the merge failed for the following reason. This is not fatal. {}",
+                        err
+                    );
+                }
            }
        }
    }
@@ -382,7 +390,6 @@ impl SegmentUpdater {
        self.run_async(move |segment_updater| {
            info!("End merge {:?}", after_merge_segment_entry.meta());
            let mut delete_cursor = after_merge_segment_entry.delete_cursor().clone();
-            let mut _file_protection_opt = None;
            if let Some(delete_operation) = delete_cursor.get() {
                let committed_opstamp = segment_updater
                    .0
@@ -393,29 +400,22 @@ impl SegmentUpdater {
                if delete_operation.opstamp < committed_opstamp {
                    let index = &segment_updater.0.index;
                    let segment = index.segment(after_merge_segment_entry.meta().clone());
-                    match advance_deletes(
-                        segment,
-                        &mut after_merge_segment_entry,
-                        committed_opstamp,
-                    ) {
-                        Ok(file_protection_opt_res) => {
-                            _file_protection_opt = file_protection_opt_res;
-                        }
-                        Err(e) => {
-                            error!(
-                                "Merge of {:?} was cancelled (advancing deletes failed): {:?}",
-                                before_merge_segment_ids, e
-                            );
-                            // ... cancel merge
-                            if cfg!(test) {
-                                panic!("Merge failed.");
-                            }
-                            segment_updater.cancel_merge(
-                                &before_merge_segment_ids,
-                                after_merge_segment_entry.segment_id(),
-                            );
-                            return;
+                    if let Err(e) =
+                        advance_deletes(segment, &mut after_merge_segment_entry, committed_opstamp)
+                    {
+                        error!(
+                            "Merge of {:?} was cancelled (advancing deletes failed): {:?}",
+                            before_merge_segment_ids, e
+                        );
+                        // ... cancel merge
+                        if cfg!(test) {
+                            panic!("Merge failed.");
                        }
+                        segment_updater.cancel_merge(
+                            &before_merge_segment_ids,
+                            after_merge_segment_entry.segment_id(),
+                        );
+                        return;
                    }
                }
            }
@@ -461,7 +461,7 @@ impl SegmentUpdater {
                merging_thread_handle
                    .join()
                    .map(|_| ())
-                    .map_err(|_| ErrorKind::ErrorInThread("Merging thread failed.".into()))?;
+                    .map_err(|_| TantivyError::ErrorInThread("Merging thread failed.".into()))?;
            }
            // Our merging thread may have queued their completed
            self.run_async(move |_| {}).wait()?;
--- a/src/indexer/segment_writer.rs
+++ b/src/indexer/segment_writer.rs
@@ -1,10 +1,8 @@
 use super::operation::AddOperation;
 use core::Segment;
 use core::SerializableSegment;
-use datastruct::stacker::Heap;
 use fastfield::FastFieldsWriter;
 use fieldnorm::FieldNormsWriter;
-use indexer::index_writer::MARGIN_IN_BYTES;
 use indexer::segment_serializer::SegmentSerializer;
 use postings::MultiFieldPostingsWriter;
 use schema::FieldType;
@@ -24,10 +22,9 @@ use Result;
 ///
 /// They creates the postings list in anonymous memory.
 /// The segment is layed on disk when the segment gets `finalized`.
-pub struct SegmentWriter<'a> {
-    heap: &'a Heap,
+pub struct SegmentWriter {
    max_doc: DocId,
-    multifield_postings: MultiFieldPostingsWriter<'a>,
+    multifield_postings: MultiFieldPostingsWriter,
    segment_serializer: SegmentSerializer,
    fast_field_writers: FastFieldsWriter,
    fieldnorms_writer: FieldNormsWriter,
@@ -35,7 +32,7 @@ pub struct SegmentWriter<'a> {
    tokenizers: Vec<Option<Box<BoxedTokenizer>>>,
 }

-impl<'a> SegmentWriter<'a> {
+impl SegmentWriter {
    /// Creates a new `SegmentWriter`
    ///
    /// The arguments are defined as follows
@@ -46,13 +43,12 @@ impl<'a> SegmentWriter<'a> {
    /// - segment: The segment being written
    /// - schema
    pub fn for_segment(
-        heap: &'a Heap,
        table_bits: usize,
        mut segment: Segment,
        schema: &Schema,
-    ) -> Result<SegmentWriter<'a>> {
+    ) -> Result<SegmentWriter> {
        let segment_serializer = SegmentSerializer::for_segment(&mut segment)?;
-        let multifield_postings = MultiFieldPostingsWriter::new(schema, table_bits, heap);
+        let multifield_postings = MultiFieldPostingsWriter::new(schema, table_bits);
        let tokenizers = schema
            .fields()
            .iter()
@@ -68,7 +64,6 @@ impl<'a> SegmentWriter<'a> {
            })
            .collect();
        Ok(SegmentWriter {
-            heap,
            max_doc: 0,
            multifield_postings,
            fieldnorms_writer: FieldNormsWriter::for_schema(schema),
@@ -94,22 +89,8 @@ impl<'a> SegmentWriter<'a> {
        Ok(self.doc_opstamps)
    }

-    /// Returns true iff the segment writer's buffer has reached capacity.
-    ///
-    /// The limit is defined as `the user defined heap size - an arbitrary margin of 10MB`
-    /// The `Segment` is `finalize`d when the buffer gets full.
-    ///
-    /// Because, we cannot cut through a document, the margin is there to ensure that we rarely
-    /// exceeds the heap size.
-    pub fn is_buffer_full(&self) -> bool {
-        self.heap.num_free_bytes() <= MARGIN_IN_BYTES
-    }
-
-    /// Return true if the term dictionary hashmap is reaching capacity.
-    /// It is one of the condition that triggers a `SegmentWriter` to
-    /// be finalized.
-    pub(crate) fn is_term_saturated(&self) -> bool {
-        self.multifield_postings.is_term_saturated()
+    pub fn mem_usage(&self) -> usize {
+        self.multifield_postings.mem_usage()
    }

    /// Indexes a new document
@@ -248,7 +229,7 @@ fn write(
    Ok(())
 }

-impl<'a> SerializableSegment for SegmentWriter<'a> {
+impl SerializableSegment for SegmentWriter {
    fn write(&self, serializer: SegmentSerializer) -> Result<u32> {
        let max_doc = self.max_doc;
        write(
--- a/src/lib.rs
+++ b/src/lib.rs
@@ -7,6 +7,7 @@
 #![allow(new_without_default)]
 #![allow(decimal_literal_representation)]
 #![warn(missing_docs)]
+#![recursion_limit = "80"]

 //! # `tantivy`
 //!
@@ -55,7 +56,7 @@
 //!
 //! // Indexing documents
 //!
-//! let index = Index::create(index_path, schema.clone())?;
+//! let index = Index::create_in_dir(index_path, schema.clone())?;
 //!
 //! // Here we use a buffer of 100MB that will be split
 //! // between indexing threads.
@@ -123,7 +124,7 @@ extern crate serde_json;
 extern crate log;

 #[macro_use]
-extern crate error_chain;
+extern crate failure;

 #[cfg(feature = "mmap")]
 extern crate atomicwrites;
@@ -131,16 +132,19 @@ extern crate base64;
 extern crate bit_set;
 extern crate bitpacking;
 extern crate byteorder;
-extern crate chan;
+
+#[macro_use]
 extern crate combine;
+
 extern crate crossbeam;
+extern crate crossbeam_channel;
 extern crate fnv;
 extern crate fst;
+extern crate fst_regex;
 extern crate futures;
 extern crate futures_cpupool;
 extern crate itertools;
 extern crate levenshtein_automata;
-extern crate lz4;
 extern crate num_cpus;
 extern crate owning_ref;
 extern crate regex;
@@ -155,9 +159,6 @@ extern crate uuid;
 #[macro_use]
 extern crate matches;

-#[cfg(test)]
-extern crate env_logger;
-
 #[cfg(windows)]
 extern crate winapi;

@@ -178,18 +179,22 @@ mod functional_test;
 #[macro_use]
 mod macros;

-pub use error::{Error, ErrorKind, ResultExt};
+pub use error::TantivyError;
+
+#[deprecated(since="0.7.0", note="please use `tantivy::TantivyError` instead")]
+pub use error::TantivyError as Error;
+
+extern crate census;
+extern crate owned_read;

 /// Tantivy result.
-pub type Result<T> = std::result::Result<T, Error>;
+pub type Result<T> = std::result::Result<T, error::TantivyError>;

 mod common;
-mod compression;
 mod core;
 mod indexer;

-mod datastruct;
-#[allow(unused_doc_comment)]
+#[allow(unused_doc_comments)]
 mod error;
 pub mod tokenizer;

@@ -197,6 +202,7 @@ pub mod collector;
 pub mod directory;
 pub mod fastfield;
 pub mod fieldnorm;
+pub(crate) mod positions;
 pub mod postings;
 pub mod query;
 pub mod schema;
@@ -283,7 +289,8 @@ mod tests {
    use core::SegmentReader;
    use docset::DocSet;
    use query::BooleanQuery;
-    use rand::distributions::{IndependentSample, Range};
+    use rand::distributions::Bernoulli;
+    use rand::distributions::Range;
    use rand::{Rng, SeedableRng, XorShiftRng};
    use schema::*;
    use Index;
@@ -304,21 +311,24 @@ mod tests {
    }

    pub fn generate_nonunique_unsorted(max_value: u32, n_elems: usize) -> Vec<u32> {
-        let seed: &[u32; 4] = &[1, 2, 3, 4];
-        let mut rng: XorShiftRng = XorShiftRng::from_seed(*seed);
-        let between = Range::new(0u32, max_value);
-        (0..n_elems)
-            .map(|_| between.ind_sample(&mut rng))
+        let seed: [u8; 16] = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15];
+        XorShiftRng::from_seed(seed)
+            .sample_iter(&Range::new(0u32, max_value))
+            .take(n_elems)
            .collect::<Vec<u32>>()
    }

-    pub fn sample_with_seed(n: u32, ratio: f32, seed_val: u32) -> Vec<u32> {
-        let seed: &[u32; 4] = &[1, 2, 3, seed_val];
-        let mut rng: XorShiftRng = XorShiftRng::from_seed(*seed);
-        (0..n).filter(|_| rng.next_f32() < ratio).collect()
+    pub fn sample_with_seed(n: u32, ratio: f64, seed_val: u8) -> Vec<u32> {
+        let seed: [u8; 16] = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, seed_val];
+        XorShiftRng::from_seed(seed)
+            .sample_iter(&Bernoulli::new(ratio))
+            .take(n as usize)
+            .enumerate()
+            .filter_map(|(val, keep)| if keep { Some(val as u32) } else { None })
+            .collect()
    }

-    pub fn sample(n: u32, ratio: f32) -> Vec<u32> {
+    pub fn sample(n: u32, ratio: f64) -> Vec<u32> {
        sample_with_seed(n, ratio, 4)
    }

--- a/src/macros.rs
+++ b/src/macros.rs
@@ -1,7 +1,3 @@
-macro_rules! get(
-    ($e:expr) => (match $e { Some(e) => e, None => return None })
-);
-
 /// `doc!` is a shortcut that helps building `Document`
 /// objects.
 ///
--- a/src/positions/mod.rs
+++ b/src/positions/mod.rs
@@ -0,0 +1,148 @@
+
+/// Positions are stored in three parts and over two files.
+//
+/// The `SegmentComponent::POSITIONS` file contains all of the bitpacked positions delta,
+/// for all terms of a given field, one term after the other.
+///
+/// If the last block is incomplete, it is simply padded with zeros.
+/// It cannot be read alone, as it actually does not contain the number of bits used for
+/// each blocks.
+/// .
+/// Each block is serialized one after the other.
+/// If the last block is incomplete, it is simply padded with zeros.
+///
+///
+/// The `SegmentComponent::POSITIONSSKIP` file contains the number of bits used in each block in `u8`
+/// stream.
+///
+/// This makes it possible to rapidly skip over `n positions`.
+///
+/// For every block #n where n = k * `LONG_SKIP_INTERVAL` blocks (k>=1), we also store
+/// in this file the sum of number of bits used for all of the previous block (blocks `[0, n[`).
+/// That is useful to start reading the positions for a given term: The TermInfo contains
+/// an address in the positions stream, expressed in "number of positions".
+/// The long skip structure makes it possible to skip rapidly to the a checkpoint close to this
+/// value, and then skip normally.
+///
+
+mod reader;
+mod serializer;
+
+pub use self::reader::PositionReader;
+pub use self::serializer::PositionSerializer;
+use bitpacking::{BitPacker4x, BitPacker};
+
+const COMPRESSION_BLOCK_SIZE: usize = BitPacker4x::BLOCK_LEN;
+const LONG_SKIP_IN_BLOCKS: usize = 1_024;
+const LONG_SKIP_INTERVAL: u64 = (LONG_SKIP_IN_BLOCKS * COMPRESSION_BLOCK_SIZE) as u64;
+
+lazy_static! {
+    static ref BIT_PACKER: BitPacker4x = BitPacker4x::new();
+}
+
+#[cfg(test)]
+pub mod tests {
+
+    use std::iter;
+    use super::{PositionSerializer, PositionReader};
+    use directory::ReadOnlySource;
+    use positions::COMPRESSION_BLOCK_SIZE;
+
+    fn create_stream_buffer(vals: &[u32]) -> (ReadOnlySource, ReadOnlySource) {
+        let mut skip_buffer = vec![];
+        let mut stream_buffer = vec![];
+        {
+            let mut serializer = PositionSerializer::new(&mut stream_buffer, &mut skip_buffer);
+            for (i, &val) in vals.iter().enumerate() {
+                assert_eq!(serializer.positions_idx(), i as u64);
+                serializer.write_all(&[val]).unwrap();
+            }
+            serializer.close().unwrap();
+        }
+        (ReadOnlySource::from(stream_buffer), ReadOnlySource::from(skip_buffer))
+    }
+
+    #[test]
+    fn test_position_read() {
+        let v: Vec<u32> = (0..1000).collect();
+        let (stream, skip) = create_stream_buffer(&v[..]);
+        assert_eq!(skip.len(), 12);
+        assert_eq!(stream.len(), 1168);
+        let mut position_reader = PositionReader::new(stream, skip, 0u64);
+        for &n in &[1, 10, 127, 128, 130, 312] {
+            let mut v = vec![0u32; n];
+            position_reader.read(&mut v[..n]);
+            for i in 0..n {
+                assert_eq!(v[i], i as u32);
+            }
+        }
+    }
+
+    #[test]
+    fn test_position_skip() {
+        let v: Vec<u32> = (0..1_000).collect();
+        let (stream, skip) = create_stream_buffer(&v[..]);
+        assert_eq!(skip.len(), 12);
+        assert_eq!(stream.len(), 1168);
+
+        let mut position_reader = PositionReader::new(stream, skip, 0u64);
+        position_reader.skip(10);
+        for &n in &[10, 127, COMPRESSION_BLOCK_SIZE, 130, 312] {
+            let mut v = vec![0u32; n];
+            position_reader.read(&mut v[..n]);
+            for i in 0..n {
+                assert_eq!(v[i], 10u32 + i as u32);
+            }
+        }
+    }
+
+    #[test]
+    fn test_position_read_after_skip() {
+        let v: Vec<u32> = (0..1_000).collect();
+        let (stream, skip) = create_stream_buffer(&v[..]);
+        assert_eq!(skip.len(), 12);
+        assert_eq!(stream.len(), 1168);
+
+        let mut position_reader = PositionReader::new(stream,skip, 0u64);
+        let mut buf = [0u32; 7];
+        let mut c = 0;
+        for _ in 0..100 {
+            position_reader.read(&mut buf);
+            position_reader.read(&mut buf);
+            position_reader.skip(4);
+            position_reader.skip(3);
+            for &el in &buf {
+                assert_eq!(c, el);
+                c += 1;
+            }
+        }
+    }
+
+    #[test]
+    fn test_position_long_skip_const() {
+        const CONST_VAL: u32 = 9u32;
+        let v: Vec<u32> = iter::repeat(CONST_VAL).take(2_000_000).collect();
+        let (stream, skip) = create_stream_buffer(&v[..]);
+        assert_eq!(skip.len(), 15_749);
+        assert_eq!(stream.len(), 1_000_000);
+        let mut position_reader = PositionReader::new(stream,skip, 128 * 1024);
+        let mut buf = [0u32; 1];
+        position_reader.read(&mut buf);
+        assert_eq!(buf[0], CONST_VAL);
+    }
+
+    #[test]
+    fn test_position_long_skip_2() {
+        let v: Vec<u32> = (0..2_000_000).collect();
+        let (stream, skip) = create_stream_buffer(&v[..]);
+        assert_eq!(skip.len(), 15_749);
+        assert_eq!(stream.len(), 4_987_872);
+        for &offset in &[10, 128 * 1024, 128 * 1024 - 1, 128 * 1024 + 7, 128 * 10 * 1024 + 10] {
+            let mut position_reader = PositionReader::new(stream.clone(),skip.clone(), offset);
+            let mut buf = [0u32; 1];
+            position_reader.read(&mut buf);
+            assert_eq!(buf[0], offset as u32);
+        }
+    }
+}
+
--- a/src/positions/reader.rs
+++ b/src/positions/reader.rs
@@ -0,0 +1,146 @@
+use bitpacking::{BitPacker4x, BitPacker};
+use owned_read::OwnedRead;
+use common::{BinarySerializable, FixedSize};
+use postings::compression::compressed_block_size;
+use directory::ReadOnlySource;
+use positions::COMPRESSION_BLOCK_SIZE;
+use positions::LONG_SKIP_IN_BLOCKS;
+use positions::LONG_SKIP_INTERVAL;
+use super::BIT_PACKER;
+
+pub struct PositionReader {
+    skip_read: OwnedRead,
+    position_read: OwnedRead,
+    inner_offset: usize,
+    buffer: Box<[u32; 128]>,
+    ahead: Option<usize>, // if None, no block is loaded.
+                          // if Some(num_blocks), the block currently loaded is num_blocks ahead
+                          // of the block of the next int to read.
+}
+
+
+// `ahead` represents the offset of the block currently loaded
+// compared to the cursor of the actual stream.
+//
+// By contract, when this function is called, the current block has to be
+// decompressed.
+//
+// If the requested number of els ends exactly at a given block, the next
+// block is not decompressed.
+fn read_impl(
+    mut position: &[u8],
+    buffer: &mut [u32; 128],
+    mut inner_offset: usize,
+    num_bits: &[u8],
+    output: &mut [u32]) -> usize {
+    let mut output_start = 0;
+    let mut output_len = output.len();
+    let mut ahead = 0;
+    loop {
+        let available_len = 128 - inner_offset;
+        if output_len <= available_len {
+            output[output_start..].copy_from_slice(&buffer[inner_offset..][..output_len]);
+            return ahead;
+        } else {
+            output[output_start..][..available_len].copy_from_slice(&buffer[inner_offset..]);
+            output_len -= available_len;
+            output_start += available_len;
+            inner_offset = 0;
+            let num_bits = num_bits[ahead];
+            BitPacker4x::new()
+                .decompress(position, &mut buffer[..], num_bits);
+            let block_len = compressed_block_size(num_bits);
+            position = &position[block_len..];
+            ahead += 1;
+        }
+    }
+}
+
+
+impl PositionReader {
+    pub fn new(position_source: ReadOnlySource,
+               skip_source: ReadOnlySource,
+               offset: u64) -> PositionReader {
+        let skip_len = skip_source.len();
+        let (body, footer) = skip_source.split(skip_len - u32::SIZE_IN_BYTES);
+        let num_long_skips = u32::deserialize(&mut footer.as_slice()).expect("Index corrupted");
+        let body_split = body.len() - u64::SIZE_IN_BYTES * (num_long_skips as usize);
+        let (skip_body, long_skips) = body.split(body_split);
+        let long_skip_id = (offset / LONG_SKIP_INTERVAL) as usize;
+        let small_skip = (offset - (long_skip_id as u64) * (LONG_SKIP_INTERVAL as u64)) as usize;
+        let offset_num_bytes: u64 = {
+            if long_skip_id > 0 {
+                let mut long_skip_blocks: &[u8] = &long_skips.as_slice()[(long_skip_id - 1) * 8..][..8];
+                u64::deserialize(&mut long_skip_blocks).expect("Index corrupted") * 16
+            } else {
+                0
+            }
+        };
+        let mut position_read = OwnedRead::new(position_source);
+        position_read.advance(offset_num_bytes as usize);
+        let mut skip_read = OwnedRead::new(skip_body);
+        skip_read.advance(long_skip_id  * LONG_SKIP_IN_BLOCKS);
+        let mut position_reader = PositionReader {
+            skip_read,
+            position_read,
+            inner_offset: 0,
+            buffer: Box::new([0u32; 128]),
+            ahead: None
+        };
+        position_reader.skip(small_skip);
+        position_reader
+    }
+
+    /// Fills a buffer with the next `output.len()` integers.
+    /// This does not consume / advance the stream.
+    pub fn read(&mut self, output: &mut [u32]) {
+        let skip_data = self.skip_read.as_ref();
+        let position_data = self.position_read.as_ref();
+        let num_bits = self.skip_read.get(0);
+        if self.ahead != Some(0) {
+            // the block currently available is not the block
+            // for the current position
+            BIT_PACKER.decompress(position_data, self.buffer.as_mut(), num_bits);
+        }
+        let block_len = compressed_block_size(num_bits);
+        self.ahead = Some(read_impl(
+            &position_data[block_len..],
+            self.buffer.as_mut(),
+            self.inner_offset,
+            &skip_data[1..],
+            output));
+    }
+
+    /// Skip the next `skip_len` integer.
+    ///
+    /// If a full block is skipped, calling
+    /// `.skip(...)` will avoid decompressing it.
+    ///
+    /// May panic if the end of the stream is reached.
+    pub fn skip(&mut self, skip_len: usize) {
+
+        let skip_len_plus_inner_offset = skip_len + self.inner_offset;
+
+        let num_blocks_to_advance = skip_len_plus_inner_offset / COMPRESSION_BLOCK_SIZE;
+        self.inner_offset = skip_len_plus_inner_offset % COMPRESSION_BLOCK_SIZE;
+
+        self.ahead = self.ahead
+            .and_then(|num_blocks| {
+                if num_blocks >= num_blocks_to_advance {
+                    Some(num_blocks_to_advance - num_blocks_to_advance)
+                } else {
+                    None
+                }
+            });
+
+        let skip_len = self.skip_read
+            .as_ref()[..num_blocks_to_advance]
+            .iter()
+            .cloned()
+            .map(|num_bit| num_bit as usize)
+            .sum::<usize>() * (COMPRESSION_BLOCK_SIZE / 8);
+
+        self.skip_read.advance(num_blocks_to_advance);
+        self.position_read.advance(skip_len);
+    }
+}
--- a/src/positions/serializer.rs
+++ b/src/positions/serializer.rs
@@ -0,0 +1,79 @@
+use std::io;
+use bitpacking::BitPacker;
+use positions::{COMPRESSION_BLOCK_SIZE, LONG_SKIP_INTERVAL};
+use common::BinarySerializable;
+use super::BIT_PACKER;
+
+pub struct PositionSerializer<W: io::Write> {
+    write_stream: W,
+    write_skiplist: W,
+    block: Vec<u32>,
+    buffer: Vec<u8>,
+    num_ints: u64,
+    long_skips: Vec<u64>,
+    cumulated_num_bits: u64,
+}
+
+impl<W: io::Write> PositionSerializer<W> {
+    pub fn new(write_stream: W, write_skiplist: W) -> PositionSerializer<W> {
+        PositionSerializer {
+            write_stream,
+            write_skiplist,
+            block: Vec::with_capacity(128),
+            buffer: vec![0u8; 128 * 4],
+            num_ints: 0u64,
+            long_skips: Vec::new(),
+            cumulated_num_bits: 0u64
+        }
+    }
+
+    pub fn positions_idx(&self) -> u64 {
+        self.num_ints
+    }
+
+
+    fn remaining_block_len(&self) -> usize {
+        COMPRESSION_BLOCK_SIZE - self.block.len()
+    }
+
+    pub fn write_all(&mut self, mut vals: &[u32]) -> io::Result<()> {
+        while !vals.is_empty() {
+            let remaining_block_len = self.remaining_block_len();
+            let num_to_write = remaining_block_len.min(vals.len());
+            self.block.extend(&vals[..num_to_write]);
+            self.num_ints += num_to_write as u64;
+            vals = &vals[num_to_write..];
+            if self.remaining_block_len() == 0 {
+                self.flush_block()?;
+            }
+        }
+        Ok(())
+    }
+
+    fn flush_block(&mut self) -> io::Result<()> {
+        let num_bits = BIT_PACKER.num_bits(&self.block[..]);
+        self.cumulated_num_bits += num_bits as u64;
+        self.write_skiplist.write(&[num_bits])?;
+        let written_len = BIT_PACKER.compress(&self.block[..], &mut self.buffer, num_bits);
+        self.write_stream.write_all(&self.buffer[..written_len])?;
+        self.block.clear();
+        if (self.num_ints % LONG_SKIP_INTERVAL) == 0u64 {
+            self.long_skips.push(self.cumulated_num_bits);
+        }
+        Ok(())
+    }
+
+    pub fn close(mut self) -> io::Result<()> {
+        if !self.block.is_empty() {
+            self.block.resize(COMPRESSION_BLOCK_SIZE, 0u32);
+            self.flush_block()?;
+        }
+        for &long_skip in &self.long_skips {
+            long_skip.serialize(&mut self.write_skiplist)?;
+        }
+        (self.long_skips.len() as u32).serialize(&mut self.write_skiplist)?;
+        self.write_skiplist.flush()?;
+        self.write_stream.flush()?;
+        Ok(())
+    }
+}
--- a/src/postings/compression/mod.rs
+++ b/src/postings/compression/mod.rs
@@ -1,18 +1,14 @@
-#![allow(dead_code)]
-
-
-mod stream;
-
-pub const COMPRESSION_BLOCK_SIZE: usize = 128;
-const COMPRESSED_BLOCK_MAX_SIZE: usize = COMPRESSION_BLOCK_SIZE * 4 + 1;
-
-pub use self::stream::CompressedIntStream;
-
 use bitpacking::{BitPacker, BitPacker4x};
+use common::FixedSize;
+
+pub const COMPRESSION_BLOCK_SIZE: usize = BitPacker4x::BLOCK_LEN;
+const COMPRESSED_BLOCK_MAX_SIZE: usize = COMPRESSION_BLOCK_SIZE * u32::SIZE_IN_BYTES;
+
+mod vint;

 /// Returns the size in bytes of a compressed block, given `num_bits`.
 pub fn compressed_block_size(num_bits: u8) -> usize {
-    1 + (num_bits as usize) * COMPRESSION_BLOCK_SIZE / 8
+    (num_bits as usize) * COMPRESSION_BLOCK_SIZE / 8
 }

 pub struct BlockEncoder {
@@ -30,21 +26,18 @@ impl BlockEncoder {
        }
    }

-    pub fn compress_block_sorted(&mut self, block: &[u32], offset: u32) -> &[u8] {
+    pub fn compress_block_sorted(&mut self, block: &[u32], offset: u32) -> (u8, &[u8]) {
        let num_bits = self.bitpacker.num_bits_sorted(offset, block);
-        self.output[0] = num_bits;
-        let written_size =
-            1 + self.bitpacker
-                .compress_sorted(offset, block, &mut self.output[1..], num_bits);
-        &self.output[..written_size]
+        let written_size = self.bitpacker
+            .compress_sorted(offset, block, &mut self.output[..], num_bits);
+        (num_bits, &self.output[..written_size])
    }

-    pub fn compress_block_unsorted(&mut self, block: &[u32]) -> &[u8] {
+    pub fn compress_block_unsorted(&mut self, block: &[u32]) -> (u8, &[u8]) {
        let num_bits = self.bitpacker.num_bits(block);
-        self.output[0] = num_bits;
-        let written_size = 1 + self.bitpacker
-            .compress(block, &mut self.output[1..], num_bits);
-        &self.output[..written_size]
+        let written_size = self.bitpacker
+            .compress(block, &mut self.output[..], num_bits);
+        (num_bits, &self.output[..written_size])
    }
 }

@@ -69,22 +62,19 @@ impl BlockDecoder {
        }
    }

-    pub fn uncompress_block_sorted(&mut self, compressed_data: &[u8], offset: u32) -> usize {
-        let num_bits = compressed_data[0];
+    pub fn uncompress_block_sorted(&mut self, compressed_data: &[u8], offset: u32, num_bits: u8) -> usize {
        self.output_len = COMPRESSION_BLOCK_SIZE;
-        1 + self.bitpacker.decompress_sorted(
+        self.bitpacker.decompress_sorted(
            offset,
-            &compressed_data[1..],
+            &compressed_data,
            &mut self.output,
            num_bits,
        )
    }

-    pub fn uncompress_block_unsorted<'a>(&mut self, compressed_data: &'a [u8]) -> usize {
-        let num_bits = compressed_data[0];
+    pub fn uncompress_block_unsorted(&mut self, compressed_data: &[u8], num_bits: u8) -> usize {
        self.output_len = COMPRESSION_BLOCK_SIZE;
-        1 + self.bitpacker
-            .decompress(&compressed_data[1..], &mut self.output, num_bits)
+        self.bitpacker.decompress(&compressed_data, &mut self.output, num_bits)
    }

    #[inline]
@@ -98,11 +88,10 @@ impl BlockDecoder {
    }
 }

-mod vint;

 pub trait VIntEncoder {
    /// Compresses an array of `u32` integers,
-    /// using [delta-encoding](https://en.wikipedia.org/wiki/Delta_encoding)
+    /// using [delta-encoding](https://en.wikipedia.org/wiki/Delta_ encoding)
    /// and variable bytes encoding.
    ///
    /// The method takes an array of ints to compress, and returns
@@ -185,10 +174,10 @@ pub mod tests {
    fn test_encode_sorted_block() {
        let vals: Vec<u32> = (0u32..128u32).map(|i| i * 7).collect();
        let mut encoder = BlockEncoder::new();
-        let compressed_data = encoder.compress_block_sorted(&vals, 0);
+        let (num_bits, compressed_data) = encoder.compress_block_sorted(&vals, 0);
        let mut decoder = BlockDecoder::new();
        {
-            let consumed_num_bytes = decoder.uncompress_block_sorted(compressed_data, 0);
+            let consumed_num_bytes = decoder.uncompress_block_sorted(compressed_data, 0, num_bits);
            assert_eq!(consumed_num_bytes, compressed_data.len());
        }
        for i in 0..128 {
@@ -200,10 +189,10 @@ pub mod tests {
    fn test_encode_sorted_block_with_offset() {
        let vals: Vec<u32> = (0u32..128u32).map(|i| 11 + i * 7).collect();
        let mut encoder = BlockEncoder::new();
-        let compressed_data = encoder.compress_block_sorted(&vals, 10);
+        let (num_bits, compressed_data) = encoder.compress_block_sorted(&vals, 10);
        let mut decoder = BlockDecoder::new();
        {
-            let consumed_num_bytes = decoder.uncompress_block_sorted(compressed_data, 10);
+            let consumed_num_bytes = decoder.uncompress_block_sorted(compressed_data, 10, num_bits);
            assert_eq!(consumed_num_bytes, compressed_data.len());
        }
        for i in 0..128 {
@@ -217,12 +206,12 @@ pub mod tests {
        let n = 128;
        let vals: Vec<u32> = (0..n).map(|i| 11u32 + (i as u32) * 7u32).collect();
        let mut encoder = BlockEncoder::new();
-        let compressed_data = encoder.compress_block_sorted(&vals, 10);
+        let (num_bits, compressed_data) = encoder.compress_block_sorted(&vals, 10);
        compressed.extend_from_slice(compressed_data);
        compressed.push(173u8);
        let mut decoder = BlockDecoder::new();
        {
-            let consumed_num_bytes = decoder.uncompress_block_sorted(&compressed, 10);
+            let consumed_num_bytes = decoder.uncompress_block_sorted(&compressed, 10, num_bits);
            assert_eq!(consumed_num_bytes, compressed.len() - 1);
            assert_eq!(compressed[consumed_num_bytes], 173u8);
        }
@@ -237,12 +226,12 @@ pub mod tests {
        let n = 128;
        let vals: Vec<u32> = (0..n).map(|i| 11u32 + (i as u32) * 7u32 % 12).collect();
        let mut encoder = BlockEncoder::new();
-        let compressed_data = encoder.compress_block_unsorted(&vals);
+        let (num_bits, compressed_data) = encoder.compress_block_unsorted(&vals);
        compressed.extend_from_slice(compressed_data);
        compressed.push(173u8);
        let mut decoder = BlockDecoder::new();
        {
-            let consumed_num_bytes = decoder.uncompress_block_unsorted(&compressed);
+            let consumed_num_bytes = decoder.uncompress_block_unsorted(&compressed, num_bits);
            assert_eq!(consumed_num_bytes + 1, compressed.len());
            assert_eq!(compressed[consumed_num_bytes], 173u8);
        }
@@ -305,7 +294,7 @@ mod bench {
    fn bench_uncompress(b: &mut Bencher) {
        let mut encoder = BlockEncoder::new();
        let data = generate_array(COMPRESSION_BLOCK_SIZE, 0.1);
-        let compressed = encoder.compress_block_sorted(&data, 0u32);
+        let (_, compressed) = encoder.compress_block_sorted(&data, 0u32);
        let mut decoder = BlockDecoder::new();
        b.iter(|| {
            decoder.uncompress_block_sorted(compressed, 0u32);
--- a/Show More
+++ b/Show More
Author	SHA1	Message	Date
Paul Masurel	4cbcc59e8f	First stab at #396	2018-08-28 10:01:19 +09:00
Paul Masurel	ede97eded6	Removed use	2018-08-28 09:54:04 +09:00
Paul Masurel	4b7ff78c5a	Added fundamentalss	2018-08-28 08:09:27 +09:00
Paul Masurel	948758ad78	First commit for the documentation	2018-08-27 09:49:49 +09:00
Paul Masurel	d71fa43ca3	Moving emoticon on the right side of the parenthesis	2018-08-23 08:59:11 +09:00
Paul Masurel	1e5266d4c9	Merge branch 'master' of github.com:tantivy-search/tantivy	2018-08-23 08:55:30 +09:00
Paul Masurel	537fc27231	Added bench line in features	2018-08-23 08:55:13 +09:00
Dru Sellers	af593b1116	Add default EN stopwords to the default analyzer (#381 ) * Add a default list of en stopwords * Add the default en stopword filter to the standard tokenizers * code review feedback	2018-08-22 10:49:39 +09:00
Paul Masurel	3d73c0c240	Update issue templates	2018-08-21 10:59:08 +09:00
Paul Masurel	3a8e524f77	Added example to show how to access the inverted list directly	2018-08-21 09:36:13 +09:00
Paul Masurel	c0641c2b47	Remove generate html script. It moved to tantivy-search.github.io	2018-08-21 08:26:46 +09:00
Dru Sellers	ef3a16a129	Switch from error-chain to failure crate (#376 ) * Switch from error-chain to failure crate * Added deprecated alias for * Started editing the changeld	2018-08-20 09:40:45 +09:00
Paul Masurel	a0a284fe91	Added a full fledge empty query and relyign on it in QueryParser, instead of using an empty clause.	2018-08-20 09:21:32 +09:00
dependabot[bot]	0feeef2684	Update owning_ref requirement from 0.3 to 0.4 (#379 ) Updates the requirements on [owning_ref](https://github.com/Kimundi/owning-ref-rs) to permit the latest version. - [Release notes](https://github.com/Kimundi/owning-ref-rs/releases) - [Commits](https://github.com/Kimundi/owning-ref-rs/commits) Signed-off-by: dependabot[bot] <support@dependabot.com>	2018-08-20 09:08:11 +09:00
Dru Sellers	cc50bdb06a	Add a basic faceted search example (#383 ) * Add a basic faceted search example * quieting the compiler	2018-08-19 08:07:54 +09:00
Paul Masurel	23c2c3ae7c	Building all examples on appveyor + running them on travis	2018-08-17 13:24:37 +09:00
Dru Sellers	674524ba91	Add an example of using the stopwords filter (#377 )	2018-08-17 12:52:21 +09:00
Paul Masurel	60a9a7f837	Added example showing how to delete/update documents	2018-08-17 09:43:55 +09:00
Paul Masurel	5b5c706581	Simplified examples	2018-08-16 22:38:39 +09:00
Paul Masurel	3e14a76623	Update regex_query.rs	2018-08-15 16:38:32 +09:00
Paul Masurel	8cde1c81e5	Update README.md	2018-08-13 18:03:30 +09:00
Paul Masurel	8d0a29b137	Added sourcerer wall of fame	2018-08-13 18:02:49 +09:00
Paul Masurel	cbfb2fe19d	Avoid building twice when doing code coverage	2018-08-13 10:38:01 +09:00
Vignesh Sarma K	09e00f1d42	add position_length to Token (#337 ) * add position_length to Token refer #291 * Add term offset to `PhraseQuery` ref #291 * Add new constructor for `PhraseQuery` that allows custom offset * fix the method name as per pr comment * Closes #291 Added unit test. Using offsets from the analyzer in QueryParser.	2018-08-13 10:14:50 +09:00
Paul Masurel	290620fdee	Added slashes	2018-08-13 09:13:01 +09:00
petr-tik	f0d1b85bd8	N370 pr fix num searchers (#371 ) * Change ordering to Acquire * set_num_searchers now uses AtomicUsize.store	2018-08-13 08:56:30 +09:00
petr-tik	aaef546f91	Moved NUM_SEARCHERS into a local variable (#369 ) * Moved NUM_SEARCHERS into a local variable dynamically determined as the number of available cpus. var name in lowercase (not a constant anymore). updated it in docstring * lowercased the varnames * User can set number of logical cores in create_from_metas * cargo fmt * Num_searchers as Arc<AtomicUsize> Retrieving the value with Relaxed ordering Reverted create_from_metas signature. However, it calls num_cpus and sets the Arc val	2018-08-12 20:08:14 +09:00
Paul Masurel	811ddf2226	Closes #364 (#365 ) * Closes #364 * Trying to raise the recursion limit * Better unit test and bug fix on token offsets	2018-08-08 11:15:20 +09:00
Paul Masurel	79a339d353	Removing env_logger dependency	2018-08-02 19:29:09 +09:00
Paul Masurel	e45e4c79d9	update crossbeam	2018-08-02 19:24:08 +09:00
Paul Masurel	848bf41bc9	Updating rand to 0.5 (#363 )	2018-08-02 19:19:04 +09:00
Paul Masurel	d11cb087a7	Updated to combine-0.3 (#362 )	2018-08-02 18:29:58 +09:00
Jacob Brown	2dd7422f42	replace chan with crossbeam-channel (#361 ) * replace chan with crossbeam-channel * Update Cargo.toml	2018-08-02 12:47:22 +09:00
Paul Masurel	e8707c02c0	Issue/333 (#335 ) * Add skip information for posting list (skip to doc ids) * Separate num bits from data for positions (skip n positions) * Address in the position using a n-position offset * Added a long skip structure to allow efficient opening of the position for a given term.	2018-07-31 10:51:53 +09:00
dependabot[bot]	55928d756a	Update rust-stemmers requirement to 1.0.2 (#350 ) * Update rust-stemmers requirement to 1.0.2 Updates the requirements on [rust-stemmers](https://github.com/CurrySoftware/rust-stemmers) to permit the latest version. - [Release notes](https://github.com/CurrySoftware/rust-stemmers/releases) - [Commits](https://github.com/CurrySoftware/rust-stemmers/commits) Signed-off-by: dependabot[bot] <support@dependabot.com> * Update Cargo.toml	2018-07-31 09:32:57 +09:00
dependabot[bot]	a4370bca64	Update owned-read requirement to 0.4 (#352 ) Updates the requirements on [owned-read](https://github.com/tantivy-search/owned-read) to permit the latest version. - [Release notes](https://github.com/tantivy-search/owned-read/releases) - [Commits](https://github.com/tantivy-search/owned-read/commits) Signed-off-by: dependabot[bot] <support@dependabot.com>	2018-07-31 09:32:01 +09:00
dependabot[bot]	5a5c5a8ca5	Update bit-set requirement to 0.5.0 (#351 ) * Update bit-set requirement to 0.5.0 Updates the requirements on [bit-set](https://github.com/contain-rs/bit-set) to permit the latest version. - [Release notes](https://github.com/contain-rs/bit-set/releases) - [Commits](https://github.com/contain-rs/bit-set/commits) Signed-off-by: dependabot[bot] <support@dependabot.com> * Update Cargo.toml * Update Cargo.toml	2018-07-31 09:31:41 +09:00
dependabot[bot]	1b470dd474	Update log requirement to 0.4.3 (#353 ) * Update log requirement to 0.4.3 Updates the requirements on [log](https://github.com/rust-lang/log) to permit the latest version. - [Release notes](https://github.com/rust-lang/log/releases) - [Changelog](https://github.com/rust-lang-nursery/log/blob/master/CHANGELOG.md) - [Commits](https://github.com/rust-lang/log/commits/env_logger-0.4.3) Signed-off-by: dependabot[bot] <support@dependabot.com> * Update Cargo.toml	2018-07-31 09:31:19 +09:00
Paul Masurel	52b4575245	Issue/355 (#358 ) * issue with top_k sorting (#356) * Closes #355	2018-07-31 08:24:55 +09:00
dependabot[bot]	ddd2d5b04c	Update lazy_static requirement to 1.0.2 (#349 ) * Update lazy_static requirement to 1.0.2 Updates the requirements on [lazy_static](https://github.com/rust-lang-nursery/lazy-static.rs) to permit the latest version. - [Release notes](https://github.com/rust-lang-nursery/lazy-static.rs/releases) - [Commits](https://github.com/rust-lang-nursery/lazy-static.rs/commits/v1.0.2) Signed-off-by: dependabot[bot] <support@dependabot.com> * Update Cargo.toml	2018-07-30 12:34:06 +09:00
dependabot[bot]	fa22b4041a	Update itertools requirement to 0.7.8 (#346 ) * Update itertools requirement to 0.7.8 Updates the requirements on [itertools](https://github.com/bluss/rust-itertools) to permit the latest version. - [Release notes](https://github.com/bluss/rust-itertools/releases) - [Commits](https://github.com/bluss/rust-itertools/commits/0.7.8) Signed-off-by: dependabot[bot] <support@dependabot.com> * Update Cargo.toml	2018-07-30 11:32:12 +09:00
dependabot[bot]	8faee143fa	Update regex requirement to 1.0 (#347 ) Updates the requirements on [regex](https://github.com/rust-lang/regex) to permit the latest version. - [Release notes](https://github.com/rust-lang/regex/releases) - [Changelog](https://github.com/rust-lang/regex/blob/master/CHANGELOG.md) - [Commits](https://github.com/rust-lang/regex/commits/1.0.2) Signed-off-by: dependabot[bot] <support@dependabot.com>	2018-07-30 09:59:19 +09:00
dependabot[bot]	366ce98f08	Update tempfile requirement to 3.0 (#348 ) Updates the requirements on [tempfile](https://github.com/Stebalien/tempfile) to permit the latest version. - [Release notes](https://github.com/Stebalien/tempfile/releases) - [Changelog](https://github.com/Stebalien/tempfile/blob/master/NEWS) - [Commits](https://github.com/Stebalien/tempfile/commits/v3.0.3) Signed-off-by: dependabot[bot] <support@dependabot.com>	2018-07-30 09:58:56 +09:00
Paul Masurel	190e60a41c	Closes #339 . (#340 ) As required per the FacetCollector, facet values needs to be sorted before being encoded in the multivalued field.	2018-07-25 18:21:48 +09:00
Vignesh Sarma K	b9558801a1	Declare and implement separate Clone Traits (#336 ) For traits, `Directory` and `MergePolicy`. refer #306	2018-07-18 12:36:43 +09:00
Paul Masurel	36728215ac	Using the codecov badge	2018-07-10 21:19:59 +09:00
Paul Masurel	39551a0418	fix travis	2018-07-10 13:08:22 +09:00
Paul Masurel	39b98b2e76	fix travis	2018-07-10 13:07:15 +09:00
Paul Masurel	616162400d	Add missing space	2018-07-10 12:49:32 +09:00
Paul Masurel	694d164db6	fix travis.yml	2018-07-10 09:39:39 +09:00
Paul Masurel	ef442cefb1	codecov	2018-07-10 09:38:59 +09:00
Paul Masurel	14da241f35	Readed cov	2018-07-10 09:25:24 +09:00
Paul Masurel	346a9e4287	Set dev version	2018-07-10 09:20:21 +09:00
Paul Masurel	31655e92d7	Preparing release 0.6.1	2018-07-10 09:12:26 +09:00
Paul Masurel	6b8d76685a	Tiny refactoring	2018-07-05 09:11:55 +09:00
Paul Masurel	ce5683fc6a	Removed useless counting_writer	2018-07-04 16:13:19 +09:00
Paul Masurel	5205579db6	Merge branch 'master' of github.com:tantivy-search/tantivy	2018-07-04 16:09:59 +09:00
Paul Masurel	d056ae60dc	Removed SourceRead. Relying on the new owned-read crate instead (#332 )	2018-07-04 16:08:52 +09:00
Paul Masurel	af9280c95f	Removed SourceRead. Relying on the new owned-read crate instead	2018-07-04 12:47:25 +09:00
David Hewson	2e538ce6e6	remove extra space in name (#331 ) the extra space that appeared breaks using the package	2018-07-02 05:32:19 +09:00
Jason Wolfe	00466d2b08	#328 : Support parsing unbounded range queries (#329 ) * #328: Support parsing unbounded range queries. Update CHANGELOG.md for query parser changes. * Set version to 0.7-dev	2018-06-30 13:24:02 +09:00
Paul Masurel	8ebbf6b336	Issue/325 (#330 ) * Introducing a SegmentMea inventory. * Depending on census=0.1 * Cargo fmt	2018-06-30 13:11:41 +09:00
Paul Masurel	1ce36bb211	Merge branch 'master' of github.com:tantivy-search/tantivy	2018-06-27 16:58:47 +09:00
Jason Wolfe	2ac43bf21b	Support parsing RangeQuery and AllQuery in Queryparser (#323 ) * (#321) Add support for range query parsing to grammar / parser. Still needs to be wired through the rest of the way. * (321) Finish wiring RangeQuery parsing through * (#321) Add logical AST query parser tests for RangeQuery * (#321) Support parsing AllQuery * (#321) Update documentation of QueryParser * (#321) Support negative numbers in range query parsing	2018-06-25 08:29:47 +09:00
Paul Masurel	3fd8c2aa5a	Removed one keywoard	2018-06-22 14:47:21 +09:00
Paul Masurel	c1022e23d2	Switching to stable rust in AppVeyor.	2018-06-22 14:33:42 +09:00
Paul Masurel	8ccbfdea5d	Preparing for release	2018-06-22 14:27:46 +09:00
Paul Masurel	badfce3a23	Preparing for release.	2018-06-22 14:09:14 +09:00
Dru Sellers	e301e0bc87	Add some simple doc tests (#320 ) * Add TopCollector doc test * Add CountCollector Doc Test * Add Doc Test for MultiCollector * Add ChainedCollector Doc Test * Expose Fuzzy Query where it should be * Add FuzzyTermQuery Doc Test * Expose RegexQuery * Regex Query Doc Test * Add TermQuery Doc Test * Add doc comments * fix test 🤦 * Added explanation about the complexity variables * Fixing unit tests * Single threads if you check docids	2018-06-19 10:45:20 +09:00
Dru Sellers	317baf4e75	Add in simple regex query support (#319 ) * Add fst_regex crate in * Reduce API surface area This doesn't need to be public * better test name * Pull Automaton weight out so it can be shared * Implement Regex Query	2018-06-16 14:08:30 +09:00
Paul Masurel	24398d94e4	Exposing the	2018-06-15 21:40:57 +09:00
Dru Sellers	360f4132eb	Standardizes the Index::open_* APIs (#318 ) * Relocate `from_directory` closer to its usage * Specific methods come before the generic method * Rename open methods to follow the lead of the create methods	2018-06-15 12:16:41 +09:00
Dru Sellers	2b8f02764b	Standardizes the Index::create_* APIs (#317 ) * Pull all creation methods next to each other The goal here is to make it clear which methods are performing the same function, and to assist with standardizing the API calls. * Make `from_directory` private This seems to be an internal function, so lets make it internal. * Rename `create` to `create_in_dir` This lets the name match the `create_in_ram` pattern and opens up `create` for the generic implementation. * Implement the generic create function All of the create methods now delegate to the common create function and future `create_in_*` functions now have a clear pattern to follow as well	2018-06-14 11:08:42 +09:00
Paul Masurel	0465876854	Issue/257 (#310 ) * Replaced lz4 by a pure rust implementation of snappy. Closes #257 * snappy is the default compression. One can use lz4 by enabling the lz4 feature flag. * Removed Compression trait	2018-06-12 19:02:57 +09:00
Dru Sellers	6f7b099370	Add AutomatonWeight to a fuzzy_search module and FuzzyQuery (#300 ) * Add AutomatonWeight to a fuzzy_search module * Hacking around ownership issues * Working through lifetime issues * Working through tests * fix test by lower casing the words (reducing distance) * code review changes * Suggestion on how to solve the borrow problem * clean up	2018-06-11 22:23:03 +09:00
Paul Masurel	84f5cc4388	Added an AUTHORS file. Closes #315 (#316 )	2018-06-11 22:21:58 +09:00
Paul Masurel	75aae0d2c2	Update README	2018-06-08 13:05:57 +09:00
Paul Masurel	009a3559be	atomicwrites 2.2.0 for ARM compilation	2018-06-06 07:13:09 +09:00
Paul Masurel	7a31669e9d	Disabling ARM targets	2018-06-05 12:22:00 +09:00
Paul Masurel	5185eb790b	Reduced heap usage in unit test	2018-06-05 10:02:10 +09:00
Paul Masurel	a3dffbf1c6	Added more ARM target.	2018-06-05 09:06:33 +09:00
Paul Masurel	857a5794d8	Updated nix version	2018-06-05 09:02:40 +09:00
Paul Masurel	b0a6fc1448	Reduce RAM usage	2018-06-04 11:20:24 +09:00
Paul Masurel	989d52bea4	Updated atomicwrites version.	2018-06-04 10:00:21 +09:00
Paul Masurel	09661ea7ec	Added cross testing on different platforms	2018-06-04 09:47:53 +09:00
Paul Masurel	b59132966f	Better heap (#311 ) * Changed the heap to a paged memory arena. * Trying to simplify the indexing term hashmap * Exploding datastruct * Removed some complexity in bitpacker	2018-06-04 09:39:18 +09:00
Paul Masurel	863d3411bc	Update Cargo.toml	2018-05-31 15:54:34 +09:00
Paul Masurel	8a55d133ab	Showing Appveyor CI badge for the master branch .. before the last build was shown.	2018-05-28 13:44:53 +09:00
Jason Wolfe	432d49d814	Expose parameters of RangeQuery for external usage (#309 )	2018-05-19 14:29:25 +09:00