Commit Graph

1002 Commits

Author SHA1 Message Date
Paul Masurel
3a8e524f77 Added example to show how to access the inverted list directly 2018-08-21 09:36:13 +09:00
Dru Sellers
ef3a16a129 Switch from error-chain to failure crate (#376)
* Switch from error-chain to failure crate

* Added deprecated alias for

* Started editing the changeld
2018-08-20 09:40:45 +09:00
Paul Masurel
a0a284fe91 Added a full fledge empty query and relyign on it in QueryParser, instead of using an empty clause. 2018-08-20 09:21:32 +09:00
Paul Masurel
60a9a7f837 Added example showing how to delete/update documents 2018-08-17 09:43:55 +09:00
Paul Masurel
3e14a76623 Update regex_query.rs 2018-08-15 16:38:32 +09:00
Vignesh Sarma K
09e00f1d42 add position_length to Token (#337)
* add position_length to Token

refer #291

* Add term offset to `PhraseQuery`

ref #291

* Add new constructor for `PhraseQuery` that allows custom offset

* fix the method name as per pr comment

* Closes #291

Added unit test.
Using offsets from the analyzer in QueryParser.
2018-08-13 10:14:50 +09:00
Paul Masurel
290620fdee Added slashes 2018-08-13 09:13:01 +09:00
petr-tik
f0d1b85bd8 N370 pr fix num searchers (#371)
* Change ordering to Acquire

* set_num_searchers now uses AtomicUsize.store
2018-08-13 08:56:30 +09:00
petr-tik
aaef546f91 Moved NUM_SEARCHERS into a local variable (#369)
* Moved NUM_SEARCHERS into a local variable

dynamically determined as the number of available cpus.

var name in lowercase (not a constant anymore).

updated it in docstring

* lowercased the varnames

* User can set number of logical cores in create_from_metas

* cargo fmt

* Num_searchers as Arc<AtomicUsize>

Retrieving the value with Relaxed ordering

Reverted create_from_metas signature. However, it calls num_cpus and
sets the Arc val
2018-08-12 20:08:14 +09:00
Paul Masurel
811ddf2226 Closes #364 (#365)
* Closes #364

* Trying to raise the recursion limit

* Better unit test and bug fix on token offsets
2018-08-08 11:15:20 +09:00
Paul Masurel
79a339d353 Removing env_logger dependency 2018-08-02 19:29:09 +09:00
Paul Masurel
e45e4c79d9 update crossbeam 2018-08-02 19:24:08 +09:00
Paul Masurel
848bf41bc9 Updating rand to 0.5 (#363) 2018-08-02 19:19:04 +09:00
Paul Masurel
d11cb087a7 Updated to combine-0.3 (#362) 2018-08-02 18:29:58 +09:00
Jacob Brown
2dd7422f42 replace chan with crossbeam-channel (#361)
* replace chan with crossbeam-channel

* Update Cargo.toml
2018-08-02 12:47:22 +09:00
Paul Masurel
e8707c02c0 Issue/333 (#335)
* Add skip information for posting list (skip to doc ids) 
* Separate num bits from data for positions (skip n positions)
* Address in the position using a n-position offset
* Added a long skip structure to allow efficient opening of the position for a given term.
2018-07-31 10:51:53 +09:00
Paul Masurel
52b4575245 Issue/355 (#358)
* issue with top_k sorting (#356)

* Closes #355
2018-07-31 08:24:55 +09:00
Paul Masurel
190e60a41c Closes #339. (#340)
As required per the FacetCollector,
facet values needs to be sorted before being encoded in the
multivalued field.
2018-07-25 18:21:48 +09:00
Vignesh Sarma K
b9558801a1 Declare and implement separate Clone Traits (#336)
For traits, `Directory` and `MergePolicy`.

refer #306
2018-07-18 12:36:43 +09:00
Paul Masurel
6b8d76685a Tiny refactoring 2018-07-05 09:11:55 +09:00
Paul Masurel
ce5683fc6a Removed useless counting_writer 2018-07-04 16:13:19 +09:00
Paul Masurel
af9280c95f Removed SourceRead. Relying on the new owned-read crate instead 2018-07-04 12:47:25 +09:00
Jason Wolfe
00466d2b08 #328: Support parsing unbounded range queries (#329)
* #328: Support parsing unbounded range queries. Update CHANGELOG.md for query parser changes.

* Set version to 0.7-dev
2018-06-30 13:24:02 +09:00
Paul Masurel
8ebbf6b336 Issue/325 (#330)
* Introducing a SegmentMea inventory.
* Depending on census=0.1
* Cargo fmt
2018-06-30 13:11:41 +09:00
Jason Wolfe
2ac43bf21b Support parsing RangeQuery and AllQuery in Queryparser (#323)
* (#321) Add support for range query parsing to grammar / parser. Still needs to be wired through the rest of the way.

* (321) Finish wiring RangeQuery parsing through

* (#321) Add logical AST query parser tests for RangeQuery

* (#321) Support parsing AllQuery

* (#321) Update documentation of QueryParser

* (#321) Support negative numbers in range query parsing
2018-06-25 08:29:47 +09:00
Paul Masurel
8ccbfdea5d Preparing for release 2018-06-22 14:27:46 +09:00
Paul Masurel
badfce3a23 Preparing for release. 2018-06-22 14:09:14 +09:00
Dru Sellers
e301e0bc87 Add some simple doc tests (#320)
* Add TopCollector doc test

* Add CountCollector Doc Test

* Add Doc Test for MultiCollector

* Add ChainedCollector Doc Test

* Expose Fuzzy Query where it should be

* Add FuzzyTermQuery Doc Test

* Expose RegexQuery

* Regex Query Doc Test

* Add TermQuery Doc Test

* Add doc comments

* fix test 🤦

* Added explanation about the complexity variables

* Fixing unit tests

* Single threads if you check docids
2018-06-19 10:45:20 +09:00
Dru Sellers
317baf4e75 Add in simple regex query support (#319)
* Add fst_regex crate in

* Reduce API surface area

This doesn't need to be public

* better test name

* Pull Automaton weight out so it can be shared

* Implement Regex Query
2018-06-16 14:08:30 +09:00
Paul Masurel
24398d94e4 Exposing the 2018-06-15 21:40:57 +09:00
Dru Sellers
360f4132eb Standardizes the Index::open_* APIs (#318)
* Relocate `from_directory` closer to its usage

* Specific methods come before the generic method

* Rename open methods to follow the lead of the create methods
2018-06-15 12:16:41 +09:00
Dru Sellers
2b8f02764b Standardizes the Index::create_* APIs (#317)
* Pull all creation methods next to each other

The goal here is to make it clear which methods are performing the
same function, and to assist with standardizing the API calls.

* Make `from_directory` private

This seems to be an internal function, so lets make it internal.

* Rename `create` to `create_in_dir`

This lets the name match the `create_in_ram` pattern and opens up
`create` for the generic implementation.

* Implement the generic create function

All of the create methods now delegate to the common create function
and future `create_in_*` functions now have a clear pattern
to follow as well
2018-06-14 11:08:42 +09:00
Paul Masurel
0465876854 Issue/257 (#310)
* Replaced lz4 by a pure rust implementation of snappy.

Closes #257

* snappy is the default compression. One can use lz4 by enabling the lz4 feature flag.

* Removed Compression trait
2018-06-12 19:02:57 +09:00
Dru Sellers
6f7b099370 Add AutomatonWeight to a fuzzy_search module and FuzzyQuery (#300)
* Add AutomatonWeight to a fuzzy_search module

* Hacking around ownership issues

* Working through lifetime issues

* Working through tests

* fix test by lower casing the words (reducing distance)

* code review changes

* Suggestion on how to solve the borrow problem

* clean up
2018-06-11 22:23:03 +09:00
Paul Masurel
5185eb790b Reduced heap usage in unit test 2018-06-05 10:02:10 +09:00
Paul Masurel
b0a6fc1448 Reduce RAM usage 2018-06-04 11:20:24 +09:00
Paul Masurel
b59132966f Better heap (#311)
* Changed the heap to a paged memory arena.
* Trying to simplify the indexing term hashmap
* Exploding datastruct
* Removed some complexity in bitpacker
2018-06-04 09:39:18 +09:00
Jason Wolfe
432d49d814 Expose parameters of RangeQuery for external usage (#309) 2018-05-19 14:29:25 +09:00
Jason Wolfe
0cea706f10 Add docs to new Query methods (#307) 2018-05-18 13:53:29 +09:00
Paul Masurel
bc69dab822 cargo fmt 2018-05-18 10:08:05 +09:00
Jason Wolfe
72acad0921 Add box_clone() and downcast::Any to Query (#303) 2018-05-18 09:53:11 +09:00
Paul Masurel
c9459f74e8 Update docs about TermDict. 2018-05-18 09:20:39 +09:00
Dru Sellers
08d2cc6c7b Make it possible to stream the terms matching an Automaton (#297)
* rustfmt and some English grammar

* sort cargo.toml crates

* WIP: something to show

* Remove example for now

* Implement desired method

* Resolving Generic Type Arguments

* Resolve Generic Types

* Banging around on the tests

* DANGER! Change unsafe usage based on compiler warnings

* Unscrew up my rebase

* Clean Up Type Spam

Default Types FTW

* typo

* better variable names

* Remove Duplicate Levenshtein crate
2018-05-11 12:41:14 -07:00
Dru Sellers
82d87416c2 Implement StopWords Filter (#292)
* Implement StopWords Filter

- added example doctest for alphanum_only.rs so that I could
drive my own test of the stopword filter

* Style Cop

* Switch HashSet Hasher to FNV for speed

* Update Change Log

* fix missed location renaming
2018-05-09 18:40:41 -07:00
Paul Masurel
96b2c2971e Testing actual doc ids in unit test 2018-05-09 09:14:22 -07:00
Dru Sellers
162afd73f6 Alive docs iterator (#293)
* Add non-deleted DocId iterator to SegmentReader

Closes #287

* Add Todo

* Add Unit Test

* Improving test based on feedback

- found bug and fixed it. :)

* Reestablish changes post rebase for clean merge
2018-05-09 09:03:27 -07:00
Paul Masurel
ddfd87fa59 Merge branch 'master' of github.com:tantivy-search/tantivy 2018-05-08 00:08:17 -07:00
Paul Masurel
24050d0eb5 Remove some unsafe stuff, justified some of it. 2018-05-07 23:57:53 -07:00
Jason Wolfe
89eb209ece #294: Make fieldnorm module public, add documentation (#295) 2018-05-07 20:20:38 -07:00
Paul Masurel
9a0b7f9855 Rustfmt 2018-05-07 19:50:35 -07:00