Commit Graph

1127 Commits

Author SHA1 Message Date
petr-tik
1a90a1f3b0 Merge branch 'master' of github.com:tantivy-search/tantivy into stamper_refactor 2019-04-26 08:47:12 +01:00
Paul Masurel
dac50c6aeb Dds merged (#539)
* add ascii folding support

* Minor change and added Changelog.

* add additional tests

* Add tests for ascii folding (#533)

* first tests for ascii folding

* use a `RawTokenizer` for tokens using punctuation

* add test for all (?) folding, inspired by Lucene

* Simplification of the unit test code
2019-04-26 10:25:08 +09:00
Paul Masurel
31b22c5acc Added logging when token is dropped. (#538) 2019-04-26 09:23:28 +09:00
petr-tik
8e50921363 Tidied up the Stamper module and upgraded to a 1.34 dependency
Added stamper.revert method to be used for rollback - rolling back to a previous
commit in case of deleting all documents or rolling operations back should reset
the stamper as well

Added type alias for Opstamp - helps code readibility instead of seeing u64
returned by functions.

Moved to AtomicU64 on stable rust (since 1.34) - where possible use standard
library interfaces.
2019-04-24 20:46:28 +01:00
Paul Masurel
96a4f503ec Closes #526 (#535) 2019-04-24 20:59:48 +09:00
Paul Masurel
b7c2d0de97 Clippy2 (#534)
* Clippy comments

Clippy complaints that about the cast of &[u32] to a *const __m128i,
because of the lack of alignment constraints.

This commit passes the OutputBuffer object (which enforces proper
    alignment) instead of `&[u32]`.

* Clippy. Block alignment

* Code simplification

* Added comment. Code simplification

* Removed the extraneous freq block len hack.
2019-04-24 12:31:32 +09:00
Paul Masurel
a228825462 Clippy comments (#532)
Clippy complaints that about the cast of &[u32] to a *const __m128i,
because of the lack of alignment constraints.

This commit passes the OutputBuffer object (which enforces proper
    alignment) instead of `&[u32]`.
2019-04-23 09:54:02 +09:00
Paul Masurel
d823163d52 Closes #527. (#529)
Fixing the bug that affects the result of `query.count()` in presence of
deletes.
2019-04-19 09:19:50 +09:00
Paul Masurel
acd29b535d Fix comment 2019-04-02 10:05:14 +09:00
Panagiotis Ktistakis
2cd31bcda2 Fix non english stemmers (#521) 2019-03-27 08:54:16 +09:00
Paul Masurel
a8cc5208f1 Linear simd (#519)
* linear simd search within block
2019-03-20 22:10:05 +09:00
Paul Masurel
83eb0d0cb7 Disabling tests on Android 2019-03-20 10:24:17 +09:00
Paul Masurel
ee6e273365 cleanup for nodefaultfeatures 2019-03-20 10:04:42 +09:00
Paul Masurel
5768d93171 Rename try to attempt as try is becoming a keyword in rust 2019-03-20 08:54:19 +09:00
Paul Masurel
663dd89c05 Feature/reader (#517)
Adding IndexReader to the API. Making it possible to watch for changes.

* Closes #500
2019-03-20 08:39:22 +09:00
barrotsteindev
a934577168 WIP: date field (#487)
* initial version, still a work in progress

* remove redudant or

* add chrono::DateTime and index i64

* add more tests

* fix tests

* pass DateTime by ptr

* remove println!

* document query_parser rfc 3339 date support

* added some more docs about implementation to schema.rs

* enforce DateTime is UTC, and re-export chrono

* added DateField to changelog

* fixed conflict

* use INDEXED instead of INT_INDEXED for date fields
2019-03-15 22:10:37 +09:00
Paul Masurel
94f1885334 Issue/513 (#514)
* Closes #513

* Clean up and doc

* Updated changelog
2019-03-07 09:39:30 +09:00
Paul Masurel
e67883138d Cargo fmt 2019-03-06 10:31:00 +09:00
Paul Masurel
f5c65f1f60 Added comment on the constructor fo TopDocSByField 2019-03-06 10:30:37 +09:00
Mauri de Souza Nunes
ec73a9a284 Remove note about panicking in get_field docs (#503)
Since get_field rely on calling get on the underlying InnerSchema HashMap
it shouldn't fail if the field was not found, it simply returns None.
2019-02-28 09:23:00 +09:00
Thomas Schaller
a814a31f1e Remove semicolon from doc! expansion (#509) 2019-02-28 09:20:43 +09:00
Paul Masurel
9acadb3756 Code cleaning 2019-02-26 10:50:36 +09:00
Paul Masurel
774fcecf23 cargo fmt 2019-02-26 10:44:59 +09:00
Paul Masurel
27c9fa6028 Jannickj prove bug with facets (#508)
* prove bug with facets

* Closing #505

Introduce a term id in the TermHashMap
2019-02-25 22:33:17 +09:00
Paul Masurel
b422f9c389 Partially addresses #500 (#502)
Using `tantivy_fst`. Storing `Weak<Mmap>` in the Mmap cache.
2019-02-23 10:33:59 +09:00
petr-tik
9451fd5b09 MsQueue to channel (#495)
* Format

Made the docstring consistent
remove empty line

* Move matches to dev deps

* Replace MsQueue with an unbounded crossbeam-channel

Questions:
queue.push ignores Result return

How to test pop() calls, if they block

* Format

Made the docstring consistent
remove empty line

* Unwrap the Result of queue.pop

* Addressed Paul's review

wrap the Result-returning send call with expect()

implemented the test not to fail after popping from empty queue

removed references to the Michael-Scott Queue

formatted
2019-02-23 09:06:50 +09:00
Jason Goldberger
e14701e9cd Add grouped operations (#493)
* [WIP] added UserOperation enum, added IndexWriter.run, and added MultiStamp

* removed MultiStamp in favor of std::ops::Range

* changed IndexWriter::run to return u64, Stamper::stamps to return a Range, added tests, and added docs

* changed delete_cursor skipping to use first operation's opstamp vice last. change index_writer test to use 1 thread

* added test for order batch of operations

* added a test comment
2019-02-14 08:56:01 +09:00
Paul Masurel
45e62d4329 Code simplification and adding comments 2019-02-06 10:05:15 +09:00
Paul Masurel
04e9606638 simplification of positions 2019-02-05 15:36:13 +01:00
Paul Masurel
a5c57ebbd9 Positions simplification 2019-02-05 14:50:51 +01:00
Paul Masurel
96eaa5bc63 Positions 2019-02-05 14:50:16 +01:00
Paul Masurel
f1d30ab196 fastfield reader fix 2019-02-05 14:10:16 +01:00
Paul Masurel
4507df9255 Closes #461 (#489)
Multivalued fast field uses `u64` indexes.
2019-02-04 13:24:00 +01:00
Paul Masurel
e8625548b7 Closes #461 (#488)
Multivalued fast field uses `u64` indexes.
2019-02-04 13:20:20 +01:00
Paul Masurel
50ed6fb534 Code cleanup
Fixed compilation without the mmap directory
2019-02-05 12:39:30 +01:00
Panagiotis Ktistakis
76609deadf Add Greek stemmer (#486) 2019-02-01 06:30:49 +01:00
Paul Masurel
749e62c40b renamed 2019-01-30 16:29:17 +01:00
Paul Masurel
259ce567d1 Using linear search 2019-01-29 15:59:24 +01:00
Paul Masurel
4c93b096eb Rustfmt 2019-01-29 11:45:30 +01:00
Paul Masurel
6a547b0b5f Issue/483 (#484)
* Downcast_ref

* fixing unit test
2019-01-28 11:43:42 +01:00
Paul Masurel
e99d1a2355 Better exponential search 2019-01-29 11:29:17 +01:00
Paul Masurel
c7bddc5fe3 Inlined exponential search 2019-01-28 17:28:07 +01:00
Paul Masurel
7b97dde335 Clippy + cargo fmt 2019-01-28 12:37:55 +01:00
Paul Masurel
644b4bd0a1 Issue/468b (#482)
* Moving lock to directory/

* added fs2

* doc

* Using fs2 for locking

* Added unit test

* Fixed error message related unit test

* Fixing location of import
2019-01-27 12:32:21 +01:00
Paul Masurel
bf94fd77db Issue/471 (#481)
* Closes 471

Removing writing_segments in the segment manager as it is now useless.
Removing the target merged segment id as it is useless as well.

* RAII for tracking which segment is in merge.

Closes #471

* fmt

* Using Inventory::default().
2019-01-27 12:18:59 +09:00
Paul Masurel
097eaf4aa6 impl Future as a result of merges 2019-01-28 03:56:43 +01:00
Paul Masurel
1fd46c1e9b Clippy 2019-01-28 03:46:23 +01:00
Paul Masurel
63b593bd0a Lower RAM usage in tests. 2019-01-24 09:10:38 +09:00
barrotsteindev
222b7f2580 Tantivy-288 (#472)
* add unit test

* improved test

* added SegmentManager#remove_empty_segments

* update old tests for new behaviour

* cleaner filter for empty segments

* PR adjustments

* rename x in closures

* simplify assert_eq!(vec.len(), 0)

* wait_merging_threads

* acquire searchers

* add comments to test

* rebased on latest master

* harden test

* fix merger#test_merge_multivalued_int_fields_all_deleted test
2019-01-24 08:58:56 +09:00
Paul Masurel
0b0bf59a32 Allow stemmers in languages other than English (#478)
Allow users to create stemmers for languages other than English. Add a
default stemmer for English.

Closes #478
2019-01-23 22:21:00 +09:00