Commit Graph

58 Commits

Author SHA1 Message Date
Paul Masurel
02bfa9be52 Moving to termdict 2017-05-19 08:43:52 +09:00
Paul Masurel
b3f62b8acc Better API 2017-05-18 23:35:39 +09:00
Paul Masurel
2a08c247af Clippy 2017-05-18 23:20:41 +09:00
Paul Masurel
d2926b6ee0 Format 2017-05-18 23:09:20 +09:00
Paul Masurel
0272167c2e Code cleaning 2017-05-18 23:06:02 +09:00
Paul Masurel
ca76fd5ba0 Uncommenting unit test 2017-05-18 20:41:56 +09:00
Paul Masurel
e79a316e41 Issue 155 - Trying to avoid term lookup when merging terms
+ Adds a proper Streamer interface
2017-05-18 20:12:00 +09:00
Paul Masurel
7b2b181652 Merge branch 'master' into issue/136
Conflicts:
	src/datastruct/stacker/hashmap.rs
	src/datastruct/stacker/heap.rs
	src/datastruct/stacker/mod.rs
	src/indexer/index_writer.rs
	src/indexer/merger.rs
	src/indexer/segment_updater.rs
	src/indexer/segment_writer.rs
	src/postings/postings_writer.rs
	src/postings/recorder.rs
	src/schema/term.rs
2017-05-17 18:40:09 +09:00
Laurentiu Nicola
b3f39f2343 Remove unneeded suppressions, make clippy lints explicit 2017-05-17 15:50:07 +09:00
Laurentiu Nicola
c0538dbe9a clippy: fix mut_from_ref warnings 2017-05-17 15:50:07 +09:00
Laurentiu Nicola
3dde748b25 Make rustfmt happy 2017-05-16 00:49:05 +03:00
Paul Masurel
4c8f9742f8 format 2017-05-15 22:30:18 +09:00
Paul Masurel
ecbdd70c37 Removed the clunky linked list logic of the heap. 2017-05-12 14:01:52 +09:00
Paul Masurel
fb1b2be782 issue/136 Fix following CR 2017-05-12 13:51:09 +09:00
Paul Masurel
6fd17e0ead Code cleaning 2017-05-11 20:47:30 +09:00
Paul Masurel
477b9136b9 FIXED inconsistent Term's field serialization.
Also.

Cleaned up the code to make sure that the logic
is only in one place.
Removed allocate_vec

Closes #141
Closes #139
Closes #142
Closes #138
2017-05-11 19:37:15 +09:00
Paul Masurel
54ab897755 Added comment 2017-05-10 19:30:24 +09:00
Paul Masurel
1369d2d144 Quadratic probing. 2017-05-10 10:38:47 +09:00
Paul Masurel
90bc3e3773 Added limitation on term dictionary saturation 2017-05-09 14:10:33 +09:00
Paul Masurel
ffb62b6835 working 2017-05-09 10:17:05 +09:00
Paul Masurel
e0a39fb273 issue/96 Added unit test, documentation and various tiny improvements. 2017-04-04 22:43:35 +09:00
Paul Masurel
f0dc0de4b7 Added helper to create Vec with a given size 2017-03-29 11:26:24 +09:00
Paul Masurel
597dac9cb6 NOBUG Adding doc. 2017-02-25 23:39:02 +09:00
Paul Masurel
503d0295cb issue/43 TODO hunt 2017-02-23 09:54:54 +09:00
Paul Masurel
0f332d1fd3 issue/43 Removed doc freq from recorders. 2017-02-19 22:39:31 +09:00
Paul Masurel
1b45539f32 issue/43 Added support for delete in merged index 2017-02-19 22:39:31 +09:00
Paul Masurel
7315000fd4 issue/43 Merging ok for postings / fastfields. 2017-02-19 22:39:31 +09:00
Paul Masurel
d5c161e196 issue/43 Computing deleted doc bitset 2017-02-19 22:38:14 +09:00
Paul Masurel
ca5f3e1d46 issue/67 First stab. Iterator working. 2016-12-17 00:58:12 +01:00
Paul Masurel
6dcac90f49 issue38 Slightly cleaner code. 2016-10-24 22:33:33 +09:00
Paul Masurel
9358eb32f0 bug/4 Removed useless use of Cursor. 2016-10-16 23:25:03 +09:00
Paul Masurel
20c089b9f1 bug/4 fixed for clippy 2016-10-16 18:31:29 +09:00
Paul Masurel
2b7444b11a bug/4 Removed race condition in SegmentUpdater 2016-10-16 17:04:45 +09:00
Paul Masurel
746d6284d9 #4 Using peek_mut API from binary heap instead of replace 2016-10-14 15:30:36 +09:00
Paul Masurel
97099e9911 bug/4 Allocating heap in a slightly less unsafe way 2016-10-11 10:32:15 +09:00
Paul Masurel
fe905ff18b bug/4 Bugfix, and made unit test way faster 2016-10-11 08:59:41 +09:00
Paul Masurel
ca331e7fe5 Added documentation / HeapAllocable 2016-09-22 14:32:44 +09:00
Paul Masurel
b337adbd78 NOBUG Added comments. 2016-09-21 00:52:31 +09:00
Paul Masurel
e8d5baa44b NOBUG Adding documentation 2016-09-20 08:58:43 +09:00
Paul Masurel
f3a24f5b3c NOBUG Code cleaning , cargo clippy 2016-09-19 17:01:37 +09:00
Paul Masurel
95d16d916b Removed dead code. 2016-09-15 09:15:09 +09:00
Paul Masurel
7969fb3a71 Use logging. 2016-09-15 00:00:14 +09:00
Paul Masurel
346fc31ac2 Chaining heaps.
We commit close segments when the indexer heap is close to its capacity.
(currently we use a limit of 10_000_000).

Because we do this check before indexing a document, and before
also because serialization starts by closing the postingswriter, and
therefore all of the recorders open for the last document, we may still
overflow the heap.

We don't want to resize the heap because we may have references to objects
in the current heap.

Because of that, heap are actually chained list.
In an ideal settings, the limit should work fine and this overflow behavior should
never be activated.
2016-09-14 10:27:55 +09:00
Paul Masurel
b911c4dc98 Indexing works. 3'22 2016-09-13 00:36:42 +09:00
Paul Masurel
50687a1c7c Renaming + new unit test 2016-09-08 09:26:14 +09:00
Paul Masurel
a612504e26 #8 Hashmap size as a function of the heap size 2016-09-06 22:13:55 +09:00
Paul Masurel
24d2e3f6c1 switching for the stacker datastructure 2016-09-05 10:27:14 +09:00
Paul Masurel
a599614a94 Code clean up. 2016-08-27 17:00:14 +09:00
Paul Masurel
0972a1c6a0 Removing data copy in the RAMDirectory
The fst crate recently added support for sliced `Arc<Vec<u8>>`.
This called for a rewrite of the RAMDirectory for tantivy's RAMDirectory.
Previously every single read was copying data.

In addition:
- RAMDirectory's Write object panic if someone does not flush
right before the destruction of the object.
- In the same spirit, the postings serializer panics if someone
opens a term without closing the previous one.

Closes #16
2016-08-18 10:45:34 +09:00
Paul Masurel
e486495cb8 Code cleaning. 2016-07-31 15:34:32 +09:00