Commit Graph

16 Commits

Author SHA1 Message Date
Harrison Burt
1c7c6fd591 POC: Tantivy documents as a trait (#2071)
* fix windows build (#1)

* Fix windows build

* Add doc traits

* Add field value iter

* Add value and serialization

* Adjust order

* Fix bug

* Correct type

* Fix generic bugs

* Reformat code

* Add generic to index writer which I forgot about

* Fix missing generics on single segment writer

* Add missing type export

* Add default methods for convenience

* Cleanup

* Fix more-like-this query to use standard types

* Update API and fix tests

* Add doc traits

* Add field value iter

* Add value and serialization

* Adjust order

* Fix bug

* Correct type

* Rebase main and fix conflicts

* Reformat code

* Merge upstream

* Fix missing generics on single segment writer

* Add missing type export

* Add default methods for convenience

* Cleanup

* Fix more-like-this query to use standard types

* Update API and fix tests

* Add tokenizer improvements from previous commits

* Add tokenizer improvements from previous commits

* Reformat

* Fix unit tests

* Fix unit tests

* Use enum in changes

* Stage changes

* Add new deserializer logic

* Add serializer integration

* Add document deserializer

* Implement new (de)serialization api for existing types

* Fix bugs and type errors

* Add helper implementations

* Fix errors

* Reformat code

* Add unit tests and some code organisation for serialization

* Add unit tests to deserializer

* Add some small docs

* Add support for deserializing serde values

* Reformat

* Fix typo

* Fix typo

* Change repr of facet

* Remove unused trait methods

* Add child value type

* Resolve comments

* Fix build

* Fix more build errors

* Fix more build errors

* Fix the tests I missed

* Fix examples

* fix numerical order, serialize PreTok Str

* fix coverage

* rename Document to TantivyDocument, rename DocumentAccess to Document

add Binary prefix to binary de/serialization

* fix coverage

---------

Co-authored-by: Pascal Seitz <pascal.seitz@gmail.com>
2023-10-02 10:01:16 +02:00
PSeitz
2d7390341c increase min memory to 15MB for indexing (#2176)
With tantivy 0.20 the minimum memory consumption per SegmentWriter increased to
12MB. 7MB are for the different fast field collectors types (they could be
lazily created). Increase the minimum memory from 3MB to 15MB.

Change memory variable naming from arena to budget.

closes #2156
2023-09-13 07:38:34 +02:00
Paul Masurel
eca6628b3c Minor refactoring (#1266) 2022-01-28 15:55:55 +09:00
Paul Masurel
9679c5f306 Rename quickwit-inc -> quickwit-oss 2022-01-27 15:37:09 +09:00
Paul Masurel
7234bef0eb Issue/1198 (#1201)
* Unit test reproducing #1198
* Fixing unit test to handle the error from add_document.
* Bump project version
2021-11-11 16:42:19 +09:00
Pascal Seitz
99cd25beae use <T: Into<Box<dyn Directory>>> as parameter to open/create an Index
This is done in order to support Box<dyn Directory> additionally to generic implementations of the trait Directory.
Remove boxing in ManagedDirectory.
2021-10-25 12:34:40 +08:00
Pascal Seitz
0062fe705d cargo fmt 2021-07-01 18:17:08 +02:00
Pascal Seitz
1e4df54ab3 fix clippy 2021-07-01 17:41:53 +02:00
Paul Masurel
39dd8cfe24 Cargo clippy. Acronym should not be full uppercase apparently. 2021-04-26 11:49:18 +09:00
Adrien Guillo
7fd6054145 Modified Directory::exists API to return Result<bool, OpenReadError> 2020-11-09 18:00:14 -08:00
Paul Masurel
c23a03ad81 Large API Change in the Directory API. (#901)
Tantivy used to assume that all files could be somehow memory mapped. After this change, Directory return a `FileSlice` that can be reduced and eventually read into an `OwnedBytes` object. Long and blocking io operation are still required by they do not span over the entire file.
2020-10-08 16:36:51 +09:00
Paul Masurel
42756c7474 Removing futures-cpupool and upgrading to futures-0.3 2019-11-15 18:35:31 +09:00
Paul Masurel
7b21b3f25a Refactoring around Field (#673)
* Refactoring around Field

Removing the contract about the order of the field, and the
field id allocation.

* Update delete_queue.rs

* Update field.rs
2019-10-25 09:06:44 +09:00
fdb-hiroshima
d8894f0bd2 add checksum check in ManagedDirectory (#605)
* add checksum check in ManagedDirectory

fix #400

* flush after writing checksum

* don't checksum atomic file access and clone managed_paths

* implement a footer storing metadata about a file

this is more of a poc, it require some refactoring into multiple files
`terminate(self)` is implemented, but not used anywhere yet

* address comments and simplify things with new contract

use BitOrder for integer to raw byte conversion
consider atomic write imply atomic read, which might not actually be true
use some indirection to have a boxable terminating writer

* implement TerminatingWrite and make terminate() be called where it should

add dependancy to drop_bomb to help find where terminate() should be called
implement TerminatingWrite for wrapper writers
make tests pass
/!\ some tests seems to pass where they shouldn't

* remove usage of drop_bomb

* fmt

* add test for checksum

* address some review comments

* update changelog

* fmt
2019-09-18 18:26:25 +09:00
Kornel
754b55eee5 Bump deps (#613)
* Bump crossbeam

* Warnings--

* Remove outdated tempdir
2019-08-05 22:21:22 +09:00
Paul Masurel
7211df6719 Failrs (#600)
* Single thread tests

* Isolating fail tests into a different binary
2019-07-22 13:17:21 +09:00