Hamir Mahal
0c634adbe1
style: simplify strings with string interpolation ( #2412 )
...
* style: simplify strings with string interpolation
* fix: formatting
2024-05-27 09:16:47 +02:00
PSeitz
fdecb79273
tokenizer-api: reduce Tokenizer overhead ( #2062 )
...
* tokenizer-api: reduce Tokenizer overhead
Previously a new `Token` for each text encountered was created, which
contains `String::with_capacity(200)`
In the new API the token_stream gets mutable access to the tokenizer,
this allows state to be shared (in this PR Token is shared).
Ideally the allocation for the BoxTokenStream would also be removed, but
this may require some lifetime tricks.
* simplify api
* move lowercase and ascii folding buffer to global
* empty Token text as default
2023-06-08 18:37:58 +08:00
trinity-1686a
064518156f
refactor tokenization pipeline to use GATs ( #1924 )
...
* refactor tokenization pipeline to use GATs
* fix doctests
* fix clippy lints
* remove commented code
2023-03-09 09:39:37 +01:00
Paul Masurel
097fd6138d
Fix clippy comments ( #1872 )
2023-02-14 23:12:45 +09:00
Paul Masurel
bd5eea9852
Integrated columnar work.
2023-02-09 13:14:31 +01:00
Paul Masurel
811fd0cb9e
Dynamic analyzer ( #755 )
...
* Removed generics in tokenizers
* lowercaser
* Added TokenizerExt
* Introducing BoxedTokenizer
* Introducing BoxXXXXX helper struct
* Closes #762 .
* Introducing a TextAnalyzer
2020-01-29 18:23:37 +09:00
Paul Masurel
462774b15c
Tiqb feature/2018 ( #583 )
...
* rust 2018
* Added CHANGELOG comment
2019-07-01 10:01:46 +09:00
Paul Masurel
a3042e956b
Facet remove unsafe ( #454 )
...
* Removing some unsafe
* Removing some unsafe (2)
2018-12-17 09:31:09 +09:00
Paul Masurel
24050d0eb5
Remove some unsafe stuff, justified some of it.
2018-05-07 23:57:53 -07:00
Paul Masurel
78673172d0
Cargo fmt
2018-04-21 20:05:36 +09:00
Paul Masurel
0cf274135b
Clippy
2018-03-10 13:07:18 +09:00
Paul Masurel
a7ffc0e610
Rustfmt
2018-02-12 10:31:29 +09:00
Paul Masurel
df53dc4ceb
Format
2018-02-03 00:21:05 +09:00
Paul Masurel
930010aa88
Unit test passing
2018-01-28 00:03:51 +09:00
Paul Masurel
3edb3dce6a
Test not passing
2018-01-25 12:46:32 +09:00