mirror of
https://github.com/quickwit-oss/tantivy.git
synced 2026-05-30 23:20:40 +00:00
This overhauls `SegmentReader` to put its various components behind `OnceLock`s such that they can be opened and read on their first use, as oppoed when a SegmentReader is constructed -- which is once for every segment when an Index is opened. This has a negative impact on some of Tantivy's expectations in that an existing SegementReader can still read from physical files that were deleted by a merge. This isn't true now that the segment's physical files aren't opened until needed. As such, I've `#[ignore]`'d six tests that expose this problem. From our (pg_search's) side of things, we don't really have physical files and don't need to rely on the filesystem/kernel to allow reading unlinked files that are still open. Overall, this cuts down a signficiant number of disk reads during pg_search's query planning. With my test data it goes from 808 individual reads totalling 999,799 bytes, to 18 reads totalling 814,514 bytes. This reduces the time it takes to plan a simple query from about 1.4ms to 0.436ms -- roughly a 3.2x improvement.