Commit Graph

  • 2edda73656 fix name, add comment bucket_id_agg Pascal Seitz 2026-01-05 15:21:12 +01:00
  • b269fd8bb8 move partitions to heap Pascal Seitz 2026-01-05 15:07:00 +01:00
  • 8d3a7abbe9 split subaggcache into two trait impls Pascal Seitz 2026-01-05 09:23:56 +01:00
  • b3db43bfb5 Explicit doc for the meaning of intersection_priority paul.masurel/skip-first-doc-on-intersection Paul Masurel 2026-01-05 11:15:45 +01:00
  • db2ecc6057 fix Column.first method parameter type (#2792) main ChangRui-Ryan 2026-01-05 17:03:01 +08:00
  • db9e35e7ee Property test for Comparator/ValueRange consistency, and fixes. stuhood.lazy-scorers-blocks-upstream Stu Hood 2025-12-27 17:56:55 -07:00
  • 7f39d5eab9 test_order_by_u64_prop Stu Hood 2025-12-27 16:04:24 -07:00
  • af53ffe5df Use a Buffer generic scratch buffer parameter on TopNComputer and push directly from ColumnValues into a TopNComputer buffer in some cases. Stu Hood 2025-12-26 17:52:40 -07:00
  • 041c6f01a3 Convert test_order_by_compound_filtering_with_none to a proptest. Stu Hood 2025-12-26 14:25:01 -07:00
  • 9615eb73b8 Implement collect_block for lazy scorers using SegmentSortKeyComputer::segment_sort_keys. Stu Hood 2025-11-01 15:46:46 -07:00
  • b8636e707c Seek into the danger zoner paul.masurel/seek_into_the_danger_zone_result Paul Masurel 2026-01-03 18:00:05 +01:00
  • 601541e9ae Added unit tests Paul Masurel 2026-01-02 10:42:05 +01:00
  • 77505c3d03 Making stemming optional. (#2791) Paul Masurel 2026-01-02 12:40:42 +01:00
  • 735c588f4f fix union performance regression (#2790) PSeitz 2026-01-02 19:06:51 +08:00
  • 242a1531bf fix flaky test (#2784) PSeitz 2026-01-02 18:30:51 +08:00
  • 6443b63177 document 1bit hole and some queries supporting running with just fastfield (#2779) trinity-1686a 2026-01-02 10:32:37 +01:00
  • 4987495ee4 Add an erased SortKeyComputer to sort on types which are not known until runtime (#2770) Stu Hood 2026-01-02 02:28:47 -07:00
  • 98be1a5423 Added seek_doc to intersections. Paul Masurel 2025-12-30 17:26:15 +01:00
  • b11605f045 Addressing clippy comments (#2789) Paul Masurel 2025-12-31 18:02:00 +01:00
  • 75d7989cc6 add benchmark for boolean query with range sub query (#2787) ChangRui-Ryan 2025-12-31 19:00:53 +08:00
  • f5939b2e4c Adds seek into the danger zone for fastfield range docsets. paul.masurel/search_into_danger_zone_fastfield Paul Masurel 2025-12-30 19:00:46 +01:00
  • 923f0508f2 seek_exact + cost based intersection (#2538) PSeitz 2025-12-30 21:43:25 +08:00
  • 147214b0eb Implement GreaterThanOrEqual and LessThanOrEqual to handle boundary cases in Chain. stuhood.lazy-scorers-blocks Stu Hood 2025-12-29 15:38:28 -07:00
  • 865a12f4bb Simpler implementation of first_vals_in_value_range. Stu Hood 2025-12-28 12:19:47 -07:00
  • 00110312c9 Fix compound filters, and remove redundant implementation in Chain implementation Stu Hood 2025-12-28 10:18:52 -07:00
  • e0b62e00ac optimize RangeDocSet for non-overlapping query ranges (#2783) ChangRui-Ryan 2025-12-29 23:55:28 +08:00
  • 3ddca31292 simplify fetch block in column_block_accessor Pascal Seitz 2025-12-12 17:38:37 +08:00
  • b2e980b450 Property test for Comparator/ValueRange consistency, and fixes. Stu Hood 2025-12-27 18:37:15 -07:00
  • 1a701b86bd Remove allow-dead-code annotation. Stu Hood 2025-12-27 17:56:55 -07:00
  • ee4538d6c2 test_order_by_u64_prop Stu Hood 2025-12-27 16:04:24 -07:00
  • 25f1e9aa9f Move ComparableDoc to a reusable location, allowing for pushing directly from ColumnValues into a TopNComputer buffer in some cases. Stu Hood 2025-12-27 14:01:02 -07:00
  • 6b03b28bac Use a Buffer generic scratch buffer parameter on TopNComputer to allow for internal iteration in SegmentSortKeyComputer. Stu Hood 2025-12-27 12:09:33 -07:00
  • 7a5241cb83 Update comments. Stu Hood 2025-12-26 17:52:40 -07:00
  • 0f5e0f6f87 TODO: Audit. Stu Hood 2025-12-26 16:12:15 -07:00
  • a654115d9a Squash with Optional. WIP: Still needs work: we are allocating. Stu Hood 2025-12-26 15:18:17 -07:00
  • 1a17515ead Convert test_order_by_compound_filtering_with_none to a proptest. Stu Hood 2025-12-26 14:25:01 -07:00
  • 0f1b0ce527 Optimize Optional indexes. TODO: Audit. Stu Hood 2025-12-26 13:01:30 -07:00
  • 0c920dfc61 Add a ValueRange filter to SegmentSortKeyComputer::segment_sort_keys. Stu Hood 2025-12-26 11:30:36 -07:00
  • 996fc936f6 Add null handling to first_vals_in_value_range. Stu Hood 2025-12-26 11:12:53 -07:00
  • 5ff38e1605 WIP: Add ValueRange cases for Comparators. Stu Hood 2025-12-26 11:02:19 -07:00
  • e8a4adeedd Replace Column::first_vals with Column::first_vals_in_value_range. Stu Hood 2025-12-25 15:26:33 -07:00
  • efc9e585a9 WIP: Add ValueRange::All Stu Hood 2025-12-25 15:07:21 -07:00
  • f4252fc184 WIP: Add ValueRange. Stu Hood 2025-12-25 14:53:15 -07:00
  • 53c067d1f3 Restore laziness in ChainSegmentSortKeyComputer. Stu Hood 2025-12-23 18:06:55 -07:00
  • 259c1ed965 Isolate accept_sort_key_lazy to ChainSegmentSortKeyComputer. Stu Hood 2025-12-23 17:37:02 -07:00
  • 1afc432df8 Use an internal buffer in the SegmentSortKeyComputer. Stu Hood 2025-12-23 17:23:10 -07:00
  • b8acd3ac94 WIP: Add and use segment_sort_keys to remove dynamic dispatch to the column. Stu Hood 2025-12-23 16:44:50 -07:00
  • b5321d2125 Implement laziness for collect_block. Stu Hood 2025-12-23 15:45:18 -07:00
  • ad3e2363fe WIP: Add failing test. Stu Hood 2025-12-19 15:14:54 -07:00
  • 9ec5750c25 Implement collect_block for lazy scorers. Stu Hood 2025-11-01 15:46:46 -07:00
  • 03f09a2b5b chore: Add support for natural-order-with-none-highest in TopDocs::order_by (#90) Stu Hood 2025-12-23 10:15:31 -07:00
  • ce97beb86f Add support for natural-order-with-none-highest in TopDocs::order_by (#2780) Stu Hood 2025-12-23 01:22:20 -07:00
  • c0f21a45ae Use a strict comparison in TopNComputer (#2777) Stu Hood 2025-12-18 03:13:23 -08:00
  • 9ffe4af096 Fix TopN performance regression. Stu Hood 2025-12-15 16:15:12 -08:00
  • c56ddcb6d7 Add an erased SortKeyComputer to sort on types which are not known until runtime. Stu Hood 2025-12-03 21:55:24 -08:00
  • 5b8fff154b fix: overflow in vint buffer (#88) Ming 2025-12-16 16:04:29 -05:00
  • 73657dff77 fix: fixed integer overflow in ExpUnrolledLinkedList for large datasets (#2735) Moe 2025-12-16 13:57:12 -08:00
  • e3c9be1f92 fix: boolean query incorrectly dropping documents when AllScorer is present (#2760) Moe 2025-12-16 13:52:02 -08:00
  • ba61ed6ef3 fix: vint buffer can overflow (#2778) Ming 2025-12-16 16:50:41 -05:00
  • d0e1600135 fix bug with minimum_should_match and AllScorer (#2774) trinity-1686a 2025-12-14 10:10:45 +01:00
  • 87fe3a311f share column block accessor Pascal Seitz 2025-12-12 16:54:44 +08:00
  • 71dc08424c add comment Pascal Seitz 2025-12-12 16:09:58 +08:00
  • e9020d17d4 fix coverage (#2769) PSeitz-dd 2025-12-11 11:35:58 +01:00
  • 5ba0031f7d move rand_distr to dev_dep (#2772) PSeitz-dd 2025-12-11 11:23:50 +01:00
  • 22dde8f9ae chore: Make some delete-related functions public (#46) (#2766) Philippe Noël 2025-12-10 19:22:15 -05:00
  • 14cc24614e Make DeleteMeta pub (#2765) Philippe Noël 2025-12-10 18:11:03 -05:00
  • 8a1079b2dc expose AddOperation and with_max_doc (#7) (#2762) Philippe Noël 2025-12-10 18:10:42 -05:00
  • fe293225d5 Fixed agg validation paradedb/fix-agg-validation Mohammad Dashti 2025-12-10 10:28:07 -08:00
  • ff6ee3a5db fix: post-rebase fixes - Add missing size_hint module declaration - Remove test-only export serialize_and_load_u64_based_column_values - fixed quickwit CI issues Mohammad Dashti 2025-12-05 22:37:19 -08:00
  • eda9aa437f fix: boolean query incorrectly dropping documents when AllScorer is present (#84) Moe 2025-12-05 16:59:55 -08:00
  • 538da08eb5 Add polish stemmer (#82) This commit adds support for Polish language stemming. The previously used rust-stemmers crate is abandoned and unmaintained, which blocked the addition of new languages. This change addresses a user request for Polish stemming to improve BM25 recall in their use case. The tantivy-stemmers crate is a modern, maintained alternative that also opens the door for supporting many other languages in the future. - Added the tantivy-stemmers crate as a dependency to the workspace, alongside the existing rust-stemmers dependency (for backward compatibility) - Introduced an internal enum that can hold an algorithm from either rust-stemmers or tantivy-stemmers - Added Polish to the main Language enum, mapped to the new tantivy-stemmers implementation - Updated the token stream to handle both types of stemmers internally - Added the POLISH variant to the stopwords list - Existing tests pass - Added test_pl_tokenizer to verify that the Polish stemmer works correctly Piotr Olszak 2025-11-24 18:57:58 +01:00
  • 7bd5cc5417 fix: fixed integer overflow in ExpUnrolledLinkedList for large datasets (#80) Moe 2025-11-10 19:47:56 -08:00
  • 5d46137556 feat: Added multiple snippet support (#76) Moe 2025-10-28 16:24:57 -07:00
  • 92c784f697 perf: Optimize TermSet for very large sets of terms. (#75) Stu Hood 2025-10-25 14:40:00 -07:00
  • b3541d10e1 chore: Use smaller merge buffers. (#74) Stu Hood 2025-10-23 12:57:52 -07:00
  • 7183ac6cbc fix: Use smaller buffers during merging (#71) Stu Hood 2025-10-20 10:02:09 -07:00
  • e0476d2eb2 fix: Add support for bool to the fast field TermSet implementation (#70) Stu Hood 2025-10-16 12:21:04 -07:00
  • 9fe0899934 perf: Implement a TermSet variant which uses fast fields (#69) Stu Hood 2025-10-16 09:18:16 -07:00
  • aaa5abb7d6 chore: Expose a method to create a segment with a particular id (#68) Stu Hood 2025-10-10 11:01:15 -07:00
  • f8b8fd0321 feat: SnippetGenerator accepts limit/offset (#66) Ming 2025-10-01 19:05:41 -04:00
  • cd878a5c90 fix: support MemoryArena allocations up to 4GB (#62) Eric Ridge 2025-09-19 15:08:45 -04:00
  • 30c237e895 perf: various optimizations around arenas (#60) Eric Ridge 2025-08-31 15:58:24 -04:00
  • b6cd39872b fix: Allow zero indexing & merging threads (#59) Eric Ridge 2025-08-11 07:46:37 -04:00
  • c96d801c68 perf: Lazily load in BitpackedCodec (#56) Stu Hood 2025-07-23 12:46:20 -07:00
  • 7a13e0294d Avoid copying into OwnedBytes when opening a fast field column Dictionary. (#55) Stu Hood 2025-07-14 06:59:43 -07:00
  • 20d00701ee perf: lazily open positions file (#54) Eric Ridge 2025-06-27 13:47:10 -04:00
  • 526afc6111 chore: internal API visibility adjustments (#53) Eric Ridge 2025-06-18 11:25:42 -04:00
  • f9e4a8413b make the directory BufWriter capacity configurable (#52) Ming 2025-06-17 17:01:26 -04:00
  • 58124bb164 changes to make merging work (#48) Ming 2025-06-11 16:46:00 -04:00
  • 176f7e852a perf: remove general overhead during segment merging (#47) Eric Ridge 2025-06-09 13:57:05 -04:00
  • cfa5f94114 chore: Make some delete-related functions public (#46) Ming 2025-06-09 09:50:26 -04:00
  • 5e449e7dda feat: SnippetGenerator can handle JSON fields (#42) Ming 2025-06-08 16:54:02 -04:00
  • 1617459b01 Expose some methods which are necessary to create a streaming version of sorted_ords_to_term_cb. (#43) Stu Hood 2025-05-28 13:36:09 -07:00
  • 0e1a7e213e chore: allow merge_foreground to ignore the store (#40) Ming 2025-04-28 16:04:01 -04:00
  • b0660ba196 chore: make some structs pub (#39) Ming 2025-04-28 12:06:25 -04:00
  • 936d6af471 feat: ability to directly merge segments in the foregound (#36) Eric Ridge 2025-04-01 13:49:55 -04:00
  • 2560de3a01 feat: IndexWriter::wait_merging_threads() return Err on merge failure (#34) Eric Ridge 2025-03-24 10:50:16 -04:00
  • 75a8384c2b feat: remove Directory::reconsider_merge_policy() and add other niceties to Directory API (#33) Eric Ridge 2025-03-07 14:37:58 -05:00
  • 5b6da9123c feat: introduce a MergeOptimizedInvertedIndexReader (#32) Eric Ridge 2025-03-04 15:42:53 -05:00
  • 8b7db36c99 feat: Add Directory::wants_cancel() function (#31) Eric Ridge 2025-02-28 10:07:48 -05:00