Paul Masurel
f88b7200b2
Optimization when posting list are saturated. ( #2745 )
...
* Optimization when posting list are saturated.
If a posting list doc freq is the segment reader's
max_doc, and if scoring does not matter, we can replace it
by a AllScorer.
In turn, in a boolean query, we can dismiss all scorers and
empty scorers, to accelerate the request.
* Added range query optimization
* CR comment
* CR comments
* CR comment
---------
Co-authored-by: Paul Masurel <paul.masurel@datadoghq.com >
2025-11-26 15:50:57 +01:00
PSeitz-dd
203751f2fe
Optimize ExistsQuery for a high number of dynamic columns ( #2694 )
...
* Optimize ExistsQuery for a high number of dynamic columns
The previous algorithm checked _each_ doc in _each_ column for
existence. This causes huge cost on JSON fields with e.g. 100k columns.
Compute a bitset instead if we have more than one column.
add `iter_docs` to the multivalued_index
* add benchmark
subfields=1
exists_json_union Memory: 89.3 KB (+2.01%) Avg: 0.4865ms (-26.03%) Median: 0.4865ms (-26.03%) [0.4865ms .. 0.4865ms]
subfields=2
exists_json_union Memory: 68.1 KB Avg: 1.7048ms (-0.46%) Median: 1.7048ms (-0.46%) [1.7048ms .. 1.7048ms]
subfields=3
exists_json_union Memory: 61.8 KB Avg: 2.0742ms (-2.22%) Median: 2.0742ms (-2.22%) [2.0742ms .. 2.0742ms]
subfields=4
exists_json_union Memory: 119.8 KB (+103.44%) Avg: 3.9500ms (+42.62%) Median: 3.9500ms (+42.62%) [3.9500ms .. 3.9500ms]
subfields=5
exists_json_union Memory: 120.4 KB (+107.65%) Avg: 3.9610ms (+20.65%) Median: 3.9610ms (+20.65%) [3.9610ms .. 3.9610ms]
subfields=6
exists_json_union Memory: 120.6 KB (+107.49%) Avg: 3.8903ms (+3.11%) Median: 3.8903ms (+3.11%) [3.8903ms .. 3.8903ms]
subfields=7
exists_json_union Memory: 120.9 KB (+106.93%) Avg: 3.6220ms (-16.22%) Median: 3.6220ms (-16.22%) [3.6220ms .. 3.6220ms]
subfields=8
exists_json_union Memory: 121.3 KB (+106.23%) Avg: 4.0981ms (-15.97%) Median: 4.0981ms (-15.97%) [4.0981ms .. 4.0981ms]
subfields=16
exists_json_union Memory: 123.1 KB (+103.09%) Avg: 4.3483ms (-92.26%) Median: 4.3483ms (-92.26%) [4.3483ms .. 4.3483ms]
subfields=256
exists_json_union Memory: 204.6 KB (+19.85%) Avg: 3.8874ms (-99.01%) Median: 3.8874ms (-99.01%) [3.8874ms .. 3.8874ms]
subfields=4096
exists_json_union Memory: 2.0 MB Avg: 3.5571ms (-99.90%) Median: 3.5571ms (-99.90%) [3.5571ms .. 3.5571ms]
subfields=65536
exists_json_union Memory: 28.3 MB Avg: 14.4417ms (-99.97%) Median: 14.4417ms (-99.97%) [14.4417ms .. 14.4417ms]
subfields=262144
exists_json_union Memory: 113.3 MB Avg: 66.2860ms (-99.95%) Median: 66.2860ms (-99.95%) [66.2860ms .. 66.2860ms]
* rename methods
2025-09-16 18:21:03 +02:00
PSeitz
945af922d1
clippy ( #2661 )
...
* clippy
* use readable version
---------
Co-authored-by: Pascal Seitz <pascal.seitz@datadoghq.com >
2025-07-02 11:25:03 +02:00
Remi Dettai
71cf19870b
Exist queries match subpath fields ( #2558 )
...
* Exist queries match subpath fields
* Make subpath check optional
* Add async subpath listing
2025-01-06 10:17:39 +01:00
PSeitz
1b4076691f
refactor fast field query ( #2452 )
...
As preparation of #2023 and #1709
* Use Term to pass parameters
* merge u64 and ip fast field range query
Side note: I did not rename range_query_u64_fastfield, because then git can't track the changes.
2024-07-15 18:08:05 +08:00
PSeitz
74940e9345
clippy ( #2349 )
...
* fix clippy
* fix clippy
* fix duplicate imports
2024-04-09 07:54:44 +02:00
PSeitz
48630ceec9
move into new index module ( #2259 )
...
move core modules to index module
2024-01-31 10:30:04 +01:00
Igor Motov
19325132b7
Fast-field based implementation of ExistsQuery ( #2160 )
...
Adds an implementation of ExistsQuery that takes advantage of fast fields.
Fixes #2159
2023-09-07 11:51:49 +09:00