mirror of
https://github.com/neondatabase/neon.git
synced 2026-01-14 17:02:56 +00:00
Fix some mistypings
This commit is contained in:
@@ -2,44 +2,44 @@
|
||||
|
||||
## Why do we need to include more information in EXPLAIN?
|
||||
|
||||
Neon contains two components: prefetch and LFC 9local file cache) which may have critical impact on query performance.
|
||||
Both are trying to solve the problem with relatively larger round-trip between compute and page server (much larger than average access time for modern SSDs).
|
||||
This is why Neon can provide comparable performance only if all data set is present a local node.
|
||||
Neon contains two components: prefetch and LFC (local file cache) which may have critical impact on query performance.
|
||||
Both are trying to solve the problem with relatively large round-trip between compute and page server (much larger than average access time for modern SSDs).
|
||||
This is why Neon can provide comparable performance only if all data set is present at local node.
|
||||
Certainly the fastest case of accessing data in Postgres is when it is present in Postgres cache (shared buffers).
|
||||
Unfortunately size of shared buffer can not be changed on the flight: it requires Postgres restart. It is not acceptable for auto-scaling.
|
||||
This is why we relatively small shared buffers and dynamically resized local file cache (LFC).
|
||||
Unfortunately size of shared buffer can not be changed on the flight: it requires Postgres restart. It is not acceptable for autoscaling.
|
||||
This is why we have relatively small shared buffers and dynamically resized local file cache (LFC).
|
||||
It is intended that LFC fits in memory, although it can improve performance even if data is read from local disk.
|
||||
See https://neondb.slack.com/archives/C03QLRH7PPD/p1718714926044699
|
||||
To minimize in memory overhead of LFC and to improve sequential scan, LFC uses chunks which size is larger than size of Postgres page (right now chunk size is 1Mb).
|
||||
To minimize in-memory footprint of LFC and to improve sequential scan, LFC uses chunks which size is larger than size of Postgres page (right now chunk size is 1Mb).
|
||||
|
||||
|
||||
LFC as any other cache is useless after cool restart. Also some data sets can not fit in local disk.
|
||||
This is where another approach can help: prefetching. If we are able to predict which pages will be needed soon,
|
||||
compute can send prefetch requests to page server before this page are actually requested for executor.
|
||||
Prefetch is also used by Postgres (using `fadvise`), but only for bitmap scan. Neon implements prefetch for more execution plans:
|
||||
compute can send prefetch requests to page server before this page is actually requested by executor.
|
||||
Prefetch is also used by Postgres (using `fadvise`), but only for vacuum and bitmap scan. Neon provides prefetch for more execution plan nodes:
|
||||
sequential scan, index scan (prefetch of referenced heap pages), index-only scan (prefetch B-Tree leaves).
|
||||
|
||||
As far work of prefetch and LFC may have critical impact on query performance, we need to provide this information to the users.
|
||||
The most convenient and natural way is to include in EXPLAIN. Two new keyword were added by Neon to EXPLAIN options: `prefetch` and `filecache`.
|
||||
As far as work of prefetch and LFC may have critical impact on query performance, we need to provide this information to the users.
|
||||
The most convenient and natural way is to include it in EXPLAIN. Two new keyword are added by Neon to EXPLAIN options: `prefetch` and `filecache`.
|
||||
|
||||
## prefetch
|
||||
|
||||
The following information is available about prefetch:
|
||||
* `hits` - number pages which are received from page server before actually requested by executor. Prefetch distance is controlled by` effective_io_concurrency` GUC. The larger it is, the more chances that page server will be able to complete request before it is needed. But it should not be larger than `neon.prefetch_buffer_size`.
|
||||
* `misses` - number of pages which were not prefetched. Prefetch is not implemented for all plan nodes. And even for those nodes for which it is implemented (i.e. sequential scan) some mispredictions are possible. Please notice that `hits + misses != accessed pages`. If prefetch request for the page was issued but not yet completed before it is requested, then such access is not considered as prefetch hit or miss.
|
||||
* `hits` - number of pages which are received from page server before actually requested by executor. Prefetch distance is controlled by `effective_io_concurrency` GUC. The larger it is, the more chances that page server will be able to complete request before it is needed. But it should not be larger than `neon.prefetch_buffer_size`.
|
||||
* `misses` - number of accessed pages which were not prefetched. Prefetch is not implemented for all plan nodes. And even for those nodes for which it is implemented (i.e. sequential scan) some mispredictions are possible. Please notice that `hits + misses != accessed pages`. If prefetch request for the page was issued but not yet completed before the page is requested, then such access is not considered as prefetch hit or miss.
|
||||
* `expired` - page can be updated by backend since the moment of sending prefetch request to page server. Or result of prefetch just not used because executor doesn't need this page (for example because of presence of `LIMIT` clause in the query). In both cases such requests are considered as expired.
|
||||
* `duplicates` - multiple prefetch requests fort the same page. For some nodes predicting next pages is trivial, i.e. for sequential scan. But in case of index scan we need to prefetch referenced heap pages. And definitely index entries can have multiple references to the same heap page. Such non unique prefetch requests are considered as duplicates.
|
||||
* `duplicates` - multiple prefetch requests for the same page. For some nodes predicting next pages is trivial, i.e. for sequential scan. But in case of index scan we need to prefetch referenced heap pages. And definitely index entries can have multiple references to the same heap page. Such non-unique prefetch requests are considered as duplicates.
|
||||
|
||||
|
||||
## filecache
|
||||
|
||||
The following information is available about file cache (LFC):
|
||||
* `hits` - number of pages found in LFC.
|
||||
* `misses` - number of pages not found in LFC.
|
||||
* `hits` - number of accessed pages found in LFC.
|
||||
* `misses` - number of accessed pages not found in LFC.
|
||||
|
||||
# LFC statistic
|
||||
|
||||
While `filecache` option of EXPLAIN command provides information about LFC usage in particular query, there is also available some global statistic about LFC usage.
|
||||
While `filecache` option of EXPLAIN command provides information about LFC usage in the particular query, there is also available global statistic about LFC usage.
|
||||
It is provided by `neon` extension.
|
||||
|
||||
## `neon_lfc_stats` view
|
||||
@@ -51,8 +51,8 @@ The following keys are provided:
|
||||
* `file_cache_used` - number chunks used in LFC
|
||||
* `file_cache_writes` - number of pages written to LFC
|
||||
* `file_cache_size` - current cache size in chunks (can not be larger than `neon.file_cache_size_limit`)
|
||||
* `file_cache_used_pages` - number is used pages. As fart as not all pages of the chunk can be filled with data it can be smaller than `file_cache_used*128` (128 is number of 8kB pages in 1MB chunk)
|
||||
* `file_cache_evicted_pages` - number of pages evicted from LFC because working set doesn't fir in LFC.
|
||||
* `file_cache_used_pages` - number is used pages. As far as not all pages of the chunk can be filled with data it can be smaller than `file_cache_used*128` (128 is number of 8kB pages in 1MB chunk)
|
||||
* `file_cache_evicted_pages` - number of pages evicted from LFC because working set doesn't fit in LFC.
|
||||
* `file_cache_limit` - current limit of LFC size (in chunks)
|
||||
|
||||
## `local_cache` view
|
||||
|
||||
Reference in New Issue
Block a user