rust/neon - neon - Gitea: Git with a cup of tea

rust/neon

mirror of https://github.com/neondatabase/neon.git synced 2026-01-05 20:42:54 +00:00

Author	SHA1	Message	Date
Andreas Scherbaum	5c6d78d469	Rename "zenith" to "neon" (#6957 ) Usually RFC documents are not modified, but the vast mentions of "zenith" in early RFC documents make it desirable to update the product name to today's name, to avoid confusion. ## Problem Early RFC documents use the old "zenith" product name a lot, which is not something everyone is aware of after the product was renamed. ## Summary of changes Replace occurrences of "zenith" with "neon". Images are excluded. --------- Co-authored-by: Andreas Scherbaum <andreas@neon.tech>	2024-03-04 13:02:18 +01:00
Christian Schwarz	3fd77eb0d4	layer file creation: remove redundant fsync()s (#6983 ) The `writer.finish()` methods already fsync the inode, using `VirtualFile::sync_all()`. All that the callers need to do is fsync their directory, i.e., the timeline directory. Note that there's a call in the new compaction code that is apparently dead-at-runtime, so, I couldn't fix up any fsyncs there [Link](`502b69b33b/pageserver/src/tenant/timeline/compaction.rs (L204-L211)`). Note that layer durability still matters somewhat, even after #5198 which made remote storage authoritative. We do have the layer file length as an indicator, but no checksums on the layer file contents. So, a series of overwrites without fsyncs in the middle, plus a subsequent crash, could cause us to end up in a state where the file length matches but the contents are garbage. part of https://github.com/neondatabase/neon/issues/6663	2024-03-04 12:33:42 +01:00
Anna Khanova	3114be034a	proxy: change is cold start to enum (#6948 ) ## Problem Actually it's good idea to distinguish between cases when it's a cold start, but we took the compute from the pool ## Summary of changes Updated to enum.	2024-03-04 10:31:28 +01:00
John Spray	8dc7dc79dd	tests: debugging for `test_secondary_downloads` failures (#6984 ) ## Problem - #6966 - Existing logs aren't pointing to a cause: it looks like heatmap upload and download are happening, but for some reason the evicted layer isn't removed on the secondary location. ## Summary of changes - Assert evicted layer is gone from heatmap before checking its gone from local disk: this will give clarity on whether the issue is with the uploads or downloads. - On assertion failures, log the contents of heatmap.	2024-03-04 09:10:04 +00:00
John Spray	fad9be4598	pageserver: mention key in walredo errors (#6988 ) ## Problem - Walredo errors, e.g. during image creation, mention the LSN affected but not the key. ## Summary of changes - Add key to "error applying ... WAL records" log message	2024-03-04 08:56:55 +00:00
John Spray	20d0939b00	control_plane/attachment_service: implement PlacementPolicy::Secondary, configuration updates (#6521 ) During onboarding, the control plane may attempt ad-hoc creation of a secondary location to facilitate live migration. This gives us two problems to solve: - Accept 'Secondary' mode in /location_config and use it to put the tenant into secondary mode on some physical pageserver, then pass through /tenant/xyz/secondary/download requests - Create tenants with no generation initially, since the initial `Secondary` mode call will not provide us a generation. This PR also fixes modification of a tenant's TenantConf during /location_conf, which was previously ignored, and refines the flow for config modification: - avoid bumping generations when the only reason we're reconciling an attached location is a config change - increment TenantState.sequence when spawning a reconciler: usually schedule() does this, but when we do config changes that doesn't happen, so without this change waiters would think reconciliation was done immediately. `sequence` is a bit of a murky thing right now, as it's dual-purposed for tracking waiters, and for checking if an existing reconciliation is already making updates to our current sequence. I'll follow up at some point to clarify it's purpose. - test config modification at the end of onboarding test	2024-03-01 20:25:53 +00:00
Alex Chi Z	ea0d35f3ca	neon_local: improved docs and fix wrong connstr (#6954 ) The user created with the `--create-test-user` flag is `test` instead of `user`. ref https://github.com/neondatabase/neon/pull/6848 Signed-off-by: Alex Chi Z <chi@neon.tech>	2024-03-01 14:54:07 -05:00
John Spray	e34059cd18	pageserver: increase DEFAULT_MAX_WALRECEIVER_LSN_WAL_LAG (#6970 ) ## Problem At high ingest rates, pageservers spuriously disconnect from safekeepers because stats updates don't come in frequently enough to keep the broker/safekeeper LSN delta under the wal lag limit. ## Summary of changes - Increase DEFAULT_MAX_WALRECEIVER_LSN_WAL_LAG from 10MiB to 1GiB. This should be enough for realistic per-timeline throughputs.	2024-03-01 16:49:37 +00:00
John Spray	d999c46692	pageserver: handle temp_download files in secondary locations (#6990 ) ## Problem PR #6837 fixed secondary locations to avoid spamming log warnings on temp files, but we also have ".temp_download" files to consider. ## Summary of changes - Give temp_download files the same behavior as temp files. - Refactor the relevant helper to pub(crate) from pub	2024-03-01 16:19:40 +00:00
Arpad Müller	82853cc1d1	Fix warnings and compile errors on nightly (#6886 ) Nightly has added a bunch of compiler and linter warnings. There is also two dependencies that fail compilation on latest nightly due to using the old `stdsimd` feature name. This PR fixes them.	2024-03-01 17:14:19 +01:00
Vlad Lazar	1efaa16260	test: add test for checkpoint timeout flushing (#6950 ) ## Problem https://github.com/neondatabase/neon/pull/6661 changed the layer flushing logic and led to OOMs in staging. The issue turned out to be holding on to in-memory layers for too long. After OOMing we'd need to replay potentially a lot of WAL. ## Summary of changes Test that open layers get flushed after the `checkpoint_timeout` config and do not require WAL reingest upon restart. The workload creates a number of timelines and writes some data to each, but not enough to trigger flushes via the `checkpoint_distance` config. I ran this test against https://github.com/neondatabase/neon/pull/6661 and it was indeed failing.	2024-03-01 14:43:33 +00:00
Bodobolero	4dbb74b559	new test for LFC stats in explain (#6968 ) ## Problem PR https://github.com/neondatabase/neon/pull/6851 implemented new output in PostgreSQL explain. this is a test case for the new function. ## Summary of changes ## Checklist before requesting a review - [x] I have performed a self-review of my code. - [x] If it is a core feature, I have added thorough tests. - [no ] Do we need to implement analytics? if so did you add the relevant metrics to the dashboard? - [no] If this PR requires public announcement, mark it with /release-notes label and add several sentences in this section. ## Checklist before merging - [ ] Do not forget to reformat commit message to not include the above checklist	2024-03-01 14:33:08 +00:00
Joonas Koivunen	5ab10d051d	metrics: record more details of the responding (#6979 ) On eu-west-1 during benchmarks we sometimes lose samples. Add more time measurements.	2024-03-01 14:04:39 +00:00
John Spray	f8bdce1015	pageserver: fix duplicate shard_id in span (#6981 ) ## Problem shard_id in span is repeated: - https://github.com/neondatabase/neon/issues/6723 Closes: #6723 ## Summary of changes - Only add shard_id to the span when fetching a cached timeline, as it is already added when loading an uncached timeline.	2024-03-01 13:26:45 +00:00
Bodobolero	7ba50708e3	Testcase for neon extension function approximate_working_set_size() (#6980 ) ## Problem PR https://github.com/neondatabase/neon/pull/6935 introduced a new function in neon extension: approximate_working_set_size This test case verifies its working correctly. --------- Co-authored-by: Alexander Bayandin <alexander@neon.tech>	2024-03-01 13:29:08 +01:00
Christian Schwarz	e9e77ee744	tests: add optional cursor to `log_contains` + fix truthiness issues in callers (#6960 ) Extracted from https://github.com/neondatabase/neon/pull/6953 Part of https://github.com/neondatabase/neon/issues/5899 Core Change ----------- In #6953, we need the ability to scan the log _after_ a specific line and ignore anything before that line. This PR changes `log_contains` to returns a tuple of `(matching line, cursor)`. Hand that cursor to a subsequent `log_contains` call to search the log for the next occurrence of the pattern. Other Changes ------------- - Inspect all the callsites of `log_contains` to handle the new tuple return type. - Above inspection unveiled many callers aren't using `assert log_contains(...) is not None` but some weaker version of the code that breaks if `log_contains` ever returns a not-None but falsy value. Fix that. - Above changes unveiled that `test_remote_storage_upload_queue_retries` was using `wait_until` incorrectly; after fixing the usage, I had to raise the `wait_until` timeout. So, maybe this will fix its flakiness.	2024-03-01 10:45:39 +01:00
Joonas Koivunen	ee93700a0f	dube: timeout individual layer evictions, log progress and record metrics (#6131 ) Because of bugs evictions could hang and pause disk usage eviction task. One such bug is known and fixed #6928. Guard each layer eviction with a modest timeout deeming timeouted evictions as failures, to be conservative. In addition, add logging and metrics recording on each eviction iteration: - log collection completed with duration and amount of layers - per tenant collection time is observed in a new histogram - per tenant layer count is observed in a new histogram - record metric for collected, selected and evicted layer counts - log if eviction takes more than 10s - log eviction completion with eviction duration Additionally remove dead code for which no dead code warnings appeared in earlier PR. Follow-up to: #6060.	2024-02-29 20:54:16 +00:00
Christian Schwarz	502b69b33b	refactor(compaction): `RequestContext` shouldn't be `Clone`, only `RequestContextAdaptor` uses it (#6961 ) Extracted from https://github.com/neondatabase/neon/pull/6953 Part of https://github.com/neondatabase/neon/issues/5899	2024-02-29 19:50:23 +00:00
Alex Chi Z	76ab57f33f	test: disable test_superuser on pg15 (#6972 ) ref https://github.com/neondatabase/neon/issues/6969 Signed-off-by: Alex Chi Z <chi@neon.tech>	2024-02-29 18:51:15 +00:00
Vlad Lazar	5984edaecd	libs: fix expired token in auth decode test (#6963 ) The test token expired earlier today (1709200879). I regenerated the token, but without an expiration date this time.	2024-02-29 13:55:38 +00:00
Konstantin Knizhnik	3eb83a0ebb	Provide appoximation of working set using hyper-log-log algorithm in LFC (#6935 ) ## Summary of changes Calculate number of unique page accesses at compute. It can be used to estimate working set size and adjust cache size (shared_buffers or local file cache). Approximation is made using HyperLogLog algorithm. It is performed by local file cache and so is available only when local file cache is enabled. This calculation doesn't take in account access to the pages present in shared buffers, but includes pages available in local file cache. This information can be retrieved using approximate_working_set_size(reset bool) function from neon extension. reset parameter can be used to reset statistic and so collect unique accesses for the particular interval. Below is an example of estimating working set size after pgbench -c 10 -S -T 100 -s 10: ``` postgres=# select approximate_working_set_size(false); approximate_working_set_size ------------------------------ 19052 (1 row) postgres=# select pg_table_size('pgbench_accounts')/8192; ?column? ---------- 16402 (1 row) ``` ## Checklist before requesting a review - [ ] I have performed a self-review of my code. - [ ] If it is a core feature, I have added thorough tests. - [ ] Do we need to implement analytics? if so did you add the relevant metrics to the dashboard? - [ ] If this PR requires public announcement, mark it with /release-notes label and add several sentences in this section. ## Checklist before merging - [ ] Do not forget to reformat commit message to not include the above checklist --------- Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech>	2024-02-29 15:54:58 +02:00
Joonas Koivunen	4d426f6fbe	feat: support lazy, queued tenant attaches (#6907 ) Add off-by-default support for lazy queued tenant activation on attach. This should be useful on bulk migrations as some tenants will be activated faster due to operations or endpoint startup. Eventually all tenants will get activated by reusing the same mechanism we have at startup (`PageserverConf::concurrent_tenant_warmup`). The difference to lazy attached tenants to startup ones is that we leave their initial logical size calculation be triggered by WalReceiver or consumption metrics. Fixes: #6315 Co-authored-by: Arpad Müller <arpad-m@users.noreply.github.com>	2024-02-29 13:26:29 +02:00
John Spray	d04af08567	control_plane: storage controller secrets by env (#6952 ) ## Problem Sometimes folks prefer not to expose secrets as CLI args. ## Summary of changes - Add ability to load secrets from environment variables. We can eventually remove the AWS SM code path here if nobody is using it -- we don't need to maintain three ways to load secrets.	2024-02-29 10:00:01 +00:00
Alexander Bayandin	54586d6b57	CI: create compute-tools image from compute-node image (#6899 ) ## Problem We build compute-tools binary twice — in `compute-node` and in `compute-tools` jobs, and we build them slightly differently: - `cargo build --locked --profile release-line-debug-size-lto` (previously in `compute-node`) - `mold -run cargo build -p compute_tools --locked --release` (previously in `compute-tools`) Before: - compute-node: 6m 34s - compute-tools (as a separate job): 7m 47s After: - compute-node: 7m 34s - compute-tools (as a separate step, within compute-node job): 5s ## Summary of changes - Move compute-tools image creation to `Dockerfile.compute-node` - Delete `Dockerfile.compute-tools`	2024-02-28 15:24:35 +00:00
John Spray	e5384ebefc	pageserver: accelerate tenant activation on HTTP API timeline read requests (#6944 ) ## Problem Callers of the timeline creation API may issue timeline GETs ahead of creation to e.g. check if their intended timeline already exists, or to learn the LSN of a parent timeline. Although the timeline creation API already triggers activation of a timeline if it's currently waiting to activate, the GET endpoint doesn't, so such callers will encounter 503 responses for several minutes after a pageserver restarts, while tenants are lazily warming up. The original scope of which APIs will activate a timeline was quite small, but really it makes sense to do it for any API that needs a particular timeline to be active. ## Summary of changes - In the timeline detail GET handler, use wait_to_become_active, which triggers immediate activation of a tenant if it was currently waiting for the warmup semaphore, then waits up to 5 seconds for the activation to complete. If it doesn't complete promptly, we return a 503 as before. - Modify active_timeline_for_active_tenant to also use wait_to_become_active, which indirectly makes several other timeline-scope request handlers fast-activate a tenant when called. This is important because a timeline creation flow could also use e.g. get_lsn_for_timestamp as a precursor to creating a timeline. - There is some risk to this change: an excessive number of timeline GET requests could cause too many tenant activations to happen at the same time, leading to excessive queue depth to the S3 client. However, this was already the case for e.g. many concurrent timeline creations.	2024-02-28 14:53:35 +00:00
Alexander Bayandin	60a232400b	CI(pin-build-tools-image): pass secrets to the job (#6949 ) ## Problem `pin-build-tools-image` job doesn't have access to secrets and thus fails. Missed in the original PR[0] - [0] https://github.com/neondatabase/neon/pull/6795 ## Summary of changes - pass secrets to `pin-build-tools-image` job	2024-02-28 14:36:17 +00:00
Andreas Scherbaum	edd809747b	English keyboard has "z" and "y" switched (#6947 ) ## Problem The "z" and "y" letters are switched on the English keyboard, and I'm used to a German keyboard. Very embarrassing. ## Summary of changes Fix syntax error in README Co-authored-by: Andreas Scherbaum <andreas@neon.tech>	2024-02-28 14:10:58 +01:00
Conrad Ludgate	48957e23b7	proxy: refactor span usage (#6946 ) ## Problem Hard to find error reasons by endpoint for HTTP flow. ## Summary of changes I want all root spans to have session id and endpoint id. I want all root spans to be consistent.	2024-02-28 17:10:07 +04:00
Alexander Bayandin	1d5e476c96	CI: use build-tools image from dockerhub (#6795 ) ## Problem Currently, after updating `Dockerfile.build-tools` in a PR, it requires a manual action to make it `pinned`, i.e., the default for everyone. It also makes all opened PRs use such images (even created in the PR and without such changes). This PR overhauls the way we build and use `build-tools` image (and uses the image from Docker Hub). ## Summary of changes - The `neondatabase/build-tools` image gets tagged with the latest commit sha for the `Dockerfile.build-tools` file - Each PR calculates the tag for `neondatabase/build-tools`, tries to pull it, and rebuilds the image with such tag if it doesn't exist. - Use `neondatabase/build-tools` as a default image - When running on `main` branch — create a `pinned` tag and push it to ECR - Use `concurrency` to ensure we don't build `build-tools` image for the same commit in parallel from different PRs	2024-02-28 12:38:11 +00:00
Vlad Lazar	2b11466b59	pageserver: optimise disk io for vectored get (#6780 ) ## Problem The vectored read path proposed in https://github.com/neondatabase/neon/pull/6576 seems to be functionally correct, but in my testing (see below) it is about 10-20% slower than the naive sequential vectored implementation. ## Summary of changes There's three parts to this PR: 1. Supporting vectored blob reads. This is actually trickier than it sounds because on disk blobs are prefixed with a variable length size header. Since the blobs are not necessarily fixed size, we need to juggle the offsets such that the callers can retrieve the blobs from the resulting buffer. 2. Merge disk read requests issued by the vectored read path up to a maximum size. Again, the merging is complicated by the fact that blobs are not fixed size. We keep track of the begin and end offset of each blob and pass them into the vectored blob reader. In turn, the reader will return a buffer and the offsets at which the blobs begin and end. 3. A benchmark for basebackup requests against tenant with large SLRU block counts is added. This required a small change to pagebench and a new config variable for the pageserver which toggles the vectored get validation. We can probably optimise things further by adding a little bit of concurrency for our IO. In principle, it's as simple as spawning a task which deals with issuing IO and doing the serialisation and handling on the parent task which receives input via a channel.	2024-02-28 12:06:00 +00:00
Christian Schwarz	b6bd75964f	Revert "pageserver: roll open layer in timeline writer (#6661 )" + PR #6842 (#6938 ) This reverts commits `587cb705b8` (PR #6661) and `fcbe9fb184` (PR #6842). Conflicts: pageserver/src/tenant.rs pageserver/src/tenant/timeline.rs The conflicts were with * pageserver: adjust checkpoint distance for sharded tenants (#6852) * pageserver: add vectored get implementation (#6576) Also we had to keep the `allowed_errors` to make `test_forward_compatibility` happy, see the PR thread on GitHub for details.	2024-02-28 11:38:23 +00:00
Joonas Koivunen	fcb77f3d8f	build: add a timeout for test-images (#6942 ) normal runtime seems to be 3min, add 20min timeout.	2024-02-28 12:58:13 +02:00
Vlad Lazar	c3a40a06f3	test: wait for storage controller readiness (#6930 ) ## Problem Starting up the pageserver before the storage controller is ready can lead to a round of reconciliation, which leads to the previous tenant being shut down. This disturbs some tests. ## Summary of changes Wait for the storage controller to become ready on neon env start-up. Closes https://github.com/neondatabase/neon/issues/6724	2024-02-28 09:52:22 +00:00
Joonas Koivunen	1b1320a263	fix: allow evicting wanted deleted layers (#6931 ) Not allowing evicting wanted deleted layers is something I've forgotten to implement on #5645. This PR makes it possible to evict such layers, which should reduce the amount of hanging evictions. Fixes: #6928 Co-authored-by: Christian Schwarz <christian@neon.tech>	2024-02-28 00:02:44 +02:00
Konstantin Knizhnik	e1b4d96b5b	Limit number of AUX files deltas to reduce reconstruct time (#6874 ) ## Problem After commit [`840abe3954`] (store AUX files as deltas) we avoid quadratic growth of storage size when storing LR snapshots but get quadratic slowdown of reconstruct time. As a result storing 70k snapshots at my local Neon instance took more than 3 hours and starting node (creation of basecbackup): ~10 minutes. In prod 70k AUX files cause increase of startup time to 40 minutes: https://neondb.slack.com/archives/C03F5SM1N02/p1708513010480179 ## Summary of changes Enforce storing full AUX directory (some analog of FPI) each 1024 files. Time of creation 70k snapshots is reduced to 6 minutes and startup time - to 1.5 minutes (100 seconds). ## Checklist before requesting a review - [ ] I have performed a self-review of my code. - [ ] If it is a core feature, I have added thorough tests. - [ ] Do we need to implement analytics? if so did you add the relevant metrics to the dashboard? - [ ] If this PR requires public announcement, mark it with /release-notes label and add several sentences in this section. ## Checklist before merging - [ ] Do not forget to reformat commit message to not include the above checklist --------- Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech>	2024-02-27 21:18:46 +02:00
John Spray	a8ec18c0f4	refactor: move storage controller API structs into pageserver_api (#6927 ) ## Problem This is a precursor to adding a convenience CLI for the storage controller. ## Summary of changes - move controller api structs into pageserver_api::controller_api to make them visible to other crates - rename pageserver_api::control_api to pageserver_api::upcall_api to match the /upcall/v1/ naming in the storage controller. Why here rather than a totally separate crate? It's convenient to have all the pageserver-related stuff in one place, and if we ever wanted to move it to a different crate it's super easy to do that later.	2024-02-27 17:24:01 +00:00
Arpad Müller	045bc6af8b	Add new compaction abstraction, simulator, and implementation. (#6830 ) Rebased version of #5234, part of #6768 This consists of three parts: 1. A refactoring and new contract for implementing and testing compaction. The logic is now in a separate crate, with no dependency on the 'pageserver' crate. It defines an interface that the real pageserver must implement, in order to call the compaction algorithm. The interface models things like delta and image layers, but just the parts that the compaction algorithm needs to make decisions. That makes it easier unit test the algorithm and experiment with different implementations. I did not convert the current code to the new abstraction, however. When compaction algorithm is set to "Legacy", we just use the old code. It might be worthwhile to convert the old code to the new abstraction, so that we can compare the behavior of the new algorithm against the old one, using the same simulated cases. If we do that, have to be careful that the converted code really is equivalent to the old. This inclues only trivial changes to the main pageserver code. All the new code is behind a tenant config option. So this should be pretty safe to merge, even if the new implementation is buggy, as long as we don't enable it. 2. A new compaction algorithm, implemented using the new abstraction. The new algorithm is tiered compaction. It is inspired by the PoC at PR #4539, although I did not use that code directly, as I needed the new implementation to fit the new abstraction. The algorithm here is less advanced, I did not implement partial image layers, for example. I wanted to keep it simple on purpose, so that as we add bells and whistles, we can see the effects using the included simulator. One difference to #4539 and your typical LSM tree implementations is how we keep track of the LSM tree levels. This PR doesn't have a permanent concept of a level, tier or sorted run at all. There are just delta and image layers. However, when compaction starts, we look at the layers that exist, and arrange them into levels, depending on their shapes. That is ephemeral: when the compaction finishes, we forget that information. This allows the new algorithm to work without any extra bookkeeping. That makes it easier to transition from the old algorithm to new, and back again. There is just a new tenant config option to choose the compaction algorithm. The default is "Legacy", meaning the current algorithm in 'main'. If you set it to "Tiered", the new algorithm is used. 3. A simulator, which implements the new abstraction. The simulator can be used to analyze write and storage amplification, without running a test with the full pageserver. It can also draw an SVG animation of the simulation, to visualize how layers are created and deleted. To run the simulator: cargo run --bin compaction-simulator run-suite --------- Co-authored-by: Heikki Linnakangas <heikki@neon.tech>	2024-02-27 17:15:46 +01:00
siegerts	c8ac4c054e	readme: Update Neon link URL (#6918 ) ## Problem ## Summary of changes Updates the neon.tech link to point to a /github page in order to correctly attribute visits originating from the repo.	2024-02-27 11:08:43 -05:00
Anna Khanova	896d51367e	proxy: introdice is cold start for analytics (#6902 ) ## Problem Data team cannot distinguish between cold start and not cold start. ## Summary of changes Report `is_cold_start` to analytics. --------- Co-authored-by: Conrad Ludgate <conrad@neon.tech>	2024-02-27 19:53:02 +04:00
Joonas Koivunen	a691786ce2	fix: logical size calculation gating (#6915 ) Noticed that we are failing to handle `Result::Err` when entering a gate for logical size calculation. Audited rest of the gate enters, which seem fine, unified two instances. Noticed that the gate guard allows to remove a failpoint, then noticed that adjacent failpoint was blocking the executor thread instead of using `pausable_failpoint!`, fix both. eviction_task.rs now maintains a gate guard as well. Cc: #4733	2024-02-27 14:27:13 +00:00
Roman Zaynetdinov	2991d01b61	Export connection counts from sql_exporter (#6926 ) ## Problem We want to show connection counts to console users. ## Summary of changes Start exporting connection counts grouped by database name and connection state.	2024-02-27 13:47:05 +00:00
Konstantin Knizhnik	e895644555	Show LFC statistic in EXPLAIN (#6851 ) ## Problem LFC has high impact on Neon application performance but there is no way for user to check efficiency of its usage ## Summary of changes Show LFC statistic in EXPLAIN ANALYZE ## Description Local file cache (LFC) A layer of caching that stores frequently accessed data from the storage layer in the local memory of the Neon compute instance. This cache helps to reduce latency and improve query performance by minimizing the need to fetch data from the storage layer repeatedly. Externalization of LFC in explain output Then EXPLAIN ANALYZE output is extended to display important counts for local file cache (LFC) hits and misses. This works both, for EXPLAIN text and json output. File cache: hits Whenever the Postgres backend retrieves a page/block from SGMR, it is not found in shared buffer but the page is already found in the LFC this counter is incremented. File cache: misses Whenever the Postgres backend retrieves a page/block from SGMR, it is not found in shared buffer and also not in then LFC but the page is retrieved from Neon storage (page server) this counter is incremented. Example (for explain text output) ```sql explain (analyze,buffers,prefetch,filecache) select count(*) from pgbench_accounts; QUERY PLAN -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- Finalize Aggregate (cost=214486.94..214486.95 rows=1 width=8) (actual time=5195.378..5196.034 rows=1 loops=1) Buffers: shared hit=178875 read=143691 dirtied=128597 written=127346 Prefetch: hits=0 misses=1865 expired=0 duplicates=0 File cache: hits=141826 misses=1865 -> Gather (cost=214486.73..214486.94 rows=2 width=8) (actual time=5195.366..5196.025 rows=3 loops=1) Workers Planned: 2 Workers Launched: 2 Buffers: shared hit=178875 read=143691 dirtied=128597 written=127346 Prefetch: hits=0 misses=1865 expired=0 duplicates=0 File cache: hits=141826 misses=1865 -> Partial Aggregate (cost=213486.73..213486.74 rows=1 width=8) (actual time=5187.670..5187.670 rows=1 loops=3) Buffers: shared hit=178875 read=143691 dirtied=128597 written=127346 Prefetch: hits=0 misses=1865 expired=0 duplicates=0 File cache: hits=141826 misses=1865 -> Parallel Index Only Scan using pgbench_accounts_pkey on pgbench_accounts (cost=0.43..203003.02 rows=4193481 width=0) (actual time=0.574..4928.995 rows=3333333 loops=3) Heap Fetches: 3675286 Buffers: shared hit=178875 read=143691 dirtied=128597 written=127346 Prefetch: hits=0 misses=1865 expired=0 duplicates=0 File cache: hits=141826 misses=1865 ``` The json output uses the following keys and provides integer values for those keys: ``` ... "File Cache Hits": 141826, "File Cache Misses": 1865 ... ``` ## Checklist before requesting a review - [ ] I have performed a self-review of my code. - [ ] If it is a core feature, I have added thorough tests. - [ ] Do we need to implement analytics? if so did you add the relevant metrics to the dashboard? - [ ] If this PR requires public announcement, mark it with /release-notes label and add several sentences in this section. ## Checklist before merging - [ ] Do not forget to reformat commit message to not include the above checklist --------- Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech>	2024-02-27 14:45:54 +02:00
Christian Schwarz	62d77e263f	test_remote_timeline_client_calls_started_metric: fix flakiness (#6911 ) fixes https://github.com/neondatabase/neon/issues/6889 # Problem The failure in the last 3 flaky runs on `main` is ``` test_runner/regress/test_remote_storage.py:460: in test_remote_timeline_client_calls_started_metric churn("a", "b") test_runner/regress/test_remote_storage.py:457: in churn assert gc_result["layers_removed"] > 0 E assert 0 > 0 ``` That's this code `cd449d66ea/test_runner/regress/test_remote_storage.py (L448-L460)` So, the test expects GC to remove some layers but the GC doesn't. # Fix My impression is that the VACUUM isn't re-using pages aggressively enough, but I can't really prove that. Tried to analyze the layer map dump but it's too complex. So, this PR: - Creates more churn by doing the overwrite twice. - Forces image layer creation. It also drive-by removes the redundant call to timeline_compact, because, timeline_checkpoint already does that internally.	2024-02-27 10:55:10 +01:00
Alex Chi Z	b2bbc20311	fix: only alter default privileges when public schema exists (#6914 ) ## Problem Following up https://github.com/neondatabase/neon/pull/6885, only alter default privileges when the public schema exists. Signed-off-by: Alex Chi Z <chi@neon.tech>	2024-02-26 11:48:56 -09:00
Vlad Lazar	5accf6e24a	attachment_service: JWT auth enforcement (#6897 ) ## Problem Attachment service does not do auth based on JWT scopes. ## Summary of changes Do JWT based permission checking for requests coming into the attachment service. Requests into the attachment service must use different tokens based on the endpoint: * `/control` and `/debug` require `admin` scope * `/upcall` requires `generations_api` scope * `/v1/...` requires `pageserverapi` scope Requests into the pageserver from the attachment service must use `pageserverapi` scope.	2024-02-26 18:17:06 +00:00
Andreas Scherbaum	0881d4f9e3	Update README, include cleanup details (#6816 ) ## Problem README.md is missing cleanup instructions ## Summary of changes Add cleanup instructions Add instructions how to handle errors during initialization --------- Co-authored-by: Andreas Scherbaum <andreas@neon.tech>	2024-02-26 18:53:48 +01:00
Alexander Bayandin	975786265c	CI: Delete GitHub Actions caches once PR is closed (#6900 ) ## Problem > Approaching total cache storage limit (9.25 GB of 10 GB Used) > Least recently used caches will be automatically evicted to limit the total cache storage to 10 GB. [Learn more about cache usage.](https://docs.github.com/actions/using-workflows/caching-dependencies-to-speed-up-workflows#usage-limits-and-eviction-policy) From https://github.com/neondatabase/neon/actions/caches Some of these caches are from closed/merged PRs. ## Summary of changes - Add a workflow that deletes caches for closed branches	2024-02-26 18:17:22 +01:00
Christian Schwarz	c4059939e6	fixup(#6893 ): report_size() still used pageserver_created_persistent_* metrics (#6909 ) Use the remote_timeline_client metrics instead, they work for layer file uploads and are reasonable close to what the `pageserver_created_persistent_*` metrics were. Should we wait for empty upload queue before calling `report_size()`? part of https://github.com/neondatabase/neon/issues/6737	2024-02-26 17:28:00 +01:00
Bodobolero	75baf83fce	externalize statistics on LFC cache usage (#6906 ) ## Problem Customers should be able to determine the size of their workload's working set to right size their compute. Since Neon uses Local file cache (LFC) instead of shared buffers on bigger compute nodes to cache pages we need to externalize a means to determine LFC hit ratio in addition to shared buffer hit ratio. Currently the following end user documentation `fb7cd3af0e/content/docs/manage/endpoints.md (L137)` is wrong because it describes how to right size a compute node based on shared buffer hit ratio. Note that the existing functionality in extension "neon" is NOT available to end users but only to superuser / cloud_admin. ## Summary of changes - externalize functions and views in neon extension to end users - introduce a new view `NEON_STAT_FILE_CACHE` with the following DDL ```sql CREATE OR REPLACE VIEW NEON_STAT_FILE_CACHE AS WITH lfc_stats AS ( SELECT stat_name, count FROM neon_get_lfc_stats() AS t(stat_name text, count bigint) ), lfc_values AS ( SELECT MAX(CASE WHEN stat_name = 'file_cache_misses' THEN count ELSE NULL END) AS file_cache_misses, MAX(CASE WHEN stat_name = 'file_cache_hits' THEN count ELSE NULL END) AS file_cache_hits, MAX(CASE WHEN stat_name = 'file_cache_used' THEN count ELSE NULL END) AS file_cache_used, MAX(CASE WHEN stat_name = 'file_cache_writes' THEN count ELSE NULL END) AS file_cache_writes, -- Calculate the file_cache_hit_ratio within the same CTE for simplicity CASE WHEN MAX(CASE WHEN stat_name = 'file_cache_misses' THEN count ELSE 0 END) + MAX(CASE WHEN stat_name = 'file_cache_hits' THEN count ELSE 0 END) = 0 THEN NULL ELSE ROUND((MAX(CASE WHEN stat_name = 'file_cache_hits' THEN count ELSE 0 END)::DECIMAL / (MAX(CASE WHEN stat_name = 'file_cache_hits' THEN count ELSE 0 END) + MAX(CASE WHEN stat_name = 'file_cache_misses' THEN count ELSE 0 END))) * 100, 2) END AS file_cache_hit_ratio FROM lfc_stats ) SELECT file_cache_misses, file_cache_hits, file_cache_used, file_cache_writes, file_cache_hit_ratio from lfc_values; ``` This view can be used by an end user as follows: ```sql CREATE EXTENSION NEON; SELECT * from neon. NEON_STAT_FILE_CACHE" ``` The output looks like the following: ``` select * from NEON_STAT_FILE_CACHE; file_cache_misses \| file_cache_hits \| file_cache_used \| file_cache_writes \| file_cache_hit_ratio -------------------+-----------------+-----------------+-------------------+---------------------- 2133643 \| 108999742 \| 607 \| 10767410 \| 98.08 (1 row) ``` ## Checklist before requesting a review - [x ] I have performed a self-review of my code. - [x ] If it is a core feature, I have added thorough tests. - [ ] Do we need to implement analytics? if so did you add the relevant metrics to the dashboard? - [x ] If this PR requires public announcement, mark it with /release-notes label and add several sentences in this section. ## Checklist before merging - [ ] Do not forget to reformat commit message to not include the above checklist	2024-02-26 16:06:00 +00:00
Roman Zaynetdinov	459c2af8c1	Expose LFC cache size limit from sql_exporter (#6912 ) ## Problem We want to report how much cache was used and what the limit was. ## Summary of changes Added one more query to sql_exporter to expose `neon.file_cache_size_limit`.	2024-02-26 10:36:11 -05:00

1 2 3 4 5 ...

4767 Commits